☆ 4.7 Article

Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection

PATTERN RECOGNITION (2019)

期刊

PATTERN RECOGNITION

卷 86, 期 -, 页码 376-385

出版社

ELSEVIER SCI LTD

DOI: 10.1016/j.patcog.2018.08.007

关键词

RGB-D; Convolutional neural networks; Multi-path; Saliency detection

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

资金

Research Grants Council of Hong Kong [11205015, CityU 11255716]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Paired RGB and depth images are becoming popular multi-modal data adopted in computer vision tasks. Traditional methods based on Convolutional Neural Networks (CNNs) typically fuse RGB and depth by combining their deep representations in a late stage with only one path, which can be ambiguous and insufficient for fusing large amounts of cross-modal data. To address this issue, we propose a novel multi-scale multi-path fusion network with cross-modal interactions (MMCI), in which the traditional two-stream fusion architecture with single fusion path is advanced by diversifying the fusion path to a global reasoning one and another local capturing one and meanwhile introducing cross-modal interactions in multiple layers. Compared to traditional two-stream architectures, the MMCI net is able to supply more adaptive and flexible fusion flows, thus easing the optimization and enabling sufficient and efficient fusion. Concurrently, the MMCI net is equipped with multi-scale perception ability (i.e., simultaneously global and local contextual reasoning). We take RGB-D saliency detection as an example task. Extensive experiments on three benchmark datasets show the improvement of the proposed MMCI net over other state-of-the-art methods. (C) 2018 Elsevier Ltd. All rights reserved.

Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection

期刊

PATTERN RECOGNITION

出版社

ELSEVIER SCI LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection

期刊

PATTERN RECOGNITION

出版社

ELSEVIER SCI LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文