☆ 4.7 Article

Learning Discriminative Cross-Modality Features for RGB-D Saliency Detection

IEEE TRANSACTIONS ON IMAGE PROCESSING (2022)

期刊

IEEE TRANSACTIONS ON IMAGE PROCESSING

卷 31, 期 -, 页码 1285-1297

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TIP.2022.3140606

关键词

Feature extraction; Correlation; Saliency detection; Fuses; Convolution; Task analysis; Object detection; RGB-D saliency detection; cross-modality features; correlation-fusion

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

资金

National Key Research and Development Program of China [2018AAA0102002]
National Natural Science Foundation of China [61732007, 61922043]
Fundamental Research Funds for the Central Universities [30920041109]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper proposes a method to address the modality gap issue in RGB-D saliency detection by learning discriminative cross-modality features. The method calculates cross-modality correlations and combines them with local depth correlations to predict salient maps. Experimental results demonstrate the superior performance of the proposed algorithm.

How to explore useful information from depth is the key success of the RGB-D saliency detection methods. While the RGB and depth images are from different domains, a modality gap will lead to unsatisfactory results for simple feature concatenation. Towards better performance, most methods focus on bridging this gap and designing different cross-modal fusion modules for features, while ignoring explicitly extracting some useful consistent information from them. To overcome this problem, we develop a simple yet effective RGB-D saliency detection method by learning discriminative cross-modality features based on the deep neural network. The proposed method first learns modality-specific features for RGB and depth inputs. And then we separately calculate the correlations of every pixel-pair in a cross-modality consistent way, i.e., the distribution ranges are consistent for the correlations calculated based on features extracted from RGB (RGB correlation) or depth inputs (depth correlation). From different perspectives, color or spatial, the RGB and depth correlations end up at the same point to depict how tightly each pixel-pair is related. Secondly, to complemently gather RGB and depth information, we propose a novel correlation-fusion to fuse RGB and depth correlations, resulting in a cross-modality correlation. Finally, the features are refined with both long-range cross-modality correlations and local depth correlations to predict salient maps. In which, the long-range cross-modality correlation provides context information for accurate localization, and the local depth correlation keeps good subtle structures for fine segmentation. In addition, a lightweight DepthNet is designed for efficient depth feature extraction. We solve the proposed network in an end-to-end manner. Both quantitative and qualitative experimental results demonstrate the proposed algorithm achieves favorable performance against state-of-the-art methods.

Learning Discriminative Cross-Modality Features for RGB-D Saliency Detection

期刊

IEEE TRANSACTIONS ON IMAGE PROCESSING

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Learning Discriminative Cross-Modality Features for RGB-D Saliency Detection

期刊

IEEE TRANSACTIONS ON IMAGE PROCESSING

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文