☆ 4.7 Article

Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition

IEEE TRANSACTIONS ON MULTIMEDIA (2015)

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

卷 17, 期 11, 页码 1887-1898

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TMM.2015.2476655

关键词

Deep learning; large-margin feature learning; multi-modality; RGB-D object recognition

类别

Computer Science, Information Systems Computer Science, Software Engineering Telecommunications

资金

Singapore National Research Foundation under its International Research Centre@Singapore Funding Initiative
Singapore Ministry of Education (MOE) [RG 138/14]
MOE [ARC28/14]
Singapore A*STAR Science and Engineering Research Council [PSF1321202099]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Most existing feature learning-based methods for RGB-D object recognition either combine RGB and depth data in an undifferentiated manner from the outset, or learn features from color and depth separately, which do not adequately exploit different characteristics of the two modalities or utilize the shared relationship between the modalities. In this paper, we propose a general CNN-based multi-modal learning framework for RGB-D object recognition. We first construct deep CNN layers for color and depth separately, which are then connected with a carefully designed multi-modal layer. This layer is designed to not only discover the most discriminative features for each modality, but is also able to harness the complementary relationship between the two modalities. The results of the multi-modal layer are back-propagated to update parameters of the CNN layers, and the multi-modal feature learning and the back-propagation are iteratively performed until convergence. Experimental results on two widely used RGB-D object datasets show that our method for general multi-modal learning achieves comparable performance to state-of-the-art methods specifically designed for RGB-D data.

Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文