☆ 4.7 Article

Cross-Modal Retrieval Using Multiordered Discriminative Structured Subspace Learning

IEEE TRANSACTIONS ON MULTIMEDIA (2017)

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

卷 19, 期 6, 页码 1220-1233

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TMM.2016.2646219

关键词

Cross-modal retrieval; documents and images; multimedia

类别

Computer Science, Information Systems Computer Science, Software Engineering Telecommunications

资金

National Basic Research Program of China (973 Program [2015CB351800, 2012CB316400]
National Natural Science Foundation of China [61572465, 61332016, 61429201, 61620106009, U1636214, 61303153]
Key Research Program of Frontier Sciences, Chinese Academy of Sciences [QYZDJ-SSW-SYS013]
ARO [W911NF-15-1-0290]
Faculty Research Gift Awards by the NEC Laboratories of America and Blippar

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

This paper proposes a novel method for cross-modal retrieval. In addition to the traditional vector (text)-to-vector (image) framework, we adopt a matrix (text)-to-matrix (image) framework to faithfully characterize the structures of different feature spaces. Moreover, we propose a novel metric learning framework to learn a discriminative structured subspace, in which the underlying data distribution is preserved for ensuring a desirablemetric. Concretely, there are three steps for the proposed method. First, the multiorder statistics are used to represent images and texts for enriching the feature information. We jointly use the covariance (second-order), mean (first-order), and bags of visual (textual) features (zeroth-order) to characterize each image and text. Second, considering that the heterogeneous covariance matrices lie on the different Riemannian manifolds and the other features on the different Euclidean spaces, respectively, we propose a unified metric learning framework integrating multiple distance metrics, one for each order statistical feature. This framework preserves the underlying data distribution and exploits complementary information for better matching heterogeneous data. Finally, the similarity between the different modalities can be measured by transforming the multiorder statistical features to the common subspace. The performance of the proposed method over the previous methods has been demonstrated through the experiments on two public datasets.

Cross-Modal Retrieval Using Multiordered Discriminative Structured Subspace Learning

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Cross-Modal Retrieval Using Multiordered Discriminative Structured Subspace Learning

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文