☆ 4.6 Article

Multi-modal semantic autoencoder for cross-modal retrieval

NEUROCOMPUTING (2019)

期刊

NEUROCOMPUTING

卷 331, 期 -, 页码 165-175

出版社

ELSEVIER

DOI: 10.1016/j.neucom.2018.11.042

关键词

Cross-modal retrieval; Multi-modal data; Autoencoder

类别

Computer Science, Artificial Intelligence

资金

National Natural Science Foundation of China [61672497, 61332016, 61620106009, 61650202, U1636214]
National Basic Research Program of China (973 Program) [2015CB351802]
Key Research Program of Frontier Sciences of CAS [QYZDJ-SSW-SYS013]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Cross-modal retrieval has gained much attention in recent years. As the research mainstream, most of existing approaches learn projections for data from different modalities into a common space where data can be compared directly. However, they neglect the preservation of feature and semantic information, so they are unable to obtain satisfactory results as expected. In this paper, we propose a two-stage learning method to learn multi-modal mappings that project multi-modal data to low dimensional embeddings that preserve both feature and semantic information. In the first stage, we combine both low-level feature and high-level semantic information to learn feature-aware semantic code vectors. In the second stage, we use encoder-decoder paradigm to learn projections. The encoder projects feature vectors to code vectors, and the decoder projects code vectors back to feature vectors. The encoder-decoder paradigm guarantees the embeddings to preserve both feature and semantic information. An alternating minimization procedure is developed to solve the multi-modal semantic autoencoder optimization problem. Extensive experiments on three benchmark datasets demonstrate that the proposed method outperforms state-of-the-art cross-modal retrieval methods. (C) 2018 Elsevier B.V. All rights reserved.

Multi-modal semantic autoencoder for cross-modal retrieval

期刊

NEUROCOMPUTING

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Multi-modal semantic autoencoder for cross-modal retrieval

期刊

NEUROCOMPUTING

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文