☆ 4.7 Article

Learning Feature Representation and Partial Correlation for Multimodal Multi-Label Data

IEEE TRANSACTIONS ON MULTIMEDIA (2021)

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

卷 23, 期 -, 页码 1882-1894

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TMM.2020.3004963

关键词

Semantics; Correlation; Task analysis; Data models; Learning systems; Kernel; Deep learning; Cross-modal retrieval; correlation learning; feature learning; partial correlation

类别

Computer Science, Information Systems Computer Science, Software Engineering Telecommunications

资金

National Key R&D Program of China [2018AAA0102003]
National Natural Science Foundation of China [61672497, 61836002, 61620106009, U1636214, 61931008]
Key Research Program of Frontier Sciences of CAS [QYZDJ-SSW-SYS013]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The proposed FLPCL method utilizes deep feature learning and partial correlation learning to infer relationships between modalities and learn effective multimodal representations. It outperforms state-of-the-art methods on cross-modal retrieval tasks.

User-provided annotations in existing multimodal datasets sometimes are inappropriate for model learning and can hinder the task of cross-modal retrieval. To handle this issue, we propose a discriminative and noise-robust cross-modal retrieval method, called FLPCL, which consists of deep feature learning and partial correlation learning. Deep feature learning is implemented by utilizing label supervised information to guide the training of deep neural network for each modality, which aims to find modality-specific deep feature representations that preserve the similarity and discrimination information among multimodal data. Based on deep feature learning, partial correlation learning is proposed to infer direct association between different modalities by removing the effect of common underlying semantics from each modality. It is achieved by maximizing the canonical correlation of the feature representations of different modalities conditioned on the label modality. Different from existing works that build indirect association between modalities via incorporating semantic labels, our FLPCL method can learn more effective and robust multimodal latent representations by explicitly preserving both intra-modal and inter-modal relationship among multimodal data. Extensive experiments on three cross-modal datasets show that our method outperforms state-of-the-art methods on cross-modal retrieval tasks.

Learning Feature Representation and Partial Correlation for Multimodal Multi-Label Data

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Learning Feature Representation and Partial Correlation for Multimodal Multi-Label Data

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文