☆ 3.8 Proceedings Paper

Cross-media Retrieval by Learning Rich Semantic Embeddings of Multimedia

PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17) (2017)

期刊

PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17)

卷 -, 期 -, 页码 1698-1706

出版社

ASSOC COMPUTING MACHINERY

DOI: 10.1145/3123266.3123369

关键词

Cross-media retrieval; rich semantic embeddings; multi-sensory fusion; TextNet

类别

Computer Science, Theory & Methods Engineering, Electrical & Electronic

资金

Shenzhen Peacock Plan [20130408183003656]
Shenzhen Key Laboratory for Intelligent Multimedia and Virtual Reality [ZDSYS201703031405467]
Guangdong Science and Technology Project [2014B010117007]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Cross-media retrieval aims at seeking the semantic association between different media types. Most existing methods paid much attention on learning mapping functions or finding the optimal spaces, but neglected how people accurately cognize images and texts. This paper proposes a brain inspired cross-media retrieval framework to learn rich semantic embeddings of multimedia. Different from directly using off-the-shelf image features, we combine the visual and descriptive senses for an image from the view of human perception via a joint model, called multi-sensory fusion network (MSFN). A topic model based TextNet maps texts into the same semantic space as images according to their shared ground truth labels. Moreover, in order to overcome the limitations of insufficient data for training neural networks and less complexity in text form, we introduce a large-scale image-text dataset, called Britannica dataset. Extensive experiments show the effectiveness of our framework for different lengths of texts on three benchmark datasets as well as Britannica dataset. Most of all, we report the best known average results of Img2Text and Text2Img compared with several state-of-the-art methods.

Cross-media Retrieval by Learning Rich Semantic Embeddings of Multimedia

期刊

PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17)

出版社

ASSOC COMPUTING MACHINERY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Cross-media Retrieval by Learning Rich Semantic Embeddings of Multimedia

期刊

PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17)

出版社

ASSOC COMPUTING MACHINERY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文