4.6 Article

Cross-modal retrieval with dual optimization

期刊

MULTIMEDIA TOOLS AND APPLICATIONS
卷 82, 期 5, 页码 7141-7157

出版社

SPRINGER
DOI: 10.1007/s11042-022-13650-0

关键词

Cross-modal retrieval; Modality gap; Inter-modal optimization; Intra-modal optimization

向作者/读者索取更多资源

This paper proposes a dual optimization method (CMRDO) for cross-modal retrieval, which improves retrieval accuracy by optimizing the common representation space and introducing an efficient sample construction strategy, and has strong generalization ability.
For the flexible retrieval of data in different modalities, cross-modal retrieval has gradually attracted the attention of researchers. However, there is a heterogeneity gap between the data of different modalities, which cannot be measured directly. To solve this problem, researchers project data of different modalities into a common representation space to compensate for the heterogeneity of data of different modalities. However, existing methods with pair or triple constraints ignore the rich information between samples, which leads to the degradation of retrieval performance. In order to fully mine the information of samples, this paper proposes a cross-modal retrieval method (CMRDO) with dual optimization. First, the method optimizes the common representation space from inter-modal and intra-modal, respectively. Secondly, we introduce an efficient sample construction strategy to avoid sample pairs with less information. Finally, the bi-directional retrieval strategy we introduced can effectively capture the potential structure of query modal. In the three public datasets, the proposed CMRDO can effectively improve the final cross-modal retrieval accuracy, and has strong generalization ability.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据