期刊
PATTERN RECOGNITION
卷 120, 期 -, 页码 -出版社
ELSEVIER SCI LTD
DOI: 10.1016/j.patcog.2021.108084
关键词
Hashing; Multi-label; Cross-modal retrieval; Deep learning
资金
- National Science Foundation of China [61976115, 61732006, 61876087]
- Natural Science Foundation of Jiangsu Province [SBK2021043459]
- AI+ Project of NUAA [NZ2020012,56XZA18009, 315025305]
- China Scholarship Council [201906830057]
RMSH is designed for more accurate multi-label cross-modal retrieval, addressing modality discrepancies and noise through fine-grained similarity of rich semantics and robust margin-adaptive triplet loss. The effective bounds derived from information coding-theoretic analysis enable our method to achieve state-of-the-art performance on multiple benchmarks.
Hashing based cross-modal retrieval has recently made significant progress. But straightforward embedding data from different modalities involving rich semantics into a joint Hamming space will inevitably produce false codes due to the intrinsic modality discrepancy and noises. We present a novel deep Robust Multilevel Semantic Hashing (RMSH) for more accurate multi-label cross-modal retrieval. It seeks to preserve fine-grained similarity among data with rich semantics,i.e., multi-label, while explicitly require distances between dissimilar points to be larger than a specific value for strong robustness. For this, we give an effective bound of this value based on the information coding-theoretic analysis, and the above goals are embodied into a margin-adaptive triplet loss. Furthermore, we introduce pseudo-codes via fusing multiple hash codes to explore seldom-seen semantics, alleviating the sparsity problem of similarity information. Experiments on three benchmarks show the validity of the derived bounds, and our method achieves state-of-the-art performance. (c) 2021 Published by Elsevier Ltd.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据