4.6 Article

A Cross-Modal Hash Retrieval Method with Fused Triples

期刊

APPLIED SCIENCES-BASEL
卷 13, 期 18, 页码 -

出版社

MDPI
DOI: 10.3390/app131810524

关键词

triples; cross-modal; similarity retrieval; hash algorithm

向作者/读者索取更多资源

This paper proposes Tri-CMH, a cross-modal hash retrieval method with fused triples, which effectively utilizes the dataset's supervisory information and improves the model's ability to judge cross-modality semantic similarity.
Due to the fast retrieval speed and low storage cost, cross-modal hashing has become the primary method for cross-modal retrieval. Since the emergence of deep cross-modal hashing methods, cross-modal retrieval significantly improved. However, the existing cross-modal hash retrieval methods still need to effectively utilize the dataset's supervisory information and the lack of similarity expression ability. This means that the label information needs to be maximized, and the potential semantic relationship between two modalities cannot be fully explored, thus affecting the judgment of semantic similarity between two modalities. To address these problems, this paper proposes Tri-CMH, a cross-modal hash retrieval method with fused triples, which is an end-to-end modeling framework consisting of two parts: feature extraction and hash learning. Firstly, the multi-modal data are preprocessing into the form of triple groups. The data supervision matrix is constructed so that the samples with labels and their meanings are aggregated together. In contrast, the samples with labels and their opposite meanings are separated, thus avoiding the problem of the under-utilization of supervisory information in the data set and achieving the effect of efficiently utilizing the global supervisory information. Meanwhile, the loss function of the hash learning part is optimized by considering the Hamming distance loss, single-modality internal loss, cross-modality loss, and quantization loss to explicitly constrain semantically similar hash codes and semantically dissimilar hash codes and to improve the model's ability to judge cross-modality semantic similarity. The method is trained and tested on the IAPR-TC12, MIRFLICKR-25K, and NUS-WIDE datasets, and the experimental evaluation criteria are mAP and PR curve, and the experimental results show the effectiveness and practicality of the method.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据