4.7 Article

Mask Cross-Modal Hashing Networks

Journal

IEEE TRANSACTIONS ON MULTIMEDIA
Volume 23, Issue -, Pages 550-558

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TMM.2020.2984081

Keywords

Semantic mask; inter-modal similarity; intra-modal similarity; hashing network; cross-modal retrieval

Funding

  1. National Science Foundation of China [61771322, 61375015]
  2. China Scholarship Council
  3. Fundamental Research Foundation of Shenzhen [JCYJ20160307154630057]

Ask authors/readers for more resources

The rapid development of deep learning has led to significant progress in cross-modal retrieval and the recent attention towards cross-modal hashing. The existing semantic heterogeneity gap between different modalities presents a challenging problem. To address this, we propose the MDCH approach, which introduces semantic mask information and alternately trains intra-modal and inter-modal networks to improve hash code effectiveness.
Due to the rapid development of deep learning, cross-modal retrieval has achieved significant progress in recent years. Moreover, cross-modal hashing has recently attracted considerable attention to multi-modal retrieval applications due to its advantages of low storage costs and fast retrieval speed. However, it is still a challenging problem due to an existing semantic heterogeneity gap between different modalities. In order to further narrow the gap and obtain more effective hash codes, we put forward a novel mask deep cross-modal hashing (MDCH) approach to explore the similarity between inter-modal instances. The main contributions of this paper are that: (1) we attempt to introduce semantic mask information into cross-modal hashing retrieval, (2) we alternately train intra-modal and inter-modal networks to fully mine the semantic relationship between different modalities. The semantic mask can improve the semantic information of the image feature. While inter-modal similarity, explored by inter-modal networks, focuses on enforcing images and their corresponding text tags to have similar hash codes, intra-modal similarity, explored by intra-modal networks, can retain local structural information embedded in each modality to achieve internal similarity. A large number of experiments conducted on three datasets demonstrate that our proposed MDCH approach is superior to several state-of-the-art cross-modal hashing approaches.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available