Journal
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
Volume 30, Issue 7, Pages 2262-2275Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCSVT.2019.2911359
Keywords
Approximate nearest neighbor search; learning to hash; cross-modal retrieval
Categories
Funding
- National Natural Science Foundation of China [61872428, 61573212, 61772310]
- Key Research and Development Program of Shandong Province [2018CXGC0708]
Ask authors/readers for more resources
In this paper, we present a novel supervised cross-modal hashing framework, namely Scalable disCRete mATrix faCtorization Hashing (SCRATCH). First, it utilizes collective matrix factorization on original features together with label semantic embedding, to learn the latent representations in a shared latent space. Thereafter, it generates binary hash codes based on the latent representations. During optimization, it avoids using a large n x n similarity matrix and generates hash codes discretely. Besides, based on different objective functions, learning strategy, and features, we further present three models in this framework, i.e., SCRATCH-o, SCRATCH-t, and SCRATCH-d. The first one is a one-step method, learning the hash functions and the binary codes in the same optimization problem. The second is a two-step method, which first generates the binary codes and then learns the hash functions based on the learned hash codes. The third one is a deep version of SCRATCH-t, which utilizes deep neural networks as hash functions. The extensive experiments on two widely used benchmark datasets demonstrate that SCRATCH-o and SCRATCH-t outperform some state-of-the-art shallow hashing methods for cross-modal retrieval. The SCRATCH-d also outperforms some state-of-the-art deep hashing models.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available