4.7 Article

Hash Learning With Variable Quantization for Large-Scale Retrieval

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCSVT.2021.3051358

关键词

Quantization (signal); Principal component analysis; Hamming distance; Binary codes; Kernel; Eigenvalues and eigenfunctions; Iterative methods; Approximate nearest neighbor search; multimodal retrieval; hashing; quantization

资金

  1. National Key Research and Development Program of China [2019YFB2102400]
  2. NSFC [62172090, 61772112, 61572463, 61902367]
  3. Natural Science Foundation of Shandong Province [ZR2020QF041]
  4. Postdoctoral Applied Research Program of Qingdao [862005040007]
  5. Science Innovation Foundation of Dalian [2019J12GX037]
  6. Fundamental Research Funds for the Central Universities [202113037]
  7. Jiangsu Provincial Double-Innovation Doctor Program [JSSCBS20210075]
  8. Alibaba Group through Alibaba Innovative Research Program
  9. CAAI-Huawei MindSpore Open Fund

向作者/读者索取更多资源

This paper analyzes the accuracy loss issue in the quantization step of hashing algorithms and proposes two new quantization methods: Variable Integer-based Quantization (VIQ) and Variable Codebook-based Quantization (VCQ). Experimental results show that both methods can improve accuracy, with VCQ performing better than VIQ, but VIQ providing higher search efficiency.
Approximate Nearest Neighbor(ANN) search is the core problem in many large-scale machine learning and computer vision applications such as multimodal retrieval. Hashing is becoming increasingly popular, since it can provide efficient similarity search and compact data representations suitable for handling such large-scale ANN search problems. Most hashing algorithms concentrate on learning more effective projection functions. However, the accuracy loss in the quantization step has been ignored and barely studied. In this paper, we analyse the importance of various projected dimensions, distribute them into several groups and quantize them with two types of values which can both better preserve the neighborhood structure among data. One is Variable Integer-based Quantization (VIQ) that quantizes each projected dimension with integer values. The other is Variable Codebook-based Quantization (VCQ) that quantizes each projected dimension with corresponding codebook values. We conduct experiments on five common public data sets containing up to one million vectors. The results show that the proposed VCQ and VIQ algorithms can both achieve much higher accuracy than state-of-the-art quantization methods. Furthermore, although VCQ performs better than VIQ, ANN search with VIQ provides much higher search efficiency.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据