4.6 Article

Low-time-complexity document clustering using memristive dot product engine

期刊

SCIENCE CHINA-INFORMATION SCIENCES
卷 65, 期 2, 页码 -

出版社

SCIENCE PRESS
DOI: 10.1007/s11432-021-3316-x

关键词

linear-time clustering; cosine similarity; spherical K-means; memristor; in-memory computing

资金

  1. National Key Research and Development Plan of MOST of China [2019YFB2205100]
  2. National Natural Science Foundation of China [61874164, 92064012, 61841404]
  3. Hubei Key Laboratory for Advanced Memories, Hubei Engineering Research Center on Microelectronics
  4. Chua Memristor Institute

向作者/读者索取更多资源

This article introduces a method to accelerate document clustering using memristive in-memory computing, which reduces the time complexity by performing similarity measurement in one step. It also proposes a normalization scheme to reduce normalization steps during clustering and discusses the impact of non-ideal factors in memristors on clustering tasks.
Document clustering has been commonly accepted in the field of data analysis. Nevertheless, the challenging issues for the clustering are the massive similarity measurement operations in the von Neumann architecture which result in huge time consumption. Memristive in-memory computing provides a brand-new path to solve this problem. In this article, utilizing the memristive dot product engine, we demonstrate a cosine similarity accelerated document clustering method for the first time. The memristor-based clustering method lowers the time complexity from O(N center dot d) of the conventional algorithm to O(N) by executing similarity measurement in one step. Focused on the unit-length vectors, an in-situ normalization scheme for the stored vectors in the crossbar array is proposed to provide an efficient hardware training scheme and reduce the normalization steps during the clustering. Utilizing the BBCSport dataset as a benchmark, we further discussed the impact of the non-ideal factors in the memristors, including the available quantized states, the inevitable programming noise, and the device failure. Simulation results indicate that the 6-bit quantized states and 5% programming noise are acceptable for the document clustering tasks. Besides, high resistance states of the failure cells are recommended for higher performance clustering results.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据