3.8 Proceedings Paper

Accelerating the Yinyang K-Means Algorithm Using the GPU

出版社

IEEE COMPUTER SOC
DOI: 10.1109/ICDE51399.2021.00163

关键词

GPGPU; K-Means; Lloyd's Algorithm; Yinyang Algorithm

资金

  1. U.S. Department of Energy by Lawrence Livermore National Laboratory [DE-AC52-07NA27344]

向作者/读者索取更多资源

The k-means clustering algorithm is widely used for unsupervised learning and requires significant computational effort to compute results accurately. Alternative algorithms have been proposed to avoid distance calculations while producing precise results. This paper explores the use of GPU acceleration for the Yinyang algorithm, achieving up to 8x speedup on real-world datasets compared to multi-core CPU implementations.
The k-means clustering algorithm is widely employed for unsupervised learning. The algorithm takes as input a multidimensional dataset of points and number of clusters/centroids, k, where each point is assigned to one of the clusters. For exact k-means clustering, the algorithm must compute the same result as Lloyd's algorithm, which is well-known to be computationally expensive due to the large number of distance comparisons between each point and the k centroids. Several algorithms have been proposed for k-means clustering that avoid distance calculations but produce an exact result. However, these algorithms have all been designed for execution using the CPU, and no published works have examined using the GPU to accelerate k-means while simultaneously avoiding distance calculations. This paper examines the state-of-the-art Yinyang algorithm that avoids distance calculations as executed on the GPU. Since Lloyd's algorithm is well-suited to a GPU execution, it is not clear whether the Yinyang algorithm will obtain significant performance gains on GPU hardware. In this context, this paper: (i) proposes the first GPU-accelerated Yinyang algorithm in the literature; (ii) advances several optimizations to GPU kernels; (iii) contrasts and evaluates different degrees of distance calculation pruning; and, (iv) compares the performance of our GPU-accelerated Yinyang algorithm to four reference implementations. Our GPU algorithm achieves a speedup over the multi-core CPU Yinyang algorithm of up to 8x on real-world datasets.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据