4.8 Article

Ball k-Means: Fast Adaptive Clustering With No Bounds

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TPAMI.2020.3008694

Keywords

Ball k-means; k-means; ball cluster; stable area; active area; neighbor cluster

Funding

  1. National Natural Science Foundation of China [61806030, 61936001, 11690011, 61721002, U1811461]
  2. Natural Science Foundation of Chongqing [cstc2019jcyj-msxmX0485, cstc2019jcyj-cxttX0002]
  3. National Key Research and Development Program of China [2019QY(Y)0301, 2016QY01W0200]
  4. NICE: NRT for Integrated Computational Entomology, US National Natural Science [1631776]

Ask authors/readers for more resources

This paper presents a novel accelerated exact k-means algorithm called Ball k-means, which uses a ball to describe each cluster. The algorithm focuses on reducing the point-centroid distance computation by finding neighbor clusters for each cluster. It divides each cluster into stable and active areas, with the latter further divided into annular areas. The points in the stable area remain unchanged, while the points in each annular area are adjusted among a few neighbor clusters. The Ball k-means achieves higher performance and requires fewer distance calculations compared to other state-of-the-art accelerated exact bounded methods, making it a versatile replacement for the naive k-means algorithm.
This paper presents a novel accelerated exact k-means called as Ball k-means by using the ball to describe each cluster, which focus on reducing the point-centroid distance computation. The Ball k-means can exactly find its neighbor clusters for each cluster, resulting distance computations only between a point and its neighbor clusters' centroids instead of all centroids. What's more, each cluster can be divided into stable area and active area, and the latter one is further divided into some exact annular area. The assignment of the points in the stable area is not changed while the points in each annular area will be adjusted within a few neighbor clusters. There are no upper or lower bounds in the whole process. Moreover, ball k-means uses ball clusters and neighbor searching along with multiple novel stratagems for reducing centroid distance computations. In comparison with the current state-of-the art accelerated exact bounded methods, the Yinyang algorithm and the Exponion algorithm, as well as other top-of-the-line tree-based and bounded methods, the ball k-means attains both higher performance and performs fewer distance calculations, especially for large-k problems. The faster speed, no extra parameters and simpler design of Ball k-means make it an all-around replacement of the naive k-means.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available