4.8 Article

Coordinate Descent Method for k-means

Journal

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TPAMI.2021.3085739

Keywords

Coordinate descent method; k-means method; clustering; Lloyd heuristic

Funding

  1. National Key Research and Development Program of China [2018AAA0101902]
  2. Natural Science Basic Research Program of Shaanxi [2021JM-071]
  3. National Natural Science Foundation of China [61936014, 61772427]
  4. Fundamental Research Funds for Central Universities [G2019KY0501]

Ask authors/readers for more resources

This paper proposes a solution to solve the k-means problem using coordinate descent method. Through theoretical analysis and experiments, we find that the proposed method performs better in terms of objective function value, local minimum and iterations, and is more robust.
k-means method using Lloyd heuristic is a traditional clustering method which has played a key role in multiple downstream tasks of machine learning because of its simplicity. However, Lloyd heuristic always finds a bad local minimum, i.e., the bad local minimum makes objective function value not small enough, which limits the performance of k-means. In this paper, we use coordinate descent (CD) method to solve the problem. First, we show that the k-means minimization problem can be reformulated as a trace maximization problem, then a simple and efficient coordinate descent scheme is proposed to solve the maximization problem. Two interesting findings through theory are that Lloyd cannot decrease the objective function value of k-means produced by our CD further, and our proposed method CD to solve k-means problem can avoid produce empty clusters. In addition, according to the computational complexity analysis, it is verified CD has the same time complexity with original k-means method. Extensive experiments including statistical hypothesis testing, on several real-world datasets with varying number of clusters, varying number of samples and varying number of dimensions show that CD performs better compared to Lloyd, i.e., lower objective value, better local minimum and fewer iterations. And CD is more robust to initialization than Lloyd whether the initialization strategy is random or initialization of k-means++.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available