☆ 4.2 Article

Clustering Large Datasets by MergingK-Means Solutions

JOURNAL OF CLASSIFICATION (2020)

期刊

JOURNAL OF CLASSIFICATION

卷 37, 期 1, 页码 97-123

出版社

SPRINGER

DOI: 10.1007/s00357-019-09314-8

关键词

K-means; Finite mixture models; Merging components; Pairwise overlap; Classification EM algorithm

类别

Mathematics, Interdisciplinary Applications Psychology, Mathematical

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Existing clustering methods range from simple but very restrictive to complex but more flexible. TheK-means algorithm is one of the most popular clustering procedures due to its computational speed and intuitive construction. Unfortunately, the application ofK-means in its traditional form based on Euclidean distances is limited to cases with spherical clusters of approximately the same volume and spread of points. Recent developments in the area of merging mixture components for clustering show good promise. We propose a general framework for hierarchical merging based on pairwise overlap between components which can be readily applied in the context of theK-means algorithm to produce meaningful clusters. Such an approach preserves the main advantage of theK-means algorithm-its speed. The developed ideas are illustrated on examples, studied through simulations, and applied to the problem of digit recognition.

Clustering Large Datasets by MergingK-Means Solutions

期刊

JOURNAL OF CLASSIFICATION

出版社

SPRINGER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Clustering Large Datasets by MergingK-Means Solutions

期刊

JOURNAL OF CLASSIFICATION

出版社

SPRINGER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文