☆ 4.6 Article

Comprehensive survey on hierarchical clustering algorithms and the recent developments

ARTIFICIAL INTELLIGENCE REVIEW (2023)

Journal

ARTIFICIAL INTELLIGENCE REVIEW

Volume 56, Issue 8, Pages 8219-8264

Publisher

SPRINGER

DOI: 10.1007/s10462-022-10366-3

Keywords

Hierarchical clustering; Divisive; Agglomerative; Dissimilarity; Similarity

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Data clustering is a widely used technique in various fields to divide objects into different clusters based on similarity measures. Hierarchical clustering methods generate consistent partitions of data at different levels, allowing analysis of complex data structures. This article comprehensively reviews various hierarchical clustering methods, including recent developments, and examines the role of similarity measures in the clustering process.

Data clustering is a commonly used data processing technique in many fields, which divides objects into different clusters in terms of some similarity measure between data points. Comparing to partitioning clustering methods which give a flat partition of the data, hierarchical clustering methods can give multiple consistent partitions of the data at different levels for the same data without rerunning clustering, it can be used to better analyze the complex structure of the data. There are usually two kinds of hierarchical clustering methods: divisive and agglomerative. For the divisive clustering, the key issue is how to select a cluster for the next splitting procedure according to dissimilarity and how to divide the selected cluster. For agglomerative hierarchical clustering, the key issue is the similarity measure that is used to select the two most similar clusters for the next merge. Although both types of the methods produce the dendrogram of the data as output, the clustering results may be very different depending on the dissimilarity or similarity measure used in the clustering, and different types of methods should be selected according to different types of the data and different application scenarios. So, we have reviewed various hierarchical clustering methods comprehensively, especially the most recently developed methods, in this work. The similarity measure plays a crucial role during hierarchical clustering process, we have reviewed different types of the similarity measure along with the hierarchical clustering. More specifically, different types of hierarchical clustering methods are comprehensively reviewed from six aspects, and their advantages and drawbacks are analyzed. The application of some methods in real life is also discussed. Furthermore, we have also included some recent works in combining deep learning techniques and hierarchical clustering, which is worth serious attention and may improve the hierarchical clustering significantly in the future.

Comprehensive survey on hierarchical clustering algorithms and the recent developments

Journal

ARTIFICIAL INTELLIGENCE REVIEW

Publisher

SPRINGER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Comprehensive survey on hierarchical clustering algorithms and the recent developments

Journal

ARTIFICIAL INTELLIGENCE REVIEW

Publisher

SPRINGER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper