4.5 Article

A novel hierarchical clustering algorithm with merging strategy based on shared subordinates

Journal

APPLIED INTELLIGENCE
Volume 52, Issue 8, Pages 8635-8650

Publisher

SPRINGER
DOI: 10.1007/s10489-021-02830-4

Keywords

Hierarchical clustering; Natural neighbor; Local representatives; Shared subordinates

Funding

  1. Project of National Natural Science Foundation for Young Scientists of China [61802360]

Ask authors/readers for more resources

Hierarchical clustering is a common unsupervised learning technique used to discover relationships in data sets. A novel Hierarchical Clustering algorithm with a Merging strategy based on Shared Subordinates (HCMSS) is proposed to overcome challenges like inaccuracy and time-consuming. Experiments show that HCMSS can effectively improve clustering accuracy and save time compared to state-of-the-art benchmarks.
Hierarchical clustering is a common unsupervised learning technique that is used to discover potential relationships in data sets. Despite the conciseness and interpretability, hierarchical clustering algorithms still face some challenges such as inaccuracy, time-consuming, and difficulty in choosing merging strategies. To overcome these limitations, we propose a novel Hierarchical Clustering algorithm with a Merging strategy based on Shared Subordinates (HCMSS), which defines new concepts of the local core representative and the shared subordinate belonging to multiple representatives. First, the state-of-the-art natural neighbor (NaN) is introduced to compute the local neighborhood and the local density of each data point. Next, a sharing-based local core searching algorithm (SLORE) is proposed to find local core points and divide the input data set into numerous initial small clusters. Lastly, these small clusters are merged hierarchically and form the final clustering result. We creatively split the merging process into two sub-steps: first, pre-connecting small clusters according to a shared-subordinates-based indicator that measures the stickiness between clusters; second, merging the pre-connected intermediate clusters and the remaining unconnected small clusters in a classical hierarchical way. Experiments on 8 synthetic and 8 real-world data sets demonstrate that HCMSS can effectively improve the clustering accuracy and is less time-consuming than 2 state-of-the-art benchmarks.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available