4.5 Article

A novel hierarchical clustering algorithm with merging strategy based on shared subordinates

期刊

APPLIED INTELLIGENCE
卷 52, 期 8, 页码 8635-8650

出版社

SPRINGER
DOI: 10.1007/s10489-021-02830-4

关键词

Hierarchical clustering; Natural neighbor; Local representatives; Shared subordinates

资金

  1. Project of National Natural Science Foundation for Young Scientists of China [61802360]

向作者/读者索取更多资源

Hierarchical clustering is a common unsupervised learning technique used to discover relationships in data sets. A novel Hierarchical Clustering algorithm with a Merging strategy based on Shared Subordinates (HCMSS) is proposed to overcome challenges like inaccuracy and time-consuming. Experiments show that HCMSS can effectively improve clustering accuracy and save time compared to state-of-the-art benchmarks.
Hierarchical clustering is a common unsupervised learning technique that is used to discover potential relationships in data sets. Despite the conciseness and interpretability, hierarchical clustering algorithms still face some challenges such as inaccuracy, time-consuming, and difficulty in choosing merging strategies. To overcome these limitations, we propose a novel Hierarchical Clustering algorithm with a Merging strategy based on Shared Subordinates (HCMSS), which defines new concepts of the local core representative and the shared subordinate belonging to multiple representatives. First, the state-of-the-art natural neighbor (NaN) is introduced to compute the local neighborhood and the local density of each data point. Next, a sharing-based local core searching algorithm (SLORE) is proposed to find local core points and divide the input data set into numerous initial small clusters. Lastly, these small clusters are merged hierarchically and form the final clustering result. We creatively split the merging process into two sub-steps: first, pre-connecting small clusters according to a shared-subordinates-based indicator that measures the stickiness between clusters; second, merging the pre-connected intermediate clusters and the remaining unconnected small clusters in a classical hierarchical way. Experiments on 8 synthetic and 8 real-world data sets demonstrate that HCMSS can effectively improve the clustering accuracy and is less time-consuming than 2 state-of-the-art benchmarks.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据