4.7 Article

DenPEHC: Density peak based efficient hierarchical clustering

期刊

INFORMATION SCIENCES
卷 373, 期 -, 页码 200-218

出版社

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2016.08.086

关键词

Hierarchical clustering; Density peaks; Grid granulation; Granular computing

资金

  1. National Key Research and Development Program of China [2016YFB1000905]
  2. National Natural Science Foundation of China [61272060, 61572091]

向作者/读者索取更多资源

Existing hierarchical clustering algorithms involve a flat clustering component and an additional agglomerative or divisive procedure. This paper presents a density peak based hierarchical clustering method (DenPEHC), which directly generates clusters on each possible clustering layer, and introduces a grid granulation framework to enable DenPEHC to cluster large-scale and high-dimensional (LSHD) datasets. This study consists of three parts: (1) utilizing the distribution of the parameter gamma, which is defined as the product of the local density rho and the minimal distance to data points with higher density delta in clustering by fast search and find of density peaks (DPClust), and a linear fitting approach to select clustering centers with the clustering hierarchy decided by finding the stairs in the gamma curve; (2) analyzing the leading tree (in which each node except the root is led by its parent to join the same cluster) as an intermediate result of DPClust, and constructing the clustering hierarchy efficiently based on the tree; and (3) designing a framework to enable DenPEHC to cluster LSHD datasets when a large number of attributes can be grouped by their semantics. The proposed method builds the clustering hierarchy by simply disconnecting the center points from their parents with a linear computational complexity 0(m), where m is the number of clusters. Experiments on synthetic and real datasets show that the proposed method has promising efficiency, accuracy and robustness compared to state-of-the-art methods. (C) 2016 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据