4.7 Article

Density peak clustering using global and local consistency adjustable manifold distance

Journal

INFORMATION SCIENCES
Volume 577, Issue -, Pages 769-804

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2021.08.036

Keywords

Clustering; Density peaks; Euclidean distance; Manifold distance; Global and local consistency

Funding

  1. National Natural Science Foundation of China [62176050]
  2. Fundamental Research Funds for the Central Universities [2572017EB02]
  3. Innovative Talent Fund of Harbin Science and Technology Bureau [2017RAXXJ018]

Ask authors/readers for more resources

The study introduces a novel density peak clustering algorithm using a globally and locally consistent adjustable manifold distance, to effectively capture clusters with different densities and sizes. Experimental results demonstrate that this method outperforms other clustering techniques with statistical significance.
A novel density-based clustering algorithm, called Density Peak Clustering (DPC), has recently received great attention due to its efficiency in clustering performance and sim-plicity in implementation. However, empirical studies have demonstrated that the com-monly used distance measures in DPC cannot simultaneously consider global and local consistency, which can cause the estimated local densities based on it incapable of captur-ing the ground-truth data structure and thus produce poor clustering results, especially when the clusters existing in datasets exhibit multi-density manifold structures character -istics with different sizes. In order to address those limitations, we propose a novel density peak clustering algorithm using global and local consistency adjustable manifold distance in this paper. In the proposed algorithm, a novel manifold distance with exponential term and scaling factor is introduced to estimate local densities of all data points. By modifying its exponential term and scaling factor, we can flexibly adjust the ratio of the distance between the data within the same manifold to the distance between the data across differ-ent manifolds. This flexible adjustment is beneficial to the estimated local densities more accurately reflecting the global and local consistency of data structures. In addition, to effectively deal with clusters with different densities and sizes, a compensation strategy for distance from nearest point with larger density, called local-scale tuning distance, is developed for our proposed approach. By the developed local-scale tuning distance, under-lying cluster centers of clusters with different densities and sizes, especially the clusters with low densities or small sizes can remarkably stand out from the decision graph so that the proposed method can accurately identify the number of underlying clusters in the deci-sion graph and thus obtain satisfactory clustering results. In the experimental part, the effect of the scaling factor on the performance of the proposed technique is discussed and some suggestions about the determination of the parameters are given. Theoretical analysis and experimental results on several synthetic datasets and read-world datasets demonstrate that the proposed approach is superior to other existing clustering techniques in terms of three evaluation metrics with statistical significance. (c) 2021 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available