4.7 Article

Spectral clustering algorithm using density-sensitive distance measure with global and local consistencies

期刊

KNOWLEDGE-BASED SYSTEMS
卷 170, 期 -, 页码 26-42

出版社

ELSEVIER
DOI: 10.1016/j.knosys.2019.01.026

关键词

Spectral clustering; Euclidean distance; Relative density sensitive term; Global and local consistencies; Robustness

资金

  1. Fundamental Research Funds for the Central Universities [2572017EB02, 2572017CB07]
  2. Innovative talent fund of Harbin science and technology Bureau [2017RAXXJ018]
  3. Double first-class scientific research foundation of Northeast Forestry University [411112438]

向作者/读者索取更多资源

Spectral clustering algorithm (SC) has recently received great attention for its high performance in large-scale data clustering and simplicity in implementation. However, previous studies have demonstrated that the commonly used distance measures in SC cannot simultaneously consider global and local consistencies and are sensitive to various noises. As a result, the obtained similarity matrices are unable to capture the actual data structure and thus produce poor clustering results, especially when the data exhibits nonlinear and local manifold structures characteristics. In order to address those limitations, we present a spectral clustering algorithm using density-sensitive distance measure with global and local consistencies in this paper. in the presented algorithm, a novel manifold distance with exponential term and scaling factor is introduced as the pairwise similarity measure. By modifying its exponential term and scaling factor, we can flexibly adjust the ratio of the similarities between the data within the same manifold to the similarities between the data across different manifolds. This flexible adjustment is beneficial to the obtained similarity matrix more accurately reflecting the global and local consistencies of data structures. In addition, to eliminate the effect of noises on the clustering performance, we also incorporate the relative density sensitive term into the distance measure to take into account the local distribution characteristics of the data. Finally, to further improve clustering performance, we provide the SC-based k value determination method far k nearest neighbors (KNN) graph. in the experimental part, the effect of parameters on the performance of the proposed technique is discussed and some suggestions about the determination of the parameters are given. Theoretical analysis and experimental results on several synthetic datasets. UCI benchmark datasets and generated large MNIST handwritten digits datasets demonstrate that the proposed approach is superior to other existing spectral clustering techniques with good robustness. (C) 2019 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据