4.3 Article

Automatic Density Peaks Clustering Using DNA Genetic Algorithm Optimized Data Field and Gaussian Process

出版社

WORLD SCIENTIFIC PUBL CO PTE LTD
DOI: 10.1142/S0218001417500239

关键词

ADPC; Data field; DNA genetic algorithm; Gaussian

资金

  1. Excellent Young Scholars Research Fund of Shandong Normal University, China
  2. National Science Foundation of China [61472231, 61402266]
  3. Jinan Youth Science and Technology Star Project [20120108]
  4. soft science research on the national economy and social information of Shandong, China [2015EI013]

向作者/读者索取更多资源

Clustering by fast search and finding of Density Peaks ( called as DPC) introduced by Alex Rodriguez and Alessandro Laio attracted much attention in the field of pattern recognition and artificial intelligence. However, DPC still has a lot of defects that are not resolved. Firstly, the local density rho(i) of point i is affected by the cutoff distance dc, which can influence the clustering result, especially for small real-world cases. Secondly, the number of clusters is still found intuitively by using the decision diagram to select the cluster centers. In order to overcome these defects, this paper proposes an automatic density peaks clustering approach using DNA genetic algorithm optimized data field and Gaussian process (referred to as ADPC-DNAGA). ADPC-DNAGA can extract the optimal value of threshold with the potential entropy of data field and automatically determine the cluster centers by Gaussian method. For any data set to be clustered, the threshold can be calculated from the data set objectively rather than the empirical estimation. The proposed clustering algorithm is benchmarked on publicly available synthetic and real-world datasets which are commonly used for testing the performance of clustering algorithms. The clustering results are compared not only with that of DPC but also with that of several well-known clustering algorithms such as Affinity Propagation, DBSCAN and Spectral Cluster. The experimental results demonstrate that our proposed clustering algorithm can find the optimal cutoff distance d(c), to automatically identify clusters, regardless of their shape and dimension of the embedded space, and can often outperform the comparisons.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据