期刊
SOFT COMPUTING
卷 24, 期 12, 页码 9227-9241出版社
SPRINGER
DOI: 10.1007/s00500-019-04449-7
关键词
Cluster validity index; Euclidean distance; Dynamic clustering; Near-optimal number of clusters; Cluster validity evaluation
资金
- National Natural Science Foundation of China [61862042, 61762062, 61601215, 61862044]
- Science and Technology Innovation Platform Project of Jiangxi Province [20181BCD40005]
- Major Discipline Academic and Technical Leader Training Plan Project of Jiangxi Province [20172BCB22030]
- Primary Research & Development Plan Project of Jiangxi Province [20192BBE50075, 20181ACE50033, 20171BBE50064, 2013ZBBE50018]
- Natural Science Foundation of Jiangxi Province [20192BAB207019, 20192BAB207020, 20171BAB202027]
- Graduate Innovation Fund Project of Jiangxi Province [YC2019-S100, YC2019-S048]
Cluster validity evaluation is a hot issue in clustering algorithm research. Aiming at determining the optimal number of clusters in cluster validity evaluation, this paper proposes a new cluster validity index Ratio of Deviation of Sum-of-squares and Euclid distance (RDSED), and designs a cluster validity evaluation method based on RDSED which is suitable to dynamically determine the near-optimal number of clusters. Firstly, based on the analysis of the relationships of the intra-class and inter-class, the concepts of sum-of-squares of within-cluster, sum-of-squares of between-cluster, total sum-of-squares, sum of intra-cluster distance and average distance between clusters are proposed, and then a cluster validity index RDSED based on these concepts is constructed. Secondly, a cluster validity evaluation method based on RDSED for dynamically determining the near-optimal number of clusters is designed. In this method, RDSED value is calculated from large to small in the range of clustering number and this index value is used to dynamically terminate the clustering validity verification process, and finally the near-optimal number of clusters and clustering partition results are obtained. Experiment results of artificial datasets and real datasets show that, compared with some classical clustering validity evaluation method, the proposed cluster validity evaluation method can obtain the near-optimal number of clusters that is closest to the real cluster number in most cases and can effectively evaluate clustering partition results.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据