4.7 Article

Object-based cluster validation with densities

期刊

PATTERN RECOGNITION
卷 121, 期 -, 页码 -

出版社

ELSEVIER SCI LTD
DOI: 10.1016/j.patcog.2021.108223

关键词

Clustering; Clustering validity index; Internal index; Density-based cluster validation; Unsupervised

向作者/读者索取更多资源

Clustering validity indices are used to determine the correct number of clusters and evaluate the quality of clusters formed by clustering algorithms. Internal validity indices, such as OCVD, focus on capturing the separation and compactness of clusters by considering the density of data objects. OCVD, a single number that averages the density-based contribution of individual data objects, performs well in detecting the correct number of clusters, particularly in data sets with clusters of arbitrary shapes.
Clustering validity indices are typically used as tools to find the correct number of clusters in a data set and/or to evaluate the quality of the clusters formed by clustering algorithms. Clustering validity in-dices measure separation and compactness of clusters. Typically, when applying a clustering algorithm, the input includes the number of clusters. After applying the algorithm with several different numbers of clusters, we determine the number of clusters to be the one with the best validity index. There are two types of clustering validity indices: external indices that are supervised, and internal indices that are un-supervised. The focus of this paper is on internal validity indices. Some existing internal validity indices capture the properties of the clusters by using representative statistics such as mean, variance, diameter, etc., however, these do not perform well when clusters have arbitrary shapes. One approach to overcome this issue is to use the density of the data objects in each cluster. That provides the advantage of captur-ing the full characteristics of the cluster which is most beneficial when there are clusters with arbitrary shapes. In the literature, a few density-based clustering validity indices have been proposed. However, some of them show poor performance when the clusters are not perfectly separated. Some others per-form poorly because they use only representative objects from each cluster instead of all objects. The contribution of this paper is an internal validity index named the object-based clustering validity index with densities (OCVD). OCVD is a single number that averages the density-based contribution of individ-ual data objects to both separation and compactness of clusters. The methodology behind calculating the density-based contributions of the objects is kernel density estimation. We show through several exper-iments that OCVD performs well in detecting the correct number of clusters in data sets with different cluster shapes including arbitrary shapes. (c) 2021 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据