4.5 Article

A new cluster validity for data clustering

Journal

NEURAL PROCESSING LETTERS
Volume 23, Issue 3, Pages 325-344

Publisher

SPRINGER
DOI: 10.1007/s11063-006-9005-x

Keywords

cluster validity; data clustering; deterministic annealing; structural risk minimization; Vapnik-Chervonenkis-bound

Ask authors/readers for more resources

Cluster validity has been widely used to evaluate the fitness of partitions produced by clustering algorithms. This paper presents a new validity, which is called the Vapnik-Chervonenkis-bound (VB) index, for data clustering. It is estimated based on the structural risk minimization (SRM) principle, which optimizes the bound simultaneously over both the distortion function (empirical risk) and the VC-dimension (model complexity). The smallest bound of the guaranteed risk achieved on some appropriate cluster number validates the best description of the data structure. We use the deterministic annealing (DA) algorithm as the underlying clustering technique to produce the partitions. Five numerical examples and two real data sets are used to illustrate the use of VB as a validity index. Its effectiveness is compared to several popular cluster-validity indexes. The results of comparative study show that the proposed VB index has high ability in producing a good cluster number estimate and in addition, it provides a new approach for cluster validity from the view of statistical learning theory.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available