4.0 Article

Estimating the number of clusters

Publisher

WILEY
DOI: 10.2307/3315985

Keywords

cluster analysis; density estimates; level sets; number of modes; smoothed bootstrap; support estimation

Ask authors/readers for more resources

Hartigan (1975) defines the number q of clusters in a ed-variate statistical population as the number of connected components of the set {f > c}, where f denotes the underlying density function an R-d and c is a given constant. Some usual cluster algorithms treat q as an input which must be given in advance. The authors propose a method for estimating this parameter which is based on the computation of the number of connected components of an estimate of {f > c}. This set estimator is constructed as a union of balls with centres at an appropriate subsample which is selected via a nonparametric density estimator of f. The asymptotic behaviour of the proposed method is analyzed. A simulation study and an example with real data are also included.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.0
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available