4.7 Article

A meta-learning approach for determining the number of clusters with consideration of nearest neighbors

Journal

INFORMATION SCIENCES
Volume 232, Issue -, Pages 208-224

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2012.12.033

Keywords

Cluster analysis; Number of clusters; Compactness; Disconnectivity; Meta-learning

Funding

  1. Basic Science Research Program through the National Research Foundation of Korea (NRF)
  2. Ministry of Education, Science and Technology [2012R1A1A1012153]
  3. National Research Foundation of Korea [2012R1A1A1012153] Funding Source: Korea Institute of Science & Technology Information (KISTI), National Science & Technology Information Service (NTIS)

Ask authors/readers for more resources

An important and challenging problem in data clustering is the determination of the best number of clusters. A variety of estimation methods has been proposed over the years to address this problem. Most of these methods depend on several nontrivial assumptions about the data structure; and such methods may thus fail to discover the true clusters in a dataset that does not satisfy those assumptions. We develop a new approach that takes as a starting point the simple and intuitive observation that close objects should fall within the same cluster, whereas distant ones should not. Based on this simple notion we utilize a new measurement of good clustering called disconnectivity as well as existing goodness measurements; and we embed these measures into a meta-learning approach for estimating the number of clusters. A simulation experiment based on 13 representative models and an application to real world datasets are conducted to show the effectiveness of the proposed method. (C) 2013 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available