4.5 Article

Delta-density based clustering with a divide-and-conquer strategy: 3DC clustering

Journal

PATTERN RECOGNITION LETTERS
Volume 73, Issue -, Pages 52-59

Publisher

ELSEVIER
DOI: 10.1016/j.patrec.2016.01.009

Keywords

Divide-and-conquer; DBSCAN; Automatic clustering

Funding

  1. National Natural Science Foundation of China [61173083]

Ask authors/readers for more resources

Recently a delta-density based clustering (DDC) algorithm was proposed to cluster data efficiently by fast searching density peaks. In the DDC method, the density and a new-defined criterion delta-distance are utilized. The examples with anomalously large delta-density values are treated as cluster centers, then the remaining are assigned the same cluster label as theft neighbor with higher density. However there are two challenges for the DDC algorithm. First, no rules are available to judge density-delta values as anomalously large or not. Second, the decision graph might produce the redundant examples with anomalous large density-delta values, as we define as the decision graph fraud problem. In this paper, an improved and automatic version of the DDC algorithm, named as 3DC clustering, is proposed to overcome those difficulties. The 3DC algorithm is motivated by the divide-and-conquer strategy and the density-reachable concept in the DBSCAN framework. It can automatically find the correct number of clusters in a recursive way. Experiments on artificial and real world data show that the 3DC clustering algorithm has a comparable performance with the supervised-clustering baselines and outperforms the unsupervised DDCs, which utilize the novelty detection strategies to select the anomalously large density-delta examples for cluster centers. (C) 2016 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available