4.7 Article

Clustering with a new distance measure based on a dual-rooted tree

Journal

INFORMATION SCIENCES
Volume 251, Issue -, Pages 96-113

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2013.05.040

Keywords

Non-metric clustering; Minimal spanning tree; Prim's algorithm; Affinity measure; Co-association measure; Consensus clustering

Funding

  1. PPF-ISSO Nice Sophia Antipolis University, France
  2. US National Science Foundation [CCR-0325571, CCF-0830490]
  3. Division of Computing and Communication Foundations
  4. Direct For Computer & Info Scie & Enginr [1217880] Funding Source: National Science Foundation

Ask authors/readers for more resources

This paper introduces a novel distance measure for clustering high dimensional data based on the hitting time of two Minimal Spanning Trees (MST) grown sequentially from a pair of points by Prim's algorithm. When the proposed measure is used in conjunction with spectral clustering, we obtain a powerful clustering algorithm that is able to separate neighboring non-convex shaped clusters and to account for local as well as global geometric features of the data set. Remarkably, the new distance measure is a true metric even if the Prim algorithm uses a non-metric dissimilarity measure to compute the edges of the MST. This metric property brings added flexibility to the proposed method. In particular, the method is applied to clustering non Euclidean quantities, such as probability distributions or spectra, using the Kullback-Leibler divergence as a base measure. We reduce computational complexity by applying consensus clustering to a small ensemble of dual rooted MSTs. We show that the resultant consensus spectral clustering with dual rooted MST is competitive with other clustering methods, both in terms of clustering performance and computational complexity. We illustrate the proposed clustering algorithm on public domain benchmark data for which the ground truth is known, on one hand, and on real-world astrophysical data on the other hand. (c) 2013 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available