4.5 Article

Model-based clustering of high-dimensional data: A review

Journal

COMPUTATIONAL STATISTICS & DATA ANALYSIS
Volume 71, Issue -, Pages 52-78

Publisher

ELSEVIER
DOI: 10.1016/j.csda.2012.12.008

Keywords

Model-based clustering; High-dimensional data; Dimension reduction; Regularization; Parsimonious models; Subspace clustering; Variable selection; Software; R package

Ask authors/readers for more resources

Model-based clustering is a popular tool which is renowned for its probabilistic foundations and its flexibility. However, high-dimensional data are nowadays more and more frequent and, unfortunately, classical model-based clustering techniques show a disappointing behavior in high-dimensional spaces. This is mainly due to the fact that model-based clustering methods are dramatically over-parametrized in this case. However, high-dimensional spaces have specific characteristics which are useful for clustering and recent techniques exploit those characteristics. After having recalled the bases of model-based clustering, dimension reduction approaches, regularization-based techniques, parsimonious modeling, subspace clustering methods and clustering methods based on variable selection are reviewed. Existing softwares for model-based clustering of high-dimensional data will be also reviewed and their practical use will be illustrated on real-world data sets. (C) 2012 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available