☆ 4.6 Article

A Framework for Feature Selection in Clustering

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION (2010)

Journal

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION

Volume 105, Issue 490, Pages 713-726

Publisher

AMER STATISTICAL ASSOC

DOI: 10.1198/jasa.2010.tm09415

Keywords

Hierarchical clustering; High-dimensional; K-means clustering; Lasso; Model selection; Sparsity; Unsupervised learning

Funding

National Defense Science and Engineering Graduate Fellowship
National Science Foundation [DMS-9971405]
National Institutes of Health [N01-HV-28183]
NATIONAL HEART, LUNG, AND BLOOD INSTITUTE [R01HL028183] Funding Source: NIH RePORTER
NATIONAL INSTITUTE OF BIOMEDICAL IMAGING AND BIOENGINEERING [R01EB001988] Funding Source: NIH RePORTER

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

We consider the problem of clustering observations using a potentially large set of features. One might expect that the true underlying clusters present in the data differ only with respect to a small fraction of the features, and will be missed if one clusters the observations using the full set of features. We propose a novel framework for sparse clustering, in which one clusters the observations using an adaptively chosen subset of the features. The method uses a lasso-type penalty to select the features. We use this framework to develop simple methods for sparse K-means and sparse hierarchical clustering. A single criterion governs both the selection of the features and the resulting clusters. These approaches are demonstrated on simulated and genomic data.

A Framework for Feature Selection in Clustering

Journal

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION

Publisher

AMER STATISTICAL ASSOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

A Framework for Feature Selection in Clustering

Journal

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION

Publisher

AMER STATISTICAL ASSOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper