☆ 4.6 Article

On Consistency and Sparsity for Principal Components Analysis in High Dimensions

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION (2009)

Journal

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION

Volume 104, Issue 486, Pages 682-693

Publisher

AMER STATISTICAL ASSOC

DOI: 10.1198/jasa.2009.0121

Keywords

Eigenvector estimation; Reduction of dimension; Regularization; Thresholding; Variable selection

Funding

National Science Foundation [DMS 0505303, DMS 0072661] Funding Source: Medline
NIBIB NIH HHS [R01 EB001988, R01 EB001988-14] Funding Source: Medline

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Principal components analysis (PCA) is a classic method for the reduction of dimensionality of data in the form of n observations (or cases) of a vector with p variables. Contemporary datasets often have p comparable with or even much larger than n. Our main assertions, in such settings, are (a) that some initial reduction in dimensionality is desirable before applying any PCA-type search for principal modes, and (b) the initial reduction in dimensionality is best achieved by working in a basis in which the signals have a sparse representation. We describe a simple asymptotic model in which the estimate of the leading, principal component vector via standard PCA is consistent if and only if p(n)/n -> 0. We provide a simple algorithm for selecting it subset of coordinates with largest sample variances, and show that if PCA is done on the selected subset, then consistency is recovered, even if p(n) >> n.

On Consistency and Sparsity for Principal Components Analysis in High Dimensions

Journal

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION

Publisher

AMER STATISTICAL ASSOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

On Consistency and Sparsity for Principal Components Analysis in High Dimensions

Journal

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION

Publisher

AMER STATISTICAL ASSOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper