4.6 Article

On Consistency and Sparsity for Principal Components Analysis in High Dimensions

期刊

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
卷 104, 期 486, 页码 682-693

出版社

AMER STATISTICAL ASSOC
DOI: 10.1198/jasa.2009.0121

关键词

Eigenvector estimation; Reduction of dimension; Regularization; Thresholding; Variable selection

资金

  1. National Science Foundation [DMS 0505303, DMS 0072661] Funding Source: Medline
  2. NIBIB NIH HHS [R01 EB001988, R01 EB001988-14] Funding Source: Medline

向作者/读者索取更多资源

Principal components analysis (PCA) is a classic method for the reduction of dimensionality of data in the form of n observations (or cases) of a vector with p variables. Contemporary datasets often have p comparable with or even much larger than n. Our main assertions, in such settings, are (a) that some initial reduction in dimensionality is desirable before applying any PCA-type search for principal modes, and (b) the initial reduction in dimensionality is best achieved by working in a basis in which the signals have a sparse representation. We describe a simple asymptotic model in which the estimate of the leading, principal component vector via standard PCA is consistent if and only if p(n)/n -> 0. We provide a simple algorithm for selecting it subset of coordinates with largest sample variances, and show that if PCA is done on the selected subset, then consistency is recovered, even if p(n) >> n.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据