4.6 Article

On the distribution of the largest eigenvalue in principal components analysis

Journal

ANNALS OF STATISTICS
Volume 29, Issue 2, Pages 295-327

Publisher

INST MATHEMATICAL STATISTICS-IMS
DOI: 10.1214/aos/1009210544

Keywords

Karhunen-Loeve transform; empirical orthogonal functions; largest eigenvalue; largest singular value; Laguerre ensemble; Laguerre polynomial; Wishart distribution; Plancherel-Rotach asymptotics; Painleve equation; Tracy-Widom distribution; random matrix theory; Fredholm determinant; Liouville-Green method

Ask authors/readers for more resources

Let x((1)) denote the square of the largest singular value of an n x p matrix X, all of whose entries are independent standard Gaussian variates. Equivalently, x((1)) is the largest principal component variance of the covariance matrix X'X, or the largest eigenvalue of a p-variate Wishart distribution on n degrees of freedom with identity covariance. Consider the limit of large p and n with n/p = gamma greater than or equal to 1. When centered by mu (p) = (rootn - 1 + rootp)(2) and scaled by sigma (p) = (rootn - 1 + rootp)(1/rootn - 1 + 1/rootp)(1/3), the distribution of x((1)) approaches the Tracy-Widom law of order 1, which is defined in terms of the Painleve II differential equation and can be numerically evaluated and tabulated in software. Simulations show the approximation to be informative for n and p as small as 5. The limit is derived via a corresponding result for complex Wishart matrices using methods from random matrix theory. The result suggests that some aspects of large p multivariate distribution theory may be easier to apply in practice than their fixed p counterparts.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available