4.4 Article

BI-CROSS-VALIDATION OF THE SVD AND THE NONNEGATIVE MATRIX FACTORIZATION

Journal

ANNALS OF APPLIED STATISTICS
Volume 3, Issue 2, Pages 564-594

Publisher

INST MATHEMATICAL STATISTICS
DOI: 10.1214/08-AOAS227

Keywords

Cross-validation; principal components; random matrix theory; sample reuse; weak factor model

Funding

  1. NSF [DMS-06-04939]

Ask authors/readers for more resources

This article presents a form of bi-cross-validation (BCV) for choosing the rank in outer product models, especially the singular value decomposition (SVD) and the nonnegative matrix factorization (NW). Instead of leaving out a set of rows of the data matrix, we leave out a set of rows and a set of columns, and then predict the left out entries by low rank operations on the retained data. We prove a self-consistency result expressing the prediction error as a residual from a low rank approximation. Random matrix theory and some empirical results suggest that smaller hold-out sets lead to more over-fitting, while larger ones are more prone to under-fitting. In simulated examples we find that a method leaving out half the rows and half the columns performs well.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available