☆ 4.4 Article Proceedings Paper

Persistence in high-dimensional linear predictor selection and the virtue of overparametrization

BERNOULLI (2004)

Journal

BERNOULLI

Volume 10, Issue 6, Pages 971-988

Publisher

INT STATISTICAL INST

DOI: 10.3150/bj/1106314846

Keywords

consistency; lasso; regression; variable selection

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Let Z(i) = (Y-i, X-i... X-m(i)), i = 1..., n, be independent and identically distributed random vectors. Z(i) similar to F, F is an element of F. It is desired to predict Y by Sigmabeta(j)X(j), where (beta(1),.... beta(m)) is an element of B-n subset of or equal to R-m, under a prediction loss. Suppose that m = n(a), a > 1, that is, there are many more explanatory variables than observations. We consider sets B-n restricted by the maximal number of non-zero coefficients of their members, or by their 11 radius. We study the following asymptotic question: how 'large' may the set B-n be, so that it is still possible to select empirically a predictor whose risk under F is close to that of the best predictor in the set? Sharp bounds for orders of magnitudes are given under various assumptions on F. Algorithmic complexity of the ensuing procedures is also studied. The main message of this paper and the implications of the orders derived are that under various sparsity assumptions on the optimal predictor there is 'asymptotically no harm' in introducing many more explanatory variables than observations. Furthermore, such practice can be beneficial in comparison with a procedure that screens in advance a small subset of explanatory variables. Another main result is that 'lasso' procedures, that is. optimization under 11 constraints, could be efficient in finding optimal sparse predictors in high dimensions.

Persistence in high-dimensional linear predictor selection and the virtue of overparametrization

Journal

BERNOULLI

Publisher

INT STATISTICAL INST

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Persistence in high-dimensional linear predictor selection and the virtue of overparametrization

Journal

BERNOULLI

Publisher

INT STATISTICAL INST

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper