☆ 4.8 Article

Benign overfitting in linear regression

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2020)

Journal

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA

Volume 117, Issue 48, Pages 30063-30070

Publisher

NATL ACAD SCIENCES

DOI: 10.1073/pnas.1907378117

Keywords

statistical learning theory; overfitting; linear regression; interpolation

Funding

NSF [IIS1619362]
Google research award
Spanish Ministry of Economy and Competitiveness [PGC2018-101643-B-I00]
High-dimensional problems in structured probabilistic models Ayudas Fundacion BBVA a Equipos de Investigacion Cientifica 2017
Google Focused Award Algorithms and Learning for AI

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

The phenomenon of benign overfitting is one of the key mysteries uncovered by deep learning methodology: deep neural networks seem to predict well, even with a perfect fit to noisy training data. Motivated by this phenomenon, we consider when a perfect fit to training data in linear regression is compatible with accurate prediction. We give a characterization of linear regression problems for which the minimum norm interpolating prediction rule has near-optimal prediction accuracy. The characterization is in terms of two notions of the effective rank of the data covariance. It shows that overparameterization is essential for benign overfitting in this setting: the number of directions in parameter space that are unimportant for prediction must significantly exceed the sample size. By studying examples of data covariance properties that this characterization shows are required for benign overfitting, we find an important role for finite-dimensional data: the accuracy of the minimum norm interpolating prediction rule approaches the best possible accuracy for a much narrower range of properties of the data distribution when the data lie in an infinite-dimensional space vs. when the data lie in a finite-dimensional space with dimension that grows faster than the sample size.

Benign overfitting in linear regression

Journal

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA

Publisher

NATL ACAD SCIENCES

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Benign overfitting in linear regression

Journal

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA

Publisher

NATL ACAD SCIENCES

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper