4.7 Article

The Interplay Between Implicit Bias and Benign Overfitting in Two-Layer Linear Networks

Journal

JOURNAL OF MACHINE LEARNING RESEARCH
Volume 23, Issue -, Pages -

Publisher

MICROTOME PUBL

Keywords

implicit bias; generalization; benign overfitting; interpolation; neural net-works; regression

Funding

  1. NSF [DMS-2023505, DMS-2031883]
  2. Simons Foundation [814639]

Ask authors/readers for more resources

This paper investigates the phenomenon of benign overfitting in neural network models, deriving bounds on excess risk for two-layer linear neural networks trained with gradient flow on the squared loss. The study highlights the importance of both initialization quality and data covariance matrix properties in achieving low excess risk.
The recent success of neural network models has shone light on a rather surprising sta-tistical phenomenon: statistical models that perfectly fit noisy data can generalize well to unseen test data. Understanding this phenomenon of benign overfitting has attracted intense theoretical and empirical study. In this paper, we consider interpolating two-layer linear neural networks trained with gradient flow on the squared loss and derive bounds on the excess risk when the covariates satisfy sub-Gaussianity and anti-concentration prop-erties, and the noise is independent and sub-Gaussian. By leveraging recent results that characterize the implicit bias of this estimator, our bounds emphasize the role of both the quality of the initialization as well as the properties of the data covariance matrix in achieving low excess risk.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available