4.6 Article

l1 Regularization in Two-Layer Neural Networks

Journal

IEEE SIGNAL PROCESSING LETTERS
Volume 29, Issue -, Pages 135-139

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/LSP.2021.3129698

Keywords

Generalization error; model complexity; neural network; regularization

Ask authors/readers for more resources

A crucial problem in neural networks is selecting an architecture that balances the tradeoff between underfitting and overfitting. This study demonstrates that l(1) regularizations for two-layer neural networks can control generalization error and sparsify input dimensions. By applying an appropriate l(1) regularization on the output layer, the network can produce a tight statistical risk. Additionally, using l(1) regularization on the input layer results in a risk constraint that is not dependent on the input data dimension. The findings also suggest that training a wide neural network with suitable regularization offers an alternative bias-variance tradeoff over selecting from a candidate set of neural networks.
A crucial problem of neural networks is to select an architecture that strikes appropriate tradeoffs between underfitting and overfitting. This work shows that l(1) regularizations for two-layer neural networks can control the generalization error and sparsify the input dimension. In particular, with an appropriate l(1) regularization on the output layer, the network can produce a tight statistical risk. Moreover, an appropriate l(1) regularization on the input layer leads to a risk hound that does not involve the input data dimension. The results also indicate that training a wide neural network with a suitable regularization provides an alternative bias-variance tradeoff to selecting from a candidate set of neural networks. Our analysis is based on a new integration of dimension-based and norm-based complexity analysis to bound the generalization error.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available