☆ 4.8 Article

Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2021)

期刊

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA

卷 118, 期 43, 页码 -

出版社

NATL ACAD SCIENCES

DOI: 10.1073/pnas.2103091118

关键词

deep learning; neural collapse; class imbalance

类别

Multidisciplinary Sciences

资金

NIH [RF1AG063481]
NSF [DMS-1847415, CCF-1934876]
Alfred Sloan Research Fellowship
Wharton Dean's Research Fund

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The Layer-Peeled Model is introduced in this paper as a nonconvex optimization program to better understand deep neural networks. It is shown to inherit characteristics of well-trained neural networks and can help explain and predict common empirical patterns of deep-learning training. The model reveals phenomena such as neural collapse on balanced datasets and Minority Collapse on imbalanced datasets, providing insights into how to mitigate the latter.

In this paper, we introduce the Layer-Peeled Model, a nonconvex, yet analytically tractable, optimization program, in a quest to better understand deep neural networks that are trained for a sufficiently long time. As the name suggests, this model is derived by isolating the topmost layer from the remainder of the neural network, followed by imposing certain constraints separately on the two parts of the network. We demonstrate that the Layer-Peeled Model, albeit simple, inherits many characteristics of well-trained neural networks, thereby offering an effective tool for explaining and predicting common empirical patterns of deep-learning training. First, when working on class balanced datasets, we prove that any solution to this model forms a simplex equiangular tight frame, which, in part, explains the recently discovered phenomenon of neural collapse [V. Papyan, X. Y. Han, D. L. Donoho, Proc. Natl. Acad. Sci. U.S.A. 117, 24652- 24663 (2020)]. More importantly, when moving to the imbalanced case, our analysis of the Layer-Peeled Model reveals a hitherto unknown phenomenon that we term Minority Collapse, which fundamentally limits the performance of deep-learning models on the minority classes. In addition, we use the Layer-Peeled Model to gain insights into how to mitigate Minority Collapse. Interestingly, this phenomenon is first predicted by the Layer-Peeled Model before being confirmed by our computational experiments.

Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training

期刊

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA

出版社

NATL ACAD SCIENCES

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training

期刊

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA

出版社

NATL ACAD SCIENCES

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文