☆ 4.6 Article

The dynamics of representation learning in shallow, non-linear autoencoders

JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT (2023)

Journal

JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT

Volume 2023, Issue 11, Pages -

Publisher

IOP Publishing Ltd

DOI: 10.1088/1742-5468/ad01af

Keywords

learning theory; machine learning; nonlinear dynamics

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Autoencoders are simple neural networks that can be used for unsupervised learning and studying feature extraction. This study focuses on the dynamics of feature learning in non-linear, shallow autoencoders and derives equations that describe the generalization dynamics. The analysis reveals important factors that affect the learning process.

Autoencoders are the simplest neural network for unsupervised learning, and thus an ideal framework for studying feature learning. While a detailed understanding of the dynamics of linear autoencoders has recently been obtained, the study of non-linear autoencoders has been hindered by the technical difficulty of handling training data with non-trivial correlations-a fundamental prerequisite for feature extraction. Here, we study the dynamics of feature learning in non-linear, shallow autoencoders. We derive a set of asymptotically exact equations that describe the generalisation dynamics of autoencoders trained with stochastic gradient descent (SGD) in the limit of high-dimensional inputs. These equations reveal that autoencoders learn the leading principal components of their inputs sequentially. An analysis of the long-time dynamics explains the failure of sigmoidal autoencoders to learn with tied weights, and highlights the importance of training the bias in ReLU autoencoders. Building on previous results for linear networks, we analyse a modification of the vanilla SGD algorithm, which allows learning of the exact principal components. Finally, we show that our equations accurately describe the generalisation dynamics of non-linear autoencoders trained on realistic datasets such as CIFAR10, thus establishing shallow autoencoders as an instance of the recently observed Gaussian universality.

The dynamics of representation learning in shallow, non-linear autoencoders

Journal

JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT

Publisher

IOP Publishing Ltd

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

The dynamics of representation learning in shallow, non-linear autoencoders

Journal

JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT

Publisher

IOP Publishing Ltd

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper