☆ 4.7 Article

On the effective initialisation for restricted Boltzmann machines via duality with Hopfield model

NEURAL NETWORKS (2021)

期刊

NEURAL NETWORKS

卷 143, 期 -, 页码 314-326

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.neunet.2021.06.017

关键词

Hopfield model; Restricted Boltzmann machine; Statistical mechanics

类别

Computer Science, Artificial Intelligence Neurosciences

资金

Sapienza University of Rome, Italy [RM120172B8066CB0]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study leverages the equivalence between RBMs and HNN to propose an effective weight initialization method and applies it in a simple auto-encoder model. Additionally, obtaining larger retrieval regions by applying Gram-Schmidt orthogonalisation to the patterns is demonstrated.

Restricted Boltzmann machines (RBMs) with a binary visible layer of size N and a Gaussian hidden layer of size P have been proved to be equivalent to a Hopfield neural network (HNN) made of N binary neurons and storing P patterns xi, as long as the weights w in the former are identified with the patterns. Here we aim to leverage this equivalence to find effective initialisations for weights in the RBM when what is available is a set of noisy examples of each pattern, aiming to translate statistical mechanics background available for HNN to the study of RBM's learning and retrieval abilities. In particular, given a set of definite, structureless patterns we build a sample of blurred examples and prove that the initialisation where w corresponds to the empirical average xi over the sample is a fixed point under stochastic gradient descent. Further, as a toy application of the duality between HNN and RBM, we consider the simplest random auto-encoder (a three layer network made of two RBMs coupled by their hidden layer) and evidence that, as long as the parameter setting corresponds to the retrieval region of the dual HNN, reconstruction and denoising can be accomplished trivially, while when the system is in the spin-glass phase inference algorithms are necessary. This questions the need for larger retrieval regions which we obtain by applying a Gram-Schmidt orthogonalisation to the patterns: in fact, this procedure yields to a set of patterns devoid of correlations and for which the largest retrieval region can be accomplished. Finally we consider an application of duality also in a structured case: we test this approach on the MNIST dataset, and obtain that the network performs already similar to 67% of successful classifications, suggesting it can be exploited as a computationally-cheap pre-training. (C) 2021 Elsevier Ltd. All rights reserved.

On the effective initialisation for restricted Boltzmann machines via duality with Hopfield model

期刊

NEURAL NETWORKS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

On the effective initialisation for restricted Boltzmann machines via duality with Hopfield model

期刊

NEURAL NETWORKS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文