☆ 4.7 Review

Quantifying relevance in learning and inference

PHYSICS REPORTS-REVIEW SECTION OF PHYSICS LETTERS (2022)

期刊

PHYSICS REPORTS-REVIEW SECTION OF PHYSICS LETTERS

卷 963, 期 -, 页码 1-43

出版社

ELSEVIER

DOI: 10.1016/j.physrep.2022.03.001

关键词

Relevance; Statistical inference; Machine learning; Information theory

类别

Physics, Multidisciplinary

资金

Kavli Foundation, United States
Norwegian Research Council, Centre of Excellence scheme (Centre for Neural Computation) [223262]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Learning is a distinctive feature of intelligent behavior, but our conceptual understanding of learning is still poor. This article reviews recent progress in understanding learning based on the concept of relevance, which quantifies the amount of information contained in a dataset or the internal representation of a learning machine about the generative model of the data. The theoretical framework is supported by empirical analysis.

Learning is a distinctive feature of intelligent behaviour. High-throughput experimental data and Big Data promise to open new windows on complex systems such as cells, the brain or our societies. Yet, the puzzling success of Artificial Intelligence and Machine Learning shows that we still have a poor conceptual understanding of learning. These applications push statistical inference into uncharted territories where data is high dimensional and scarce, and prior information on true models is scant if not totally absent. Here we review recent progress on understanding learning, based on the notion of relevance . The relevance, as we define it here, quantifies the amount of information that a dataset or the internal representation of a learning machine contains on the generative model of the data. This allows us to define maximally informative samples, on one hand, and optimal learning machines on the other. These are ideal limits of samples and of machines, that contain the maximal amount of information about the unknown generative process, at a given resolution (or level of compression). Both ideal limits exhibit critical features in the statistical sense: Maximally informative samples are characterised by a power-law frequency distribution (statistical criticality) and optimal learning machines by an anomalously large susceptibility. The trade-off between resolution (i.e. compression) and relevance distinguishes the regime of noisy representations from that of lossy compression. These are separated by a special point characterised by Zipf's law statistics. This identifies samples obeying Zipf's law as the most compressed loss-less representations that are optimal in the sense of maximal relevance. Criticality in optimal learning machines manifests in an exponential degeneracy of energy levels, that leads to unusual thermodynamic properties. This distinctive feature is consistent with the invariance of the classification under coarse graining of the output, which is a desirable property of learning machines. This theoretical framework is corroborated by empirical analysis showing (i) how the concept of relevance can be useful to identify relevant variables in high-dimensional inference and (ii) that widely used machine learning architectures approach reasonably well the ideal limit of optimal learning machines, within the limits of the data with which they are trained. (c) 2022 The Authors. Published by Elsevier B.V.

Quantifying relevance in learning and inference

期刊

PHYSICS REPORTS-REVIEW SECTION OF PHYSICS LETTERS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Quantifying relevance in learning and inference

期刊

PHYSICS REPORTS-REVIEW SECTION OF PHYSICS LETTERS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文