☆ 4.8 Article

Learning Representations for Neural Network-Based Classification Using the Information Bottleneck Principle

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2020)

Journal

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

Volume 42, Issue 9, Pages 2225-2239

Publisher

IEEE COMPUTER SOC

DOI: 10.1109/TPAMI.2019.2909031

Keywords

Training; Task analysis; Robustness; Cost function; Neurons; Neural networks; Deep learning; information bottleneck; representation learning; regularization; classification; neural networks; stochastic neural networks

Funding

German Federal Ministry of Education and Research
Erwin Schrodinger Fellowship of the Austrian Science Fund [J 3765]
Austrian COMET Program -Competence Centers for Excellent Technologies -under Austrian Federal Ministry of Transport, Innovation and Technology
Austrian Federal Ministry of Digital and Economic Affairs
State of Styria

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

In this theory paper, we investigate training deep neural networks (DNNs) for classification via minimizing the information bottleneck (IB) functional. We show that the resulting optimization problem suffers from two severe issues: First, for deterministic DNNs, either the IB functional is infinite for almost all values of network parameters, making the optimization problem ill-posed, or it is piecewise constant, hence not admitting gradient-based optimization methods. Second, the invariance of the IB functional under bijections prevents it from capturing properties of the learned representation that are desirable for classification, such as robustness and simplicity. We argue that these issues are partly resolved for stochastic DNNs, DNNs that include a (hard or soft) decision rule, or by replacing the IB functional with related, but more well-behaved cost functions. We conclude that recent successes reported about training DNNs using the IB framework must be attributed to such solutions. As a side effect, our results indicate limitations of the IB framework for the analysis of DNNs. We also note that rather than trying to repair the inherent problems in the IB functional, a better approach may be to design regularizers on latent representation enforcing the desired properties directly.

Learning Representations for Neural Network-Based Classification Using the Information Bottleneck Principle

Journal

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Learning Representations for Neural Network-Based Classification Using the Information Bottleneck Principle

Journal

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper