4.7 Article

On the number of regions of piecewise linear neural networks

相关参考文献

注意:仅列出部分参考文献,下载原文获取全部文献信息。
Article Mathematics, Applied

Stable parameterization of continuous and piecewise-linear functions

Alexis Goujon et al.

Summary: This paper investigates an alternative representation of continuous and piecewise-linear functions using local hat basis functions for low-dimensional regression problems. The necessary and sufficient condition for the basis functions to form a Riesz basis is provided, ensuring a stable and unique link between the parameters and the CPWL function. The estimation of the Lipschitz constant of the CPWL mapping is also discussed.

APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS (2023)

Article Mathematics, Applied

Approximation of Lipschitz Functions Using Deep Spline Neural Networks*

Sebastian Neumayer et al.

Summary: Although Lipschitz-constrained neural networks are widely used in machine learning, designing and training expressive Lipschitz-constrained networks is challenging. To overcome the disadvantages of rectified linear-unit networks, we propose using learnable spline activation functions with three linear regions or more. We prove that our choice is universal among all 1-Lipschitz activation functions and can approximate a larger class of functions compared to other weight-constrained architectures. Our choice is also at least as expressive as the non-componentwise Groupsort activation function for spectral-norm-constrained weights. The theoretical findings align with prior numerical results.

SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE (2023)

Article Mathematics, Applied

Sharp Bounds for the Number of Regions of Maxout Networks and Vertices of Minkowski Sums

Guido Montúfar et al.

SIAM Journal on Applied Algebra and Geometry (2022)

Article Automation & Control Systems

Training Robust Neural Networks Using Lipschitz Bounds

Patricia Pauli et al.

Summary: In this study, a framework for training multi-layer neural networks with increased robustness is proposed. By minimizing the Lipschitz constant and using a semidefinite programming based training procedure, the framework successfully enhances the robustness of neural networks. Two examples are provided to demonstrate the effectiveness of the proposed framework.

IEEE CONTROL SYSTEMS LETTERS (2022)

Article Engineering, Electrical & Electronic

Mad Max: Affine Spline Insights Into Deep Learning

Randall Balestriero et al.

Summary: The study establishes a rigorous connection between deep networks and approximation theory using spline functions and operators, introducing the concept of max-affine spline operators (MASOs). By analyzing the inner workings of DNs, exploring signal comparison, optimal classification theory, and data memorization effects, the research provides insights into the organization of signals in a hierarchical manner. Additionally, a penalty term is proposed to enhance classification performance by forcing templates to be orthogonal, without altering the DN architecture.

PROCEEDINGS OF THE IEEE (2021)

Article Engineering, Electrical & Electronic

Learning Activation Functions in Deep (Spline) Neural Networks

Pakshal Bohra et al.

IEEE OPEN JOURNAL OF SIGNAL PROCESSING (2020)

Article Computer Science, Information Systems

A Framework for the Construction of Upper Bounds on the Number of Affine Linear Regions of ReLU Feed-Forward Neural Networks

Peter Hinz et al.

IEEE TRANSACTIONS ON INFORMATION THEORY (2019)

Review Automation & Control Systems

Why and When Can Deep-but Not Shallow-networks Avoid the Curse of Dimensionality: A Review

Tomaso Poggio et al.

INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING (2017)

Article Mathematics, Applied

Deep vs. shallow networks: An approximation theory perspective

H. N. Mhaskar et al.

ANALYSIS AND APPLICATIONS (2016)

Review Multidisciplinary Sciences

Deep learning

Yann LeCun et al.

NATURE (2015)

Article Computer Science, Artificial Intelligence

Learning Deep Architectures for AI

Yoshua Bengio

FOUNDATIONS AND TRENDS IN MACHINE LEARNING (2009)

Article Computer Science, Information Systems

Generalization of hinging hyperplanes

SN Wang et al.

IEEE TRANSACTIONS ON INFORMATION THEORY (2005)