4.5 Article

Convex and concave envelopes of artificial neural network activation functions for deterministic global optimization

Journal

JOURNAL OF GLOBAL OPTIMIZATION
Volume -, Issue -, Pages -

Publisher

SPRINGER
DOI: 10.1007/s10898-022-01228-x

Keywords

Artificial neural networks; Machine learning; Deterministic global optimization; Factorable programming; McCormick relaxations; Envelopes; Julia programming

Funding

  1. National Science Foundation [1932723]
  2. Div Of Chem, Bioeng, Env, & Transp Sys
  3. Directorate For Engineering [1932723] Funding Source: National Science Foundation

Ask authors/readers for more resources

This study introduces general methods to construct convex/concave relaxations of commonly used activation functions for artificial neural networks (ANNs) to improve optimization performance. The developed library of activation function envelopes leads to tighter relaxations of ANNs, resulting in a significant reduction in computational time required for solving optimization problems.
In this work, we present general methods to construct convex/concave relaxations of the activation functions that are commonly chosen for artificial neural networks (ANNs). The choice of these functions is often informed by both broader modeling considerations balanced with a need for high computational performance. The direct application of factorable programming techniques to compute bounds and convex/concave relaxations of such functions often lead to weak enclosures due to the dependency problem. Moreover, the piecewise formulation that defines several popular activation functions, prevents the computation of convex/concave relaxations as they violate the factorable function requirement. To improve the performance of relaxations of ANNs for deterministic global optimization applications, this study presents the development of a library of envelopes of the thoroughly studied rectifier-type and sigmoid activation functions, in addition to the novel self-gated sigmoid-weighted linear unit (SiLU) and Gaussian error linear unit activation functions. We demonstrate that the envelopes of activation functions directly lead to tighter relaxations of ANNs on their input domain. In turn, these improvements translate to a dramatic reduction in CPU runtime required for solving optimization problems involving ANN models to epsilon-global optimality. We further demonstrate that the factorable programming approach leads to superior computational performance over alternative state-of-the-art approaches.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available