Journal
MACHINE LEARNING
Volume 110, Issue 2, Pages 393-416Publisher
SPRINGER
DOI: 10.1007/s10994-020-05929-w
Keywords
Neural networks; Regularisation; Lipschitz continuity
Categories
Ask authors/readers for more resources
The study investigates the effect of enforcing Lipschitz continuity of neural networks with respect to inputs, providing a technique for computing upper bound of Lipschitz constant for multiple p-norms. It formulates training with bounded Lipschitz constant as a constrained optimization problem and shows that resulting models outperform those trained with common regularizers. The study also demonstrates intuitive tuning of hyperparameters, impact of norm choice on model, and significant performance gains with limited training data.
We investigate the effect of explicitly enforcing the Lipschitz continuity of neural networks with respect to their inputs. To this end, we provide a simple technique for computing an upper bound to the Lipschitz constant-for multiple p-norms-of a feed forward neural network composed of commonly used layer types. Our technique is then used to formulate training a neural network with a bounded Lipschitz constant as a constrained optimisation problem that can be solved using projected stochastic gradient methods. Our evaluation study shows that the performance of the resulting models exceeds that of models trained with other common regularisers. We also provide evidence that the hyperparameters are intuitive to tune, demonstrate how the choice of norm for computing the Lipschitz constant impacts the resulting model, and show that the performance gains provided by our method are particularly noticeable when only a small amount of training data is available.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available