4.7 Article

Implicit Differentiation for Fast Hyperparameter Selection in Non-Smooth Convex Learning

Journal

JOURNAL OF MACHINE LEARNING RESEARCH
Volume 23, Issue -, Pages -

Publisher

MICROTOME PUBL

Keywords

Convex optimization; hyperparameter optimization; hyperparameter selec-tion; bilevel optimization; Lasso; generalized linear models

Funding

  1. ERC Starting Grant SLAB [ERC-StG-676943]
  2. ANR [GraVa ANR-18-CE40-0005]
  3. European Research Council [SLING 819789]

Ask authors/readers for more resources

This paper studies the optimization problem when the internal problem is convex but non-smooth and proposes two methods to accelerate computation. Experimental results show that this optimization method has computational advantages in hyperparameter optimization, especially when multiple hyperparameters are required.
Finding the optimal hyperparameters of a model can be cast as a bilevel optimization problem, typically solved using zero-order techniques. In this work we study first-order methods when the inner optimization problem is convex but non-smooth. We show that the forward-mode differentiation of proximal gradient descent and proximal coordinate descent yield sequences of Jacobians converging toward the exact Jacobian. Using implicit differentiation, we show it is possible to leverage the non-smoothness of the inner problem to speed up the computation. Finally, we provide a bound on the error made on the hypergradient when the inner optimization problem is solved approximately. Results on regression and classification problems reveal computational benefits for hyperparameter optimization, especially when multiple hyperparameters are required.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available