4.3 Article

RATE OF CONVERGENCE OF THE NESTEROV ACCELERATED GRADIENT METHOD IN THE SUBCRITICAL CASE α ≤ 3

Publisher

EDP SCIENCES S A
DOI: 10.1051/cocv/2017083

Keywords

Accelerated gradient method; FISTA; inertial forward-backward algorithms; Nesterov method; proximal-based methods; structured convex optimization; subcritical case; vanishing damping.

Ask authors/readers for more resources

In a Hilbert space setting H, given Phi : H -> R a convex continuously differentiable function, and alpha a positive parameter, we consider the inertial dynamic system with Asymptotic Vanishing Damping (AVD)(alpha) (sic)(t) + alpha/t(x) over dot (t) + del Phi(x(t)) = 0. Depending on the value of alpha with respect to 3, we give a complete picture of the convergence properties as t -> +infinity of the trajectories generated by (AVD)(alpha), as well as iterations of the corresponding algorithms. Indeed, as shown by Su-Boyd-Candes, the case alpha = 3 corresponds to a continuous version of the accelerated gradient method of Nesterov, with the rate of convergence Phi(x(t)) - min Phi = O(t(-2)) for alpha >= 3. Our main result concerns the subcritical case alpha <= 3, where we show that Phi(x(t)) - min Phi = O (t(-2/3 alpha)). This overall picture shows a continuous variation of the rate of convergence of the values Phi(x(t)) - min(H) Phi = O (t(-p(alpha))) with respect to alpha > 0: the coefficient p(alpha) increases linearly up to 2 when alpha goes from 0 to 3, then displays a plateau. Then we examine the convergence of trajectories to optimal solutions. As a new result, in the one-dimensional framework, for the critical value alpha = 3, we prove the convergence of the trajectories. In the second part of this paper, we study the convergence properties of the associated forward-backward inertial algorithms. They aim to solve structured convex minimization problems of the form min {Theta := Phi + Psi}, with Phi smooth and Psi nonsmooth. The continuous dynamics serves as a guideline for this study. We obtain a similar rate of convergence for the sequence of iterates (x(k)): for alpha <= 3 we have Theta(x(k)) - min Theta = O(k(-p)) for all p < 2 alpha/3, and for alpha > 3 Theta(x(k)) - min Theta = o(k(-2)). Finally, we show that the results are robust with respect to external perturbations.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available