270

Adaptive Accelerated Gradient Converging Methods under Holderian Error Bound Condition

Abstract

In this paper, we focus our study on the convergence of (proximal) gradient methods and accelerated (proximal) gradient methods for smooth (composite) optimization under a H\"{o}lderian error bound (HEB) condition. We first show that proximal gradient (PG) method is automatically adaptive to HEB while accelerated proximal gradient (APG) method can be adaptive to HEB by restart with an improved iteration complexity. However, the number of iterations to restart APG hinges on a possibly unknown parameter. To address this issue, we propose to develop adaptive gradient converging methods, i.e., using the magnitude of gradient as a criterion for restart and termination. We develop adaptive accelerated gradient converging (adaAGC) methods for solving smooth (composite) optimization under the HEB condition with an explicit exponent θ\theta, and establish an better iteration complexity than PG. Furthermore, we demonstrate that these results have important implication and applications in machine learning: (i) if the considered objective function is coercive and semi-algebraic, PG's convergence speed is essentially o(1t)o(\frac{1}{t}), where tt is the total number of iterations; (ii) if the objective function consists of an 1\ell_1, \ell_\infty, 1,\ell_{1,\infty}, or huber norm regularization and a convex smooth piecewise quadratic loss, which includes least-squares loss, smoothed hinge loss and huber loss, the proposed adaAGC is parameter-free and enjoys a faster linear convergence than PG without any other assumptions (e.g., restricted eigen-value condition). It is notable that for all aforementioned problems, our linear convergence results are global instead of local. To the best of our knowledge, these improved results are the first shown in this work.

View on arXiv
Comments on this paper