ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1809.02341
78
19
v1v2v3v4 (latest)

An Anderson-Chebyshev Mixing Method for Nonlinear Optimization

7 September 2018
Zhize Li
Jian Li
ArXiv (abs)PDFHTML
Abstract

Anderson mixing (or Anderson acceleration) is an efficient acceleration method for fixed point iterations (i.e., xt+1=G(xt)x_{t+1}=G(x_t)xt+1​=G(xt​)), e.g., gradient descent can be viewed as iteratively applying the operation G(x)=x−α∇f(x)G(x) = x-\alpha\nabla f(x)G(x)=x−α∇f(x). It is known that Anderson mixing is quite efficient in practice and can be viewed as an extension of Krylov subspace methods for nonlinear problems. First, we show that Anderson mixing with Chebyshev polynomial parameters can achieve the optimal convergence rate O(κln⁡1ϵ)O(\sqrt{\kappa}\ln\frac{1}{\epsilon})O(κ​lnϵ1​), which improves the previous result O(κln⁡1ϵ)O(\kappa\ln\frac{1}{\epsilon})O(κlnϵ1​) provided by [Toth and Kelley, 2015] for quadratic functions. Then, we provide a convergence analysis for minimizing general nonlinear problems. Besides, if the hyperparameters (e.g., the Lipschitz smooth parameter LLL) are not available, we propose a Guessing Algorithm for guessing them dynamically and also prove a similar convergence rate. Finally, the experimental results demonstrate that the proposed Anderson-Chebyshev mixing method converges significantly faster than other algorithms, e.g., vanilla gradient descent (GD), Nesterov's Accelerated GD. Also, these algorithms combined with the proposed guessing algorithm (guessing the hyperparameters dynamically) achieve much better performance.

View on arXiv
Comments on this paper