ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.11333
23
14

ANITA: An Optimal Loopless Accelerated Variance-Reduced Gradient Method

21 March 2021
Zhize Li
ArXivPDFHTML
Abstract

In this paper, we propose a novel accelerated gradient method called ANITA for solving the fundamental finite-sum optimization problems. Concretely, we consider both general convex and strongly convex settings: i) For general convex finite-sum problems, ANITA improves previous state-of-the-art result given by Varag (Lan et al., 2019). In particular, for large-scale problems or the convergence error is not very small, i.e., n≥1ϵ2n \geq \frac{1}{\epsilon^2}n≥ϵ21​, ANITA obtains the \emph{first} optimal result O(n)O(n)O(n), matching the lower bound Ω(n)\Omega(n)Ω(n) provided by Woodworth and Srebro (2016), while previous results are O(nlog⁡1ϵ)O(n \log \frac{1}{\epsilon})O(nlogϵ1​) of Varag (Lan et al., 2019) and O(nϵ)O(\frac{n}{\sqrt{\epsilon}})O(ϵ​n​) of Katyusha (Allen-Zhu, 2017). ii) For strongly convex finite-sum problems, we also show that ANITA can achieve the optimal convergence rate O((n+nLμ)log⁡1ϵ)O\big((n+\sqrt{\frac{nL}{\mu}})\log\frac{1}{\epsilon}\big)O((n+μnL​​)logϵ1​) matching the lower bound Ω((n+nLμ)log⁡1ϵ)\Omega\big((n+\sqrt{\frac{nL}{\mu}})\log\frac{1}{\epsilon}\big)Ω((n+μnL​​)logϵ1​) provided by Lan and Zhou (2015). Besides, ANITA enjoys a simpler loopless algorithmic structure unlike previous accelerated algorithms such as Varag (Lan et al., 2019) and Katyusha (Allen-Zhu, 2017) where they use double-loop structures. Moreover, we provide a novel \emph{dynamic multi-stage convergence analysis}, which is the key technical part for improving previous results to the optimal rates. We believe that our new theoretical rates and novel convergence analysis for the fundamental finite-sum problem will directly lead to key improvements for many other related problems, such as distributed/federated/decentralized optimization problems (e.g., Li and Richt\árik, 2021). Finally, the numerical experiments show that ANITA converges faster than the previous state-of-the-art Varag (Lan et al., 2019), validating our theoretical results and confirming the practical superiority of ANITA.

View on arXiv
Comments on this paper