ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.16126
22
0

Near-Optimal Distributed Minimax Optimization under the Second-Order Similarity

25 May 2024
Qihao Zhou
Haishan Ye
Luo Luo
ArXivPDFHTML
Abstract

This paper considers the distributed convex-concave minimax optimization under the second-order similarity. We propose stochastic variance-reduced optimistic gradient sliding (SVOGS) method, which takes the advantage of the finite-sum structure in the objective by involving the mini-batch client sampling and variance reduction. We prove SVOGS can achieve the ε\varepsilonε-duality gap within communication rounds of O(δD2/ε){\mathcal O}(\delta D^2/\varepsilon)O(δD2/ε), communication complexity of O(n+nδD2/ε){\mathcal O}(n+\sqrt{n}\delta D^2/\varepsilon)O(n+n​δD2/ε), and local gradient calls of O~(n+(nδ+L)D2/εlog⁡(1/ε))\tilde{\mathcal O}(n+(\sqrt{n}\delta+L)D^2/\varepsilon\log(1/\varepsilon))O~(n+(n​δ+L)D2/εlog(1/ε)), where nnn is the number of nodes, δ\deltaδ is the degree of the second-order similarity, LLL is the smoothness parameter and DDD is the diameter of the constraint set. We can verify that all of above complexity (nearly) matches the corresponding lower bounds. For the specific μ\muμ-strongly-convex-μ\muμ-strongly-convex case, our algorithm has the upper bounds on communication rounds, communication complexity, and local gradient calls of O(δ/μlog⁡(1/ε))\mathcal O(\delta/\mu\log(1/\varepsilon))O(δ/μlog(1/ε)), O((n+nδ/μ)log⁡(1/ε)){\mathcal O}((n+\sqrt{n}\delta/\mu)\log(1/\varepsilon))O((n+n​δ/μ)log(1/ε)), and O~(n+(nδ+L)/μ)log⁡(1/ε))\tilde{\mathcal O}(n+(\sqrt{n}\delta+L)/\mu)\log(1/\varepsilon))O~(n+(n​δ+L)/μ)log(1/ε)) respectively, which are also nearly tight. Furthermore, we conduct the numerical experiments to show the empirical advantages of proposed method.

View on arXiv
Comments on this paper