ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2108.04755
17
29

FedPAGE: A Fast Local Stochastic Gradient Method for Communication-Efficient Federated Learning

10 August 2021
Haoyu Zhao
Zhize Li
Peter Richtárik
    FedML
ArXivPDFHTML
Abstract

Federated Averaging (FedAvg, also known as Local-SGD) (McMahan et al., 2017) is a classical federated learning algorithm in which clients run multiple local SGD steps before communicating their update to an orchestrating server. We propose a new federated learning algorithm, FedPAGE, able to further reduce the communication complexity by utilizing the recent optimal PAGE method (Li et al., 2021) instead of plain SGD in FedAvg. We show that FedPAGE uses much fewer communication rounds than previous local methods for both federated convex and nonconvex optimization. Concretely, 1) in the convex setting, the number of communication rounds of FedPAGE is O(N3/4Sϵ)O(\frac{N^{3/4}}{S\epsilon})O(SϵN3/4​), improving the best-known result O(NSϵ)O(\frac{N}{S\epsilon})O(SϵN​) of SCAFFOLD (Karimireddy et al.,2020) by a factor of N1/4N^{1/4}N1/4, where NNN is the total number of clients (usually is very large in federated learning), SSS is the sampled subset of clients in each communication round, and ϵ\epsilonϵ is the target error; 2) in the nonconvex setting, the number of communication rounds of FedPAGE is O(N+SSϵ2)O(\frac{\sqrt{N}+S}{S\epsilon^2})O(Sϵ2N​+S​), improving the best-known result O(N2/3S2/3ϵ2)O(\frac{N^{2/3}}{S^{2/3}\epsilon^2})O(S2/3ϵ2N2/3​) of SCAFFOLD (Karimireddy et al.,2020) by a factor of N1/6S1/3N^{1/6}S^{1/3}N1/6S1/3, if the sampled clients S≤NS\leq \sqrt{N}S≤N​. Note that in both settings, the communication cost for each round is the same for both FedPAGE and SCAFFOLD. As a result, FedPAGE achieves new state-of-the-art results in terms of communication complexity for both federated convex and nonconvex optimization.

View on arXiv
Comments on this paper