ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.04443
  4. Cited By
Revisiting LocalSGD and SCAFFOLD: Improved Rates and Missing Analysis
v1v2v3 (latest)

Revisiting LocalSGD and SCAFFOLD: Improved Rates and Missing Analysis

International Conference on Artificial Intelligence and Statistics (AISTATS), 2025
8 January 2025
Ruichen Luo
Sebastian U Stich
Samuel Horváth
Martin Takáč
ArXiv (abs)PDFHTMLGithub

Papers citing "Revisiting LocalSGD and SCAFFOLD: Improved Rates and Missing Analysis"

27 / 27 papers shown
The Limits and Potentials of Local SGD for Distributed Heterogeneous
  Learning with Intermittent Communication
The Limits and Potentials of Local SGD for Distributed Heterogeneous Learning with Intermittent Communication
Kumar Kshitij Patel
Margalit Glasgow
Ali Zindari
Lingxiao Wang
Sebastian U. Stich
Ziheng Cheng
Nirmit Joshi
Nathan Srebro
332
13
0
19 May 2024
Federated Optimization with Doubly Regularized Drift Correction
Federated Optimization with Doubly Regularized Drift Correction
Xiaowen Jiang
Anton Rodomanov
Sebastian U. Stich
FedML
275
17
0
12 Apr 2024
Stochastic Controlled Averaging for Federated Learning with
  Communication Compression
Stochastic Controlled Averaging for Federated Learning with Communication CompressionInternational Conference on Learning Representations (ICLR), 2023
Xinmeng Huang
Ping Li
Xiaoyun Li
456
283
0
16 Aug 2023
Lessons from Generalization Error Analysis of Federated Learning: You
  May Communicate Less Often!
Lessons from Generalization Error Analysis of Federated Learning: You May Communicate Less Often!International Conference on Machine Learning (ICML), 2023
Romain Chor
Abdellatif Zaidi
Milad Sefidgaran
Yijun Wan
FedML
326
11
0
09 Jun 2023
Understanding Generalization of Federated Learning via Stability:
  Heterogeneity Matters
Understanding Generalization of Federated Learning via Stability: Heterogeneity MattersInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Zhenyu Sun
Xiaochun Niu
Ermin Wei
FedMLMLT
288
35
0
06 Jun 2023
Why (and When) does Local SGD Generalize Better than SGD?
Why (and When) does Local SGD Generalize Better than SGD?International Conference on Learning Representations (ICLR), 2023
Xinran Gu
Kaifeng Lyu
Longbo Huang
Sanjeev Arora
312
32
0
02 Mar 2023
Faster federated optimization under second-order similarity
Faster federated optimization under second-order similarityInternational Conference on Learning Representations (ICLR), 2022
Ahmed Khaled
Chi Jin
FedML
330
26
0
06 Sep 2022
ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication
  Acceleration! Finally!
ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!International Conference on Machine Learning (ICML), 2022
Konstantin Mishchenko
Grigory Malinovsky
Sebastian U. Stich
Peter Richtárik
389
198
0
18 Feb 2022
Bias-Variance Reduced Local SGD for Less Heterogeneous Federated
  Learning
Bias-Variance Reduced Local SGD for Less Heterogeneous Federated LearningInternational Conference on Machine Learning (ICML), 2021
Tomoya Murata
Taiji Suzuki
FedML
278
61
0
05 Feb 2021
Flower: A Friendly Federated Learning Research Framework
Flower: A Friendly Federated Learning Research Framework
Daniel J. Beutel
Taner Topal
Akhil Mathur
Xinchi Qiu
Javier Fernandez-Marques
...
Lorenzo Sani
Kwing Hei Li
Titouan Parcollet
Pedro Porto Buarque de Gusmão
Nicholas D. Lane
FedML
790
1,232
0
28 Jul 2020
FedML: A Research Library and Benchmark for Federated Machine Learning
FedML: A Research Library and Benchmark for Federated Machine Learning
Chaoyang He
Songze Li
Jinhyun So
Xiao Zeng
Mi Zhang
...
Yang Liu
Ramesh Raskar
Qiang Yang
M. Annavaram
Salman Avestimehr
FedML
786
686
0
27 Jul 2020
STL-SGD: Speeding Up Local SGD with Stagewise Communication Period
STL-SGD: Speeding Up Local SGD with Stagewise Communication PeriodAAAI Conference on Artificial Intelligence (AAAI), 2020
Shuheng Shen
Yifei Cheng
Jingchang Liu
Linli Xu
LRM
313
12
0
11 Jun 2020
Minibatch vs Local SGD for Heterogeneous Distributed Learning
Minibatch vs Local SGD for Heterogeneous Distributed Learning
Blake E. Woodworth
Kumar Kshitij Patel
Nathan Srebro
FedML
583
219
0
08 Jun 2020
A Unified Theory of Decentralized SGD with Changing Topology and Local
  Updates
A Unified Theory of Decentralized SGD with Changing Topology and Local UpdatesInternational Conference on Machine Learning (ICML), 2020
Anastasia Koloskova
Nicolas Loizou
Sadra Boreiri
Martin Jaggi
Sebastian U. Stich
FedML
779
618
0
23 Mar 2020
Statistically Preconditioned Accelerated Gradient Method for Distributed
  Optimization
Statistically Preconditioned Accelerated Gradient Method for Distributed OptimizationInternational Conference on Machine Learning (ICML), 2020
Aymeric Dieuleveut
Lin Xiao
Sébastien Bubeck
Francis R. Bach
Laurent Massoulie
301
66
0
25 Feb 2020
Is Local SGD Better than Minibatch SGD?
Is Local SGD Better than Minibatch SGD?International Conference on Machine Learning (ICML), 2020
Blake E. Woodworth
Kumar Kshitij Patel
Sebastian U. Stich
Zhen Dai
Brian Bullins
H. B. McMahan
Ohad Shamir
Nathan Srebro
FedML
339
275
0
18 Feb 2020
Parallel Restarted SGD with Faster Convergence and Less Communication:
  Demystifying Why Model Averaging Works for Deep Learning
Parallel Restarted SGD with Faster Convergence and Less Communication: Demystifying Why Model Averaging Works for Deep LearningAAAI Conference on Artificial Intelligence (AAAI), 2018
Hao Yu
Sen Yang
Shenghuo Zhu
MoMeFedML
617
672
0
17 Jul 2018
Local SGD Converges Fast and Communicates Little
Local SGD Converges Fast and Communicates Little
Sebastian U. Stich
FedML
1.4K
1,227
0
24 May 2018
SGD and Hogwild! Convergence Without the Bounded Gradients Assumption
SGD and Hogwild! Convergence Without the Bounded Gradients Assumption
Lam M. Nguyen
Phuong Ha Nguyen
Marten van Dijk
Peter Richtárik
K. Scheinberg
Martin Takáč
333
246
0
11 Feb 2018
On the convergence properties of a $K$-step averaging stochastic
  gradient descent algorithm for nonconvex optimization
On the convergence properties of a KKK-step averaging stochastic gradient descent algorithm for nonconvex optimization
Fan Zhou
Guojing Cong
506
248
0
03 Aug 2017
Can Decentralized Algorithms Outperform Centralized Algorithms? A Case
  Study for Decentralized Parallel Stochastic Gradient Descent
Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent
Xiangru Lian
Ce Zhang
Huan Zhang
Cho-Jui Hsieh
Wei Zhang
Ji Liu
728
1,426
0
25 May 2017
Optimization Methods for Large-Scale Machine Learning
Optimization Methods for Large-Scale Machine Learning
Léon Bottou
Frank E. Curtis
J. Nocedal
1.1K
3,746
0
15 Jun 2016
Communication-Efficient Learning of Deep Networks from Decentralized
  Data
Communication-Efficient Learning of Deep Networks from Decentralized Data
H. B. McMahan
Eider Moore
Daniel Ramage
S. Hampson
Blaise Agüera y Arcas
FedML
1.8K
23,514
0
17 Feb 2016
SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly
  Convex Composite Objectives
SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite ObjectivesNeural Information Processing Systems (NeurIPS), 2014
Aaron Defazio
Francis R. Bach
Damien Scieur
ODL
937
1,970
0
01 Jul 2014
Communication Efficient Distributed Optimization using an Approximate
  Newton-type Method
Communication Efficient Distributed Optimization using an Approximate Newton-type MethodInternational Conference on Machine Learning (ICML), 2013
Ohad Shamir
Nathan Srebro
Tong Zhang
550
593
0
30 Dec 2013
Mini-Batch Primal and Dual Methods for SVMs
Mini-Batch Primal and Dual Methods for SVMsInternational Conference on Machine Learning (ICML), 2013
Martin Takáč
A. Bijral
Peter Richtárik
Nathan Srebro
257
195
0
10 Mar 2013
Optimal Distributed Online Prediction using Mini-Batches
Optimal Distributed Online Prediction using Mini-Batches
O. Dekel
Ran Gilad-Bachrach
Ohad Shamir
Lin Xiao
652
712
0
07 Dec 2010
1
Page 1 of 1