ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.06377
  4. Cited By
STL-SGD: Speeding Up Local SGD with Stagewise Communication Period

STL-SGD: Speeding Up Local SGD with Stagewise Communication Period

11 June 2020
Shuheng Shen
Yifei Cheng
Jingchang Liu
Linli Xu
    LRM
ArXivPDFHTML

Papers citing "STL-SGD: Speeding Up Local SGD with Stagewise Communication Period"

7 / 7 papers shown
Title
Revisiting LocalSGD and SCAFFOLD: Improved Rates and Missing Analysis
Revisiting LocalSGD and SCAFFOLD: Improved Rates and Missing Analysis
Ruichen Luo
Sebastian U Stich
Samuel Horváth
Martin Takáč
38
0
0
08 Jan 2025
EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models
EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models
Jialiang Cheng
Ning Gao
Yun Yue
Zhiling Ye
Jiadi Jiang
Jian Sha
OffRL
77
0
0
10 Dec 2024
STSyn: Speeding Up Local SGD with Straggler-Tolerant Synchronization
STSyn: Speeding Up Local SGD with Straggler-Tolerant Synchronization
Feng Zhu
Jingjing Zhang
Xin Eric Wang
26
3
0
06 Oct 2022
Threats to Federated Learning: A Survey
Threats to Federated Learning: A Survey
Lingjuan Lyu
Han Yu
Qiang Yang
FedML
193
434
0
04 Mar 2020
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
281
2,888
0
15 Sep 2016
Linear Convergence of Gradient and Proximal-Gradient Methods Under the
  Polyak-Łojasiewicz Condition
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition
Hamed Karimi
J. Nutini
Mark W. Schmidt
133
1,198
0
16 Aug 2016
Optimal Distributed Online Prediction using Mini-Batches
Optimal Distributed Online Prediction using Mini-Batches
O. Dekel
Ran Gilad-Bachrach
Ohad Shamir
Lin Xiao
171
683
0
07 Dec 2010
1