Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1808.07217
Cited By
v1
v2
v3
v4
v5
v6 (latest)
Don't Use Large Mini-Batches, Use Local SGD
22 August 2018
Tao Lin
Sebastian U. Stich
Kumar Kshitij Patel
Martin Jaggi
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Don't Use Large Mini-Batches, Use Local SGD"
50 / 280 papers shown
Title
Nesterov-Accelerated Robust Federated Learning Over Byzantine Adversaries
Lihan Xu
Yanjie Dong
Gang Wang
Runhao Zeng
Xiaoyi Fan
Xiping Hu
FedML
AAML
172
0
0
04 Nov 2025
DYNAMIX: RL-based Adaptive Batch Size Optimization in Distributed Machine Learning Systems
Yuanjun Dai
Keqiang He
An Wang
80
0
0
09 Oct 2025
MT-DAO: Multi-Timescale Distributed Adaptive Optimizers with Local Updates
Alex Iacob
Andrej Jovanovic
M. Safaryan
Meghdad Kurmanji
Lorenzo Sani
Samuel Horváth
William F. Shen
Xinchi Qiu
Nicholas D. Lane
AI4CE
134
0
0
06 Oct 2025
Understanding Outer Optimizers in Local SGD: Learning Rates, Momentum, and Acceleration
Ahmed Khaled
Satyen Kale
Arthur Douillard
Chi Jin
Rob Fergus
Manzil Zaheer
134
1
0
12 Sep 2025
On Using Large-Batches in Federated Learning
Sahil Tyagi
FedML
90
0
0
05 Sep 2025
Communication Efficient LLM Pre-training with SparseLoCo
Amir Sarfi
Benjamin Thérien
Joel Lidin
Eugene Belilovsky
92
1
0
21 Aug 2025
Cooperative SGD with Dynamic Mixing Matrices
Soumya Sarkar
Shweta Jain
125
0
0
20 Aug 2025
FedEve: On Bridging the Client Drift and Period Drift for Cross-device Federated Learning
Tao Shen
Zexi Li
Didi Zhu
Ziyu Zhao
Chao-Xiang Wu
Fei Wu
FedML
120
0
0
20 Aug 2025
FedMP: Tackling Medical Feature Heterogeneity in Federated Learning from a Manifold Perspective
Zhekai Zhou
Shudong Liu
Zhaokun Zhou
Yang Liu
Qiang Yang
Yuesheng Zhu
Guibo Luo
96
0
0
07 Aug 2025
Communication-Efficient Distributed Training for Collaborative Flat Optima Recovery in Deep Learning
Tolga Dimlioglu
A. Choromańska
FedML
242
1
0
27 Jul 2025
HASFL: Heterogeneity-aware Split Federated Learning over Edge Computing Systems
Zheng Lin
Zhe Chen
Xianhao Chen
Wei Ni
Yue Gao
FedML
195
9
0
10 Jun 2025
MuLoCo: Muon is a practical inner optimizer for DiLoCo
Benjamin Thérien
Xiaolong Huang
Irina Rish
Eugene Belilovsky
MoE
157
6
0
29 May 2025
Sharp Gaussian approximations for Decentralized Federated Learning
Soham Bonnerjee
Sayar Karmakar
Wei Biao Wu
FedML
282
0
0
12 May 2025
Pseudo-Asynchronous Local SGD: Robust and Efficient Data-Parallel Training
Hiroki Naganuma
Xinzhi Zhang
Man-Chung Yue
Ioannis Mitliagkas
Philipp A. Witte
Russell J. Hewett
Yin Tat Lee
440
1
0
25 Apr 2025
Federated Learning for Medical Image Classification: A Comprehensive Benchmark
Zhekai Zhou
Guibo Luo
Mingzhi Chen
Zhenyu Weng
Yuesheng Zhu
FedML
278
8
0
07 Apr 2025
Convergence Analysis of Federated Learning Methods Using Backward Error Analysis
AAAI Conference on Artificial Intelligence (AAAI), 2025
Jinwoo Lim
Suhyun Kim
Soo-Mook Moon
FedML
249
0
0
05 Mar 2025
Tackling Feature and Sample Heterogeneity in Decentralized Multi-Task Learning: A Sheaf-Theoretic Approach
Chaouki Ben Issaid
Praneeth Vepakomma
Mehdi Bennis
433
7
0
03 Feb 2025
FedSat: A Statistical Aggregation Approach for Class Imbalanced Clients in Federated Learning
S. Chowdhury
Raju Halder
FedML
234
2
0
31 Dec 2024
A Unified Analysis of Federated Learning with Arbitrary Client Participation
Neural Information Processing Systems (NeurIPS), 2022
Maroun Touma
Mingyue Ji
FedML
526
76
0
31 Dec 2024
EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models
International Conference on Learning Representations (ICLR), 2024
Jialiang Cheng
Ning Gao
Yun Yue
Zhiling Ye
Jiadi Jiang
Jian Sha
OffRL
372
1
0
10 Dec 2024
FedDUAL: A Dual-Strategy with Adaptive Loss and Dynamic Aggregation for Mitigating Data Heterogeneity in Federated Learning
Pranab Sahoo
Ashutosh Tripathi
Sriparna Saha
S. Mondal
227
1
0
05 Dec 2024
Task Arithmetic Through The Lens Of One-Shot Federated Learning
Zhixu Tao
I. Mason
Sanjeev R. Kulkarni
Xavier Boix
MoMe
FedML
409
9
0
27 Nov 2024
Distributed Sign Momentum with Local Steps for Training Transformers
Shuhua Yu
Ding Zhou
Cong Xie
An Xu
Zhi-Li Zhang
Xin Liu
S. Kar
278
0
0
26 Nov 2024
Photon: Federated LLM Pre-Training
Lorenzo Sani
Alex Iacob
Zeyu Cao
Royson Lee
Bill Marino
...
Dongqi Cai
Zexi Li
Wanru Zhao
Xinchi Qiu
Nicholas D. Lane
AI4CE
296
14
0
05 Nov 2024
Enhancing Federated Learning Convergence with Dynamic Data Queue and Data Entropy-driven Participant Selection
IEEE Internet of Things Journal (IEEE IoT J.), 2024
Charuka Herath
Xiaolan Liu
S. Lambotharan
Y. Rahulamathavan
FedML
177
7
0
23 Oct 2024
SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Neural Information Processing Systems (NeurIPS), 2024
Jinda Jia
Cong Xie
Hanlin Lu
Daoce Wang
Hao Feng
...
Baixi Sun
Yanghua Peng
Zhi-Li Zhang
Xin Liu
Dingwen Tao
MQ
235
10
0
20 Oct 2024
On the Convergence of (Stochastic) Gradient Descent for Kolmogorov--Arnold Networks
Yihang Gao
Vincent Y. F. Tan
ODL
124
2
0
10 Oct 2024
DEPT: Decoupled Embeddings for Pre-training Language Models
International Conference on Learning Representations (ICLR), 2024
Alex Iacob
Lorenzo Sani
Meghdad Kurmanji
William F. Shen
Xinchi Qiu
Dongqi Cai
Yan Gao
Nicholas D. Lane
VLM
1.3K
2
0
07 Oct 2024
Can We Theoretically Quantify the Impacts of Local Updates on the Generalization Performance of Federated Learning?
ACM Interational Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc), 2024
Peizhong Ju
Haibo Yang
Jia Liu
Yingbin Liang
Ness B. Shroff
FedML
276
1
0
05 Sep 2024
FADAS: Towards Federated Adaptive Asynchronous Optimization
Yujia Wang
Shiqiang Wang
Songtao Lu
Jinghui Chen
FedML
185
11
0
25 Jul 2024
A New Theoretical Perspective on Data Heterogeneity in Federated Optimization
Jiayi Wang
Maroun Touma
Rong-Rong Chen
Mingyue Ji
FedML
179
3
0
22 Jul 2024
Personalized Multi-tier Federated Learning
Sourasekhar Banerjee
Ali Dadras
A. Yurtsever
Monowar Bhuyan
FedML
213
4
0
19 Jul 2024
On the Trade-off between Flatness and Optimization in Distributed Learning
Ying Cao
Zhaoxian Wu
Kun Yuan
Ali H. Sayed
392
3
0
28 Jun 2024
Communication-Efficient Adaptive Batch Size Strategies for Distributed Local Gradient Methods
Tim Tsz-Kit Lau
Weijian Li
Chenwei Xu
Han Liu
Mladen Kolar
288
3
0
20 Jun 2024
Batch-in-Batch: a new adversarial training framework for initial perturbation and sample selection
Yinting Wu
Pai Peng
Bo Cai
Le Li
.
AAML
211
0
0
06 Jun 2024
Communication-Efficient Distributed Deep Learning via Federated Dynamic Averaging
Michail Theologitis
Georgios Frangias
Georgios Anestis
V. Samoladas
Antonios Deligiannakis
FedML
386
2
0
31 May 2024
Full-Stack Allreduce on Multi-Rail Networks
Enda Yu
Dezun Dong
Xiangke Liao
GNN
152
1
0
28 May 2024
WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average
Louis Fournier
Adel Nabli
Masih Aminbeidokhti
M. Pedersoli
Eugene Belilovsky
Edouard Oyallon
MoMe
FedML
285
7
0
27 May 2024
Client2Vec: Improving Federated Learning by Distribution Shifts Aware Client Indexing
Yongxin Guo
Lin Wang
Xiaoying Tang
Tao Lin
FedML
OOD
345
0
0
25 May 2024
Efficiency for Free: Ideal Data Are Transportable Representations
Neural Information Processing Systems (NeurIPS), 2024
Peng Sun
Yi Jiang
Tao Lin
DD
350
2
0
23 May 2024
Worldwide Federated Training of Language Models
Alexandru Iacob
Lorenzo Sani
Bill Marino
Preslav Aleksandrov
William F. Shen
Nicholas D. Lane
FedML
327
5
0
23 May 2024
The Limits and Potentials of Local SGD for Distributed Heterogeneous Learning with Intermittent Communication
Kumar Kshitij Patel
Margalit Glasgow
Ali Zindari
Lingxiao Wang
Sebastian U. Stich
Ziheng Cheng
Nirmit Joshi
Nathan Srebro
210
11
0
19 May 2024
The Future of Large Language Model Pre-training is Federated
Lorenzo Sani
Alexandru Iacob
Zeyu Cao
Bill Marino
Yan Gao
...
Wanru Zhao
William F. Shen
Preslav Aleksandrov
Xinchi Qiu
Nicholas D. Lane
AI4CE
422
37
0
17 May 2024
AB-Training: A Communication-Efficient Approach for Distributed Low-Rank Learning
D. Coquelin
Katherina Flügel
Marie Weiel
Nicholas Kiefer
Muhammed Öz
Charlotte Debus
Achim Streit
Markus Goetz
304
0
0
02 May 2024
Improved Generalization Bounds for Communication Efficient Federated Learning
Peyman Gholami
H. Seferoglu
FedML
AI4CE
318
6
0
17 Apr 2024
Communication-Efficient Large-Scale Distributed Deep Learning: A Comprehensive Survey
Feng Liang
Zhen Zhang
Haifeng Lu
Victor C. M. Leung
Yanyi Guo
Xiping Hu
GNN
307
20
0
09 Apr 2024
AdaptSFL: Adaptive Split Federated Learning in Resource-constrained Edge Networks
Zhengyi Lin
Guanqiao Qu
Wei Wei
Xianhao Chen
Kin K. Leung
448
76
0
19 Mar 2024
On the Convergence of Federated Learning Algorithms without Data Similarity
Ali Beikmohammadi
Sarit Khirirat
Sindri Magnússon
FedML
268
4
0
29 Feb 2024
Training Neural Networks from Scratch with Parallel Low-Rank Adapters
Minyoung Huh
Brian Cheung
Jeremy Bernstein
Phillip Isola
Pulkit Agrawal
242
14
0
26 Feb 2024
CO2: Efficient Distributed Training with Full Communication-Computation Overlap
Weigao Sun
Zhen Qin
Weixuan Sun
Shidi Li
Dong Li
Xuyang Shen
Yu Qiao
Yiran Zhong
OffRL
222
14
0
29 Jan 2024
1
2
3
4
5
6
Next