Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2411.06770
Cited By
v1
v2
v3 (latest)
Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis
11 November 2024
Zhijie Chen
Qiaobo Li
A. Banerjee
FedML
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis"
50 / 67 papers shown
Title
MARINA-P: Superior Performance in Non-smooth Federated Optimization with Adaptive Stepsizes
Igor Sokolov
Peter Richtárik
295
1
0
22 Dec 2024
Why Transformers Need Adam: A Hessian Perspective
Yushun Zhang
Congliang Chen
Tian Ding
Ziniu Li
Tian Ding
Zhimin Luo
340
76
0
26 Feb 2024
Error Feedback Reloaded: From Quadratic to Arithmetic Mean of Smoothness Constants
Peter Richtárik
Elnur Gasanov
Konstantin Burlachenko
132
5
0
16 Feb 2024
Correlation Aware Sparsified Mean Estimation Using Random Projection
Neural Information Processing Systems (NeurIPS), 2023
Shuli Jiang
Pranay Sharma
Gauri Joshi
284
2
0
29 Oct 2023
Momentum Provably Improves Error Feedback!
Neural Information Processing Systems (NeurIPS), 2023
Ilyas Fatkhullin
Alexander Tyurin
Peter Richtárik
285
37
0
24 May 2023
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
International Conference on Learning Representations (ICLR), 2023
Hong Liu
Zhiyuan Li
David Leo Wright Hall
Abigail Z. Jacobs
Tengyu Ma
VLM
562
216
0
23 May 2023
Revisiting Gradient Clipping: Stochastic bias and tight convergence guarantees
International Conference on Machine Learning (ICML), 2023
Anastasia Koloskova
Aymeric Dieuleveut
Sebastian U. Stich
441
85
0
02 May 2023
z
z
z
-SignFedAvg: A Unified Stochastic Sign-based Compression for Federated Learning
AAAI Conference on Artificial Intelligence (AAAI), 2023
Zhiwei Tang
Yanmeng Wang
Tsung-Hui Chang
FedML
266
22
0
06 Feb 2023
High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance
International Conference on Machine Learning (ICML), 2023
Abdurakhmon Sadiev
Marina Danilova
Eduard A. Gorbunov
Samuel Horváth
Gauthier Gidel
Pavel Dvurechensky
Alexander Gasnikov
Peter Richtárik
198
56
0
02 Feb 2023
Communication-Efficient Federated Learning for Heterogeneous Edge Devices Based on Adaptive Gradient Quantization
IEEE Conference on Computer Communications (INFOCOM), 2022
Heting Liu
Fang He
Guohong Cao
FedML
MQ
160
44
0
16 Dec 2022
Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth Channel and Vulnerability
International Conference on Machine Learning (ICML), 2022
Zhao Song
Yitan Wang
Zheng Yu
Licheng Zhang
FedML
268
31
0
15 Oct 2022
Taming Fat-Tailed ("Heavier-Tailed'' with Potentially Infinite Variance) Noise in Federated Learning
Neural Information Processing Systems (NeurIPS), 2022
Haibo Yang
Pei-Yuan Qiu
Jia Liu
FedML
311
17
0
03 Oct 2022
Efficient-Adam: Communication-Efficient Distributed Adam
IEEE Transactions on Signal Processing (IEEE Trans. Signal Process.), 2022
Congliang Chen
Li Shen
Wei Liu
Jianfeng Yao
143
25
0
28 May 2022
On Distributed Adaptive Optimization with Gradient Compression
International Conference on Learning Representations (ICLR), 2022
Xiaoyun Li
Belhal Karimi
Ping Li
149
31
0
11 May 2022
A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks
Neural Information Processing Systems (NeurIPS), 2022
Mingrui Liu
Zhenxun Zhuang
Yunwei Lei
Chunyang Liao
193
27
0
10 May 2022
ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!
International Conference on Machine Learning (ICML), 2022
Konstantin Mishchenko
Grigory Malinovsky
Sebastian U. Stich
Peter Richtárik
273
181
0
18 Feb 2022
3PC: Three Point Compressors for Communication-Efficient Distributed Training and a Better Theory for Lazy Aggregation
International Conference on Machine Learning (ICML), 2022
Peter Richtárik
Igor Sokolov
Ilyas Fatkhullin
Elnur Gasanov
Zhize Li
Eduard A. Gorbunov
186
33
0
02 Feb 2022
On the Power-Law Hessian Spectrums in Deep Learning
Zeke Xie
Qian-Yuan Tang
Yunfeng Cai
Mingming Sun
P. Li
ODL
191
11
0
31 Jan 2022
Communication-Compressed Adaptive Gradient Method for Distributed Nonconvex Optimization
International Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Yujia Wang
Lu Lin
Jinghui Chen
293
20
0
01 Nov 2021
Permutation Compressors for Provably Faster Distributed Nonconvex Optimization
Rafal Szlendak
Alexander Tyurin
Peter Richtárik
333
39
0
07 Oct 2021
High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails
Neural Information Processing Systems (NeurIPS), 2021
Ashok Cutkosky
Harsh Mehta
133
80
0
28 Jun 2021
On Large-Cohort Training for Federated Learning
Neural Information Processing Systems (NeurIPS), 2021
Zachary B. Charles
Zachary Garrett
Zhouyuan Huo
Sergei Shmulyian
Virginia Smith
FedML
214
117
0
15 Jun 2021
EF21: A New, Simpler, Theoretically Better, and Practically Faster Error Feedback
Neural Information Processing Systems (NeurIPS), 2021
Peter Richtárik
Igor Sokolov
Ilyas Fatkhullin
200
175
0
09 Jun 2021
FedNL: Making Newton-Type Methods Applicable to Federated Learning
International Conference on Machine Learning (ICML), 2021
M. Safaryan
Rustem Islamov
Xun Qian
Peter Richtárik
FedML
200
85
0
05 Jun 2021
DRIVE: One-bit Distributed Mean Estimation
Neural Information Processing Systems (NeurIPS), 2021
S. Vargaftik
Ran Ben-Basat
Amit Portnoy
Gal Mendelson
Y. Ben-Itzhak
Michael Mitzenmacher
OOD
FedML
601
58
0
18 May 2021
1-bit LAMB: Communication Efficient Large-Scale Large-Batch Training with LAMB's Convergence Speed
International Conference on High Performance Computing (HiPC), 2021
Conglong Li
A. A. Awan
Hanlin Tang
Samyam Rajbhandari
Yuxiong He
371
34
0
13 Apr 2021
Hessian Eigenspectra of More Realistic Nonlinear Models
Neural Information Processing Systems (NeurIPS), 2021
Zhenyu Liao
Michael W. Mahoney
293
38
0
02 Mar 2021
Learning Neural Network Subspaces
International Conference on Machine Learning (ICML), 2021
Mitchell Wortsman
Maxwell Horton
Carlos Guestrin
Ali Farhadi
Mohammad Rastegari
UQCV
325
96
0
20 Feb 2021
MARINA: Faster Non-Convex Distributed Learning with Compression
International Conference on Machine Learning (ICML), 2021
Eduard A. Gorbunov
Konstantin Burlachenko
Zhize Li
Peter Richtárik
310
121
0
15 Feb 2021
1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed
International Conference on Machine Learning (ICML), 2021
Hanlin Tang
Shaoduo Gan
A. A. Awan
Samyam Rajbhandari
Conglong Li
Xiangru Lian
Ji Liu
Ce Zhang
Yuxiong He
AI4CE
271
97
0
04 Feb 2021
Improving Neural Network Training in Low Dimensional Random Bases
Frithjof Gressmann
Zach Eaton-Rosen
Carlo Luschi
180
32
0
09 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
1.3K
54,451
0
22 Oct 2020
A High Probability Analysis of Adaptive SGD with Momentum
Xiaoyun Li
Francesco Orabona
301
75
0
28 Jul 2020
CSER: Communication-efficient SGD with Error Reset
Neural Information Processing Systems (NeurIPS), 2020
Cong Xie
Shuai Zheng
Oluwasanmi Koyejo
Indranil Gupta
Mu Li
Yanghua Peng
254
42
0
26 Jul 2020
FetchSGD: Communication-Efficient Federated Learning with Sketching
International Conference on Machine Learning (ICML), 2020
D. Rothchild
Ashwinee Panda
Enayat Ullah
Nikita Ivkin
Ion Stoica
Vladimir Braverman
Joseph E. Gonzalez
Raman Arora
FedML
200
406
0
15 Jul 2020
Federated Learning with Compression: Unified Analysis and Sharp Guarantees
Farzin Haddadpour
Mohammad Mahdi Kamani
Aryan Mokhtari
M. Mahdavi
FedML
419
310
0
02 Jul 2020
Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient Clipping
Eduard A. Gorbunov
Marina Danilova
Alexander Gasnikov
199
141
0
21 May 2020
Quantized Adam with Error Feedback
ACM Transactions on Intelligent Systems and Technology (ACM TIST), 2020
Congliang Chen
Li Shen
Haozhi Huang
Wei Liu
ODL
MQ
135
38
0
29 Apr 2020
On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration
Annual Conference Computational Learning Theory (COLT), 2020
Wenlong Mou
C. J. Li
Martin J. Wainwright
Peter L. Bartlett
Sai Li
201
89
0
09 Apr 2020
Adaptive Federated Optimization
International Conference on Learning Representations (ICLR), 2020
Sashank J. Reddi
Zachary B. Charles
Manzil Zaheer
Zachary Garrett
Keith Rush
Jakub Konecný
Sanjiv Kumar
H. B. McMahan
FedML
611
1,741
0
29 Feb 2020
On Biased Compression for Distributed Learning
Journal of machine learning research (JMLR), 2020
Aleksandr Beznosikov
Samuel Horváth
Peter Richtárik
M. Safaryan
288
214
0
27 Feb 2020
PyHessian: Neural Networks Through the Lens of the Hessian
Z. Yao
A. Gholami
Kurt Keutzer
Michael W. Mahoney
ODL
309
335
0
16 Dec 2019
FedPAQ: A Communication-Efficient Federated Learning Method with Periodic Averaging and Quantization
International Conference on Artificial Intelligence and Statistics (AISTATS), 2019
Amirhossein Reisizadeh
Aryan Mokhtari
Hamed Hassani
Ali Jadbabaie
Ramtin Pedarsani
FedML
676
865
0
28 Sep 2019
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
SDM (SDM), 2019
Xinyan Li
Qilong Gu
Yingxue Zhou
Tiancong Chen
A. Banerjee
ODL
192
54
0
24 Jul 2019
Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification, and Local Computations
IEEE Journal on Selected Areas in Information Theory (JSAIT), 2019
Debraj Basu
Deepesh Data
C. Karakuş
Suhas Diggavi
MQ
244
432
0
06 Jun 2019
Communication-Efficient Distributed Blockwise Momentum SGD with Error-Feedback
Neural Information Processing Systems (NeurIPS), 2019
Shuai Zheng
Ziyue Huang
James T. Kwok
187
117
0
27 May 2019
Sub-Weibull distributions: generalizing sub-Gaussian and sub-Exponential properties to heavier-tailed distributions
M. Vladimirova
Stéphane Girard
Hien Nguyen
Julyan Arbel
377
105
0
13 May 2019
On the Convergence of Adam and Beyond
Sashank J. Reddi
Satyen Kale
Surinder Kumar
939
2,760
0
19 Apr 2019
Communication-efficient distributed SGD with Sketching
Nikita Ivkin
D. Rothchild
Enayat Ullah
Vladimir Braverman
Ion Stoica
R. Arora
FedML
255
217
0
12 Mar 2019
Compressing Gradient Optimizers via Count-Sketches
International Conference on Machine Learning (ICML), 2019
Ryan Spring
Anastasios Kyrillidis
Vijai Mohan
Anshumali Shrivastava
132
38
0
01 Feb 2019
1
2
Next