ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2411.06770
  4. Cited By
Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis
v1v2v3 (latest)

Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis

11 November 2024
Zhijie Chen
Qiaobo Li
A. Banerjee
    FedML
ArXiv (abs)PDFHTML

Papers citing "Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis"

17 / 67 papers shown
An Investigation into Neural Net Optimization via Hessian Eigenvalue
  Density
An Investigation into Neural Net Optimization via Hessian Eigenvalue Density
Behrooz Ghorbani
Shankar Krishnan
Ying Xiao
ODL
343
363
0
29 Jan 2019
A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural
  Networks
A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks
Umut Simsekli
Levent Sagun
Mert Gurbuzbalaban
473
287
0
18 Jan 2019
Tight Analyses for Non-Smooth Stochastic Gradient Descent
Tight Analyses for Non-Smooth Stochastic Gradient Descent
Nicholas J. A. Harvey
Christopher Liaw
Y. Plan
Sikander Randhawa
187
152
0
13 Dec 2018
Adaptive Communication Strategies to Achieve the Best Error-Runtime
  Trade-off in Local-Update SGD
Adaptive Communication Strategies to Achieve the Best Error-Runtime Trade-off in Local-Update SGD
Jianyu Wang
Gauri Joshi
FedML
202
245
0
19 Oct 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
2.9K
108,326
0
11 Oct 2018
The Convergence of Sparsified Gradient Methods
The Convergence of Sparsified Gradient MethodsNeural Information Processing Systems (NeurIPS), 2018
Dan Alistarh
Torsten Hoefler
M. Johansson
Sarit Khirirat
Nikola Konstantinov
Cédric Renggli
350
527
0
27 Sep 2018
Sparsified SGD with Memory
Sparsified SGD with Memory
Sebastian U. Stich
Jean-Baptiste Cordonnier
Martin Jaggi
357
832
0
20 Sep 2018
Error Compensated Quantized SGD and its Applications to Large-scale
  Distributed Optimization
Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization
Jiaxiang Wu
Weidong Huang
Junzhou Huang
Tong Zhang
226
245
0
21 Jun 2018
Local SGD Converges Fast and Communicates Little
Local SGD Converges Fast and Communicates Little
Sebastian U. Stich
FedML
1.1K
1,187
0
24 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.7K
8,043
0
20 Apr 2018
Group Normalization
Group Normalization
Yuxin Wu
Kaiming He
581
4,115
0
22 Mar 2018
signSGD: Compressed Optimisation for Non-Convex Problems
signSGD: Compressed Optimisation for Non-Convex Problems
Jeremy Bernstein
Yu Wang
Kamyar Azizzadenesheli
Anima Anandkumar
FedMLODL
531
1,178
0
13 Feb 2018
Deep Gradient Compression: Reducing the Communication Bandwidth for
  Distributed Training
Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training
Chengyue Wu
Song Han
Huizi Mao
Yu Wang
W. Dally
579
1,533
0
05 Dec 2017
Gradient Sparsification for Communication-Efficient Distributed
  Optimization
Gradient Sparsification for Communication-Efficient Distributed OptimizationNeural Information Processing Systems (NeurIPS), 2017
Jianqiao Wangni
Jialei Wang
Ji Liu
Tong Zhang
282
572
0
26 Oct 2017
Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond
Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond
Levent Sagun
Léon Bottou
Yann LeCun
UQCV
279
256
0
22 Nov 2016
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic OptimizationInternational Conference on Learning Representations (ICLR), 2014
Diederik P. Kingma
Jimmy Ba
ODL
4.7K
161,471
0
22 Dec 2014
Making Gradient Descent Optimal for Strongly Convex Stochastic
  Optimization
Making Gradient Descent Optimal for Strongly Convex Stochastic OptimizationInternational Conference on Machine Learning (ICML), 2011
Alexander Rakhlin
Ohad Shamir
Karthik Sridharan
827
797
0
26 Sep 2011
Previous
12