v1v2v3 (latest)

Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis

11 November 2024

Papers citing "Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis"

17 / 67 papers shown

An Investigation into Neural Net Optimization via Hessian Eigenvalue Density

343

363

29 Jan 2019

A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks

Umut Simsekli

Levent Sagun

Mert Gurbuzbalaban

473

287

18 Jan 2019

Tight Analyses for Non-Smooth Stochastic Gradient Descent

Nicholas J. A. Harvey

Christopher Liaw

Y. Plan

Sikander Randhawa

187

152

13 Dec 2018

Adaptive Communication Strategies to Achieve the Best Error-Runtime Trade-off in Local-Update SGD

Jianyu Wang

Gauri Joshi

FedML

202

245

19 Oct 2018

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

2.9K

108,326

11 Oct 2018

The Convergence of Sparsified Gradient MethodsNeural Information Processing Systems (NeurIPS), 2018

Dan Alistarh

350

527

27 Sep 2018

Sparsified SGD with Memory

Sebastian U. Stich

Jean-Baptiste Cordonnier

Martin Jaggi

357

832

20 Sep 2018

Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization

Jiaxiang Wu

Weidong Huang

Junzhou Huang

Tong Zhang

226

245

21 Jun 2018

Local SGD Converges Fast and Communicates Little

Sebastian U. Stich

FedML

1.1K

1,187

24 May 2018

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

Amanpreet Singh

1.7K

8,043

20 Apr 2018

Group Normalization

Yuxin Wu

Kaiming He

581

4,115

22 Mar 2018

signSGD: Compressed Optimisation for Non-Convex Problems

Jeremy Bernstein

Yu Wang

Kamyar Azizzadenesheli

Anima Anandkumar

FedML ODL

531

1,178

13 Feb 2018

Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training

Song Han

579

1,533

05 Dec 2017

Gradient Sparsification for Communication-Efficient Distributed OptimizationNeural Information Processing Systems (NeurIPS), 2017

Jianqiao Wangni

Jialei Wang

Ji Liu

Tong Zhang

282

572

26 Oct 2017

Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond

279

256

22 Nov 2016

Adam: A Method for Stochastic OptimizationInternational Conference on Learning Representations (ICLR), 2014

Diederik P. Kingma

Jimmy Ba

ODL

4.7K

161,471

22 Dec 2014

Making Gradient Descent Optimal for Strongly Convex Stochastic OptimizationInternational Conference on Machine Learning (ICML), 2011

Alexander Rakhlin

Ohad Shamir

Karthik Sridharan

827

797

26 Sep 2011