Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.02888
Cited By
1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed
4 February 2021
Hanlin Tang
Shaoduo Gan
A. A. Awan
Samyam Rajbhandari
Conglong Li
Xiangru Lian
Ji Liu
Ce Zhang
Yuxiong He
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed"
18 / 18 papers shown
Title
Striving for Simplicity: Simple Yet Effective Prior-Aware Pseudo-Labeling for Semi-Supervised Ultrasound Image Segmentation
Yaxiong Chen
Yujie Wang
Zixuan Zheng
Jingliang Hu
Yilei Shi
Shengwu Xiong
Xiao Xiang Zhu
Lichao Mou
52
0
0
18 Mar 2025
Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis
Zhijie Chen
Qiaobo Li
A. Banerjee
FedML
28
0
0
11 Nov 2024
Ordered Momentum for Asynchronous SGD
Chang-Wei Shi
Yi-Rui Yang
Wu-Jun Li
ODL
52
0
0
27 Jul 2024
Investigation of Energy-efficient AI Model Architectures and Compression Techniques for "Green" Fetal Brain Segmentation
Szymon Mazurek
M. Pytlarz
Sylwia Malec
A. Crimi
19
0
0
03 Apr 2024
DropCompute: simple and more robust distributed synchronous training via compute variance reduction
Niv Giladi
Shahar Gottlieb
Moran Shkolnik
A. Karnieli
Ron Banner
Elad Hoffer
Kfir Y. Levy
Daniel Soudry
23
2
0
18 Jun 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
24
39
0
07 Apr 2023
Matching-based Term Semantics Pre-training for Spoken Patient Query Understanding
Zefa Hu
Xiuyi Chen
Hao Wu
Minglun Han
Ziyi Ni
Jing Shi
Shuang Xu
Bo Xu
48
4
0
02 Mar 2023
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
Max Ryabinin
Tim Dettmers
Michael Diskin
Alexander Borzunov
MoE
22
31
0
27 Jan 2023
Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression
Jaeyong Song
Jinkyu Yim
Jaewon Jung
Hongsun Jang
H. Kim
Youngsok Kim
Jinho Lee
GNN
8
25
0
24 Jan 2023
Federated Averaging Langevin Dynamics: Toward a unified theory and new algorithms
Vincent Plassier
Alain Durmus
Eric Moulines
FedML
14
6
0
31 Oct 2022
Communication-Efficient Adam-Type Algorithms for Distributed Data Mining
Wenhan Xian
Feihu Huang
Heng-Chiao Huang
FedML
25
0
0
14 Oct 2022
Communication-Efficient Adaptive Federated Learning
Yujia Wang
Lu Lin
Jinghui Chen
FedML
11
69
0
05 May 2022
Survey on Large Scale Neural Network Training
Julia Gusak
Daria Cherniuk
Alena Shilova
A. Katrutsa
Daniel Bershatsky
...
Lionel Eyraud-Dubois
Oleg Shlyazhko
Denis Dimitrov
Ivan V. Oseledets
Olivier Beaumont
22
10
0
21 Feb 2022
Benchmark Assessment for DeepSpeed Optimization Library
G. Liang
I. Alsmadi
24
3
0
12 Feb 2022
Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Xiaoxin He
Fuzhao Xue
Xiaozhe Ren
Yang You
22
14
0
01 Nov 2021
Linearly Converging Error Compensated SGD
Eduard A. Gorbunov
D. Kovalev
Dmitry Makarenko
Peter Richtárik
163
77
0
23 Oct 2020
A new regret analysis for Adam-type algorithms
Ahmet Alacaoglu
Yura Malitsky
P. Mertikopoulos
V. Cevher
ODL
40
42
0
21 Mar 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
1