ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.06009
  4. Cited By
Maximizing Communication Efficiency for Large-scale Training via 0/1
  Adam

Maximizing Communication Efficiency for Large-scale Training via 0/1 Adam

12 February 2022
Yucheng Lu
Conglong Li
Minjia Zhang
Christopher De Sa
Yuxiong He
    OffRL
    AI4CE
ArXivPDFHTML

Papers citing "Maximizing Communication Efficiency for Large-scale Training via 0/1 Adam"

8 / 8 papers shown
Title
Distributed Sign Momentum with Local Steps for Training Transformers
Distributed Sign Momentum with Local Steps for Training Transformers
Shuhua Yu
Ding Zhou
Cong Xie
An Xu
Zhi-Li Zhang
Xin Liu
S. Kar
64
0
0
26 Nov 2024
On Efficient Training of Large-Scale Deep Learning Models: A Literature
  Review
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
24
39
0
07 Apr 2023
Fast Adaptive Federated Bilevel Optimization
Fast Adaptive Federated Bilevel Optimization
Feihu Huang
FedML
20
7
0
02 Nov 2022
Sign-MAML: Efficient Model-Agnostic Meta-Learning by SignSGD
Sign-MAML: Efficient Model-Agnostic Meta-Learning by SignSGD
Chen Fan
Parikshit Ram
Sijia Liu
FedML
53
16
0
15 Sep 2021
A new regret analysis for Adam-type algorithms
A new regret analysis for Adam-type algorithms
Ahmet Alacaoglu
Yura Malitsky
P. Mertikopoulos
V. Cevher
ODL
40
42
0
21 Mar 2020
A Simple Convergence Proof of Adam and Adagrad
A Simple Convergence Proof of Adam and Adagrad
Alexandre Défossez
Léon Bottou
Francis R. Bach
Nicolas Usunier
56
143
0
05 Mar 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,817
0
17 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,950
0
20 Apr 2018
1