ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.19982
  4. Cited By
Adam Accumulation to Reduce Memory Footprints of both Activations and
  Gradients for Large-scale DNN Training

Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training

31 May 2023
Yijia Zhang
Yibo Han
Shijie Cao
Guohao Dai
Youshan Miao
Ting Cao
Fan Yang
Ningyi Xu
ArXivPDFHTML

Papers citing "Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training"

5 / 5 papers shown
Title
HopGNN: Boosting Distributed GNN Training Efficiency via Feature-Centric
  Model Migration
HopGNN: Boosting Distributed GNN Training Efficiency via Feature-Centric Model Migration
Weijian Chen
Shuibing He
Haoyang Qu
Xuechen Zhang
GNN
24
0
0
01 Sep 2024
DeMansia: Mamba Never Forgets Any Tokens
DeMansia: Mamba Never Forgets Any Tokens
Ricky Fang
Mamba
19
0
0
04 Aug 2024
Batch size invariant Adam
Batch size invariant Adam
Xi Wang
Laurence Aitchison
38
2
0
29 Feb 2024
ZeRO-Offload: Democratizing Billion-Scale Model Training
ZeRO-Offload: Democratizing Billion-Scale Model Training
Jie Ren
Samyam Rajbhandari
Reza Yazdani Aminabadi
Olatunji Ruwase
Shuangyang Yang
Minjia Zhang
Dong Li
Yuxiong He
MoE
160
413
0
18 Jan 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,950
0
20 Apr 2018
1