ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.00856
  4. Cited By
Large-Scale Deep Learning Optimizations: A Comprehensive Survey

Large-Scale Deep Learning Optimizations: A Comprehensive Survey

1 November 2021
Xiaoxin He
Fuzhao Xue
Xiaozhe Ren
Yang You
ArXivPDFHTML

Papers citing "Large-Scale Deep Learning Optimizations: A Comprehensive Survey"

12 / 12 papers shown
Title
Efficient Training of Large Language Models on Distributed
  Infrastructures: A Survey
Efficient Training of Large Language Models on Distributed Infrastructures: A Survey
Jiangfei Duan
Shuo Zhang
Zerui Wang
Lijuan Jiang
Wenwen Qu
...
Dahua Lin
Yonggang Wen
Xin Jin
Tianwei Zhang
Peng Sun
69
7
0
29 Jul 2024
Towards Communication-efficient Federated Learning via Sparse and
  Aligned Adaptive Optimization
Towards Communication-efficient Federated Learning via Sparse and Aligned Adaptive Optimization
Xiumei Deng
Jun Li
Kang Wei
Long Shi
Zeihui Xiong
Ming Ding
Wen Chen
Shi Jin
H. Vincent Poor
FedML
35
0
0
28 May 2024
Accurate and Fast Fischer-Tropsch Reaction Microkinetics using PINNs
Accurate and Fast Fischer-Tropsch Reaction Microkinetics using PINNs
Harshil Patel
Aniruddha Panda
T. Nikolaienko
Stanislav Jaso
Alejandro Lopez
Kaushic Kalyanaraman
23
2
0
17 Nov 2023
CAME: Confidence-guided Adaptive Memory Efficient Optimization
CAME: Confidence-guided Adaptive Memory Efficient Optimization
Yang Luo
Xiaozhe Ren
Zangwei Zheng
Zhuo Jiang
Xin Jiang
Yang You
ODL
13
16
0
05 Jul 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature
  Review
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
24
39
0
07 Apr 2023
Refiner: Data Refining against Gradient Leakage Attacks in Federated
  Learning
Refiner: Data Refining against Gradient Leakage Attacks in Federated Learning
Mingyuan Fan
Cen Chen
Chengyu Wang
Ximeng Liu
Wenmeng Zhou
Jun Huang
AAML
FedML
11
0
0
05 Dec 2022
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep
  Models
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
Xingyu Xie
Pan Zhou
Huan Li
Zhouchen Lin
Shuicheng Yan
ODL
22
145
0
13 Aug 2022
CowClip: Reducing CTR Prediction Model Training Time from 12 hours to 10
  minutes on 1 GPU
CowClip: Reducing CTR Prediction Model Training Time from 12 hours to 10 minutes on 1 GPU
Zangwei Zheng
Peng Xu
Xuan Zou
Da Tang
Zhen Li
...
Xiangzhuo Ding
Fuzhao Xue
Ziheng Qing
Youlong Cheng
Yang You
VLM
26
7
0
13 Apr 2022
Benchmark Assessment for DeepSpeed Optimization Library
Benchmark Assessment for DeepSpeed Optimization Library
G. Liang
I. Alsmadi
24
3
0
12 Feb 2022
NUQSGD: Provably Communication-efficient Data-parallel SGD via Nonuniform Quantization
Ali Ramezani-Kebrya
Fartash Faghri
Ilya Markov
V. Aksenov
Dan Alistarh
Daniel M. Roy
MQ
57
30
0
28 Apr 2021
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,724
0
26 Sep 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,878
0
15 Sep 2016
1