Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1809.02839
Cited By
Efficient and Robust Parallel DNN Training through Model Parallelism on Multi-GPU Platform
8 September 2018
Chi-Chung Chen
Chia-Lin Yang
Hsiang-Yun Cheng
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient and Robust Parallel DNN Training through Model Parallelism on Multi-GPU Platform"
19 / 19 papers shown
Title
Nesterov Method for Asynchronous Pipeline Parallel Optimization
Thalaiyasingam Ajanthan
Sameera Ramasinghe
Yan Zuo
Gil Avraham
Alexander Long
24
0
0
02 May 2025
PipeOptim: Ensuring Effective 1F1B Schedule with Optimizer-Dependent Weight Prediction
Lei Guan
Dongsheng Li
Jiye Liang
Wenjian Wang
Wenjian Wang
Xicheng Lu
32
1
0
01 Dec 2023
A Survey From Distributed Machine Learning to Distributed Deep Learning
Mohammad Dehghani
Zahra Yazdanparast
23
0
0
11 Jul 2023
DISCO: Distributed Inference with Sparse Communications
Minghai Qin
Chaowen Sun
Jaco A. Hofmann
D. Vučinić
FedML
27
1
0
22 Feb 2023
Weight Prediction Boosts the Convergence of AdamW
Lei Guan
21
15
0
01 Feb 2023
Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression
Jaeyong Song
Jinkyu Yim
Jaewon Jung
Hongsun Jang
H. Kim
Youngsok Kim
Jinho Lee
GNN
24
25
0
24 Jan 2023
LOFT: Finding Lottery Tickets through Filter-wise Training
Qihan Wang
Chen Dun
Fangshuo Liao
C. Jermaine
Anastasios Kyrillidis
23
3
0
28 Oct 2022
RevBiFPN: The Fully Reversible Bidirectional Feature Pyramid Network
Vitaliy Chiley
Vithursan Thangarasa
Abhay Gupta
Anshul Samar
Joel Hestness
D. DeCoste
50
8
0
28 Jun 2022
FuncPipe: A Pipelined Serverless Framework for Fast and Cost-efficient Training of Deep Learning Models
Yunzhuo Liu
Bo Jiang
Tian Guo
Zimeng Huang
Wen-ping Ma
Xinbing Wang
Chenghu Zhou
24
9
0
28 Apr 2022
Efficient Pipeline Planning for Expedited Distributed DNN Training
Ziyue Luo
Xiaodong Yi
Guoping Long
Shiqing Fan
Chuan Wu
Jun Yang
Wei Lin
28
16
0
22 Apr 2022
DistrEdge: Speeding up Convolutional Neural Network Inference on Distributed Edge Devices
Xueyu Hou
Yongjie Guan
Tao Han
Ning Zhang
19
41
0
03 Feb 2022
ResIST: Layer-Wise Decomposition of ResNets for Distributed Training
Chen Dun
Cameron R. Wolfe
C. Jermaine
Anastasios Kyrillidis
16
21
0
02 Jul 2021
Parareal Neural Networks Emulating a Parallel-in-time Algorithm
Zhanyu Ma
Jiyang Xie
Jingyi Yu
AI4CE
18
9
0
16 Mar 2021
A Study of Checkpointing in Large Scale Training of Deep Neural Networks
Elvis Rojas
A. Kahira
Esteban Meneses
L. Bautista-Gomez
Rosa M. Badia
21
22
0
01 Dec 2020
Integrating Deep Learning in Domain Sciences at Exascale
Rick Archibald
E. Chow
E. DÁzevedo
Jack J. Dongarra
M. Eisenbach
...
Florent Lopez
Daniel Nichols
S. Tomov
Kwai Wong
Junqi Yin
PINN
23
5
0
23 Nov 2020
DAPPLE: A Pipelined Data Parallel Approach for Training Large Models
Shiqing Fan
Yi Rong
Chen Meng
Zongyan Cao
Siyu Wang
...
Jun Yang
Lixue Xia
Lansong Diao
Xiaoyong Liu
Wei Lin
21
232
0
02 Jul 2020
HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU Clusters through Integration of Pipelined Model Parallelism and Data Parallelism
Jay H. Park
Gyeongchan Yun
Chang Yi
N. T. Nguyen
Seungmin Lee
Jaesik Choi
S. Noh
Young-ri Choi
MoE
25
128
0
28 May 2020
Pipelined Backpropagation at Scale: Training Large Models without Batches
Atli Kosson
Vitaliy Chiley
Abhinav Venigalla
Joel Hestness
Urs Koster
35
33
0
25 Mar 2020
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
299
2,890
0
15 Sep 2016
1