Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.05979
Cited By
Performance Modeling and Evaluation of Distributed Deep Learning Frameworks on GPUs
16 November 2017
S. Shi
Qiang-qiang Wang
Xiaowen Chu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Performance Modeling and Evaluation of Distributed Deep Learning Frameworks on GPUs"
17 / 17 papers shown
Title
FedSlate:A Federated Deep Reinforcement Learning Recommender System
Yongxin Deng
Xihe Qiu
Xiaoyu Tan
Yaochu Jin
FedML
88
0
0
23 Sep 2024
A Generic Performance Model for Deep Learning in a Distributed Environment
Tulasi Kavarakuntla
Liangxiu Han
H. Lloyd
Annabel Latham
Anthony Kleerekoper
S. Akintoye
16
0
0
19 May 2023
Byzantine Fault Tolerance in Distributed Machine Learning : a Survey
Djamila Bouhata
Hamouma Moumen
Moumen Hamouma
Ahcène Bounceur
AI4CE
23
7
0
05 May 2022
Accelerating Distributed K-FAC with Smart Parallelism of Computing and Communication Tasks
S. Shi
Lin Zhang
Bo-wen Li
24
9
0
14 Jul 2021
ResIST: Layer-Wise Decomposition of ResNets for Distributed Training
Chen Dun
Cameron R. Wolfe
C. Jermaine
Anastasios Kyrillidis
16
21
0
02 Jul 2021
Distributed Machine Learning for Wireless Communication Networks: Techniques, Architectures, and Applications
Shuyan Hu
Xiaojing Chen
Wei Ni
E. Hossain
Xin Wang
AI4CE
34
111
0
02 Dec 2020
Characterizing and Modeling Distributed Training with Transient Cloud GPU Servers
Shijian Li
R. Walls
Tian Guo
15
23
0
07 Apr 2020
Communication-Efficient Decentralized Learning with Sparsification and Adaptive Peer Selection
Zhenheng Tang
S. Shi
X. Chu
FedML
13
57
0
22 Feb 2020
MG-WFBP: Merging Gradients Wisely for Efficient Communication in Distributed Deep Learning
S. Shi
X. Chu
Bo Li
FedML
20
25
0
18 Dec 2019
On the Discrepancy between the Theoretical Analysis and Practical Implementations of Compressed Communication for Distributed Deep Learning
Aritra Dutta
El Houcine Bergou
A. Abdelmoniem
Chen-Yu Ho
Atal Narayan Sahu
Marco Canini
Panos Kalnis
19
76
0
19 Nov 2019
Throughput Prediction of Asynchronous SGD in TensorFlow
Zhuojin Li
Wumo Yan
Marco Paolieri
L. Golubchik
11
5
0
12 Nov 2019
Characterizing Deep Learning Training Workloads on Alibaba-PAI
Mengdi Wang
Chen Meng
Guoping Long
Chuan Wu
Jun Yang
Wei Lin
Yangqing Jia
17
53
0
14 Oct 2019
Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training
Saptadeep Pal
Eiman Ebrahimi
A. Zulfiqar
Yaosheng Fu
Victor Zhang
Szymon Migacz
D. Nellans
Puneet Gupta
21
55
0
30 Jul 2019
Priority-based Parameter Propagation for Distributed DNN Training
Anand Jayarajan
Jinliang Wei
Garth A. Gibson
Alexandra Fedorova
Gennady Pekhimenko
AI4CE
11
178
0
10 May 2019
A Distributed Synchronous SGD Algorithm with Global Top-
k
k
k
Sparsification for Low Bandwidth Networks
S. Shi
Qiang-qiang Wang
Kaiyong Zhao
Zhenheng Tang
Yuxin Wang
Xiang Huang
Xiaowen Chu
32
134
0
14 Jan 2019
Parallax: Sparsity-aware Data Parallel Training of Deep Neural Networks
Soojeong Kim
Gyeong-In Yu
Hojin Park
Sungwoo Cho
Eunji Jeong
Hyeonmin Ha
Sanha Lee
Joo Seong Jeong
Byung-Gon Chun
15
73
0
08 Aug 2018
Stochastic Nonconvex Optimization with Large Minibatches
Weiran Wang
Nathan Srebro
34
26
0
25 Sep 2017
1