Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1708.02983
Cited By
Scaling Deep Learning on GPU and Knights Landing clusters
International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2017
9 August 2017
Yang You
A. Buluç
J. Demmel
GNN
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Scaling Deep Learning on GPU and Knights Landing clusters"
18 / 18 papers shown
Title
Estudio de la eficiencia en la escalabilidad de GPUs para el entrenamiento de Inteligencia Artificial
David Cortes
Carlos Juiz
Belen Bermejo
60
0
0
03 Sep 2025
DeAR: Accelerating Distributed Deep Learning with Fine-Grained All-Reduce Pipelining
IEEE International Conference on Distributed Computing Systems (ICDCS), 2023
Lin Zhang
Shaoshuai Shi
Xiaowen Chu
Wei Wang
Yue Liu
Chengjian Liu
157
17
0
24 Feb 2023
Towards Efficient Communications in Federated Learning: A Contemporary Survey
Journal of the Franklin Institute (JFI), 2022
Zihao Zhao
Yuzhu Mao
Yang Liu
Linqi Song
Ouyang Ye
Xinlei Chen
Wenbo Ding
FedML
288
69
0
02 Aug 2022
Reconfigurable Cyber-Physical System for Lifestyle Video-Monitoring via Deep Learning
Daniel Deniz
Francisco Barranco
J. Isern
Eduardo Ros
76
9
0
07 Oct 2020
Reducing Data Motion to Accelerate the Training of Deep Neural Networks
Sicong Zhuang
Cristiano Malossi
Marc Casas
71
0
0
05 Apr 2020
Communication-Efficient Distributed Deep Learning: A Comprehensive Survey
Zhenheng Tang
Shaoshuai Shi
Wei Wang
Yue Liu
Xiaowen Chu
218
54
0
10 Mar 2020
Communication Contention Aware Scheduling of Multiple Deep Learning Training Jobs
Qiang-qiang Wang
Shaoshuai Shi
Canhui Wang
Xiaowen Chu
190
14
0
24 Feb 2020
Adaptive Gradient Sparsification for Efficient Federated Learning: An Online Learning Approach
IEEE International Conference on Distributed Computing Systems (ICDCS), 2020
Pengchao Han
Maroun Touma
K. Leung
FedML
250
212
0
14 Jan 2020
A Survey on Distributed Machine Learning
ACM Computing Surveys (ACM CSUR), 2019
Joost Verbraeken
Matthijs Wolting
Jonathan Katzy
Jeroen Kloppenburg
Tim Verbelen
Jan S. Rellermeyer
OOD
242
809
0
20 Dec 2019
MG-WFBP: Merging Gradients Wisely for Efficient Communication in Distributed Deep Learning
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2019
Shaoshuai Shi
Xiaowen Chu
Bo Li
FedML
138
30
0
18 Dec 2019
Layer-wise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence Guarantees
European Conference on Artificial Intelligence (ECAI), 2019
Shaoshuai Shi
Zhenheng Tang
Qiang-qiang Wang
Kaiyong Zhao
Xiaowen Chu
229
27
0
20 Nov 2019
AI Enabling Technologies: A Survey
V. Gadepally
Justin A. Goodwin
J. Kepner
Albert Reuther
Hayley Reynolds
S. Samsi
Jonathan Su
David Martinez
93
26
0
08 May 2019
On Linear Learning with Manycore Processors
International Conference on High Performance Computing (HiPC), 2019
Eliza Wszola
Celestine Mendler-Dünner
Martin Jaggi
Markus Püschel
371
1
0
02 May 2019
MG-WFBP: Efficient Data Communication for Distributed Synchronous SGD Algorithms
Shaoshuai Shi
Xiaowen Chu
Bo Li
FedML
154
99
0
27 Nov 2018
Runtime Concurrency Control and Operation Scheduling for High Performance Neural Network Training
Jiawen Liu
Dong Li
Gokcen Kestor
Jeffrey S. Vetter
104
10
0
21 Oct 2018
GeneSys: Enabling Continuous Learning through Neural Network Evolution in Hardware
A. Samajdar
Parth Mannan
K. Garg
T. Krishna
187
21
0
03 Aug 2018
GossipGraD: Scalable Deep Learning using Gossip Communication based Asynchronous Gradient Descent
J. Daily
Abhinav Vishnu
Charles Siegel
T. Warfel
Vinay C. Amatya
145
99
0
15 Mar 2018
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis
ACM Computing Surveys (CSUR), 2018
Tal Ben-Nun
Torsten Hoefler
GNN
268
760
0
26 Feb 2018
1