Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1903.12650
Cited By
Yet Another Accelerated SGD: ResNet-50 Training on ImageNet in 74.7 seconds
29 March 2019
Masafumi Yamazaki
Akihiko Kasagi
Akihiro Tabuchi
Takumi Honda
Masahiro Miwa
Naoto Fukumoto
Tsuguchika Tabaru
Atsushi Ike
Kohta Nakashima
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Yet Another Accelerated SGD: ResNet-50 Training on ImageNet in 74.7 seconds"
30 / 30 papers shown
Title
Phasor-Driven Acceleration for FFT-based CNNs
Eduardo Reis
Thangarajah Akilan
Mohammed Khalid
46
0
0
01 Jun 2024
Revisiting LARS for Large Batch Training Generalization of Neural Networks
K. Do
Duong Nguyen
Hoa Nguyen
Long Tran-Thanh
Nguyen-Hoang Tran
Quoc-Viet Pham
AI4CE
ODL
69
1
0
25 Sep 2023
ABS: Adaptive Bounded Staleness Converges Faster and Communicates Less
Qiao Tan
Feng Zhu
Jingjing Zhang
57
0
0
21 Jan 2023
Analyzing I/O Performance of a Hierarchical HPC Storage System for Distributed Deep Learning
Takaaki Fukai
Kento Sato
Takahiro Hirofuchi
51
2
0
04 Jan 2023
Leveraging Computer Vision Application in Visual Arts: A Case Study on the Use of Residual Neural Network to Classify and Analyze Baroque Paintings
Daniel Kvak
38
0
0
27 Oct 2022
Large-batch Optimization for Dense Visual Predictions
Zeyue Xue
Jianming Liang
Guanglu Song
Zhuofan Zong
Liang Chen
Yu Liu
Ping Luo
VLM
96
9
0
20 Oct 2022
Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Xiaoxin He
Fuzhao Xue
Xiaozhe Ren
Yang You
83
15
0
01 Nov 2021
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
173
76
0
29 Sep 2021
Concurrent Adversarial Learning for Large-Batch Training
Yong Liu
Xiangning Chen
Minhao Cheng
Cho-Jui Hsieh
Yang You
ODL
77
13
0
01 Jun 2021
Tesseract: Parallelize the Tensor Parallelism Efficiently
Boxiang Wang
Qifan Xu
Zhengda Bian
Yang You
VLM
GNN
33
34
0
30 May 2021
An Oracle for Guiding Large-Scale Model/Hybrid Parallel Training of Convolutional Neural Networks
A. Kahira
Truong Thao Nguyen
L. Bautista-Gomez
Ryousei Takano
Rosa M. Badia
Mohamed Wahib
52
11
0
19 Apr 2021
Crossover-SGD: A gossip-based communication in distributed deep learning for alleviating large mini-batch problem and enhancing scalability
Sangho Yeo
Minho Bae
Minjoong Jeong
Oh-Kyoung Kwon
Sangyoon Oh
50
3
0
30 Dec 2020
Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning
Aurick Qiao
Sang Keun Choe
Suhas Jayaram Subramanya
Willie Neiswanger
Qirong Ho
Hao Zhang
G. Ganger
Eric Xing
VLM
77
182
0
27 Aug 2020
Scaling Distributed Deep Learning Workloads beyond the Memory Capacity with KARMA
Mohamed Wahib
Haoyu Zhang
Truong Thao Nguyen
Aleksandr Drozd
Jens Domke
Lingqi Zhang
Ryousei Takano
Satoshi Matsuoka
OODD
76
23
0
26 Aug 2020
The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism
Yosuke Oyama
N. Maruyama
Nikoli Dryden
Erin McCarthy
P. Harrington
J. Balewski
Satoshi Matsuoka
Peter Nugent
B. Van Essen
3DV
AI4CE
71
37
0
25 Jul 2020
DEED: A General Quantization Scheme for Communication Efficiency in Bits
Tian-Chun Ye
Peijun Xiao
Ruoyu Sun
FedML
MQ
38
2
0
19 Jun 2020
The Limit of the Batch Size
Yang You
Yuhui Wang
Huan Zhang
Zhao-jie Zhang
J. Demmel
Cho-Jui Hsieh
121
15
0
15 Jun 2020
TResNet: High Performance GPU-Dedicated Architecture
T. Ridnik
Hussam Lawen
Asaf Noy
Emanuel Ben-Baruch
Gilad Sharir
Itamar Friedman
OOD
109
215
0
30 Mar 2020
The Future of Digital Health with Federated Learning
Nicola Rieke
Jonny Hancox
Wenqi Li
Fausto Milletari
H. Roth
...
Ronald M. Summers
Andrew Trask
Daguang Xu
Maximilian Baust
M. Jorge Cardoso
OOD
274
1,802
0
18 Mar 2020
Communication optimization strategies for distributed deep neural network training: A survey
Shuo Ouyang
Dezun Dong
Yemao Xu
Liquan Xiao
116
12
0
06 Mar 2020
Scalable and Practical Natural Gradient for Large-Scale Deep Learning
Kazuki Osawa
Yohei Tsuji
Yuichiro Ueno
Akira Naruse
Chuan-Sheng Foo
Rio Yokota
85
37
0
13 Feb 2020
Optimization for deep learning: theory and algorithms
Ruoyu Sun
ODL
137
169
0
19 Dec 2019
Progressive Compressed Records: Taking a Byte out of Deep Learning Data
Michael Kuchnik
George Amvrosiadis
Virginia Smith
64
9
0
01 Nov 2019
Gap Aware Mitigation of Gradient Staleness
Saar Barkai
Ido Hakimi
Assaf Schuster
77
23
0
24 Sep 2019
Heterogeneity-Aware Asynchronous Decentralized Training
Qinyi Luo
Jiaao He
Youwei Zhuo
Xuehai Qian
53
8
0
17 Sep 2019
Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training
Saptadeep Pal
Eiman Ebrahimi
A. Zulfiqar
Yaosheng Fu
Victor Zhang
Szymon Migacz
D. Nellans
Puneet Gupta
90
59
0
30 Jul 2019
Taming Momentum in a Distributed Asynchronous Environment
Ido Hakimi
Saar Barkai
Moshe Gabel
Assaf Schuster
93
23
0
26 Jul 2019
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Yang You
Jing Li
Sashank J. Reddi
Jonathan Hseu
Sanjiv Kumar
Srinadh Bhojanapalli
Xiaodan Song
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
ODL
283
1,000
0
01 Apr 2019
Inefficiency of K-FAC for Large Batch Size Training
Linjian Ma
Gabe Montague
Jiayu Ye
Z. Yao
A. Gholami
Kurt Keutzer
Michael W. Mahoney
49
24
0
14 Mar 2019
SparCML: High-Performance Sparse Communication for Machine Learning
Cédric Renggli
Saleh Ashkboos
Mehdi Aghagolzadeh
Dan Alistarh
Torsten Hoefler
85
127
0
22 Feb 2018
1