Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.02029
Cited By
AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks
6 December 2017
Aditya Devarakonda
Maxim Naumov
M. Garland
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks"
15 / 15 papers shown
Title
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Tim Tsz-Kit Lau
Weijian Li
Chenwei Xu
Han Liu
Mladen Kolar
103
0
0
30 Dec 2024
Taming Resource Heterogeneity In Distributed ML Training With Dynamic Batching
S. Tyagi
Prateek Sharma
14
22
0
20 May 2023
Resource-aware Deep Learning for Wireless Fingerprinting Localization
Gregor Cerar
Blaž Bertalanič
Carolina Fortuna
HAI
8
2
0
12 Oct 2022
Dynamic Batch Adaptation
Cristian Simionescu
George Stoica
Robert Herscovici
ODL
11
1
0
01 Aug 2022
Towards Sustainable Deep Learning for Wireless Fingerprinting Localization
Anže Pirnat
Blaž Bertalanič
Gregor Cerar
M. Mohorčič
Marko Meza
Carolina Fortuna
HAI
6
6
0
22 Jan 2022
Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Xiaoxin He
Fuzhao Xue
Xiaozhe Ren
Yang You
22
14
0
01 Nov 2021
Concurrent Adversarial Learning for Large-Batch Training
Yong Liu
Xiangning Chen
Minhao Cheng
Cho-Jui Hsieh
Yang You
ODL
21
13
0
01 Jun 2021
Review: Deep Learning in Electron Microscopy
Jeffrey M. Ede
15
79
0
17 Sep 2020
AdaScale SGD: A User-Friendly Algorithm for Distributed Training
Tyler B. Johnson
Pulkit Agrawal
Haijie Gu
Carlos Guestrin
ODL
6
37
0
09 Jul 2020
Multi-Precision Policy Enforced Training (MuPPET): A precision-switching strategy for quantised fixed-point training of CNNs
A. Rajagopal
D. A. Vink
Stylianos I. Venieris
C. Bouganis
MQ
8
14
0
16 Jun 2020
Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources
Haibin Lin
Hang Zhang
Yifei Ma
Tong He
Zhi-Li Zhang
Sheng Zha
Mu Li
17
23
0
26 Apr 2019
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Yang You
Jing Li
Sashank J. Reddi
Jonathan Hseu
Sanjiv Kumar
Srinadh Bhojanapalli
Xiaodan Song
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
ODL
11
978
0
01 Apr 2019
Parameter Re-Initialization through Cyclical Batch Size Schedules
Norman Mu
Z. Yao
A. Gholami
Kurt Keutzer
Michael W. Mahoney
ODL
8
8
0
04 Dec 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,878
0
15 Sep 2016
Parallelizing Word2Vec in Shared and Distributed Memory
Shihao Ji
N. Satish
Sheng R. Li
Pradeep Dubey
VLM
MoE
14
72
0
15 Apr 2016
1