ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1610.05792
  4. Cited By
Big Batch SGD: Automated Inference using Adaptive Batch Sizes

Big Batch SGD: Automated Inference using Adaptive Batch Sizes

18 October 2016
Soham De
A. Yadav
David Jacobs
Tom Goldstein
    ODL
ArXivPDFHTML

Papers citing "Big Batch SGD: Automated Inference using Adaptive Batch Sizes"

9 / 9 papers shown
Title
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Tim Tsz-Kit Lau
Weijian Li
Chenwei Xu
Han Liu
Mladen Kolar
94
0
0
30 Dec 2024
Adaptive Batch Size for Privately Finding Second-Order Stationary Points
Adaptive Batch Size for Privately Finding Second-Order Stationary Points
Daogao Liu
Kunal Talwar
73
0
0
10 Oct 2024
Flexible numerical optimization with ensmallen
Flexible numerical optimization with ensmallen
Ryan R. Curtin
Marcus Edel
Rahul Prabhu
S. Basak
Zhihao Lou
Conrad Sanderson
14
1
0
09 Mar 2020
History-Gradient Aided Batch Size Adaptation for Variance Reduced
  Algorithms
History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms
Kaiyi Ji
Zhe Wang
Bowen Weng
Yi Zhou
Wei Zhang
Yingbin Liang
ODL
11
5
0
21 Oct 2019
AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks
AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks
Aditya Devarakonda
Maxim Naumov
M. Garland
ODL
14
136
0
06 Dec 2017
Advances in Variational Inference
Advances in Variational Inference
Cheng Zhang
Judith Butepage
Hedvig Kjellström
Stephan Mandt
BDL
27
681
0
15 Nov 2017
Coupling Adaptive Batch Sizes with Learning Rates
Coupling Adaptive Batch Sizes with Learning Rates
Lukas Balles
Javier Romero
Philipp Hennig
ODL
13
110
0
15 Dec 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,878
0
15 Sep 2016
Linear Convergence of Gradient and Proximal-Gradient Methods Under the
  Polyak-Łojasiewicz Condition
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition
Hamed Karimi
J. Nutini
Mark W. Schmidt
119
1,194
0
16 Aug 2016
1