Study on the Large Batch Size Training of Neural Networks Based on the Second Order Gradient

16 December 2020

Papers citing "Study on the Large Batch Size Training of Neural Networks Based on the Second Order Gradient"

3 / 3 papers shown

Title
Distributed Training of Deep Neural Networks: Theoretical and Practical Limits of Parallel Scalability J. Keuper Franz-Josef Pfreundt GNN 47 97 0 22 Sep 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 281 2,888 0 15 Sep 2016
The Effects of Hyperparameters on SGD Training of Neural Networks Thomas Breuel 64 63 0 12 Aug 2015