The Effect of Network Width on the Performance of Large-batch Training

11 June 2018

Papers citing "The Effect of Network Width on the Performance of Large-batch Training"

6 / 6 papers shown

Title
A New Perspective for Understanding Generalization Gap of Deep Neural Networks Trained with Large Batch Sizes O. Oyedotun Konstantinos Papadopoulos Djamila Aouada AI4CE 32 11 0 21 Oct 2022
Principal Component Networks: Parameter Reduction Early in Training R. Waleffe Theodoros Rekatsinas 3DPC 19 9 0 23 Jun 2020
SparCML: High-Performance Sparse Communication for Machine Learning Cédric Renggli Saleh Ashkboos Mehdi Aghagolzadeh Dan Alistarh Torsten Hoefler 29 126 0 22 Feb 2018
Wider or Deeper: Revisiting the ResNet Model for Visual Recognition Zifeng Wu Chunhua Shen Anton Van Den Hengel SSeg 260 1,495 0 30 Nov 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 310 2,896 0 15 Sep 2016
Optimal Distributed Online Prediction using Mini-Batches O. Dekel Ran Gilad-Bachrach Ohad Shamir Lin Xiao 182 683 0 07 Dec 2010