v1v2 (latest)

On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima

15 September 2016

Papers citing "On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima"

50 / 1,554 papers shown

Title
Online Knowledge Distillation with Diverse Peers Defang Chen Jian-Ping Mei Can Wang Yan Feng Chun-Yen Chen FedML 87 302 0 01 Dec 2019
A Reparameterization-Invariant Flatness Measure for Deep Neural Networks Henning Petzka Linara Adilova Michael Kamp C. Sminchisescu ODL 55 8 0 29 Nov 2019
On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks Umut Simsekli Mert Gurbuzbalaban T. H. Nguyen G. Richard Levent Sagun 88 59 0 29 Nov 2019
Auto-Precision Scaling for Distributed Deep Learning Ruobing Han J. Demmel Yang You 43 5 0 20 Nov 2019
Information-Theoretic Local Minima Characterization and Regularization Zhiwei Jia Hao Su 73 19 0 19 Nov 2019
Signed Input Regularization Saeid Asgari Taghanaki Kumar Abhishek Ghassan Hamarneh AAML 43 1 0 16 Nov 2019
Information-Theoretic Perspective of Federated Learning Linara Adilova Julia Rosenzweig Michael Kamp FedML 13 4 0 15 Nov 2019
Optimal Mini-Batch Size Selection for Fast Gradient Descent M. Perrone Haidar Khan Changhoan Kim Anastasios Kyrillidis Jerry Quinn V. Salapura 38 9 0 15 Nov 2019
MindTheStep-AsyncPSGD: Adaptive Asynchronous Parallel Stochastic Gradient Descent Karl Bäckström Marina Papatriantafilou P. Tsigas 58 12 0 08 Nov 2019
Small-GAN: Speeding Up GAN Training Using Core-sets Samarth Sinha Hang Zhang Anirudh Goyal Yoshua Bengio Hugo Larochelle Augustus Odena GAN 99 77 0 29 Oct 2019
E2-Train: Training State-of-the-art CNNs with Over 80% Energy Savings Yue Wang Ziyu Jiang Xiaohan Chen Pengfei Xu Yang Zhao Yingyan Lin Zhangyang Wang MQ 107 83 0 29 Oct 2019
Neural Density Estimation and Likelihood-free Inference George Papamakarios BDL DRL 95 47 0 29 Oct 2019
A Simple Dynamic Learning Rate Tuning Algorithm For Automated Training of DNNs Koyel Mukherjee Alind Khare Ashish Verma 74 15 0 25 Oct 2019
Diametrical Risk Minimization: Theory and Computations Matthew Norton J. Royset 57 19 0 24 Oct 2019
Explicitly Bayesian Regularizations in Deep Learning Xinjie Lan Kenneth Barner UQCV BDL AI4CE 103 1 0 22 Oct 2019
Robust Learning Rate Selection for Stochastic Optimization via Splitting Diagnostic Matteo Sordello Niccolò Dalmasso Hangfeng He Weijie Su 50 7 0 18 Oct 2019
On Warm-Starting Neural Network Training Jordan T. Ash Ryan P. Adams AI4CE 58 21 0 18 Oct 2019
Improving the convergence of SGD through adaptive batch sizes Scott Sievert Zachary B. Charles ODL 63 8 0 18 Oct 2019
KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment Vlad Hosu Hanhe Lin T. Szirányi Dietmar Saupe 123 582 0 14 Oct 2019
Emergent properties of the local geometry of neural loss landscapes Stanislav Fort Surya Ganguli 120 51 0 14 Oct 2019
Improved Sample Complexities for Deep Networks and Robust Classification via an All-Layer Margin Colin Wei Tengyu Ma AAML OOD 72 85 0 09 Oct 2019
Parallelizing Training of Deep Generative Models on Massive Scientific Datasets S. A. Jacobs B. Van Essen D. Hysom Jae-Seung Yeom Tim Moon ... J. Gaffney Tom Benson Peter B. Robinson L. Peterson B. Spears BDL AI4CE 67 17 0 05 Oct 2019
Distributed Learning of Deep Neural Networks using Independent Subnet Training John Shelton Hyatt Cameron R. Wolfe Michael Lee Yuxin Tang Anastasios Kyrillidis Christopher M. Jermaine OOD 92 39 0 04 Oct 2019
Generalization Bounds for Convolutional Neural Networks Shan Lin Jingwei Zhang MLT 60 35 0 03 Oct 2019
Truth or Backpropaganda? An Empirical Investigation of Deep Learning Theory Micah Goldblum Jonas Geiping Avi Schwarzschild Michael Moeller Tom Goldstein 103 34 0 01 Oct 2019
How noise affects the Hessian spectrum in overparameterized neural networks Ming-Bo Wei D. Schwab 85 28 0 01 Oct 2019
At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks? Niv Giladi Mor Shpigel Nacson Elad Hoffer Daniel Soudry 80 22 0 26 Sep 2019
GradVis: Visualization and Second Order Analysis of Optimization Surfaces during the Training of Deep Neural Networks Avraam Chatzimichailidis Franz-Josef Pfreundt N. Gauger J. Keuper 52 10 0 26 Sep 2019
Towards Understanding the Transferability of Deep Representations Hong Liu Mingsheng Long Jianmin Wang Michael I. Jordan 66 26 0 26 Sep 2019
A Closer Look at Domain Shift for Deep Learning in Histopathology Karin Stacke Gabriel Eilertsen Jonas Unger Claes Lundström OOD 63 62 0 25 Sep 2019
EEG-Based Driver Drowsiness Estimation Using Feature Weighted Episodic Training Yuqi Cui Yifan Xu Dongrui Wu 64 63 0 25 Sep 2019
Decentralized Markov Chain Gradient Descent Tao Sun Dongsheng Li BDL 91 11 0 23 Sep 2019
Scale MLPerf-0.6 models on Google TPU-v3 Pods Sameer Kumar Victor Bitorff Dehao Chen Chi-Heng Chou Blake A. Hechtman ... Peter Mattson Shibo Wang Tao Wang Yuanzhong Xu Zongwei Zhou 67 39 0 21 Sep 2019
Understanding and Robustifying Differentiable Architecture Search Arber Zela T. Elsken Tonmoy Saikia Yassine Marrakchi Thomas Brox Frank Hutter OOD AAML 154 375 0 20 Sep 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi M. Patwary Raul Puri P. LeGresley Jared Casper Bryan Catanzaro MoE 358 1,922 0 17 Sep 2019
Visualizing Movement Control Optimization Landscapes Perttu Hämäläinen Juuso Toikka Amin Babadi Karen Liu 56 7 0 17 Sep 2019
Addressing Algorithmic Bottlenecks in Elastic Machine Learning with Chicle Michael Kaufmann K. Kourtis Celestine Mendler-Dünner Adrian Schüpbach Thomas Parnell 13 0 0 11 Sep 2019
Towards Understanding the Importance of Noise in Training Neural Networks Mo Zhou Tianyi Liu Yan Li Dachao Lin Enlu Zhou T. Zhao MLT 92 26 0 07 Sep 2019
Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field Approximation Konstantinos Pitas 58 8 0 06 Sep 2019
LCA: Loss Change Allocation for Neural Network Training Janice Lan Rosanne Liu Hattie Zhou J. Yosinski 73 25 0 03 Sep 2019
Hybrid Data-Model Parallel Training for Sequence-to-Sequence Recurrent Neural Network Machine Translation Junya Ono Masao Utiyama Eiichiro Sumita AIMat AI4CE 38 7 0 02 Sep 2019
Deep Learning Theory Review: An Optimal Control and Dynamical Systems Perspective Guan-Horng Liu Evangelos A. Theodorou AI4CE 118 72 0 28 Aug 2019
Towards Better Generalization: BP-SVRG in Training Deep Neural Networks Hao Jin Dachao Lin Zhihua Zhang ODL 35 2 0 18 Aug 2019
Regularizing CNN Transfer Learning with Randomised Regression Yang Zhong A. Maki 117 13 0 16 Aug 2019
Visualizing and Understanding the Effectiveness of BERT Y. Hao Li Dong Furu Wei Ke Xu 150 186 0 15 Aug 2019
Mix & Match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency Elad Hoffer Berry Weinstein Itay Hubara Tal Ben-Nun Torsten Hoefler Daniel Soudry 113 20 0 12 Aug 2019
Instance Enhancement Batch Normalization: an Adaptive Regulator of Batch Noise Senwei Liang Zhongzhan Huang Mingfu Liang Haizhao Yang 94 59 0 12 Aug 2019
Progressive Transfer Learning Zhengxu Yu Long Wei Zhongming Jin Jianqiang Huang Deng Cai Xiansheng Hua VLM 58 10 0 07 Aug 2019
How Does Learning Rate Decay Help Modern Neural Networks? Kaichao You Mingsheng Long Jianmin Wang Michael I. Jordan 66 4 0 05 Aug 2019
On the Existence of Simpler Machine Learning Models Lesia Semenova Cynthia Rudin Ronald E. Parr 117 87 0 05 Aug 2019