Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates

23 August 2017

Papers citing "Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates"

30 / 130 papers shown

Title
Single-partition adaptive Q-learning J. Araújo Mário A. T. Figueiredo M. Botto OffRL 20 2 0 14 Jul 2020
CenterNet3D: An Anchor Free Object Detector for Point Cloud Guojun Wang Jian Wu Bin Wang Siyu Teng Long Chen Dongpu Cao 3DPC 19 27 0 13 Jul 2020
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers Robin M. Schmidt Frank Schneider Philipp Hennig ODL 45 162 0 03 Jul 2020
Hippo: Taming Hyper-parameter Optimization of Deep Learning with Stage Trees Ahnjae Shin Do Yoon Kim Joo Seong Jeong Byung-Gon Chun 28 4 0 22 Jun 2020
MOSQUITO-NET: A deep learning based CADx system for malaria diagnosis along with model interpretation using GradCam and class activation maps Aayush Kumar Sanat B Singh S. Satapathy M. Rout 6 15 0 17 Jun 2020
Monotone operator equilibrium networks Ezra Winston J. Zico Kolter 37 130 0 15 Jun 2020
Parsimonious Computing: A Minority Training Regime for Effective Prediction in Large Microarray Expression Data Sets Shailesh Sridhar Snehanshu Saha Azhar Shaikh Rahul Yedida S. Saha 14 4 0 18 May 2020
F2A2: Flexible Fully-decentralized Approximate Actor-critic for Cooperative Multi-agent Reinforcement Learning Wenhao Li Bo Jin Xiangfeng Wang Junchi Yan H. Zha 25 21 0 17 Apr 2020
Editable Neural Networks A. Sinitsin Vsevolod Plokhotnyuk Dmitriy V. Pyrkin Sergei Popov Artem Babenko KELM 68 175 0 01 Apr 2020
PointTrackNet: An End-to-End Network For 3-D Object Detection and Tracking From Point Clouds Sukai Wang Yuxiang Sun Chengju Liu Ming Liu VOT 3DPC 20 53 0 26 Feb 2020
The Two Regimes of Deep Network Training Guillaume Leclerc A. Madry 27 45 0 24 Feb 2020
Fast is better than free: Revisiting adversarial training Eric Wong Leslie Rice J. Zico Kolter AAML OOD 99 1,160 0 12 Jan 2020
Object as Hotspots: An Anchor-Free 3D Object Detection Approach via Firing of Hotspots Qi Chen Lin Sun Zhixin Wang Kui Jia Alan Yuille 3DPC 173 169 0 30 Dec 2019
Optimization for deep learning: theory and algorithms Ruoyu Sun ODL 27 168 0 19 Dec 2019
ResNetX: a more disordered and deeper network architecture Wenfeng Feng Xin Zhang Guangpeng Zhao 40 2 0 18 Dec 2019
Linear Mode Connectivity and the Lottery Ticket Hypothesis Jonathan Frankle Gintare Karolina Dziugaite Daniel M. Roy Michael Carbin MoMe 31 601 0 11 Dec 2019
Semi-Supervised Learning for Cancer Detection of Lymph Node Metastases Amit Kumar Jaiswal Ivan Panshin D. Shulkin Nagender Aneja Samuel Abramov SSL MedIm 31 23 0 23 Jun 2019
AI Feynman: a Physics-Inspired Method for Symbolic Regression S. Udrescu Max Tegmark 45 853 0 27 May 2019
Accurate Visual Localization for Automotive Applications Eli Brosh Matan Friedmann I. Kadar Lev Yitzhak Lavy Elad Levi S. Rippa Y. Lempert Bruno Fernandez-Ruiz Roei Herzig Trevor Darrell 42 24 0 01 May 2019
Forget the Learning Rate, Decay Loss Jiakai Wei 22 9 0 27 Apr 2019
Learning representations of irregular particle-detector geometry with distance-weighted graph networks S. Qasim J. Kieseler Y. Iiyama M. Pierini 35 135 0 21 Feb 2019
Image Classification at Supercomputer Scale Chris Ying Sameer Kumar Dehao Chen Tao Wang Youlong Cheng VLM 19 122 0 16 Nov 2018
Robust Learning of Tactile Force Estimation through Robot Interaction Balakumar Sundaralingam Alexander Lambert Ankur Handa Byron Boots Tucker Hermans Stan Birchfield Nathan D. Ratliff Dieter Fox OOD 19 59 0 15 Oct 2018
A Survey of Modern Object Detection Literature using Deep Learning K. Chahal Kuntal Dey ObjD 16 35 0 22 Aug 2018
Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate Mor Shpigel Nacson Nathan Srebro Daniel Soudry FedML MLT 32 97 0 05 Jun 2018
Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark Cody Coleman Daniel Kang Deepak Narayanan Luigi Nardi Tian Zhao Jian Zhang Peter Bailis K. Olukotun Christopher Ré Matei A. Zaharia 13 117 0 04 Jun 2018
Understanding Batch Normalization Johan Bjorck Carla P. Gomes B. Selman Kilian Q. Weinberger 26 596 0 01 Jun 2018
A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay L. Smith 208 1,020 0 26 Mar 2018
A Walk with SGD Chen Xing Devansh Arpit Christos Tsirigotis Yoshua Bengio 27 118 0 24 Feb 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 310 2,896 0 15 Sep 2016