SGDR: Stochastic Gradient Descent with Warm Restarts

13 August 2016

Papers citing "SGDR: Stochastic Gradient Descent with Warm Restarts"

23 / 1,273 papers shown

Title
DeepTAM: Deep Tracking and Mapping Huizhong Zhou Benjamin Ummenhofer Thomas Brox 3DV 19 227 0 06 Aug 2018
Actor-Centric Relation Network Chen Sun Abhinav Shrivastava Carl Vondrick Kevin Patrick Murphy Rahul Sukthankar Cordelia Schmid 36 220 0 28 Jul 2018
Big-Little Net: An Efficient Multi-Scale Feature Representation for Visual and Speech Recognition Chun-Fu Chen Quanfu Fan Neil Rohit Mallinar Tom Sercu Rogerio Feris 17 96 0 10 Jul 2018
Path-Level Network Transformation for Efficient Architecture Search Han Cai Jiacheng Yang Weinan Zhang Song Han Yong Yu 24 210 0 07 Jun 2018
Neural Architecture Search using Deep Neural Networks and Monte Carlo Tree Search Linnan Wang Yiyang Zhao Yuu Jinnai Yuandong Tian Rodrigo Fonseca BDL 17 50 0 18 May 2018
Born Again Neural Networks Tommaso Furlanello Zachary Chase Lipton Michael Tschannen Laurent Itti Anima Anandkumar 13 1,020 0 12 May 2018
SdcNet: A Computation-Efficient CNN for Object Recognition Yunlong Ma Chunyan Wang 16 3 0 03 May 2018
Efficient Multi-objective Neural Architecture Search via Lamarckian Evolution T. Elsken J. H. Metzen Frank Hutter 117 498 0 24 Apr 2018
Understanding Actors and Evaluating Personae with Gaussian Embeddings Hannah Kim Denys Katerenchuk Daniel Billet Jun Huan Haesun Park Boyang Albert Li 13 4 0 06 Apr 2018
Averaging Weights Leads to Wider Optima and Better Generalization Pavel Izmailov Dmitrii Podoprikhin T. Garipov Dmitry Vetrov A. Wilson FedML MoMe 32 1,616 0 14 Mar 2018
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis Tal Ben-Nun Torsten Hoefler GNN 22 701 0 26 Feb 2018
Training wide residual networks for deployment using a single bit for each weight Mark D Mcdonnell MQ 22 71 0 23 Feb 2018
AOGNets: Compositional Grammatical Architectures for Deep Learning Xilai Li Xi Song Tianfu Wu 29 25 0 15 Nov 2017
Scale out for large minibatch SGD: Residual network training on ImageNet-1K with improved accuracy and reduced time to train V. Codreanu Damian Podareanu V. Saletore 28 54 0 12 Nov 2017
Neural Optimizer Search with Reinforcement Learning Irwan Bello Barret Zoph Vijay Vasudevan Quoc V. Le ODL 25 382 0 21 Sep 2017
On the convergence properties of a $K$ -step averaging stochastic gradient descent algorithm for nonconvex optimization Fan Zhou Guojing Cong 21 232 0 03 Aug 2017
Learned Primal-dual Reconstruction J. Adler Ozan Oktem MedIm 19 747 0 20 Jul 2017
Optimization Beyond the Convolution: Generalizing Spatial Relations with End-to-End Metric Learning P. Jund Andreas Eitel N. Abdo Wolfram Burgard 3DPC 16 19 0 04 Jul 2017
FreezeOut: Accelerate Training by Progressively Freezing Layers Andrew Brock Theodore Lim J. Ritchie Nick Weston 14 123 0 15 Jun 2017
Snapshot Ensembles: Train 1, get M for free Gao Huang Yixuan Li Geoff Pleiss Zhuang Liu J. Hopcroft Kilian Q. Weinberger OOD FedML UQCV 27 935 0 01 Apr 2017
An Empirical Study of Language CNN for Image Captioning Jiuxiang Gu G. Wang Jianfei Cai Tsuhan Chen 17 132 0 21 Dec 2016
Deep Q-Networks for Accelerating the Training of Deep Neural Networks Jie Fu AI4CE 18 11 0 05 Jun 2016
The Loss Surfaces of Multilayer Networks A. Choromańska Mikael Henaff Michaël Mathieu Gerard Ben Arous Yann LeCun ODL 177 1,185 0 30 Nov 2014