Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1711.05101
Cited By
v1
v2
v3 (latest)
Decoupled Weight Decay Regularization
14 November 2017
I. Loshchilov
Katharina Eggensperger
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Github (275★)
Papers citing
"Decoupled Weight Decay Regularization"
16 / 1,216 papers shown
M2U-Net: Effective and Efficient Retinal Vessel Segmentation for Resource-Constrained Environments
Tim Laibacher
Tillman Weyde
Sepehr Jalali
153
37
0
19 Nov 2018
Minimum weight norm models do not always generalize well for over-parameterized problems
Vatsal Shah
Anastasios Kyrillidis
Sujay Sanghavi
320
21
0
16 Nov 2018
Three Mechanisms of Weight Decay Regularization
Guodong Zhang
Simon Mahns
Bowen Xu
Roger C. Grosse
204
275
0
29 Oct 2018
How to train your MAML
Antreas Antoniou
Harrison Edwards
Amos Storkey
324
854
0
22 Oct 2018
Quasi-hyperbolic momentum and Adam for deep learning
Jerry Ma
Denis Yarats
ODL
362
145
0
16 Oct 2018
Dynamic Compositionality in Recursive Neural Networks with Structure-aware Tag Representations
Taeuk Kim
Jihun Choi
Daniel Edmiston
Sanghwan Bae
Sang-goo Lee
176
24
0
07 Sep 2018
Bayesian filtering unifies adaptive and non-adaptive neural network optimization methods
Neural Information Processing Systems (NeurIPS), 2018
Laurence Aitchison
ODL
385
21
0
19 Jul 2018
Improving on Q & A Recurrent Neural Networks Using Noun-Tagging
E. Partridge
J. Sklar
Omar El-lakany
40
0
0
12 Jul 2018
Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks
Jinghui Chen
Dongruo Zhou
Yiqi Tang
Ziyan Yang
Yuan Cao
Quanquan Gu
ODL
357
208
0
18 Jun 2018
Do Better ImageNet Models Transfer Better?
Simon Kornblith
Jonathon Shlens
Quoc V. Le
OOD
MLT
473
1,446
0
23 May 2018
High throughput quantitative metallography for complex microstructures using deep learning: A case study in ultrahigh carbon steel
Brian L. DeCost
Bo Lei
T. Francis
Elizabeth A. Holm
AI4CE
143
158
0
04 May 2018
Intracranial Error Detection via Deep Learning
M. Völker
Jiří Hammer
R. Schirrmeister
Joos Behncke
L. Fiederer
A. Schulze-Bonhage
Petr Marusič
Wolfram Burgard
T. Ball
150
10
0
04 May 2018
SFace: An Efficient Network for Face Detection in Large Scale Variations
Jianfeng Wang
Ye Yuan
Boxun Li
Gang Yu
Sun Jian
CVBM
219
23
0
18 Apr 2018
signSGD: Compressed Optimisation for Non-Convex Problems
Jeremy Bernstein
Yu Wang
Kamyar Azizzadenesheli
Anima Anandkumar
FedML
ODL
547
1,183
0
13 Feb 2018
Improving Generalization Performance by Switching from Adam to SGD
N. Keskar
R. Socher
ODL
245
566
0
20 Dec 2017
Normalized Direction-preserving Adam
Zijun Zhang
Lin Ma
Zongpeng Li
Chuan Wu
ODL
186
30
0
13 Sep 2017
Previous
1
2
3
...
23
24
25
Page 25 of 25
Page
of 25
Go