ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.05101
  4. Cited By
Decoupled Weight Decay Regularization
v1v2v3 (latest)

Decoupled Weight Decay Regularization

14 November 2017
I. Loshchilov
Katharina Eggensperger
    OffRL
ArXiv (abs)PDFHTMLGithub (275★)

Papers citing "Decoupled Weight Decay Regularization"

16 / 1,216 papers shown
M2U-Net: Effective and Efficient Retinal Vessel Segmentation for
  Resource-Constrained Environments
M2U-Net: Effective and Efficient Retinal Vessel Segmentation for Resource-Constrained Environments
Tim Laibacher
Tillman Weyde
Sepehr Jalali
153
37
0
19 Nov 2018
Minimum weight norm models do not always generalize well for over-parameterized problems
Vatsal Shah
Anastasios Kyrillidis
Sujay Sanghavi
320
21
0
16 Nov 2018
Three Mechanisms of Weight Decay Regularization
Three Mechanisms of Weight Decay Regularization
Guodong Zhang
Simon Mahns
Bowen Xu
Roger C. Grosse
204
275
0
29 Oct 2018
How to train your MAML
How to train your MAML
Antreas Antoniou
Harrison Edwards
Amos Storkey
324
854
0
22 Oct 2018
Quasi-hyperbolic momentum and Adam for deep learning
Quasi-hyperbolic momentum and Adam for deep learning
Jerry Ma
Denis Yarats
ODL
362
145
0
16 Oct 2018
Dynamic Compositionality in Recursive Neural Networks with
  Structure-aware Tag Representations
Dynamic Compositionality in Recursive Neural Networks with Structure-aware Tag Representations
Taeuk Kim
Jihun Choi
Daniel Edmiston
Sanghwan Bae
Sang-goo Lee
176
24
0
07 Sep 2018
Bayesian filtering unifies adaptive and non-adaptive neural network
  optimization methods
Bayesian filtering unifies adaptive and non-adaptive neural network optimization methodsNeural Information Processing Systems (NeurIPS), 2018
Laurence Aitchison
ODL
385
21
0
19 Jul 2018
Improving on Q & A Recurrent Neural Networks Using Noun-Tagging
Improving on Q & A Recurrent Neural Networks Using Noun-Tagging
E. Partridge
J. Sklar
Omar El-lakany
40
0
0
12 Jul 2018
Closing the Generalization Gap of Adaptive Gradient Methods in Training
  Deep Neural Networks
Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks
Jinghui Chen
Dongruo Zhou
Yiqi Tang
Ziyan Yang
Yuan Cao
Quanquan Gu
ODL
357
208
0
18 Jun 2018
Do Better ImageNet Models Transfer Better?
Do Better ImageNet Models Transfer Better?
Simon Kornblith
Jonathon Shlens
Quoc V. Le
OODMLT
473
1,446
0
23 May 2018
High throughput quantitative metallography for complex microstructures
  using deep learning: A case study in ultrahigh carbon steel
High throughput quantitative metallography for complex microstructures using deep learning: A case study in ultrahigh carbon steel
Brian L. DeCost
Bo Lei
T. Francis
Elizabeth A. Holm
AI4CE
143
158
0
04 May 2018
Intracranial Error Detection via Deep Learning
Intracranial Error Detection via Deep Learning
M. Völker
Jiří Hammer
R. Schirrmeister
Joos Behncke
L. Fiederer
A. Schulze-Bonhage
Petr Marusič
Wolfram Burgard
T. Ball
150
10
0
04 May 2018
SFace: An Efficient Network for Face Detection in Large Scale Variations
SFace: An Efficient Network for Face Detection in Large Scale Variations
Jianfeng Wang
Ye Yuan
Boxun Li
Gang Yu
Sun Jian
CVBM
219
23
0
18 Apr 2018
signSGD: Compressed Optimisation for Non-Convex Problems
signSGD: Compressed Optimisation for Non-Convex Problems
Jeremy Bernstein
Yu Wang
Kamyar Azizzadenesheli
Anima Anandkumar
FedMLODL
547
1,183
0
13 Feb 2018
Improving Generalization Performance by Switching from Adam to SGD
Improving Generalization Performance by Switching from Adam to SGD
N. Keskar
R. Socher
ODL
245
566
0
20 Dec 2017
Normalized Direction-preserving Adam
Normalized Direction-preserving Adam
Zijun Zhang
Lin Ma
Zongpeng Li
Chuan Wu
ODL
186
30
0
13 Sep 2017
Previous
123...232425
Page 25 of 25
Pageof 25