Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
1711.05101
Cited By

Decoupled Weight Decay Regularization

v1v2v3 (latest)

Decoupled Weight Decay Regularization

14 November 2017

Katharina Eggensperger

ArXiv (abs)PDF HTML Github (275★)

Papers citing "Decoupled Weight Decay Regularization"

16 / 1,216 papers shown

M2U-Net: Effective and Efficient Retinal Vessel Segmentation for
Resource-Constrained Environments

M2U-Net: Effective and Efficient Retinal Vessel Segmentation for Resource-Constrained Environments

153

37

0

19 Nov 2018

Minimum weight norm models do not always generalize well for over-parameterized problems

Anastasios Kyrillidis

Sujay Sanghavi

320

21

0

16 Nov 2018

Three Mechanisms of Weight Decay Regularization

Three Mechanisms of Weight Decay Regularization

Roger C. Grosse

204

275

0

29 Oct 2018

How to train your MAML

How to train your MAML

Antreas Antoniou

Harrison Edwards

Amos Storkey

324

854

0

22 Oct 2018

Quasi-hyperbolic momentum and Adam for deep learning

Quasi-hyperbolic momentum and Adam for deep learning

362

145

0

16 Oct 2018

Dynamic Compositionality in Recursive Neural Networks with
Structure-aware Tag Representations

Dynamic Compositionality in Recursive Neural Networks with Structure-aware Tag Representations

Daniel Edmiston

176

24

0

07 Sep 2018

Bayesian filtering unifies adaptive and non-adaptive neural network
optimization methods

Bayesian filtering unifies adaptive and non-adaptive neural network optimization methodsNeural Information Processing Systems (NeurIPS), 2018

Laurence Aitchison

385

21

0

19 Jul 2018

Improving on Q & A Recurrent Neural Networks Using Noun-Tagging

Improving on Q & A Recurrent Neural Networks Using Noun-Tagging

40

0

0

12 Jul 2018

Closing the Generalization Gap of Adaptive Gradient Methods in Training
Deep Neural Networks

Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks

Quanquan Gu

357

208

0

18 Jun 2018

Do Better ImageNet Models Transfer Better?

Do Better ImageNet Models Transfer Better?

Simon Kornblith

Jonathon Shlens

473

1,446

0

23 May 2018

High throughput quantitative metallography for complex microstructures
using deep learning: A case study in ultrahigh carbon steel

High throughput quantitative metallography for complex microstructures using deep learning: A case study in ultrahigh carbon steel

Brian L. DeCost

Elizabeth A. Holm

143

158

0

04 May 2018

Intracranial Error Detection via Deep Learning

Intracranial Error Detection via Deep Learning

R. Schirrmeister

A. Schulze-Bonhage

Wolfram Burgard

150

10

0

04 May 2018

SFace: An Efficient Network for Face Detection in Large Scale Variations

SFace: An Efficient Network for Face Detection in Large Scale Variations

219

23

0

18 Apr 2018

signSGD: Compressed Optimisation for Non-Convex Problems

signSGD: Compressed Optimisation for Non-Convex Problems

Jeremy Bernstein

Kamyar Azizzadenesheli

Anima Anandkumar

547

1,183

0

13 Feb 2018

Improving Generalization Performance by Switching from Adam to SGD

Improving Generalization Performance by Switching from Adam to SGD

245

566

0

20 Dec 2017

Normalized Direction-preserving Adam

Normalized Direction-preserving Adam

186

30

0

13 Sep 2017

1 2 3...23 24 25

Page 25 of 25

Pageof 25