Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.05101
Cited By
Decoupled Weight Decay Regularization
14 November 2017
I. Loshchilov
Frank Hutter
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Decoupled Weight Decay Regularization"
20 / 320 papers shown
Title
Machine-Learning-Based Diagnostics of EEG Pathology
Lukas A. W. Gemein
R. Schirrmeister
P. Chrabaszcz
Daniel Wilson
Joschka Boedecker
A. Schulze-Bonhage
Frank Hutter
T. Ball
19
153
0
11 Feb 2020
Faster On-Device Training Using New Federated Momentum Algorithm
Zhouyuan Huo
Qian Yang
Bin Gu
Heng-Chiao Huang
FedML
14
47
0
06 Feb 2020
Object as Hotspots: An Anchor-Free 3D Object Detection Approach via Firing of Hotspots
Qi Chen
Lin Sun
Zhixin Wang
K. Jia
Alan Yuille
3DPC
161
169
0
30 Dec 2019
An Adaptive and Momental Bound Method for Stochastic Learning
Jianbang Ding
Xuancheng Ren
Ruixuan Luo
Xu Sun
ODL
9
46
0
27 Oct 2019
Demon: Improved Neural Network Training with Momentum Decay
John Chen
Cameron R. Wolfe
Zhaoqi Li
Anastasios Kyrillidis
ODL
13
15
0
11 Oct 2019
Meta-Learning Deep Energy-Based Memory Models
Sergey Bartunov
Jack W. Rae
Simon Osindero
Timothy Lillicrap
26
34
0
07 Oct 2019
A Deep Learning Based Attack for The Chaos-based Image Encryption
Chen He
Kan Ming
Yongwei Wang
Z. J. Wang
AAML
11
16
0
29 Jul 2019
Lookahead Optimizer: k steps forward, 1 step back
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
ODL
25
717
0
19 Jul 2019
Fetal Pose Estimation in Volumetric MRI using a 3D Convolution Neural Network
Junshen Xu
Molin Zhang
Esra Abaci Turk
Larry Zhang
P. E. Grant
K. Ying
Polina Golland
E. Adalsteinsson
3DH
8
31
0
10 Jul 2019
S3: A Spectral-Spatial Structure Loss for Pan-Sharpening Networks
Jae-Seok Choi
Yongwoo Kim
Munchurl Kim
6
15
0
13 Jun 2019
Latent Weights Do Not Exist: Rethinking Binarized Neural Network Optimization
K. Helwegen
James Widdicombe
Lukas Geiger
Zechun Liu
K. Cheng
Roeland Nusselder
MQ
24
110
0
05 Jun 2019
Stochastic Gradients for Large-Scale Tensor Decomposition
T. Kolda
David Hong
28
55
0
04 Jun 2019
Learning Raw Image Denoising with Bayer Pattern Unification and Bayer Preserving Augmentation
Jiaming Liu
Chihao Wu
Yuzhi Wang
Qin Xu
Yuqian Zhou
...
Chuan Wang
Shaofan Cai
Yifan Ding
Haoqiang Fan
Jue Wang
23
67
0
29 Apr 2019
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Yang You
Jing Li
Sashank J. Reddi
Jonathan Hseu
Sanjiv Kumar
Srinadh Bhojanapalli
Xiaodan Song
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
ODL
28
978
0
01 Apr 2019
Quasi-hyperbolic momentum and Adam for deep learning
Jerry Ma
Denis Yarats
ODL
76
129
0
16 Oct 2018
Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks
Jinghui Chen
Dongruo Zhou
Yiqi Tang
Ziyan Yang
Yuan Cao
Quanquan Gu
ODL
19
192
0
18 Jun 2018
Do Better ImageNet Models Transfer Better?
Simon Kornblith
Jonathon Shlens
Quoc V. Le
OOD
MLT
52
1,308
0
23 May 2018
SFace: An Efficient Network for Face Detection in Large Scale Variations
Jianfeng Wang
Ye Yuan
Boxun Li
Gang Yu
Sun Jian
CVBM
8
22
0
18 Apr 2018
signSGD: Compressed Optimisation for Non-Convex Problems
Jeremy Bernstein
Yu-Xiang Wang
Kamyar Azizzadenesheli
Anima Anandkumar
FedML
ODL
35
1,018
0
13 Feb 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
281
2,888
0
15 Sep 2016
Previous
1
2
3
4
5
6
7