Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1711.05101
Cited By
v1
v2
v3 (latest)
Decoupled Weight Decay Regularization
14 November 2017
I. Loshchilov
Katharina Eggensperger
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Github (275★)
Papers citing
"Decoupled Weight Decay Regularization"
50 / 1,216 papers shown
Machine-Learning-Based Diagnostics of EEG Pathology
NeuroImage (NeuroImage), 2020
Lukas A. W. Gemein
R. Schirrmeister
P. Chrabaszcz
Daniel Wilson
Joschka Boedecker
A. Schulze-Bonhage
Katharina Eggensperger
T. Ball
194
184
0
11 Feb 2020
Faster On-Device Training Using New Federated Momentum Algorithm
Zhouyuan Huo
Qian Yang
Bin Gu
Heng-Chiao Huang
FedML
310
54
0
06 Feb 2020
AdvectiveNet: An Eulerian-Lagrangian Fluidic reservoir for Point Cloud Processing
International Conference on Learning Representations (ICLR), 2020
Xingzhe He
Helen Lu Cao
Bo Zhu
3DPC
133
10
0
01 Feb 2020
Generation-Distillation for Efficient Natural Language Understanding in Low-Data Settings
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Luke Melas-Kyriazi
George Han
Celine Liang
103
13
0
25 Jan 2020
Object as Hotspots: An Anchor-Free 3D Object Detection Approach via Firing of Hotspots
European Conference on Computer Vision (ECCV), 2019
Qi Chen
Lin Sun
Zhixin Wang
Kui Jia
Alan Yuille
3DPC
412
196
0
30 Dec 2019
End-to-end Named Entity Recognition and Relation Extraction using Pre-trained Language Models
John Giorgi
Xindi Wang
Nicola Sahar
W. Shin
Gary D. Bader
Bo Wang
164
40
0
20 Dec 2019
Analysis of Video Feature Learning in Two-Stream CNNs on the Example of Zebrafish Swim Bout Classification
International Conference on Learning Representations (ICLR), 2019
Bennet Breier
A. Onken
81
4
0
20 Dec 2019
12-in-1: Multi-Task Vision and Language Representation Learning
Computer Vision and Pattern Recognition (CVPR), 2019
Jiasen Lu
Vedanuj Goswami
Marcus Rohrbach
Devi Parikh
Stefan Lee
VLM
ObjD
314
499
0
05 Dec 2019
Learning scale-variant features for robust iris authentication with deep learning based ensemble framework
Siming Zheng
R. Rahmat
F. Khalid
Nurul Amelina Nasharuddin
125
3
0
02 Dec 2019
Granular Motor State Monitoring of Free Living Parkinson's Disease Patients via Deep Learning
K. Yuksel
Jann Goschenhofer
H. V. Varma
U. Fietzek
Franz MJ Pfister
OOD
104
0
0
15 Nov 2019
Understanding the Disharmony between Weight Normalization Family and Weight Decay:
ε
−
ε-
ε
−
shifted
L
2
L_2
L
2
Regularizer
Li Xiang
Chen Shuo
Xia Yan
Yang Jian
122
3
0
14 Nov 2019
E-BERT: Efficient-Yet-Effective Entity Embeddings for BERT
Findings (Findings), 2019
Nina Poerner
Ulli Waltinger
Hinrich Schütze
433
163
0
09 Nov 2019
An Adaptive and Momental Bound Method for Stochastic Learning
Jianbang Ding
Xuancheng Ren
Ruixuan Luo
Xu Sun
ODL
95
52
0
27 Oct 2019
Emergent Properties of Finetuned Language Representation Models
Alexandre Matton
Luke de Oliveira
SSL
94
2
0
23 Oct 2019
Finding New Diagnostic Information for Detecting Glaucoma using Neural Networks
Erfan Noury
Suria S. Mannil
R. Chang
A. Ran
C. Cheung
...
M. Riyazuddin
Dolly Chang
Sriharsha Nagaraj
Clement C. Tham
R. Zadeh
137
4
0
14 Oct 2019
On Empirical Comparisons of Optimizers for Deep Learning
Dami Choi
Christopher J. Shallue
Zachary Nado
Jaehoon Lee
Chris J. Maddison
George E. Dahl
446
289
0
11 Oct 2019
Demon: Improved Neural Network Training with Momentum Decay
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
John Chen
Cameron R. Wolfe
Zhaoqi Li
Anastasios Kyrillidis
ODL
294
20
0
11 Oct 2019
Meta-Learning Deep Energy-Based Memory Models
International Conference on Learning Representations (ICLR), 2019
Sergey Bartunov
Jack W. Rae
Simon Osindero
Timothy Lillicrap
314
35
0
07 Oct 2019
Ouroboros: On Accelerating Training of Transformer-Based Language Models
Neural Information Processing Systems (NeurIPS), 2019
Qian Yang
Zhouyuan Huo
Wenlin Wang
Heng-Chiao Huang
Lawrence Carin
138
9
0
14 Sep 2019
Extracting and Learning a Dependency-Enhanced Type Lexicon for Dutch
Konstantinos Kogkalidis
132
0
0
06 Sep 2019
Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection
Benjin Zhu
Zhengkai Jiang
Xiangxin Zhou
Zeming Li
Gang Yu
3DPC
510
551
0
26 Aug 2019
On the Variance of the Adaptive Learning Rate and Beyond
International Conference on Learning Representations (ICLR), 2019
Liyuan Liu
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
Jiawei Han
ODL
683
2,124
0
08 Aug 2019
A Deep Learning Based Attack for The Chaos-based Image Encryption
Chen He
Kan Ming
Yongwei Wang
Z. J. Wang
AAML
95
16
0
29 Jul 2019
signADAM: Learning Confidences for Deep Neural Networks
Dong Wang
Yicheng Liu
Wenwo Tang
Fanhua Shang
Hongying Liu
Qigong Sun
Licheng Jiao
ODL
FedML
92
1
0
21 Jul 2019
Lookahead Optimizer: k steps forward, 1 step back
Neural Information Processing Systems (NeurIPS), 2019
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
ODL
474
809
0
19 Jul 2019
Fetal Pose Estimation in Volumetric MRI using a 3D Convolution Neural Network
International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2019
Junshen Xu
Molin Zhang
Esra Abaci Turk
Larry Zhang
P. E. Grant
K. Ying
Polina Golland
E. Adalsteinsson
3DH
115
36
0
10 Jul 2019
Seismic data denoising and deblending using deep learning
Alan Richardson
C. Feller
70
27
0
02 Jul 2019
A Survey of Optimization Methods from a Machine Learning Perspective
IEEE Transactions on Cybernetics (IEEE Trans. Cybern.), 2019
Shiliang Sun
Zehui Cao
Han Zhu
Jing Zhao
222
625
0
17 Jun 2019
Modeling the Dynamics of PDE Systems with Physics-Constrained Deep Auto-Regressive Networks
Journal of Computational Physics (JCP), 2019
N. Geneva
N. Zabaras
AI4CE
357
310
0
13 Jun 2019
Training Neural Networks for and by Interpolation
International Conference on Machine Learning (ICML), 2019
Leonard Berrada
Andrew Zisserman
M. P. Kumar
3DH
206
69
0
13 Jun 2019
S3: A Spectral-Spatial Structure Loss for Pan-Sharpening Networks
IEEE Geoscience and Remote Sensing Letters (GRSL), 2019
Jae-Seok Choi
Yongwoo Kim
Munchurl Kim
152
16
0
13 Jun 2019
Latent Weights Do Not Exist: Rethinking Binarized Neural Network Optimization
Neural Information Processing Systems (NeurIPS), 2019
K. Helwegen
James Widdicombe
Lukas Geiger
Zechun Liu
K. Cheng
Roeland Nusselder
MQ
241
119
0
05 Jun 2019
Stochastic Gradients for Large-Scale Tensor Decomposition
SIAM Journal on Mathematics of Data Science (SIMODS), 2019
T. Kolda
David Hong
326
65
0
04 Jun 2019
Implicit Filter Sparsification In Convolutional Neural Networks
Dushyant Mehta
K. Kim
Christian Theobalt
116
2
0
13 May 2019
MixMatch: A Holistic Approach to Semi-Supervised Learning
Neural Information Processing Systems (NeurIPS), 2019
David Berthelot
Nicholas Carlini
Ian Goodfellow
Nicolas Papernot
Avital Oliver
Colin Raffel
521
3,365
0
06 May 2019
Learning Raw Image Denoising with Bayer Pattern Unification and Bayer Preserving Augmentation
Jiaming Liu
Chihao Wu
Yuzhi Wang
Qin Xu
Yuqian Zhou
...
Chuan Wang
Shaofan Cai
Yifan Ding
Haoqiang Fan
Jue Wang
140
74
0
29 Apr 2019
Jasper: An End-to-End Convolutional Neural Acoustic Model
Jason Chun Lok Li
Vitaly Lavrukhin
Boris Ginsburg
Ryan Leary
Oleksii Kuchaiev
Jonathan M. Cohen
Huyen Nguyen
R. Gadde
DRL
VLM
AuLLM
252
277
0
05 Apr 2019
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Yang You
Jing Li
Sashank J. Reddi
Jonathan Hseu
Sanjiv Kumar
Srinadh Bhojanapalli
Xiaodan Song
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
ODL
870
1,110
0
01 Apr 2019
To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks
Matthew E. Peters
Sebastian Ruder
Noah A. Smith
384
465
0
14 Mar 2019
Connection Sensitive Attention U-NET for Accurate Retinal Vessel Segmentation
Ruirui Li
Mingming Li
Jiacheng Li
Yating Zhou
144
46
0
13 Mar 2019
DeepOBS: A Deep Learning Optimizer Benchmark Suite
Frank Schneider
Lukas Balles
Philipp Hennig
ODL
355
75
0
13 Mar 2019
CIA-Net: Robust Nuclei Instance Segmentation with Contour-aware Information Aggregation
Yanning Zhou
O. F. Onder
Qi Dou
E. Tsougenis
Hao Chen
Pheng-Ann Heng
145
221
0
13 Mar 2019
Dense Classification and Implanting for Few-Shot Learning
Yann Lifchitz
Yannis Avrithis
Sylvaine Picard
Andrei Bursuc
VLM
180
202
0
12 Mar 2019
ParticleNet: Jet Tagging via Particle Clouds
H. Qu
L. Gouskos
3DPC
MU
317
280
0
22 Feb 2019
Combining learning rate decay and weight decay with complexity gradient descent - Part I
Pierre Harvey Richemond
Wenhan Luo
102
4
0
07 Feb 2019
ICLR Reproducibility Challenge Report (Padam : Closing The Generalization Gap Of Adaptive Gradient Methods in Training Deep Neural Networks)
Harshal Mittal
Kartikey Pandey
Yash Kant
ODL
53
5
0
28 Jan 2019
Multi-style Generative Reading Comprehension
Kyosuke Nishida
Itsumi Saito
Kosuke Nishida
Kazutoshi Shinoda
Atsushi Otsuka
Hisako Asano
J. Tomita
244
71
0
08 Jan 2019
Deep Speech Enhancement for Reverberated and Noisy Signals using Wide Residual Networks
D. González
Jorge Llombart
A. Miguel
Luis Vicente
164
16
0
03 Jan 2019
Adam Induces Implicit Weight Sparsity in Rectifier Neural Networks
A. Yaguchi
Taiji Suzuki
Wataru Asano
Shuhei Nitta
Y. Sakata
A. Tanizawa
90
19
0
19 Dec 2018
On Implicit Filter Level Sparsity in Convolutional Neural Networks
Dushyant Mehta
K. Kim
Christian Theobalt
179
28
0
29 Nov 2018
Previous
1
2
3
...
23
24
25
Next