ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1908.03265
  4. Cited By
On the Variance of the Adaptive Learning Rate and Beyond
v1v2v3v4 (latest)

On the Variance of the Adaptive Learning Rate and Beyond

8 August 2019
Liyuan Liu
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
Jiawei Han
    ODL
ArXiv (abs)PDFHTMLGithub (2548★)

Papers citing "On the Variance of the Adaptive Learning Rate and Beyond"

50 / 864 papers shown
Title
Extrapolatable Relational Reasoning With Comparators in Low-Dimensional
  Manifolds
Extrapolatable Relational Reasoning With Comparators in Low-Dimensional Manifolds
Duo Wang
M. Jamnik
Pietro Lio
OOD
29
1
0
15 Jun 2020
The Limit of the Batch Size
The Limit of the Batch Size
Yang You
Yuhui Wang
Huan Zhang
Zhao-jie Zhang
J. Demmel
Cho-Jui Hsieh
121
15
0
15 Jun 2020
AdamP: Slowing Down the Slowdown for Momentum Optimizers on
  Scale-invariant Weights
AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights
Byeongho Heo
Sanghyuk Chun
Seong Joon Oh
Dongyoon Han
Sangdoo Yun
Gyuwan Kim
Youngjung Uh
Jung-Woo Ha
ODL
375
27
0
15 Jun 2020
BI-MAML: Balanced Incremental Approach for Meta Learning
BI-MAML: Balanced Incremental Approach for Meta Learning
Yang Zheng
Jinlin Xiang
Kun Su
Eli Shlizerman
CLL
48
10
0
12 Jun 2020
Getting a CLUE: A Method for Explaining Uncertainty Estimates
Getting a CLUE: A Method for Explaining Uncertainty Estimates
Javier Antorán
Umang Bhatt
T. Adel
Adrian Weller
José Miguel Hernández-Lobato
UQCVBDL
110
116
0
11 Jun 2020
Adaptive Gradient Methods Converge Faster with Over-Parameterization
  (but you should do a line-search)
Adaptive Gradient Methods Converge Faster with Over-Parameterization (but you should do a line-search)
Sharan Vaswani
I. Laradji
Frederik Kunstner
S. Meng
Mark Schmidt
Simon Lacoste-Julien
142
27
0
11 Jun 2020
AdaS: Adaptive Scheduling of Stochastic Gradients
AdaS: Adaptive Scheduling of Stochastic Gradients
Mahdi S. Hosseini
Konstantinos N. Plataniotis
ODL
80
12
0
11 Jun 2020
Deep learning of contagion dynamics on complex networks
Deep learning of contagion dynamics on complex networks
Charles Murphy
Edward Laurence
Antoine Allard
GNNAI4CE
22
70
0
09 Jun 2020
Detecting structural perturbations from time series with deep learning
Detecting structural perturbations from time series with deep learning
Edward Laurence
Charles Murphy
G. St‐Onge
Xavier Roy-Pomerleau
Vincent Thibeault
37
3
0
09 Jun 2020
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and
  Strong Baselines
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines
Marius Mosbach
Maksym Andriushchenko
Dietrich Klakow
187
363
0
08 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
189
2,770
0
05 Jun 2020
sEMG Gesture Recognition with a Simple Model of Attention
sEMG Gesture Recognition with a Simple Model of Attention
David Josephs
Carson Drake
Andrew Heroy
John Santerre
64
48
0
05 Jun 2020
Novel Object Viewpoint Estimation through Reconstruction Alignment
Novel Object Viewpoint Estimation through Reconstruction Alignment
Mohamed El Banani
Jason J. Corso
David Fouhey
122
14
0
05 Jun 2020
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
Z. Yao
A. Gholami
Sheng Shen
Mustafa Mustafa
Kurt Keutzer
Michael W. Mahoney
ODL
152
287
0
01 Jun 2020
Universal Lesion Detection by Learning from Multiple Heterogeneously
  Labeled Datasets
Universal Lesion Detection by Learning from Multiple Heterogeneously Labeled Datasets
K. Yan
Jinzheng Cai
Adam P. Harrison
D. Jin
Jing Xiao
Le Lu
63
18
0
28 May 2020
Detecting Scatteredly-Distributed, Small, andCritically Important
  Objects in 3D OncologyImaging via Decision Stratification
Detecting Scatteredly-Distributed, Small, andCritically Important Objects in 3D OncologyImaging via Decision Stratification
Zhuotun Zhu
K. Yan
D. Jin
Jinzheng Cai
T. Ho
...
Chun-Hung Chao
X. Ye
Jing Xiao
Alan Yuille
Le Lu
63
10
0
27 May 2020
Few-shot Compositional Font Generation with Dual Memory
Few-shot Compositional Font Generation with Dual Memory
Junbum Cha
Sanghyuk Chun
Gayoung Lee
Bado Lee
Seonghyeon Kim
Hwalsuk Lee
70
80
0
21 May 2020
ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech
  Recognition
ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech Recognition
Jing Pan
Joshua Shapiro
Jeremy Wohlwend
Kyu Jeong Han
Tao Lei
T. Ma
72
22
0
21 May 2020
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive
  Pitch-dependent Dilated Convolution Model for Parametric Speech Generation
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive Pitch-dependent Dilated Convolution Model for Parametric Speech Generation
Yi-Chiao Wu
Tomoki Hayashi
T. Okamoto
Hisashi Kawai
Tomoki Toda
73
4
0
18 May 2020
JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech
  without Explicit Alignment
JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech without Explicit Alignment
D. Lim
Won Jang
Gyeonghwan O
Heayoung Park
Bongwan Kim
Jaesam Yoon
68
37
0
15 May 2020
Neural Networks Versus Conventional Filters for Inertial-Sensor-based
  Attitude Estimation
Neural Networks Versus Conventional Filters for Inertial-Sensor-based Attitude Estimation
Daniel Weber
C. Gühmann
Thomas Seel
37
35
0
14 May 2020
DiscreTalk: Text-to-Speech as a Machine Translation Problem
DiscreTalk: Text-to-Speech as a Machine Translation Problem
Tomoki Hayashi
Shinji Watanabe
70
32
0
12 May 2020
Benchmark Tests of Convolutional Neural Network and Graph Convolutional
  Network on HorovodRunner Enabled Spark Clusters
Benchmark Tests of Convolutional Neural Network and Graph Convolutional Network on HorovodRunner Enabled Spark Clusters
Jing Pan
Wendao Liu
Jing Zhou
GNNBDL
14
2
0
12 May 2020
2kenize: Tying Subword Sequences for Chinese Script Conversion
2kenize: Tying Subword Sequences for Chinese Script Conversion
Pranav A
Isabelle Augenstein
66
1
0
07 May 2020
Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question
  Answering
Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering
Yanlin Feng
Xinyue Chen
Bill Yuchen Lin
Peifeng Wang
Jun Yan
Xiang Ren
LRMKELM
79
246
0
01 May 2020
BlackBox: Generalizable Reconstruction of Extremal Values from
  Incomplete Spatio-Temporal Data
BlackBox: Generalizable Reconstruction of Extremal Values from Incomplete Spatio-Temporal Data
T. Ivek
Domagoj Vlah
66
4
0
30 Apr 2020
How to Learn a Useful Critic? Model-based Action-Gradient-Estimator
  Policy Optimization
How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization
P. DÓro
Wojciech Ja'skowski
OffRL
92
27
0
29 Apr 2020
Learning Neural-Symbolic Descriptive Planning Models via Cube-Space
  Priors: The Voyage Home (to STRIPS)
Learning Neural-Symbolic Descriptive Planning Models via Cube-Space Priors: The Voyage Home (to STRIPS)
Masataro Asai
Christian Muise
64
8
0
27 Apr 2020
Five Points to Check when Comparing Visual Perception in Humans and
  Machines
Five Points to Check when Comparing Visual Perception in Humans and Machines
Christina M. Funke
Judy Borowski
Karolina Stosio
Wieland Brendel
Thomas S. A. Wallis
Matthias Bethge
71
33
0
20 Apr 2020
Adversarial Training for Large Neural Language Models
Adversarial Training for Large Neural Language Models
Xiaodong Liu
Hao Cheng
Pengcheng He
Weizhu Chen
Yu Wang
Hoifung Poon
Jianfeng Gao
AAML
94
186
0
20 Apr 2020
Organ at Risk Segmentation for Head and Neck Cancer using Stratified
  Learning and Neural Architecture Search
Organ at Risk Segmentation for Head and Neck Cancer using Stratified Learning and Neural Architecture Search
Dazhou Guo
D. Jin
Zhuotun Zhu
T. Ho
Adam P. Harrison
Chun-Hung Chao
Jing Xiao
Alan Yuille
Chien-Yu Lin
Le Lu
85
59
0
17 Apr 2020
Understanding the Difficulty of Training Transformers
Understanding the Difficulty of Training Transformers
Liyuan Liu
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
Jiawei Han
AI4CE
85
258
0
17 Apr 2020
A Cross-Stitch Architecture for Joint Registration and Segmentation in
  Adaptive Radiotherapy
A Cross-Stitch Architecture for Joint Registration and Segmentation in Adaptive Radiotherapy
Laurens Beljaards
Mohamed S. Elmahdy
F. Verbeek
Marius Staring
3DV
60
15
0
17 Apr 2020
Show Us the Way: Learning to Manage Dialog from Demonstrations
Show Us the Way: Learning to Manage Dialog from Demonstrations
Gabriel Gordon-Hall
P. Gorinski
Gerasimos Lampouras
Ignacio Iacobacci
OffRL
109
11
0
17 Apr 2020
An Adaptive Intelligence Algorithm for Undersampled Knee MRI
  Reconstruction
An Adaptive Intelligence Algorithm for Undersampled Knee MRI Reconstruction
Nicola Pezzotti
Sahar Yousefi
Mohamed S. Elmahdy
J. V. Gemert
C. Schulke
...
Sergey Kastryulin
B. Lelieveldt
M. Osch
E. Weerdt
Marius Staring
70
100
0
15 Apr 2020
Self6D: Self-Supervised Monocular 6D Object Pose Estimation
Self6D: Self-Supervised Monocular 6D Object Pose Estimation
Gu Wang
Fabian Manhardt
Jianzhun Shao
Xiangyang Ji
Nassir Navab
Federico Tombari
SSLMDE
100
136
0
14 Apr 2020
Interview: A Large-Scale Open-Source Corpus of Media Dialog
Interview: A Large-Scale Open-Source Corpus of Media Dialog
Bodhisattwa Prasad Majumder
Shuyang Li
Jianmo Ni
Julian McAuley
AuLLM
58
4
0
07 Apr 2020
Residual Shuffle-Exchange Networks for Fast Processing of Long Sequences
Residual Shuffle-Exchange Networks for Fast Processing of Long Sequences
Andis Draguns
Emīls Ozoliņš
A. Sostaks
Matiss Apinis
Kārlis Freivalds
56
8
0
06 Apr 2020
Applying Cyclical Learning Rate to Neural Machine Translation
Applying Cyclical Learning Rate to Neural Machine Translation
Choon Meng Lee
Jianfeng Liu
Wei Peng
ODL
27
2
0
06 Apr 2020
Towards Lifelong Self-Supervision For Unpaired Image-to-Image
  Translation
Towards Lifelong Self-Supervision For Unpaired Image-to-Image Translation
Victor Schmidt
Makesh Narsimhan Sreedhar
M. Elaraby
Irina Rish
SSLCLL
45
2
0
31 Mar 2020
Nonconvex sparse regularization for deep neural networks and its
  optimality
Nonconvex sparse regularization for deep neural networks and its optimality
Ilsang Ohn
Yongdai Kim
71
19
0
26 Mar 2020
iTAML: An Incremental Task-Agnostic Meta-learning Approach
iTAML: An Incremental Task-Agnostic Meta-learning Approach
Jathushan Rajasegaran
Salman Khan
Munawar Hayat
Fahad Shahbaz Khan
M. Shah
CLLOOD
146
157
0
25 Mar 2020
Multi-Plateau Ensemble for Endoscopic Artefact Segmentation and
  Detection
Multi-Plateau Ensemble for Endoscopic Artefact Segmentation and Detection
Suyog Jadhav
Udbhav Bamba
Arnav Chavan
Rishabh Tiwari
A. Raj
38
3
0
23 Mar 2020
End-to-End Deep Diagnosis of X-ray Images
End-to-End Deep Diagnosis of X-ray Images
Kudaibergen Urinbayev
Yerassyl Orazbek
Yernur Nurambek
A. Mirzakhmetov
H. A. Varol
MedIm
30
12
0
19 Mar 2020
Getting to 99% Accuracy in Interactive Segmentation
Getting to 99% Accuracy in Interactive Segmentation
Marco Forte
Brian L. Price
Scott D. Cohen
N. Xu
Franccois Pitié
155
39
0
17 Mar 2020
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Huiyu Wang
Yukun Zhu
Bradley Green
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
3DPC
136
674
0
17 Mar 2020
$F$, $B$, Alpha Matting
FFF, BBB, Alpha Matting
Marco Forte
François Pitié
106
85
0
17 Mar 2020
PLOP: Probabilistic poLynomial Objects trajectory Planning for
  autonomous driving
PLOP: Probabilistic poLynomial Objects trajectory Planning for autonomous driving
Thibault Buhet
É. Wirbel
Andrei Bursuc
Xavier Perrotton
76
33
0
09 Mar 2020
Wide-minima Density Hypothesis and the Explore-Exploit Learning Rate
  Schedule
Wide-minima Density Hypothesis and the Explore-Exploit Learning Rate Schedule
Nikhil Iyer
V. Thejas
Nipun Kwatra
Ramachandran Ramjee
Muthian Sivathanu
89
29
0
09 Mar 2020
Colored Noise Injection for Training Adversarially Robust Neural
  Networks
Colored Noise Injection for Training Adversarially Robust Neural Networks
Evgenii Zheltonozhskii
Chaim Baskin
Yaniv Nemcovsky
Brian Chmiel
A. Mendelson
A. Bronstein
AAML
32
5
0
04 Mar 2020
Previous
123...15161718
Next