Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1908.03265
Cited By
v1
v2
v3
v4 (latest)
On the Variance of the Adaptive Learning Rate and Beyond
8 August 2019
Liyuan Liu
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
Jiawei Han
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Github (2548★)
Papers citing
"On the Variance of the Adaptive Learning Rate and Beyond"
50 / 864 papers shown
Title
Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect
Yuqing Wang
Minshuo Chen
T. Zhao
Molei Tao
AI4CE
142
42
0
07 Oct 2021
GNN is a Counter? Revisiting GNN for Question Answering
Kuan-Chieh Wang
Yuyu Zhang
Diyi Yang
Le Song
Tao Qin
LMTD
74
31
0
07 Oct 2021
A Hybrid Spatial-temporal Deep Learning Architecture for Lane Detection
Yongqi Dong
S. Patil
B. Arem
Haneen Farah
111
41
0
05 Oct 2021
Multilingual AMR Parsing with Noisy Knowledge Distillation
Deng Cai
Xin Li
Jackie Chun-Sing Ho
Lidong Bing
W. Lam
56
18
0
30 Sep 2021
Generative Probabilistic Image Colorization
Chie Furusawa
S. Kitaoka
Michael Li
Yuri Odagiri
DiffM
215
4
0
29 Sep 2021
AdaInject: Injection Based Adaptive Gradient Descent Optimizers for Convolutional Neural Networks
S. Dubey
S. H. Shabbeer Basha
S. Singh
B. B. Chaudhuri
ODL
110
9
0
26 Sep 2021
Long-Range Feature Propagating for Natural Image Matting
Qinglin Liu
Haozhe Xie
Shengping Zhang
Bineng Zhong
Rongrong Ji
115
34
0
25 Sep 2021
Inequality Constrained Stochastic Nonlinear Optimization via Active-Set Sequential Quadratic Programming
Sen Na
M. Anitescu
Mladen Kolar
77
35
0
23 Sep 2021
Commonsense Knowledge in Word Associations and ConceptNet
Chunhua Liu
Trevor Cohn
Lea Frermann
87
8
0
20 Sep 2021
Towards Joint Intent Detection and Slot Filling via Higher-order Attention
Dongsheng Chen
Zhiqi Huang
Xian Wu
Shen Ge
Yuexian Zou
75
22
0
18 Sep 2021
TrouSPI-Net: Spatio-temporal attention on parallel atrous convolutions and U-GRUs for skeletal pedestrian crossing prediction
Joseph Gesnouin
Steve Pechberti
B. Stanciulescu
Fabien Moutarde
92
24
0
02 Sep 2021
SANSformers: Self-Supervised Forecasting in Electronic Health Records with Attention-Free Models
Yogesh Kumar
Alexander Ilin
H. Salo
S. Kulathinal
M. Leinonen
Pekka Marttinen
AI4TS
MedIm
32
0
0
31 Aug 2021
Iterative Filter Adaptive Network for Single Image Defocus Deblurring
Junyong Lee
Hyeongseok Son
Jaesung Rim
Sunghyun Cho
Seungyong Lee
98
125
0
31 Aug 2021
Rethinking Deep Image Prior for Denoising
Yeonsik Jo
S. Chun
Jonghyun Choi
AI4CE
77
56
0
29 Aug 2021
RPR-Net: A Point Cloud-based Rotation-aware Large Scale Place Recognition Network
Zhaoxin Fan
Zhenbo Song
Wenping Zhang
Hongyan Liu
Jun He
Xiaoyong Du
3DPC
149
8
0
29 Aug 2021
HAN: Higher-order Attention Network for Spoken Language Understanding
Dongsheng Chen
Zhiqi Huang
Yuexian Zou
47
1
0
26 Aug 2021
Towards Memory-Efficient Neural Networks via Multi-Level in situ Generation
Jiaqi Gu
Hanqing Zhu
Chenghao Feng
Mingjie Liu
Zixuan Jiang
Ray T. Chen
David Z. Pan
44
4
0
25 Aug 2021
MimicBot: Combining Imitation and Reinforcement Learning to win in Bot Bowl
Nicola Pezzotti
60
1
0
21 Aug 2021
SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation
Yan Di
Fabian Manhardt
Gu Wang
Xiangyang Ji
Nassir Navab
Federico Tombari
58
135
0
18 Aug 2021
Logit Attenuating Weight Normalization
Aman Gupta
R. Ramanath
Jun Shi
Anika Ramachandran
Sirou Zhou
Mingzhou Zhou
S. Keerthi
75
1
0
12 Aug 2021
Gates Are Not What You Need in RNNs
Ronalds Zakovskis
Andis Draguns
Eliza Gaile
Emīls Ozoliņš
Kārlis Freivalds
48
1
0
01 Aug 2021
Transformer-based deep imitation learning for dual-arm robot manipulation
Heecheol Kim
Yoshiyuki Ohmura
Yasuo Kuniyoshi
150
52
0
01 Aug 2021
Self-Paced Contrastive Learning for Semi-supervised Medical Image Segmentation with Meta-labels
Jizong Peng
Ping Wang
Chrisitian Desrosiers
M. Pedersoli
SSL
89
65
0
29 Jul 2021
Demonstration-Guided Reinforcement Learning with Learned Skills
Karl Pertsch
Youngwoon Lee
Yue Wu
Joseph J. Lim
OffRL
67
85
0
21 Jul 2021
3D fluorescence microscopy data synthesis for segmentation and benchmarking
Dennis Eschweiler
Malte Rethwisch
Mareike Jarchow
Simon Koppers
Johannes Stegmaier
3DV
MedIm
81
16
0
21 Jul 2021
A New Adaptive Gradient Method with Gradient Decomposition
Zhou Shao
Tong Lin
ODL
39
0
0
18 Jul 2021
TGIF: Tree-Graph Integrated-Format Parser for Enhanced UD with Two-Stage Generic- to Individual-Language Finetuning
Tianze Shi
Lillian Lee
64
7
0
14 Jul 2021
1st Place Solution for ICDAR 2021 Competition on Mathematical Formula Detection
Yuxiang Zhong
Xianbiao Qi
Shanjun Li
Dengyi Gu
Yihao Chen
Peiyang Ning
Rong Xiao
27
6
0
12 Jul 2021
Delta Sampling R-BERT for limited data and low-light action recognition
Sanchit Hira
Ritwik Das
Abhinav Modi
D. Pakhomov
109
17
0
12 Jul 2021
BEV-MODNet: Monocular Camera based Bird's Eye View Moving Object Detection for Autonomous Driving
Hazem Rashed
Mariam Essam
M. Mohamed
Ahmad El-Sallab
S. Yogamani
50
12
0
11 Jul 2021
REX: Revisiting Budgeted Training with an Improved Schedule
John Chen
Cameron R. Wolfe
Anastasios Kyrillidis
59
9
0
09 Jul 2021
KOALA: A Kalman Optimization Algorithm with Loss Adaptivity
A. Davtyan
Sepehr Sameni
L. Cerkezi
Givi Meishvili
Adam Bielski
Paolo Favaro
ODL
172
3
0
07 Jul 2021
Deep Network Approximation: Achieving Arbitrary Accuracy with Fixed Number of Neurons
Zuowei Shen
Haizhao Yang
Shijun Zhang
183
38
0
06 Jul 2021
Morphological Classification of Galaxies in S-PLUS using an Ensemble of Convolutional Networks
N. M. Cardoso
G. B. O. Schwarz
L. O. Dias
C. R. Bom
L. Sodré
C. Mendes de Oliveira
49
0
0
05 Jul 2021
Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation
Ryo Masumura
Daiki Okamura
Naoki Makishima
Mana Ihori
Akihiko Takashima
Tomohiro Tanaka
Shota Orihashi
57
7
0
04 Jul 2021
AdaL: Adaptive Gradient Transformation Contributes to Convergences and Generalizations
Hongwei Zhang
Weidong Zou
Hongbo Zhao
Qi Ming
Tijin Yan
Yuanqing Xia
Weipeng Cao
ODL
22
0
0
04 Jul 2021
Classical Planning in Deep Latent Space
Masataro Asai
Hiroshi Kajino
A. Fukunaga
Christian Muise
VLM
82
19
0
30 Jun 2021
An Integrated Framework for Two-pass Personalized Voice Trigger
Dexin Liao
Jing Li
Yiming Zhi
Song Li
Q. Hong
Lin Li
57
1
0
30 Jun 2021
N-Singer: A Non-Autoregressive Korean Singing Voice Synthesis System for Pronunciation Enhancement
Gyeong-Hoon Lee
Tae-Woo Kim
Hanbin Bae
Min-Ji Lee
Young-Ik Kim
Hoon-Young Cho
VLM
79
20
0
29 Jun 2021
High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails
Ashok Cutkosky
Harsh Mehta
83
62
0
28 Jun 2021
Ranger21: a synergistic deep learning optimizer
Less Wright
Nestor Demeure
ODL
AI4CE
104
88
0
25 Jun 2021
Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks using Switching Tokens
Mana Ihori
Naoki Makishima
Tomohiro Tanaka
Akihiko Takashima
Shota Orihashi
Ryo Masumura
26
3
0
23 Jun 2021
Probabilistic Attention for Interactive Segmentation
Prasad Gabbur
Manjot Bilkhu
J. Movellan
103
13
0
23 Jun 2021
Rethinking Adam: A Twofold Exponential Moving Average Approach
Yizhou Wang
Yue Kang
Can Qin
Huan Wang
Yi Xu
Yulun Zhang
Y. Fu
ODL
70
7
0
22 Jun 2021
It's FLAN time! Summing feature-wise latent representations for interpretability
An-phi Nguyen
María Rodríguez Martínez
FAtt
32
0
0
18 Jun 2021
Multi-head or Single-head? An Empirical Comparison for Transformer Training
Liyuan Liu
Jialu Liu
Jiawei Han
71
33
0
17 Jun 2021
Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation
Haoxiang Wang
Han Zhao
Yue Liu
112
90
0
16 Jun 2021
SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients
Feihu Huang
Junyi Li
Heng-Chiao Huang
ODL
115
42
0
15 Jun 2021
A Clinically Inspired Approach for Melanoma classification
Prathyusha Akundi
So Gun
J. Sivaswamy
13
0
0
15 Jun 2021
BoolNet: Minimizing The Energy Consumption of Binary Neural Networks
Nianhui Guo
Joseph Bethge
Haojin Yang
Kai Zhong
Xuefei Ning
Christoph Meinel
Yu Wang
MQ
63
11
0
13 Jun 2021
Previous
1
2
3
...
10
11
12
...
16
17
18
Next