ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1908.03265
  4. Cited By
On the Variance of the Adaptive Learning Rate and Beyond
v1v2v3v4 (latest)

On the Variance of the Adaptive Learning Rate and Beyond

8 August 2019
Liyuan Liu
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
Jiawei Han
    ODL
ArXiv (abs)PDFHTMLGithub (2548★)

Papers citing "On the Variance of the Adaptive Learning Rate and Beyond"

50 / 864 papers shown
Title
Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect
Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect
Yuqing Wang
Minshuo Chen
T. Zhao
Molei Tao
AI4CE
142
42
0
07 Oct 2021
GNN is a Counter? Revisiting GNN for Question Answering
GNN is a Counter? Revisiting GNN for Question Answering
Kuan-Chieh Wang
Yuyu Zhang
Diyi Yang
Le Song
Tao Qin
LMTD
74
31
0
07 Oct 2021
A Hybrid Spatial-temporal Deep Learning Architecture for Lane Detection
A Hybrid Spatial-temporal Deep Learning Architecture for Lane Detection
Yongqi Dong
S. Patil
B. Arem
Haneen Farah
111
41
0
05 Oct 2021
Multilingual AMR Parsing with Noisy Knowledge Distillation
Multilingual AMR Parsing with Noisy Knowledge Distillation
Deng Cai
Xin Li
Jackie Chun-Sing Ho
Lidong Bing
W. Lam
56
18
0
30 Sep 2021
Generative Probabilistic Image Colorization
Generative Probabilistic Image Colorization
Chie Furusawa
S. Kitaoka
Michael Li
Yuri Odagiri
DiffM
215
4
0
29 Sep 2021
AdaInject: Injection Based Adaptive Gradient Descent Optimizers for
  Convolutional Neural Networks
AdaInject: Injection Based Adaptive Gradient Descent Optimizers for Convolutional Neural Networks
S. Dubey
S. H. Shabbeer Basha
S. Singh
B. B. Chaudhuri
ODL
110
9
0
26 Sep 2021
Long-Range Feature Propagating for Natural Image Matting
Long-Range Feature Propagating for Natural Image Matting
Qinglin Liu
Haozhe Xie
Shengping Zhang
Bineng Zhong
Rongrong Ji
115
34
0
25 Sep 2021
Inequality Constrained Stochastic Nonlinear Optimization via Active-Set
  Sequential Quadratic Programming
Inequality Constrained Stochastic Nonlinear Optimization via Active-Set Sequential Quadratic Programming
Sen Na
M. Anitescu
Mladen Kolar
77
35
0
23 Sep 2021
Commonsense Knowledge in Word Associations and ConceptNet
Commonsense Knowledge in Word Associations and ConceptNet
Chunhua Liu
Trevor Cohn
Lea Frermann
87
8
0
20 Sep 2021
Towards Joint Intent Detection and Slot Filling via Higher-order Attention
Dongsheng Chen
Zhiqi Huang
Xian Wu
Shen Ge
Yuexian Zou
75
22
0
18 Sep 2021
TrouSPI-Net: Spatio-temporal attention on parallel atrous convolutions
  and U-GRUs for skeletal pedestrian crossing prediction
TrouSPI-Net: Spatio-temporal attention on parallel atrous convolutions and U-GRUs for skeletal pedestrian crossing prediction
Joseph Gesnouin
Steve Pechberti
B. Stanciulescu
Fabien Moutarde
92
24
0
02 Sep 2021
SANSformers: Self-Supervised Forecasting in Electronic Health Records
  with Attention-Free Models
SANSformers: Self-Supervised Forecasting in Electronic Health Records with Attention-Free Models
Yogesh Kumar
Alexander Ilin
H. Salo
S. Kulathinal
M. Leinonen
Pekka Marttinen
AI4TSMedIm
32
0
0
31 Aug 2021
Iterative Filter Adaptive Network for Single Image Defocus Deblurring
Iterative Filter Adaptive Network for Single Image Defocus Deblurring
Junyong Lee
Hyeongseok Son
Jaesung Rim
Sunghyun Cho
Seungyong Lee
98
125
0
31 Aug 2021
Rethinking Deep Image Prior for Denoising
Rethinking Deep Image Prior for Denoising
Yeonsik Jo
S. Chun
Jonghyun Choi
AI4CE
77
56
0
29 Aug 2021
RPR-Net: A Point Cloud-based Rotation-aware Large Scale Place
  Recognition Network
RPR-Net: A Point Cloud-based Rotation-aware Large Scale Place Recognition Network
Zhaoxin Fan
Zhenbo Song
Wenping Zhang
Hongyan Liu
Jun He
Xiaoyong Du
3DPC
149
8
0
29 Aug 2021
HAN: Higher-order Attention Network for Spoken Language Understanding
HAN: Higher-order Attention Network for Spoken Language Understanding
Dongsheng Chen
Zhiqi Huang
Yuexian Zou
47
1
0
26 Aug 2021
Towards Memory-Efficient Neural Networks via Multi-Level in situ
  Generation
Towards Memory-Efficient Neural Networks via Multi-Level in situ Generation
Jiaqi Gu
Hanqing Zhu
Chenghao Feng
Mingjie Liu
Zixuan Jiang
Ray T. Chen
David Z. Pan
44
4
0
25 Aug 2021
MimicBot: Combining Imitation and Reinforcement Learning to win in Bot
  Bowl
MimicBot: Combining Imitation and Reinforcement Learning to win in Bot Bowl
Nicola Pezzotti
60
1
0
21 Aug 2021
SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation
SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation
Yan Di
Fabian Manhardt
Gu Wang
Xiangyang Ji
Nassir Navab
Federico Tombari
58
135
0
18 Aug 2021
Logit Attenuating Weight Normalization
Logit Attenuating Weight Normalization
Aman Gupta
R. Ramanath
Jun Shi
Anika Ramachandran
Sirou Zhou
Mingzhou Zhou
S. Keerthi
75
1
0
12 Aug 2021
Gates Are Not What You Need in RNNs
Gates Are Not What You Need in RNNs
Ronalds Zakovskis
Andis Draguns
Eliza Gaile
Emīls Ozoliņš
Kārlis Freivalds
48
1
0
01 Aug 2021
Transformer-based deep imitation learning for dual-arm robot manipulation
Transformer-based deep imitation learning for dual-arm robot manipulation
Heecheol Kim
Yoshiyuki Ohmura
Yasuo Kuniyoshi
150
52
0
01 Aug 2021
Self-Paced Contrastive Learning for Semi-supervised Medical Image
  Segmentation with Meta-labels
Self-Paced Contrastive Learning for Semi-supervised Medical Image Segmentation with Meta-labels
Jizong Peng
Ping Wang
Chrisitian Desrosiers
M. Pedersoli
SSL
89
65
0
29 Jul 2021
Demonstration-Guided Reinforcement Learning with Learned Skills
Demonstration-Guided Reinforcement Learning with Learned Skills
Karl Pertsch
Youngwoon Lee
Yue Wu
Joseph J. Lim
OffRL
67
85
0
21 Jul 2021
3D fluorescence microscopy data synthesis for segmentation and
  benchmarking
3D fluorescence microscopy data synthesis for segmentation and benchmarking
Dennis Eschweiler
Malte Rethwisch
Mareike Jarchow
Simon Koppers
Johannes Stegmaier
3DVMedIm
81
16
0
21 Jul 2021
A New Adaptive Gradient Method with Gradient Decomposition
A New Adaptive Gradient Method with Gradient Decomposition
Zhou Shao
Tong Lin
ODL
39
0
0
18 Jul 2021
TGIF: Tree-Graph Integrated-Format Parser for Enhanced UD with Two-Stage
  Generic- to Individual-Language Finetuning
TGIF: Tree-Graph Integrated-Format Parser for Enhanced UD with Two-Stage Generic- to Individual-Language Finetuning
Tianze Shi
Lillian Lee
64
7
0
14 Jul 2021
1st Place Solution for ICDAR 2021 Competition on Mathematical Formula
  Detection
1st Place Solution for ICDAR 2021 Competition on Mathematical Formula Detection
Yuxiang Zhong
Xianbiao Qi
Shanjun Li
Dengyi Gu
Yihao Chen
Peiyang Ning
Rong Xiao
27
6
0
12 Jul 2021
Delta Sampling R-BERT for limited data and low-light action recognition
Delta Sampling R-BERT for limited data and low-light action recognition
Sanchit Hira
Ritwik Das
Abhinav Modi
D. Pakhomov
109
17
0
12 Jul 2021
BEV-MODNet: Monocular Camera based Bird's Eye View Moving Object
  Detection for Autonomous Driving
BEV-MODNet: Monocular Camera based Bird's Eye View Moving Object Detection for Autonomous Driving
Hazem Rashed
Mariam Essam
M. Mohamed
Ahmad El-Sallab
S. Yogamani
50
12
0
11 Jul 2021
REX: Revisiting Budgeted Training with an Improved Schedule
REX: Revisiting Budgeted Training with an Improved Schedule
John Chen
Cameron R. Wolfe
Anastasios Kyrillidis
59
9
0
09 Jul 2021
KOALA: A Kalman Optimization Algorithm with Loss Adaptivity
KOALA: A Kalman Optimization Algorithm with Loss Adaptivity
A. Davtyan
Sepehr Sameni
L. Cerkezi
Givi Meishvili
Adam Bielski
Paolo Favaro
ODL
172
3
0
07 Jul 2021
Deep Network Approximation: Achieving Arbitrary Accuracy with Fixed
  Number of Neurons
Deep Network Approximation: Achieving Arbitrary Accuracy with Fixed Number of Neurons
Zuowei Shen
Haizhao Yang
Shijun Zhang
183
38
0
06 Jul 2021
Morphological Classification of Galaxies in S-PLUS using an Ensemble of
  Convolutional Networks
Morphological Classification of Galaxies in S-PLUS using an Ensemble of Convolutional Networks
N. M. Cardoso
G. B. O. Schwarz
L. O. Dias
C. R. Bom
L. Sodré
C. Mendes de Oliveira
49
0
0
05 Jul 2021
Unified Autoregressive Modeling for Joint End-to-End Multi-Talker
  Overlapped Speech Recognition and Speaker Attribute Estimation
Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation
Ryo Masumura
Daiki Okamura
Naoki Makishima
Mana Ihori
Akihiko Takashima
Tomohiro Tanaka
Shota Orihashi
57
7
0
04 Jul 2021
AdaL: Adaptive Gradient Transformation Contributes to Convergences and
  Generalizations
AdaL: Adaptive Gradient Transformation Contributes to Convergences and Generalizations
Hongwei Zhang
Weidong Zou
Hongbo Zhao
Qi Ming
Tijin Yan
Yuanqing Xia
Weipeng Cao
ODL
22
0
0
04 Jul 2021
Classical Planning in Deep Latent Space
Classical Planning in Deep Latent Space
Masataro Asai
Hiroshi Kajino
A. Fukunaga
Christian Muise
VLM
82
19
0
30 Jun 2021
An Integrated Framework for Two-pass Personalized Voice Trigger
An Integrated Framework for Two-pass Personalized Voice Trigger
Dexin Liao
Jing Li
Yiming Zhi
Song Li
Q. Hong
Lin Li
57
1
0
30 Jun 2021
N-Singer: A Non-Autoregressive Korean Singing Voice Synthesis System for
  Pronunciation Enhancement
N-Singer: A Non-Autoregressive Korean Singing Voice Synthesis System for Pronunciation Enhancement
Gyeong-Hoon Lee
Tae-Woo Kim
Hanbin Bae
Min-Ji Lee
Young-Ik Kim
Hoon-Young Cho
VLM
79
20
0
29 Jun 2021
High-probability Bounds for Non-Convex Stochastic Optimization with
  Heavy Tails
High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails
Ashok Cutkosky
Harsh Mehta
83
62
0
28 Jun 2021
Ranger21: a synergistic deep learning optimizer
Ranger21: a synergistic deep learning optimizer
Less Wright
Nestor Demeure
ODLAI4CE
104
88
0
25 Jun 2021
Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks
  using Switching Tokens
Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks using Switching Tokens
Mana Ihori
Naoki Makishima
Tomohiro Tanaka
Akihiko Takashima
Shota Orihashi
Ryo Masumura
26
3
0
23 Jun 2021
Probabilistic Attention for Interactive Segmentation
Probabilistic Attention for Interactive Segmentation
Prasad Gabbur
Manjot Bilkhu
J. Movellan
103
13
0
23 Jun 2021
Rethinking Adam: A Twofold Exponential Moving Average Approach
Rethinking Adam: A Twofold Exponential Moving Average Approach
Yizhou Wang
Yue Kang
Can Qin
Huan Wang
Yi Xu
Yulun Zhang
Y. Fu
ODL
70
7
0
22 Jun 2021
It's FLAN time! Summing feature-wise latent representations for
  interpretability
It's FLAN time! Summing feature-wise latent representations for interpretability
An-phi Nguyen
María Rodríguez Martínez
FAtt
32
0
0
18 Jun 2021
Multi-head or Single-head? An Empirical Comparison for Transformer
  Training
Multi-head or Single-head? An Empirical Comparison for Transformer Training
Liyuan Liu
Jialu Liu
Jiawei Han
71
33
0
17 Jun 2021
Bridging Multi-Task Learning and Meta-Learning: Towards Efficient
  Training and Effective Adaptation
Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation
Haoxiang Wang
Han Zhao
Yue Liu
112
90
0
16 Jun 2021
SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients
SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients
Feihu Huang
Junyi Li
Heng-Chiao Huang
ODL
115
42
0
15 Jun 2021
A Clinically Inspired Approach for Melanoma classification
A Clinically Inspired Approach for Melanoma classification
Prathyusha Akundi
So Gun
J. Sivaswamy
13
0
0
15 Jun 2021
BoolNet: Minimizing The Energy Consumption of Binary Neural Networks
BoolNet: Minimizing The Energy Consumption of Binary Neural Networks
Nianhui Guo
Joseph Bethge
Haojin Yang
Kai Zhong
Xuefei Ning
Christoph Meinel
Yu Wang
MQ
63
11
0
13 Jun 2021
Previous
123...101112...161718
Next