ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1908.03265
  4. Cited By
On the Variance of the Adaptive Learning Rate and Beyond
v1v2v3v4 (latest)

On the Variance of the Adaptive Learning Rate and Beyond

8 August 2019
Liyuan Liu
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
Jiawei Han
    ODL
ArXiv (abs)PDFHTMLGithub (2548★)

Papers citing "On the Variance of the Adaptive Learning Rate and Beyond"

50 / 864 papers shown
Title
Weight Prediction Boosts the Convergence of AdamW
Weight Prediction Boosts the Convergence of AdamW
Lei Guan
106
19
0
01 Feb 2023
Deep networks for system identification: a Survey
Deep networks for system identification: a Survey
G. Pillonetto
Aleksandr Aravkin
Daniel Gedon
L. Ljung
Antônio H. Ribeiro
Thomas B. Schon
OOD
105
45
0
30 Jan 2023
On Enhancing Expressive Power via Compositions of Single Fixed-Size ReLU
  Network
On Enhancing Expressive Power via Compositions of Single Fixed-Size ReLU Network
Shijun Zhang
Jianfeng Lu
Hongkai Zhao
CoGe
108
4
0
29 Jan 2023
What Decreases Editing Capability? Domain-Specific Hybrid Refinement for
  Improved GAN Inversion
What Decreases Editing Capability? Domain-Specific Hybrid Refinement for Improved GAN Inversion
Pu Cao
Lu Yang
Dongxu Liu
Zhiwei Liu
Shan Li
Q. Song
112
7
0
28 Jan 2023
Improving Statistical Fidelity for Neural Image Compression with
  Implicit Local Likelihood Models
Improving Statistical Fidelity for Neural Image Compression with Implicit Local Likelihood Models
Matthew Muckley
Alaaeldin El-Nouby
Karen Ullrich
Hervé Jégou
Jakob Verbeek
121
55
0
26 Jan 2023
FewShotTextGCN: K-hop neighborhood regularization for few-shot learning
  on graphs
FewShotTextGCN: K-hop neighborhood regularization for few-shot learning on graphs
Niels van der Heijden
Ekaterina Shutova
H. Yannakoudakis
87
0
0
25 Jan 2023
Read the Signs: Towards Invariance to Gradient Descent's Hyperparameter
  Initialization
Read the Signs: Towards Invariance to Gradient Descent's Hyperparameter Initialization
Davood Wadi
M. Fredette
S. Sénécal
ODLAI4CE
37
0
0
24 Jan 2023
Summarize the Past to Predict the Future: Natural Language Descriptions
  of Context Boost Multimodal Object Interaction Anticipation
Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation
Razvan-George Pasca
Alexey Gavryushin
Muhammad Hamza
Yen-Ling Kuo
Kaichun Mo
Luc Van Gool
Otmar Hilliges
Xi Wang
169
14
0
22 Jan 2023
Multi-fidelity surrogate modeling for temperature field prediction using
  deep convolution neural network
Multi-fidelity surrogate modeling for temperature field prediction using deep convolution neural network
Yunyang Zhang
Zhiqiang Gong
Weien Zhou
Xiaoyu Zhao
Xiaohu Zheng
Wen Yao
AI4CE
56
25
0
17 Jan 2023
Padding Module: Learning the Padding in Deep Neural Networks
Padding Module: Learning the Padding in Deep Neural Networks
Fahad Alrasheedi
Agnibh Dasgupta
Pei-Chi Huang
KELMVLM
35
17
0
11 Jan 2023
Maximizing Use-Case Specificity through Precision Model Tuning
Maximizing Use-Case Specificity through Precision Model Tuning
Pranjal Awasthi
David Recio-Mitter
Yosuke Kyle Sugi
LM&MA
17
1
0
29 Dec 2022
Cramming: Training a Language Model on a Single GPU in One Day
Cramming: Training a Language Model on a Single GPU in One Day
Jonas Geiping
Tom Goldstein
MoE
117
91
0
28 Dec 2022
From Xception to NEXcepTion: New Design Decisions and Neural
  Architecture Search
From Xception to NEXcepTion: New Design Decisions and Neural Architecture Search
Hadar Shavit
Filip Jatelnicki
Pol Mor-Puigventós
W. Kowalczyk
45
2
0
16 Dec 2022
Convolution-enhanced Evolving Attention Networks
Convolution-enhanced Evolving Attention Networks
Yujing Wang
Yaming Yang
Zhuowan Li
Jiangang Bai
Mingliang Zhang
Xiangtai Li
Jiahao Yu
Ce Zhang
Gao Huang
Yu Tong
ViT
102
6
0
16 Dec 2022
Integrating Multimodal Data for Joint Generative Modeling of Complex
  Dynamics
Integrating Multimodal Data for Joint Generative Modeling of Complex Dynamics
Manuela Brenner
Florian Hess
G. Koppe
Daniel Durstewitz
293
11
0
15 Dec 2022
Cross-Domain Transfer via Semantic Skill Imitation
Cross-Domain Transfer via Semantic Skill Imitation
Karl Pertsch
Ruta Desai
Vikash Kumar
Franziska Meier
Joseph J. Lim
Dhruv Batra
Akshara Rai
LM&Ro
69
19
0
14 Dec 2022
Improving Depression estimation from facial videos with face alignment,
  training optimization and scheduling
Improving Depression estimation from facial videos with face alignment, training optimization and scheduling
Manuel Lage Cañellas
Constantino Álvarez Casado
L. Nguyen
Miguel Bordallo López
CVBM
49
3
0
13 Dec 2022
Punctuation Restoration for Singaporean Spoken Languages: English,
  Malay, and Mandarin
Punctuation Restoration for Singaporean Spoken Languages: English, Malay, and Mandarin
Abhinav Rao
Ho Thi-Nga
Chng Eng Siong
72
3
0
10 Dec 2022
Parameter Efficient Transfer Learning for Various Speech Processing
  Tasks
Parameter Efficient Transfer Learning for Various Speech Processing Tasks
Shinta Otake
Rei Kawakami
Nakamasa Inoue
54
17
0
06 Dec 2022
Semantic Role Labeling Meets Definition Modeling: Using Natural Language
  to Describe Predicate-Argument Structures
Semantic Role Labeling Meets Definition Modeling: Using Natural Language to Describe Predicate-Argument Structures
Simone Conia
Edoardo Barba
Alessandro Sciré
Roberto Navigli
76
7
0
02 Dec 2022
The Vanishing Decision Boundary Complexity and the Strong First
  Component
The Vanishing Decision Boundary Complexity and the Strong First Component
Hengshuai Yao
UQCV
66
0
0
25 Nov 2022
Join the High Accuracy Club on ImageNet with A Binary Neural Network
  Ticket
Join the High Accuracy Club on ImageNet with A Binary Neural Network Ticket
Nianhui Guo
Joseph Bethge
Christoph Meinel
Haojin Yang
MQ
112
20
0
23 Nov 2022
$β$-Multivariational Autoencoder for Entangled Representation
  Learning in Video Frames
βββ-Multivariational Autoencoder for Entangled Representation Learning in Video Frames
F. Nouri
R. Bergevin
58
0
0
22 Nov 2022
Uncertainty-aware Vision-based Metric Cross-view Geolocalization
Uncertainty-aware Vision-based Metric Cross-view Geolocalization
F. Fervers
Sebastian Bullinger
C. Bodensteiner
Michael Arens
Rainer Stiefelhagen
63
41
0
22 Nov 2022
GAN Inversion for Image Editing via Unsupervised Domain Adaptation
GAN Inversion for Image Editing via Unsupervised Domain Adaptation
Siyu Xing
Chen Gong
Hewei Guo
Xiaoyi Zhang
Xinwen Hou
Yu Liu
110
6
0
22 Nov 2022
Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space
  Viewpoint
Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space Viewpoint
Hongyu Liu
Yibing Song
Qifeng Chen
DiffM
96
21
0
21 Nov 2022
Novel transfer learning schemes based on Siamese networks and synthetic
  data
Novel transfer learning schemes based on Siamese networks and synthetic data
Dominik Stallmann
Philip Kenneweg
Barbara Hammer
54
6
0
21 Nov 2022
VeLO: Training Versatile Learned Optimizers by Scaling Up
VeLO: Training Versatile Learned Optimizers by Scaling Up
Luke Metz
James Harrison
C. Freeman
Amil Merchant
Lucas Beyer
...
Naman Agrawal
Ben Poole
Igor Mordatch
Adam Roberts
Jascha Narain Sohl-Dickstein
138
60
0
17 Nov 2022
Empirical Study on Optimizer Selection for Out-of-Distribution
  Generalization
Empirical Study on Optimizer Selection for Out-of-Distribution Generalization
Hiroki Naganuma
Kartik Ahuja
S. Takagi
Tetsuya Motokawa
Rio Yokota
Kohta Ishikawa
I. Sato
Ioannis Mitliagkas
OOD
93
7
0
15 Nov 2022
Composed Image Retrieval with Text Feedback via Multi-grained
  Uncertainty Regularization
Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization
Yiyang Chen
Zhedong Zheng
Wei Ji
Leigang Qu
Tat-Seng Chua
149
45
0
14 Nov 2022
ALBERT with Knowledge Graph Encoder Utilizing Semantic Similarity for
  Commonsense Question Answering
ALBERT with Knowledge Graph Encoder Utilizing Semantic Similarity for Commonsense Question Answering
Byeongmin Choi
Yong-Sook Lee
Yeunwoong Kyung
Eunchan Kim
55
10
0
14 Nov 2022
Predicting Companies' ESG Ratings from News Articles Using Multivariate
  Timeseries Analysis
Predicting Companies' ESG Ratings from News Articles Using Multivariate Timeseries Analysis
Tanja Aue
Adam Jatowt
Michael Färber
10
11
0
13 Nov 2022
Online Phase Reconstruction via DNN-based Phase Differences Estimation
Online Phase Reconstruction via DNN-based Phase Differences Estimation
Yoshiki Masuyama
Kohei Yatabe
Kento Nagatomo
Yasuhiro Oikawa
3DV
79
8
0
12 Nov 2022
ABCAS: Adaptive Bound Control of spectral norm as Automatic Stabilizer
ABCAS: Adaptive Bound Control of spectral norm as Automatic Stabilizer
Shota Hirose
Shiori Maki
Naoki Wada
Heming Sun
J. Katto
45
0
0
12 Nov 2022
Learning to Follow Instructions in Text-Based Games
Learning to Follow Instructions in Text-Based Games
Mathieu Tuli
Andrew C. Li
Pashootan Vaezipoor
Toryn Q. Klassen
Scott Sanner
Sheila A. McIlraith
79
13
0
08 Nov 2022
Can neural networks extrapolate? Discussion of a theorem by Pedro
  Domingos
Can neural networks extrapolate? Discussion of a theorem by Pedro Domingos
Adrien Courtois
Jean-Michel Morel
Pablo Arias
43
6
0
07 Nov 2022
Circling Back to Recurrent Models of Language
Circling Back to Recurrent Models of Language
Gábor Melis
89
0
0
03 Nov 2022
SIMD-size aware weight regularization for fast neural vocoding on CPU
SIMD-size aware weight regularization for fast neural vocoding on CPU
Hiroki Kanagawa
Yusuke Ijima
115
0
0
02 Nov 2022
Neural Fourier Shift for Binaural Speech Rendering
Neural Fourier Shift for Binaural Speech Rendering
Jinkyu Lee
Kyogu Lee
80
8
0
02 Nov 2022
Radically Lower Data-Labeling Costs for Visually Rich Document
  Extraction Models
Radically Lower Data-Labeling Costs for Visually Rich Document Extraction Models
Yichao Zhou
James Bradley Wendt
Navneet Potti
Jing Xie
Sandeep Tata
VLM
57
1
0
28 Oct 2022
Period VITS: Variational Inference with Explicit Pitch Modeling for
  End-to-end Emotional Speech Synthesis
Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis
Yuma Shirahata
Ryuichi Yamamoto
Eunwoo Song
Ryo Terashima
Jae-Min Kim
Kentaro Tachibana
86
11
0
28 Oct 2022
Visual Semantic Parsing: From Images to Abstract Meaning Representation
Visual Semantic Parsing: From Images to Abstract Meaning Representation
M. A. Abdelsalam
Zhan Shi
Federico Fancellu
Kalliopi Basioti
Dhaivat Bhatt
Vladimir Pavlovic
Afsaneh Fazly
GNN
85
4
0
26 Oct 2022
Dual-Pixel Raindrop Removal
Dual-Pixel Raindrop Removal
Yizhou Li
Yusuke Monno
Masatoshi Okutomi
82
6
0
24 Oct 2022
On the optimization and pruning for Bayesian deep learning
On the optimization and pruning for Bayesian deep learning
X. Ke
Yanan Fan
BDLUQCV
79
1
0
24 Oct 2022
How to Sift Out a Clean Data Subset in the Presence of Data Poisoning?
How to Sift Out a Clean Data Subset in the Presence of Data Poisoning?
Yi Zeng
Minzhou Pan
Himanshu Jahagirdar
Ming Jin
Lingjuan Lyu
R. Jia
AAML
85
21
0
12 Oct 2022
AdaNorm: Adaptive Gradient Norm Correction based Optimizer for CNNs
AdaNorm: Adaptive Gradient Norm Correction based Optimizer for CNNs
S. Dubey
S. Singh
B. B. Chaudhuri
ODL
60
8
0
12 Oct 2022
Efficient Bayesian Updates for Deep Learning via Laplace Approximations
Efficient Bayesian Updates for Deep Learning via Laplace Approximations
Denis Huseljic
M. Herde
Lukas Rauch
Paul Hahn
Zhixin Huang
D. Kottke
S. Vogt
Bernhard Sick
BDL
69
0
0
12 Oct 2022
ControlVAE: Model-Based Learning of Generative Controllers for
  Physics-Based Characters
ControlVAE: Model-Based Learning of Generative Controllers for Physics-Based Characters
Heyuan Yao
Zhenhua Song
Bin Chen
Libin Liu
DRLVGen
78
42
0
12 Oct 2022
Towards Theoretically Inspired Neural Initialization Optimization
Towards Theoretically Inspired Neural Initialization Optimization
Yibo Yang
Hong Wang
Haobo Yuan
Zhouchen Lin
73
11
0
12 Oct 2022
KALM: Knowledge-Aware Integration of Local, Document, and Global
  Contexts for Long Document Understanding
KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document Understanding
Shangbin Feng
Zhaoxuan Tan
Wenqian Zhang
Zhenyu Lei
Yulia Tsvetkov
KELMVLM
106
10
0
08 Oct 2022
Previous
123...678...161718
Next