Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1908.03265
Cited By
v1
v2
v3
v4 (latest)
On the Variance of the Adaptive Learning Rate and Beyond
8 August 2019
Liyuan Liu
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
Jiawei Han
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Github (2548★)
Papers citing
"On the Variance of the Adaptive Learning Rate and Beyond"
50 / 864 papers shown
Title
ParaGAN: A Scalable Distributed Training Framework for Generative Adversarial Networks
Ziji Shi
Jialin Li
Yang You
54
1
0
06 Nov 2024
Analyzing & Reducing the Need for Learning Rate Warmup in GPT Training
Atli Kosson
Bettina Messmer
Martin Jaggi
AI4CE
71
5
0
31 Oct 2024
Consistency Diffusion Bridge Models
Guande He
Kaiwen Zheng
Jianfei Chen
Fan Bao
Jun-Jie Zhu
DiffM
124
5
0
30 Oct 2024
USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal Synthesis
Luca Jiang-Tao Yu
Running Zhao
Sijie Ji
Edith C.H. Ngai
Chenshu Wu
52
0
0
29 Oct 2024
Data Generation for Hardware-Friendly Post-Training Quantization
Lior Dikstein
Ariel Lapid
Arnon Netzer
H. Habi
MQ
482
0
0
29 Oct 2024
Contrastive Learning with Auxiliary User Detection for Identifying Activities
Wen Ge
Guanyi Mou
Emmanuel O. Agu
Kyumin Lee
AAML
65
0
0
21 Oct 2024
Fully Explicit Dynamic Gaussian Splatting
Junoh Lee
Chang-Yeon Won
Hyunjun Jung
Inhwan Bae
Hae-Gon Jeon
3DGS
108
12
0
21 Oct 2024
Generalized Probabilistic Attention Mechanism in Transformers
DongNyeong Heo
Heeyoul Choi
98
1
0
21 Oct 2024
Almost-Linear RNNs Yield Highly Interpretable Symbolic Codes in Dynamical Systems Reconstruction
Manuel Brenner
Christoph Jürgen Hemmer
Zahra Monfared
Daniel Durstewitz
AI4CE
80
4
0
18 Oct 2024
A Mirror Descent Perspective of Smoothed Sign Descent
Shuyang Wang
Diego Klabjan
75
1
0
18 Oct 2024
Heterogeneous Graph Generation: A Hierarchical Approach using Node Feature Pooling
Hritaban Ghosh
Chen Changyu
Arunesh Sinha
Shamik Sural
62
0
0
15 Oct 2024
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
Weronika Ormaniec
Felix Dangel
Sidak Pal Singh
141
7
0
14 Oct 2024
Neural Material Adaptor for Visual Grounding of Intrinsic Dynamics
Junyi Cao
Shanyan Guan
Yanhao Ge
Wei Li
Xiaokang Yang
Chao Ma
AI4CE
87
8
0
10 Oct 2024
Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation Learning
Siyuan Li
Juanxi Tian
Zedong Wang
Luyuan Zhang
Zicheng Liu
Weiyang Jin
Yang Liu
Baigui Sun
Stan Z. Li
95
0
0
08 Oct 2024
ZEBRA: Zero-Shot Example-Based Retrieval Augmentation for Commonsense Question Answering
Francesco Maria Molfese
Simone Conia
Riccardo Orlando
Roberto Navigli
ReLM
LRM
RALM
65
3
0
07 Oct 2024
Computational design of target-specific linear peptide binders with TransformerBeta
Haowen Zhao
Francesco A. Aprile
Barbara Bravi
77
0
0
07 Oct 2024
Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data
Manuel Brenner
Elias Weber
G. Koppe
Daniel Durstewitz
AI4TS
AI4CE
118
8
0
07 Oct 2024
Image First or Text First? Optimising the Sequencing of Modalities in Large Language Model Prompting and Reasoning Tasks
Grant Wardle
Teo Susnjak
LRM
108
6
0
04 Oct 2024
Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint?
Xi Chen
Kaituo Feng
Changsheng Li
Xunhao Lai
Xiangyu Yue
Ye Yuan
Guoren Wang
94
15
0
02 Oct 2024
Question-guided Knowledge Graph Re-scoring and Injection for Knowledge Graph Question Answering
Yu Zhang
Kehai Chen
Xuefeng Bai
zhao kang
Quanjiang Guo
Min Zhang
120
12
0
02 Oct 2024
Dual Encoder GAN Inversion for High-Fidelity 3D Head Reconstruction from Single Images
Bahri Batuhan Bilecen
Ahmet Berke Gokmen
Aysegül Dündar
84
2
0
30 Sep 2024
Simple and Fast Distillation of Diffusion Models
Zhenyu Zhou
Defang Chen
Can Wang
Chun Chen
Siwei Lyu
DiffM
75
8
0
29 Sep 2024
Deep Heterogeneous Contrastive Hyper-Graph Learning for In-the-Wild Context-Aware Human Activity Recognition
Wen Ge
Guanyi Mou
Emmanuel O. Agu
Kyumin Lee
HAI
BDL
62
6
0
27 Sep 2024
Tackling fluffy clouds: robust field boundary delineation across global agricultural landscapes with Sentinel-1 and Sentinel-2 Time Series
F. Diakogiannis
Zheng-Shu Zhou
Jeff Wang
Gonzalo Mata
Dave Henry
...
Jonathan Richetti
Kathryn Batchelor
Chris Herrmann
Andrew Toovey
John Taylor
77
0
0
20 Sep 2024
Convergence of Sharpness-Aware Minimization Algorithms using Increasing Batch Size and Decaying Learning Rate
Hinata Harada
Hideaki Iiduka
59
1
0
16 Sep 2024
Stable Language Model Pre-training by Reducing Embedding Variability
Woojin Chung
Jiwoo Hong
Na Min An
James Thorne
Se-Young Yun
78
3
0
12 Sep 2024
Retro-li: Small-Scale Retrieval Augmented Generation Supporting Noisy Similarity Searches and Domain Shift Generalization
Gentiana Rashiti
G. Karunaratne
Mrinmaya Sachan
Abu Sebastian
Abbas Rahimi
RALM
230
0
0
12 Sep 2024
Dynamic Decoupling of Placid Terminal Attractor-based Gradient Descent Algorithm
Jinwei Zhao
Marco Gori
Alessandro Betti
S. Melacci
Hongtao Zhang
Jiedong Liu
Xinhong Hei
86
0
0
10 Sep 2024
Reducing Bias in Deep Learning Optimization: The RSGDM Approach
Honglin Qin
Hongye Zheng
Bingxing Wang
Zhizhong Wu
Bingyao Liu
Yuanfang Yang
57
8
0
05 Sep 2024
Does Data-Efficient Generalization Exacerbate Bias in Foundation Models?
Dilermando Queiroz
Anderson Carlos
Maíra Fatoretto
Luis Filipe Nakayama
André Anjos
Lilian Berton
138
0
0
28 Aug 2024
Bidirectional Awareness Induction in Autoregressive Seq2Seq Models
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
BDL
95
0
0
25 Aug 2024
Guardians of the Machine Translation Meta-Evaluation: Sentinel Metrics Fall In!
Stefano Perrella
Lorenzo Proietti
Alessandro Sciré
Edoardo Barba
Roberto Navigli
85
4
0
25 Aug 2024
cc-DRL: a Convex Combined Deep Reinforcement Learning Flight Control Design for a Morphing Quadrotor
Tao Yang
Huai-Ning Wu
Jun-Wei Wang
152
0
0
23 Aug 2024
Scaling Law with Learning Rate Annealing
Howe Tissue
Venus Wang
Lu Wang
108
9
0
20 Aug 2024
NeRF-US: Removing Ultrasound Imaging Artifacts from Neural Radiance Fields in the Wild
Rishit Dagli
Atsuhiro Hibi
R. Krishnan
Pascal N Tyrrell
96
6
0
13 Aug 2024
Fast-and-Frugal Text-Graph Transformers are Effective Link Predictors
Andrei Catalin Coman
Christos Theodoropoulos
Marie-Francine Moens
James Henderson
88
0
0
13 Aug 2024
Music2Latent: Consistency Autoencoders for Latent Audio Compression
Marco Pasini
Stefan Lattner
George Fazekas
70
8
0
12 Aug 2024
Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen Languages
Carlos Mullov
Ngoc-Quan Pham
Alexander Waibel
97
1
0
05 Aug 2024
DeMansia: Mamba Never Forgets Any Tokens
Ricky Fang
Mamba
48
0
0
04 Aug 2024
Deep Learning Framework for History Matching CO2 Storage with 4D Seismic and Monitoring Well Data
Ekta U. Samani
A. Banerjee
103
0
0
02 Aug 2024
What comes after transformers? -- A selective survey connecting ideas in deep learning
Johannes Schneider
AI4CE
112
2
0
01 Aug 2024
ReLiK: Retrieve and LinK, Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget
Adam Gould
Pere-Lluis Huguet-Cabot
S. Dadhania
Francesca Toni
180
9
0
31 Jul 2024
dopanim: A Dataset of Doppelganger Animals with Noisy Annotations from Multiple Humans
M. Herde
Denis Huseljic
Lukas Rauch
Bernhard Sick
85
1
0
30 Jul 2024
No learning rates needed: Introducing SALSA -- Stable Armijo Line Search Adaptation
Philip Kenneweg
Tristan Kenneweg
Fabian Fumagalli
Barbara Hammer
ODL
59
0
0
30 Jul 2024
Robust VAEs via Generating Process of Noise Augmented Data
Hiroo Irobe
Wataru Aoki
Kimihiro Yamazaki
Yuhui Zhang
Takumi Nakagawa
Hiroki Waida
Yuichiro Wada
Takafumi Kanamori
AAML
56
0
0
26 Jul 2024
Amortized Active Learning for Nonparametric Functions
Cen-You Li
Marc Toussaint
Barbara Rakitsch
Christoph Zimmer
53
0
0
25 Jul 2024
Lymphoid Infiltration Assessment of the Tumor Margins in H&E Slides
Zhuxian Guo
Amine Marzouki
Jean-François Emile
Henning Muller
Camille Kurtz
Nicolas Loménie
33
0
0
23 Jul 2024
Monocular pose estimation of articulated surgical instruments in open surgery
Robert Spektor
Tom Friedman
Itay Or
Gil Bolotin
S. Laufer
93
0
0
16 Jul 2024
Finding Meaning in Points: Weakly Supervised Semantic Segmentation for Event Cameras
Hoonhee Cho
Sung-Hoon Yoon
H. Kweon
Kuk-Jin Yoon
77
2
0
15 Jul 2024
Sensorimotor Attention and Language-based Regressions in Shared Latent Variables for Integrating Robot Motion Learning and LLM
Kanata Suzuki
Tetsuya Ogata
82
2
0
12 Jul 2024
Previous
1
2
3
4
5
...
16
17
18
Next