Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1908.03265
Cited By
v1
v2
v3
v4 (latest)
On the Variance of the Adaptive Learning Rate and Beyond
8 August 2019
Liyuan Liu
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
Jiawei Han
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Github (2548★)
Papers citing
"On the Variance of the Adaptive Learning Rate and Beyond"
50 / 864 papers shown
Title
Deepfake Caricatures: Amplifying attention to artifacts increases deepfake detection by humans and machines
Camilo Luciano Fosco
Emilie Josephs
A. Andonian
Allen Lee
93
4
0
01 Jul 2025
Relating Events and Frames Based on Self-Supervised Learning and Uncorrelated Conditioning for Unsupervised Domain Adaptation
Mohammad Rostami
Dayuan Jian
Ruitong Sun
81
0
0
01 Jul 2025
ITO-Master: Inference-Time Optimization for Audio Effects Modeling of Music Mastering Processors
Junghyun Koo
Marco A. Martínez-Ramírez
Wei-Hsiang Liao
Giorgio Fabbro
Michele Mancusi
Yuki Mitsufuji
20
0
0
20 Jun 2025
Rethinking Losses for Diffusion Bridge Samplers
Sebastian Sanokowski
Lukas Gruber
Christoph Bartmann
Sepp Hochreiter
Sebastian Lehner
DiffM
131
0
0
12 Jun 2025
An Adaptive Method Stabilizing Activations for Enhanced Generalization
Hyunseok Seung
Jaewoo Lee
Hyunsuk Ko
ODL
28
0
0
10 Jun 2025
Rapid training of Hamiltonian graph networks without gradient descent
Atamert Rahma
Chinmay Datar
Ana Cukarska
Felix Dietrich
AI4CE
19
0
0
06 Jun 2025
Adaptive Preconditioners Trigger Loss Spikes in Adam
Zhiwei Bai
Zhangchen Zhou
Jiajie Zhao
Xiaolong Li
Zhiyu Li
Feiyu Xiong
Hongkang Yang
Yaoyu Zhang
Z. Xu
ODL
100
0
0
05 Jun 2025
Local Equivariance Error-Based Metrics for Evaluating Sampling-Frequency-Independent Property of Neural Network
Kanami Imamura
Tomohiko Nakamura
Norihiro Takamune
Kohei Yatabe
Hiroshi Saruwatari
59
0
0
04 Jun 2025
Taming LLMs by Scaling Learning Rates with Gradient Grouping
Siyuan Li
Juanxi Tian
Zedong Wang
Xin Jin
Zicheng Liu
Wentao Zhang
Dan Xu
42
0
0
01 Jun 2025
Efficient Neural and Numerical Methods for High-Quality Online Speech Spectrogram Inversion via Gradient Theorem
Andres Fernandez
Juan Azcarreta
Cagdas Bilen
Jesus Monge Alvarez
32
0
0
30 May 2025
A Physics-Inspired Optimizer: Velocity Regularized Adam
Pranav Vaidhyanathan
Lucas Schorling
Natalia Ares
Michael A. Osborne
ODL
66
0
0
19 May 2025
True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics
Christoph Jürgen Hemmer
Daniel Durstewitz
AI4TS
SyDa
AI4CE
297
1
0
19 May 2025
LoRASuite: Efficient LoRA Adaptation Across Large Language Model Upgrades
Yanan Li
Fanxu Meng
Muhan Zhang
Shiai Zhu
Shangguang Wang
Mengwei Xu
MoMe
80
0
0
17 May 2025
Fixing Incomplete Value Function Decomposition for Multi-Agent Reinforcement Learning
Andrea Baisero
Rupali Bhati
Shuo Liu
Aathira Pillai
Christopher Amato
51
0
0
15 May 2025
ICE-Pruning: An Iterative Cost-Efficient Pruning Pipeline for Deep Neural Networks
Wenhao Hu
Paul Henderson
José Cano
124
0
0
12 May 2025
TopicVD: A Topic-Based Dataset of Video-Guided Multimodal Machine Translation for Documentaries
Jinze Lv
Jian Chen
Zi Long
Xianghua Fu
Yin Chen
VGen
132
0
0
09 May 2025
MRI motion correction via efficient residual-guided denoising diffusion probabilistic models
Mojtaba Safari
Shansong Wang
Qiang Li
Zach Eidex
Richard L. J. Qiu
Chih-Wei Chang
H. Mao
Xiaofeng Yang
DiffM
MedIm
76
0
0
06 May 2025
End-to-end fully-binarized network design: from Generic Learned Thermometer to Block Pruning
Thien Nguyen
William Guicquero
MQ
63
0
0
05 May 2025
Financial Data Analysis with Robust Federated Logistic Regression
Kun Yang
Nikhil Krishnan
Sanjeev Kulkarni
FedML
101
0
0
28 Apr 2025
Fault Diagnosis in New Wind Turbines using Knowledge from Existing Turbines by Generative Domain Adaptation
S. Jonas
Angela Meyer
AI4CE
81
0
0
24 Apr 2025
crowd-hpo: Realistic Hyperparameter Optimization and Benchmarking for Learning from Crowds with Noisy Labels
M. Herde
Lukas Lührs
Denis Huseljic
Bernhard Sick
84
0
0
12 Apr 2025
Semantically Encoding Activity Labels for Context-Aware Human Activity Recognition
Wen Ge
Guanyi Mou
Emmanuel O. Agu
Kyumin Lee
66
0
0
10 Apr 2025
Benchmarking Convolutional Neural Network and Graph Neural Network based Surrogate Models on a Real-World Car External Aerodynamics Dataset
Sam Jacob Jacob
Markus Mrosek
C. Othmer
Harald Köstler
GNN
150
0
0
09 Apr 2025
Spline-based Transformers
Prashanth Chandran
Agon Serifi
Markus Gross
Moritz Bächer
158
0
0
03 Apr 2025
NeuralFoil: An Airfoil Aerodynamics Analysis Tool Using Physics-Informed Machine Learning
Peter Sharpe
R. John Hansman
67
4
0
20 Mar 2025
MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification
Jianwei Zhao
Xin Li
Fan Yang
Qiang Zhai
Ao Luo
Yang Zhao
Hong-wei Cheng
Huazhu Fu
DiffM
MedIm
75
0
0
16 Mar 2025
FUSE: First-Order and Second-Order Unified SynthEsis in Stochastic Optimization
Zhanhong Jiang
Md Zahid Hasan
Aditya Balu
Joshua R. Waite
Genyi Huang
Soumik Sarkar
84
0
0
06 Mar 2025
Path-Adaptive Matting for Efficient Inference Under Various Computational Cost Constraints
Qinglin Liu
Zonglin Li
Xiaoqian Lv
Xin Sun
Ru Li
Shengping Zhang
78
0
0
05 Mar 2025
MRI super-resolution reconstruction using efficient diffusion probabilistic model with residual shifting
Mojtaba Safari
Shansong Wang
Zach Eidex
Qiang Li
Erik H. Middlebrooks
D. Yu
Xiaofeng Yang
MedIm
141
1
0
03 Mar 2025
Hierarchical Semantic Compression for Consistent Image Semantic Restoration
Shengxi Li
Zifu Zhang
Mai Xu
Lai Jiang
Yufan Liu
Ce Zhu
DiffM
76
0
0
24 Feb 2025
AquaNeRF: Neural Radiance Fields in Underwater Media with Distractor Removal
Luca Gough
Adrian Azzarelli
Fan Zhang
Nantheera Anantrasirichai
111
2
0
22 Feb 2025
Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
Shane Bergsma
Nolan Dey
Gurpreet Gosal
Gavia Gray
Daria Soboleva
Joel Hestness
109
8
0
21 Feb 2025
Carefully Blending Adversarial Training, Purification, and Aggregation Improves Adversarial Robustness
Emanuele Ballarin
A. Ansuini
Luca Bortolussi
AAML
184
0
0
20 Feb 2025
Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent
Hikaru Umeda
Hideaki Iiduka
155
2
0
17 Feb 2025
Preconditioned Inexact Stochastic ADMM for Deep Model
Shenglong Zhou
Ouya Wang
Ziyan Luo
Yongxu Zhu
Geoffrey Ye Li
86
0
0
15 Feb 2025
Amortized Safe Active Learning for Real-Time Data Acquisition: Pretrained Neural Policies from Simulated Nonparametric Functions
Cen-You Li
Marc Toussaint
Barbara Rakitsch
Christoph Zimmer
OffRL
453
0
0
26 Jan 2025
Celo: Training Versatile Learned Optimizers on a Compute Diet
A. Moudgil
Boris Knyazev
Guillaume Lajoie
Eugene Belilovsky
446
0
0
22 Jan 2025
Benchmarking YOLOv8 for Optimal Crack Detection in Civil Infrastructure
Woubishet Zewdu Taffese
Ritesh Sharma
Mohammad Hossein Afsharmovahed
Gunasekaran Manogaran
Genda Chen
98
0
0
12 Jan 2025
ReFlow6D: Refraction-Guided Transparent Object 6D Pose Estimation via Intermediate Representation Learning
Hrishikesh Gupta
S. Thalhammer
Jean-Baptiste Weibel
Alexander Haberl
Markus Vincze
106
0
0
31 Dec 2024
Temporal Context Consistency Above All: Enhancing Long-Term Anticipation by Learning and Enforcing Temporal Constraints
Alberto Maté
Mariella Dimiccoli
AI4TS
90
1
0
27 Dec 2024
Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam Timesteps
Benjamin Ellis
Matthew Jackson
Andrei Lupu
Alexander David Goldie
Mattie Fellows
Shimon Whiteson
Jakob Foerster
142
3
0
22 Dec 2024
From Histopathology Images to Cell Clouds: Learning Slide Representations with Hierarchical Cell Transformer
Zijiang Yang
Zhongwei Qiu
Tiancheng Lin
Hanqing Chao
Wanxing Chang
...
Dakai Jin
K. Yan
Le Lu
Hui Jiang
Yun Bian
DiffM
120
0
0
21 Dec 2024
Bag of Tricks for Multimodal AutoML with Image, Text, and Tabular Data
Zhiqiang Tang
Zihan Zhong
Tong He
Gerald Friedland
166
1
0
19 Dec 2024
No More Adam: Learning Rate Scaling at Initialization is All You Need
Minghao Xu
Lichuan Xiang
Xu Cai
Hongkai Wen
125
3
0
16 Dec 2024
Improving Source Extraction with Diffusion and Consistency Models
Tornike Karchkhadze
M. Izadi
Shuo Zhang
DiffM
147
1
0
09 Dec 2024
Automatic Differentiation-based Full Waveform Inversion with Flexible Workflows
Feng Liu
Haipeng Li
Guangyuan Zou
Junlun Li
140
3
0
30 Nov 2024
CAdam: Confidence-Based Optimization for Online Learning
Shaowen Wang
Anan Liu
Jian Xiao
Huan Liu
Yuekui Yang
...
Suncong Zheng
Wei-Qiang Zhang
Di Wang
Jie Jiang
Jian Li
117
0
0
29 Nov 2024
Beyond adaptive gradient: Fast-Controlled Minibatch Algorithm for large-scale optimization
Corrado Coppola
Lorenzo Papa
Irene Amerini
L. Palagi
ODL
125
0
0
24 Nov 2024
Learning state and proposal dynamics in state-space models using differentiable particle filters and neural networks
Benjamin Cox
Santiago Segarra
Victor Elvira
154
0
0
23 Nov 2024
Physics Informed Distillation for Diffusion Models
Joshua Tian Jin Tee
Kang Zhang
Hee Suk Yoon
Dhananjaya N. Gowda
Chanwoo Kim
Chang D. Yoo
DiffM
98
6
0
13 Nov 2024
1
2
3
4
...
16
17
18
Next