ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1908.03265
  4. Cited By
On the Variance of the Adaptive Learning Rate and Beyond
v1v2v3v4 (latest)

On the Variance of the Adaptive Learning Rate and Beyond

8 August 2019
Liyuan Liu
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
Jiawei Han
    ODL
ArXiv (abs)PDFHTMLGithub (2548★)

Papers citing "On the Variance of the Adaptive Learning Rate and Beyond"

50 / 864 papers shown
Title
Deepfake Caricatures: Amplifying attention to artifacts increases deepfake detection by humans and machines
Deepfake Caricatures: Amplifying attention to artifacts increases deepfake detection by humans and machines
Camilo Luciano Fosco
Emilie Josephs
A. Andonian
Allen Lee
93
4
0
01 Jul 2025
Relating Events and Frames Based on Self-Supervised Learning and Uncorrelated Conditioning for Unsupervised Domain Adaptation
Relating Events and Frames Based on Self-Supervised Learning and Uncorrelated Conditioning for Unsupervised Domain Adaptation
Mohammad Rostami
Dayuan Jian
Ruitong Sun
81
0
0
01 Jul 2025
ITO-Master: Inference-Time Optimization for Audio Effects Modeling of Music Mastering Processors
ITO-Master: Inference-Time Optimization for Audio Effects Modeling of Music Mastering Processors
Junghyun Koo
Marco A. Martínez-Ramírez
Wei-Hsiang Liao
Giorgio Fabbro
Michele Mancusi
Yuki Mitsufuji
20
0
0
20 Jun 2025
Rethinking Losses for Diffusion Bridge Samplers
Rethinking Losses for Diffusion Bridge Samplers
Sebastian Sanokowski
Lukas Gruber
Christoph Bartmann
Sepp Hochreiter
Sebastian Lehner
DiffM
131
0
0
12 Jun 2025
An Adaptive Method Stabilizing Activations for Enhanced Generalization
Hyunseok Seung
Jaewoo Lee
Hyunsuk Ko
ODL
28
0
0
10 Jun 2025
Rapid training of Hamiltonian graph networks without gradient descent
Rapid training of Hamiltonian graph networks without gradient descent
Atamert Rahma
Chinmay Datar
Ana Cukarska
Felix Dietrich
AI4CE
19
0
0
06 Jun 2025
Adaptive Preconditioners Trigger Loss Spikes in Adam
Zhiwei Bai
Zhangchen Zhou
Jiajie Zhao
Xiaolong Li
Zhiyu Li
Feiyu Xiong
Hongkang Yang
Yaoyu Zhang
Z. Xu
ODL
100
0
0
05 Jun 2025
Local Equivariance Error-Based Metrics for Evaluating Sampling-Frequency-Independent Property of Neural Network
Local Equivariance Error-Based Metrics for Evaluating Sampling-Frequency-Independent Property of Neural Network
Kanami Imamura
Tomohiko Nakamura
Norihiro Takamune
Kohei Yatabe
Hiroshi Saruwatari
59
0
0
04 Jun 2025
Taming LLMs by Scaling Learning Rates with Gradient Grouping
Taming LLMs by Scaling Learning Rates with Gradient Grouping
Siyuan Li
Juanxi Tian
Zedong Wang
Xin Jin
Zicheng Liu
Wentao Zhang
Dan Xu
42
0
0
01 Jun 2025
Efficient Neural and Numerical Methods for High-Quality Online Speech Spectrogram Inversion via Gradient Theorem
Efficient Neural and Numerical Methods for High-Quality Online Speech Spectrogram Inversion via Gradient Theorem
Andres Fernandez
Juan Azcarreta
Cagdas Bilen
Jesus Monge Alvarez
32
0
0
30 May 2025
A Physics-Inspired Optimizer: Velocity Regularized Adam
A Physics-Inspired Optimizer: Velocity Regularized Adam
Pranav Vaidhyanathan
Lucas Schorling
Natalia Ares
Michael A. Osborne
ODL
66
0
0
19 May 2025
True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics
True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics
Christoph Jürgen Hemmer
Daniel Durstewitz
AI4TSSyDaAI4CE
297
1
0
19 May 2025
LoRASuite: Efficient LoRA Adaptation Across Large Language Model Upgrades
LoRASuite: Efficient LoRA Adaptation Across Large Language Model Upgrades
Yanan Li
Fanxu Meng
Muhan Zhang
Shiai Zhu
Shangguang Wang
Mengwei Xu
MoMe
80
0
0
17 May 2025
Fixing Incomplete Value Function Decomposition for Multi-Agent Reinforcement Learning
Fixing Incomplete Value Function Decomposition for Multi-Agent Reinforcement Learning
Andrea Baisero
Rupali Bhati
Shuo Liu
Aathira Pillai
Christopher Amato
51
0
0
15 May 2025
ICE-Pruning: An Iterative Cost-Efficient Pruning Pipeline for Deep Neural Networks
ICE-Pruning: An Iterative Cost-Efficient Pruning Pipeline for Deep Neural Networks
Wenhao Hu
Paul Henderson
José Cano
124
0
0
12 May 2025
TopicVD: A Topic-Based Dataset of Video-Guided Multimodal Machine Translation for Documentaries
TopicVD: A Topic-Based Dataset of Video-Guided Multimodal Machine Translation for Documentaries
Jinze Lv
Jian Chen
Zi Long
Xianghua Fu
Yin Chen
VGen
132
0
0
09 May 2025
MRI motion correction via efficient residual-guided denoising diffusion probabilistic models
MRI motion correction via efficient residual-guided denoising diffusion probabilistic models
Mojtaba Safari
Shansong Wang
Qiang Li
Zach Eidex
Richard L. J. Qiu
Chih-Wei Chang
H. Mao
Xiaofeng Yang
DiffMMedIm
76
0
0
06 May 2025
End-to-end fully-binarized network design: from Generic Learned Thermometer to Block Pruning
End-to-end fully-binarized network design: from Generic Learned Thermometer to Block Pruning
Thien Nguyen
William Guicquero
MQ
63
0
0
05 May 2025
Financial Data Analysis with Robust Federated Logistic Regression
Financial Data Analysis with Robust Federated Logistic Regression
Kun Yang
Nikhil Krishnan
Sanjeev Kulkarni
FedML
101
0
0
28 Apr 2025
Fault Diagnosis in New Wind Turbines using Knowledge from Existing Turbines by Generative Domain Adaptation
Fault Diagnosis in New Wind Turbines using Knowledge from Existing Turbines by Generative Domain Adaptation
S. Jonas
Angela Meyer
AI4CE
81
0
0
24 Apr 2025
crowd-hpo: Realistic Hyperparameter Optimization and Benchmarking for Learning from Crowds with Noisy Labels
crowd-hpo: Realistic Hyperparameter Optimization and Benchmarking for Learning from Crowds with Noisy Labels
M. Herde
Lukas Lührs
Denis Huseljic
Bernhard Sick
84
0
0
12 Apr 2025
Semantically Encoding Activity Labels for Context-Aware Human Activity Recognition
Semantically Encoding Activity Labels for Context-Aware Human Activity Recognition
Wen Ge
Guanyi Mou
Emmanuel O. Agu
Kyumin Lee
66
0
0
10 Apr 2025
Benchmarking Convolutional Neural Network and Graph Neural Network based Surrogate Models on a Real-World Car External Aerodynamics Dataset
Benchmarking Convolutional Neural Network and Graph Neural Network based Surrogate Models on a Real-World Car External Aerodynamics Dataset
Sam Jacob Jacob
Markus Mrosek
C. Othmer
Harald Köstler
GNN
150
0
0
09 Apr 2025
Spline-based Transformers
Spline-based Transformers
Prashanth Chandran
Agon Serifi
Markus Gross
Moritz Bächer
158
0
0
03 Apr 2025
NeuralFoil: An Airfoil Aerodynamics Analysis Tool Using Physics-Informed Machine Learning
NeuralFoil: An Airfoil Aerodynamics Analysis Tool Using Physics-Informed Machine Learning
Peter Sharpe
R. John Hansman
67
4
0
20 Mar 2025
MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification
MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification
Jianwei Zhao
Xin Li
Fan Yang
Qiang Zhai
Ao Luo
Yang Zhao
Hong-wei Cheng
Huazhu Fu
DiffMMedIm
75
0
0
16 Mar 2025
FUSE: First-Order and Second-Order Unified SynthEsis in Stochastic Optimization
Zhanhong Jiang
Md Zahid Hasan
Aditya Balu
Joshua R. Waite
Genyi Huang
Soumik Sarkar
84
0
0
06 Mar 2025
Path-Adaptive Matting for Efficient Inference Under Various Computational Cost Constraints
Qinglin Liu
Zonglin Li
Xiaoqian Lv
Xin Sun
Ru Li
Shengping Zhang
78
0
0
05 Mar 2025
MRI super-resolution reconstruction using efficient diffusion probabilistic model with residual shifting
MRI super-resolution reconstruction using efficient diffusion probabilistic model with residual shifting
Mojtaba Safari
Shansong Wang
Zach Eidex
Qiang Li
Erik H. Middlebrooks
D. Yu
Xiaofeng Yang
MedIm
141
1
0
03 Mar 2025
Hierarchical Semantic Compression for Consistent Image Semantic Restoration
Hierarchical Semantic Compression for Consistent Image Semantic Restoration
Shengxi Li
Zifu Zhang
Mai Xu
Lai Jiang
Yufan Liu
Ce Zhu
DiffM
76
0
0
24 Feb 2025
AquaNeRF: Neural Radiance Fields in Underwater Media with Distractor Removal
AquaNeRF: Neural Radiance Fields in Underwater Media with Distractor Removal
Luca Gough
Adrian Azzarelli
Fan Zhang
Nantheera Anantrasirichai
111
2
0
22 Feb 2025
Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
Shane Bergsma
Nolan Dey
Gurpreet Gosal
Gavia Gray
Daria Soboleva
Joel Hestness
109
8
0
21 Feb 2025
Carefully Blending Adversarial Training, Purification, and Aggregation Improves Adversarial Robustness
Carefully Blending Adversarial Training, Purification, and Aggregation Improves Adversarial Robustness
Emanuele Ballarin
A. Ansuini
Luca Bortolussi
AAML
184
0
0
20 Feb 2025
Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent
Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent
Hikaru Umeda
Hideaki Iiduka
155
2
0
17 Feb 2025
Preconditioned Inexact Stochastic ADMM for Deep Model
Shenglong Zhou
Ouya Wang
Ziyan Luo
Yongxu Zhu
Geoffrey Ye Li
86
0
0
15 Feb 2025
Amortized Safe Active Learning for Real-Time Data Acquisition: Pretrained Neural Policies from Simulated Nonparametric Functions
Amortized Safe Active Learning for Real-Time Data Acquisition: Pretrained Neural Policies from Simulated Nonparametric Functions
Cen-You Li
Marc Toussaint
Barbara Rakitsch
Christoph Zimmer
OffRL
453
0
0
26 Jan 2025
Celo: Training Versatile Learned Optimizers on a Compute Diet
Celo: Training Versatile Learned Optimizers on a Compute Diet
A. Moudgil
Boris Knyazev
Guillaume Lajoie
Eugene Belilovsky
446
0
0
22 Jan 2025
Benchmarking YOLOv8 for Optimal Crack Detection in Civil Infrastructure
Benchmarking YOLOv8 for Optimal Crack Detection in Civil Infrastructure
Woubishet Zewdu Taffese
Ritesh Sharma
Mohammad Hossein Afsharmovahed
Gunasekaran Manogaran
Genda Chen
98
0
0
12 Jan 2025
ReFlow6D: Refraction-Guided Transparent Object 6D Pose Estimation via Intermediate Representation Learning
ReFlow6D: Refraction-Guided Transparent Object 6D Pose Estimation via Intermediate Representation Learning
Hrishikesh Gupta
S. Thalhammer
Jean-Baptiste Weibel
Alexander Haberl
Markus Vincze
106
0
0
31 Dec 2024
Temporal Context Consistency Above All: Enhancing Long-Term Anticipation
  by Learning and Enforcing Temporal Constraints
Temporal Context Consistency Above All: Enhancing Long-Term Anticipation by Learning and Enforcing Temporal Constraints
Alberto Maté
Mariella Dimiccoli
AI4TS
90
1
0
27 Dec 2024
Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam
  Timesteps
Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam Timesteps
Benjamin Ellis
Matthew Jackson
Andrei Lupu
Alexander David Goldie
Mattie Fellows
Shimon Whiteson
Jakob Foerster
142
3
0
22 Dec 2024
From Histopathology Images to Cell Clouds: Learning Slide
  Representations with Hierarchical Cell Transformer
From Histopathology Images to Cell Clouds: Learning Slide Representations with Hierarchical Cell Transformer
Zijiang Yang
Zhongwei Qiu
Tiancheng Lin
Hanqing Chao
Wanxing Chang
...
Dakai Jin
K. Yan
Le Lu
Hui Jiang
Yun Bian
DiffM
120
0
0
21 Dec 2024
Bag of Tricks for Multimodal AutoML with Image, Text, and Tabular Data
Bag of Tricks for Multimodal AutoML with Image, Text, and Tabular Data
Zhiqiang Tang
Zihan Zhong
Tong He
Gerald Friedland
166
1
0
19 Dec 2024
No More Adam: Learning Rate Scaling at Initialization is All You Need
No More Adam: Learning Rate Scaling at Initialization is All You Need
Minghao Xu
Lichuan Xiang
Xu Cai
Hongkai Wen
125
3
0
16 Dec 2024
Improving Source Extraction with Diffusion and Consistency Models
Improving Source Extraction with Diffusion and Consistency Models
Tornike Karchkhadze
M. Izadi
Shuo Zhang
DiffM
147
1
0
09 Dec 2024
Automatic Differentiation-based Full Waveform Inversion with Flexible
  Workflows
Automatic Differentiation-based Full Waveform Inversion with Flexible Workflows
Feng Liu
Haipeng Li
Guangyuan Zou
Junlun Li
140
3
0
30 Nov 2024
CAdam: Confidence-Based Optimization for Online Learning
CAdam: Confidence-Based Optimization for Online Learning
Shaowen Wang
Anan Liu
Jian Xiao
Huan Liu
Yuekui Yang
...
Suncong Zheng
Wei-Qiang Zhang
Di Wang
Jie Jiang
Jian Li
117
0
0
29 Nov 2024
Beyond adaptive gradient: Fast-Controlled Minibatch Algorithm for
  large-scale optimization
Beyond adaptive gradient: Fast-Controlled Minibatch Algorithm for large-scale optimization
Corrado Coppola
Lorenzo Papa
Irene Amerini
L. Palagi
ODL
125
0
0
24 Nov 2024
Learning state and proposal dynamics in state-space models using differentiable particle filters and neural networks
Learning state and proposal dynamics in state-space models using differentiable particle filters and neural networks
Benjamin Cox
Santiago Segarra
Victor Elvira
154
0
0
23 Nov 2024
Physics Informed Distillation for Diffusion Models
Physics Informed Distillation for Diffusion Models
Joshua Tian Jin Tee
Kang Zhang
Hee Suk Yoon
Dhananjaya N. Gowda
Chanwoo Kim
Chang D. Yoo
DiffM
98
6
0
13 Nov 2024
1234...161718
Next