Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
1808.02941
Cited By
v1
v2 (latest)
On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization
8 August 2018
Xiangyi Chen
Sijia Liu
Tian Ding
Mingyi Hong
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization"
50 / 209 papers shown
Title
HVAdam: A Full-Dimension Adaptive Optimizer
AAAI Conference on Artificial Intelligence (AAAI), 2025
Yiheng Zhang
Shaowu Wu
Yuanzhuo Xu
Jiajun Wu
Shang Xu
Steve Drew
Xiaoguang Niu
120
0
0
25 Nov 2025
Stragglers Can Contribute More: Uncertainty-Aware Distillation for Asynchronous Federated Learning
Yujia Wang
Fenglong Ma
Jinghui Chen
FedML
197
0
0
25 Nov 2025
Exploring Landscapes for Better Minima along Valleys
Tong Zhao
Jiacheng Li
Yuanchang Zhou
Guangming Tan
Weile Jia
58
0
0
31 Oct 2025
Muon Outperforms Adam in Tail-End Associative Memory Learning
Shuche Wang
Fengzhuo Zhang
Jiaxiang Li
Cunxiao Du
C. Du
Tianyu Pang
Zhuoran Yang
Mingyi Hong
Vincent Y. F. Tan
64
2
0
30 Sep 2025
End-to-End Deep Learning for Predicting Metric Space-Valued Outputs
Yidong Zhou
Su I Iao
Hans-Georg Müller
60
1
0
28 Sep 2025
On the Convergence of Muon and Beyond
Da Chang
Yongxiang Liu
Ganzhao Yuan
257
2
0
19 Sep 2025
Globally aware optimization with resurgence
Wei Bu
36
0
0
01 Sep 2025
Compressed Decentralized Momentum Stochastic Gradient Methods for Nonconvex Optimization
Wei Liu
Anweshit Panda
Ujwal Pandey
Christopher Brissette
Yikang Shen
George M. Slota
Naigang Wang
Jie Chen
Yangyang Xu
65
0
0
07 Aug 2025
Neighbor-Sampling Based Momentum Stochastic Methods for Training Graph Neural Networks
Molly Noel
Gabriel Mancino-Ball
Yangyang Xu
GNN
117
0
0
01 Aug 2025
An Adaptive Method Stabilizing Activations for Enhanced Generalization
Hyunseok Seung
Jaewoo Lee
Hyunsuk Ko
ODL
230
0
0
10 Jun 2025
Adaptive Preconditioners Trigger Loss Spikes in Adam
Zhiwei Bai
Zhangchen Zhou
Jiajie Zhao
Xiaolong Li
Zhiyu Li
Feiyu Xiong
Hongkang Yang
Yaoyu Zhang
Z. Xu
ODL
239
0
0
05 Jun 2025
Unified Scaling Laws for Compressed Representations
Andrei Panferov
Alexandra Volkova
Ionut-Vlad Modoranu
Vage Egiazarian
M. Safaryan
Dan Alistarh
145
1
0
02 Jun 2025
LightSAM: Parameter-Agnostic Sharpness-Aware Minimization
Yifei Cheng
Li Shen
Hao Sun
Nan Yin
Xiaochun Cao
Enhong Chen
AAML
182
0
0
30 May 2025
On the
O
(
d
K
1
/
4
)
O(\frac{\sqrt{d}}{K^{1/4}})
O
(
K
1/4
d
)
Convergence Rate of AdamW Measured by
ℓ
1
\ell_1
ℓ
1
Norm
Huan Li
Yiming Dong
Zhouchen Lin
334
0
0
17 May 2025
Sharp higher order convergence rates for the Adam optimizer
Steffen Dereich
Arnulf Jentzen
Adrian Riekert
ODL
194
1
0
28 Apr 2025
A Langevin sampling algorithm inspired by the Adam optimizer
Benedict Leimkuhler
René Lohmann
Peter Whalley
346
2
0
26 Apr 2025
Chemical knowledge-informed framework for privacy-aware retrosynthesis learning
Nature Communications (Nat Commun), 2025
Guikun Chen
Xu Zhang
Yue Yang
Yong Liu
Yi Yang
Wenguan Wang
234
0
0
26 Feb 2025
Scaled Conjugate Gradient Method for Nonconvex Optimization in Deep Neural Networks
Naoki Sato
Koshiro Izumi
Hideaki Iiduka
ODL
213
1
0
16 Dec 2024
Enhancing and Accelerating Diffusion-Based Inverse Problem Solving through Measurements Optimization
Tianyu Chen
Zhendong Wang
Mingyuan Zhou
218
1
0
05 Dec 2024
On the Performance Analysis of Momentum Method: A Frequency Domain Perspective
International Conference on Learning Representations (ICLR), 2024
Xianliang Li
Jun Luo
Zhiwei Zheng
Hanxiao Wang
Li Luo
Lingkun Wen
Linlong Wu
Sheng Xu
445
4
0
29 Nov 2024
Understanding Adam Requires Better Rotation Dependent Assumptions
Tianyue H. Zhang
Lucas Maes
Alexia Jolicoeur-Martineau
Alexia Jolicoeur-Martineau
Damien Scieur
Damien Scieur
Simon Lacoste-Julien
Charles Guille-Escuret
222
6
0
25 Oct 2024
LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics
International Conference on Learning Representations (ICLR), 2024
Thomas Robert
M. Safaryan
Ionut-Vlad Modoranu
Dan Alistarh
ODL
334
18
0
21 Oct 2024
Faster Adaptive Decentralized Learning Algorithms
International Conference on Machine Learning (ICML), 2024
Feihu Huang
Jianyu Zhao
194
4
0
19 Aug 2024
Attack Anything: Blind DNNs via Universal Background Adversarial Attack
Jiawei Lian
Shaohui Mei
X. Wang
Yi Wang
L. Wang
Yingjie Lu
Mingyang Ma
Lap-Pui Chau
AAML
429
3
0
17 Aug 2024
An Adaptive CSI Feedback Model Based on BiLSTM for Massive MIMO-OFDM Systems
Hongrui Shen
Long Zhao
Kan Zheng
Yuhua Cao
Pingzhi Fan
152
2
0
26 Jul 2024
FADAS: Towards Federated Adaptive Asynchronous Optimization
Yujia Wang
Shiqiang Wang
Songtao Lu
Jinghui Chen
FedML
181
9
0
25 Jul 2024
The Implicit Bias of Adam on Separable Data
Neural Information Processing Systems (NeurIPS), 2024
Chenyang Zhang
Difan Zou
Yuan Cao
AI4CE
214
19
0
15 Jun 2024
Provable Complexity Improvement of AdaGrad over SGD: Upper and Lower Bounds in Stochastic Non-Convex Optimization
Annual Conference Computational Learning Theory (COLT), 2024
Devyani Maladkar
Ruichen Jiang
Aryan Mokhtari
345
6
0
07 Jun 2024
AdaFisher: Adaptive Second Order Optimization via Fisher Information
Damien Martins Gomes
Yanlei Zhang
Eugene Belilovsky
Guy Wolf
Mahdi S. Hosseini
ODL
487
5
0
26 May 2024
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Matteo Tucat
Anirbit Mukherjee
Procheta Sen
Mingfei Sun
Omar Rivasplata
MLT
188
1
0
12 Apr 2024
Implicit Bias of AdamW:
ℓ
∞
\ell_\infty
ℓ
∞
Norm Constrained Optimization
Shuo Xie
Zhiyuan Li
OffRL
206
37
0
05 Apr 2024
Conjugate-Gradient-like Based Adaptive Moment Estimation Optimization Algorithm for Deep Learning
Jiawu Tian
Liwei Xu
Xiaowei Zhang
Yongqi Li
ODL
303
0
0
02 Apr 2024
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Qi Zhang
Yi Zhou
Shaofeng Zou
336
11
0
01 Apr 2024
Regularized DeepIV with Model Selection
Zihao Li
Hui Lan
Vasilis Syrgkanis
Mengdi Wang
Masatoshi Uehara
194
4
0
07 Mar 2024
Why Transformers Need Adam: A Hessian Perspective
Yushun Zhang
Congliang Chen
Tian Ding
Ziniu Li
Tian Ding
Zhimin Luo
288
75
0
26 Feb 2024
Towards Quantifying the Preconditioning Effect of Adam
Rudrajit Das
Naman Agarwal
Sujay Sanghavi
Inderjit S. Dhillon
82
8
0
11 Feb 2024
AdaBatchGrad: Combining Adaptive Batch Size and Adaptive Step Size
P. Ostroukhov
Aigerim Zhumabayeva
Chulu Xiang
Alexander Gasnikov
Martin Takáč
Dmitry Kamzolov
ODL
158
2
0
07 Feb 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
Neural Information Processing Systems (NeurIPS), 2024
Yusu Hong
Junhong Lin
311
16
0
06 Feb 2024
Momentum Does Not Reduce Stochastic Noise in Stochastic Gradient Descent
Naoki Sato
Hideaki Iiduka
ODL
385
1
0
04 Feb 2024
Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise
Kwangjun Ahn
Zhiyu Zhang
Yunbum Kook
Yan Dai
249
20
0
02 Feb 2024
AdamL: A fast adaptive gradient method incorporating loss function
Lu Xia
Stefano Massei
ODL
136
3
0
23 Dec 2023
AGD: an Auto-switchable Optimizer using Stepwise Gradient Difference for Preconditioning Matrix
Neural Information Processing Systems (NeurIPS), 2023
Yun Yue
Zhiling Ye
Jiadi Jiang
Yongchao Liu
Ke Zhang
ODL
168
3
0
04 Dec 2023
Adam-like Algorithm with Smooth Clipping Attains Global Minima: Analysis Based on Ergodicity of Functional SDEs
Keisuke Suzuki
133
0
0
29 Nov 2023
Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling
Naoki Sato
Hideaki Iiduka
312
4
0
15 Nov 2023
Signal Processing Meets SGD: From Momentum to Filter
Zhipeng Yao
Guisong Chang
Jiaqi Zhang
Qi Zhang
Dazhou Li
Yu Zhang
ODL
512
0
0
06 Nov 2023
Closing the Gap Between the Upper Bound and the Lower Bound of Adam's Iteration Complexity
Neural Information Processing Systems (NeurIPS), 2023
Bohan Wang
Jingwen Fu
Huishuai Zhang
Nanning Zheng
Wei Chen
206
20
0
27 Oct 2023
FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data
Hao Sun
Li Shen
Shi-Yong Chen
Jingwei Sun
Jing Li
Guangzhong Sun
Dacheng Tao
FedML
163
2
0
18 Sep 2023
A Theoretical and Empirical Study on the Convergence of Adam with an "Exact" Constant Step Size in Non-Convex Settings
Alokendu Mazumder
Rishabh Sabharwal
Manan Tayal
Bhartendu Kumar
Punit Rathore
243
0
0
15 Sep 2023
Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters and Non-ergodic Case
Machine-mediated learning (ML), 2023
Meixuan He
Yuqing Liang
Jinlan Liu
Dongpo Xu
195
13
0
20 Jul 2023
Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers
International Conference on Machine Learning (ICML), 2023
Yineng Chen
Z. Li
Lefei Zhang
Bo Du
Hai Zhao
143
8
0
02 Jul 2023
1
2
3
4
5
Next