Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1908.10525
Cited By
v1
v2 (latest)
Linear Convergence of Adaptive Stochastic Gradient Descent
International Conference on Artificial Intelligence and Statistics (AISTATS), 2019
28 August 2019
Yuege Xie
Xiaoxia Wu
Rachel A. Ward
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Linear Convergence of Adaptive Stochastic Gradient Descent"
33 / 33 papers shown
A regret minimization approach to fixed-point iterations
Joon Kwon
170
0
0
25 Sep 2025
On the Convergence of Muon and Beyond
Da Chang
Yongxiang Liu
Ganzhao Yuan
425
7
0
19 Sep 2025
Adaptive Preconditioners Trigger Loss Spikes in Adam
Zhiwei Bai
Zhangchen Zhou
Jiajie Zhao
Xiaolong Li
Zhiyu Li
Feiyu Xiong
Hongkang Yang
Yaoyu Zhang
Z. Xu
ODL
385
3
0
05 Jun 2025
ASGO: Adaptive Structured Gradient Optimization
Kang An
Yuxing Liu
Boyao Wang
Shiqian Ma
Shiqian Ma
Tong Zhang
Tong Zhang
ODL
542
37
0
26 Mar 2025
Beyond adaptive gradient: Fast-Controlled Minibatch Algorithm for large-scale optimization
Corrado Coppola
Lorenzo Papa
Irene Amerini
L. Palagi
ODL
495
0
0
24 Nov 2024
The High Line: Exact Risk and Learning Rate Curves of Stochastic Adaptive Learning Rate Algorithms
Elizabeth Collins-Woodfin
Inbar Seroussi
Begona García Malaxechebarría
Andrew W. Mackenzie
Elliot Paquette
Courtney Paquette
205
2
0
30 May 2024
Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad
Sayantan Choudhury
N. Tupitsa
Nicolas Loizou
Samuel Horváth
Martin Takáč
Eduard A. Gorbunov
468
7
0
05 Mar 2024
Adaptive SGD with Polyak stepsize and Line-search: Robust Convergence and Variance Reduction
Neural Information Processing Systems (NeurIPS), 2023
Xiao-Yan Jiang
Sebastian U. Stich
315
33
0
11 Aug 2023
Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters and Non-ergodic Case
Machine-mediated learning (ML), 2023
Meixuan He
Yuqing Liang
Jinlan Liu
Dongpo Xu
299
18
0
20 Jul 2023
Relaxing the Additivity Constraints in Decentralized No-Regret High-Dimensional Bayesian Optimization
International Conference on Learning Representations (ICLR), 2023
Anthony Bardou
Patrick Thiran
Thomas Begin
422
10
0
31 May 2023
Statistical Analysis of Fixed Mini-Batch Gradient Descent Estimator
Journal of Computational And Graphical Statistics (JCGS), 2023
Haobo Qi
Feifei Wang
Hansheng Wang
272
17
0
13 Apr 2023
TiAda: A Time-scale Adaptive Algorithm for Nonconvex Minimax Optimization
International Conference on Learning Representations (ICLR), 2022
Xiang Li
Junchi Yang
Niao He
261
13
0
31 Oct 2022
On the Convergence of AdaGrad(Norm) on
R
d
\R^{d}
R
d
: Beyond Convexity, Non-Asymptotic Rate and Acceleration
Zijian Liu
Ta Duy Nguyen
Alina Ene
Huy Le Nguyen
433
13
0
29 Sep 2022
Accelerating SGD for Highly Ill-Conditioned Huge-Scale Online Matrix Completion
Neural Information Processing Systems (NeurIPS), 2022
G. Zhang
Hong-Ming Chiu
Richard Y. Zhang
361
12
0
24 Aug 2022
Improved Policy Optimization for Online Imitation Learning
J. Lavington
Sharan Vaswani
Mark Schmidt
OffRL
322
7
0
29 Jul 2022
Adaptive Gradient Methods at the Edge of Stability
Jeremy M. Cohen
Behrooz Ghorbani
Shankar Krishnan
Naman Agarwal
Sourabh Medapati
...
Daniel Suo
David E. Cardoze
Zachary Nado
George E. Dahl
Justin Gilmer
ODL
309
74
0
29 Jul 2022
Nest Your Adaptive Algorithm for Parameter-Agnostic Nonconvex Minimax Optimization
Neural Information Processing Systems (NeurIPS), 2022
Junchi Yang
Xiang Li
Niao He
ODL
304
26
0
01 Jun 2022
Optimal Algorithms for Stochastic Multi-Level Compositional Optimization
International Conference on Machine Learning (ICML), 2022
Wei Jiang
Bokun Wang
Yibo Wang
Lijun Zhang
Tianbao Yang
530
23
0
15 Feb 2022
Local Quadratic Convergence of Stochastic Gradient Descent with Adaptive Step Size
Adityanarayanan Radhakrishnan
M. Belkin
Caroline Uhler
ODL
122
0
0
30 Dec 2021
Stationary Behavior of Constant Stepsize SGD Type Algorithms: An Asymptotic Characterization
Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS), 2021
Zaiwei Chen
Shancong Mou
S. T. Maguluri
162
16
0
11 Nov 2021
AdaLoss: A computationally-efficient and provably convergent adaptive gradient method
Xiaoxia Wu
Yuege Xie
S. Du
Rachel A. Ward
ODL
175
7
0
17 Sep 2021
On Faster Convergence of Scaled Sign Gradient Descent
Xiuxian Li
Kuo-Yi Lin
Li Li
Yiguang Hong
Jie-bin Chen
ODL
215
21
0
04 Sep 2021
Stochastic gradient descent with noise of machine learning type. Part I: Discrete time analysis
Journal of nonlinear science (J. Nonlinear Sci.), 2021
Stephan Wojtowytsch
298
58
0
04 May 2021
Neurons learn slower than they think
I. Kulikovskikh
189
0
0
02 Apr 2021
Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability
International Conference on Learning Representations (ICLR), 2021
Jeremy M. Cohen
Simran Kaur
Yuanzhi Li
J. Zico Kolter
Ameet Talwalkar
ODL
540
373
0
26 Feb 2021
Convergence of stochastic gradient descent schemes for Lojasiewicz-landscapes
Journal of Machine Learning (JML), 2021
Steffen Dereich
Sebastian Kassing
402
35
0
16 Feb 2021
Painless step size adaptation for SGD
I. Kulikovskikh
Tarzan Legović
205
0
0
01 Feb 2021
Sequential convergence of AdaGrad algorithm for smooth convex optimization
Operations Research Letters (ORL), 2020
Cheik Traoré
Edouard Pauwels
247
30
0
24 Nov 2020
Linear Convergence of Generalized Mirror Descent with Time-Dependent Mirrors
Adityanarayanan Radhakrishnan
M. Belkin
Caroline Uhler
238
10
0
18 Sep 2020
A Qualitative Study of the Dynamic Behavior for Adaptive Gradient Algorithms
Mathematical and Scientific Machine Learning (MSML), 2020
Chao Ma
Lei Wu
E. Weinan
ODL
190
37
0
14 Sep 2020
Adaptive Gradient Methods Converge Faster with Over-Parameterization (but you should do a line-search)
Sharan Vaswani
I. Laradji
Frederik Kunstner
S. Meng
Mark Schmidt
Damien Scieur
370
30
0
11 Jun 2020
Choosing the Sample with Lowest Loss makes SGD Robust
International Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Vatsal Shah
Xiaoxia Wu
Sujay Sanghavi
386
50
0
10 Jan 2020
Convergence Analysis of a Momentum Algorithm with Adaptive Step Size for Non Convex Optimization
Anas Barakat
Pascal Bianchi
216
13
0
18 Nov 2019
1
Page 1 of 1