ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.08114
  4. Cited By
On the Convergence of Stochastic Gradient Descent with Adaptive
  Stepsizes

On the Convergence of Stochastic Gradient Descent with Adaptive Stepsizes

21 May 2018
Xiaoyun Li
Francesco Orabona
ArXivPDFHTML

Papers citing "On the Convergence of Stochastic Gradient Descent with Adaptive Stepsizes"

50 / 54 papers shown
Title
Distributed Stochastic Gradient Descent with Staleness: A Stochastic Delay Differential Equation Based Framework
Distributed Stochastic Gradient Descent with Staleness: A Stochastic Delay Differential Equation Based Framework
Siyuan Yu
Wei Chen
H. V. Poor
32
0
0
17 Jun 2024
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
Dimitris Oikonomou
Nicolas Loizou
55
4
0
06 Jun 2024
Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad
Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad
Sayantan Choudhury
N. Tupitsa
Nicolas Loizou
Samuel Horváth
Martin Takáč
Eduard A. Gorbunov
30
1
0
05 Mar 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
Yusu Hong
Junhong Lin
46
10
0
06 Feb 2024
How Free is Parameter-Free Stochastic Optimization?
How Free is Parameter-Free Stochastic Optimization?
Amit Attia
Tomer Koren
ODL
47
4
0
05 Feb 2024
A simple uniformly optimal method without line search for convex
  optimization
A simple uniformly optimal method without line search for convex optimization
Tianjiao Li
Guanghui Lan
26
20
0
16 Oct 2023
Efficient Federated Learning via Local Adaptive Amended Optimizer with
  Linear Speedup
Efficient Federated Learning via Local Adaptive Amended Optimizer with Linear Speedup
Yan Sun
Li Shen
Hao Sun
Liang Ding
Dacheng Tao
FedML
24
17
0
30 Jul 2023
Convergence of AdaGrad for Non-convex Objectives: Simple Proofs and
  Relaxed Assumptions
Convergence of AdaGrad for Non-convex Objectives: Simple Proofs and Relaxed Assumptions
Bo Wang
Huishuai Zhang
Zhirui Ma
Wei Chen
34
49
0
29 May 2023
Two Sides of One Coin: the Limits of Untuned SGD and the Power of
  Adaptive Methods
Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods
Junchi Yang
Xiang Li
Ilyas Fatkhullin
Niao He
36
15
0
21 May 2023
SGD with AdaGrad Stepsizes: Full Adaptivity with High Probability to
  Unknown Parameters, Unbounded Gradients and Affine Variance
SGD with AdaGrad Stepsizes: Full Adaptivity with High Probability to Unknown Parameters, Unbounded Gradients and Affine Variance
Amit Attia
Tomer Koren
ODL
22
25
0
17 Feb 2023
Improved Representation of Asymmetrical Distances with Interval
  Quasimetric Embeddings
Improved Representation of Asymmetrical Distances with Interval Quasimetric Embeddings
Tongzhou Wang
Phillip Isola
21
7
0
28 Nov 2022
Adaptive Stochastic Optimisation of Nonconvex Composite Objectives
Adaptive Stochastic Optimisation of Nonconvex Composite Objectives
Weijia Shao
F. Sivrikaya
S. Albayrak
16
0
0
21 Nov 2022
TiAda: A Time-scale Adaptive Algorithm for Nonconvex Minimax
  Optimization
TiAda: A Time-scale Adaptive Algorithm for Nonconvex Minimax Optimization
Xiang Li
Junchi Yang
Niao He
26
8
0
31 Oct 2022
Stability and Generalization for Markov Chain Stochastic Gradient
  Methods
Stability and Generalization for Markov Chain Stochastic Gradient Methods
Puyu Wang
Yunwen Lei
Yiming Ying
Ding-Xuan Zhou
16
18
0
16 Sep 2022
Differentially Private Stochastic Gradient Descent with Low-Noise
Differentially Private Stochastic Gradient Descent with Low-Noise
Puyu Wang
Yunwen Lei
Yiming Ying
Ding-Xuan Zhou
FedML
43
5
0
09 Sep 2022
Optimistic Optimisation of Composite Objective with Exponentiated Update
Optimistic Optimisation of Composite Objective with Exponentiated Update
Weijia Shao
F. Sivrikaya
S. Albayrak
23
3
0
08 Aug 2022
Grad-GradaGrad? A Non-Monotone Adaptive Stochastic Gradient Method
Grad-GradaGrad? A Non-Monotone Adaptive Stochastic Gradient Method
Aaron Defazio
Baoyu Zhou
Lin Xiao
ODL
24
5
0
14 Jun 2022
Nest Your Adaptive Algorithm for Parameter-Agnostic Nonconvex Minimax
  Optimization
Nest Your Adaptive Algorithm for Parameter-Agnostic Nonconvex Minimax Optimization
Junchi Yang
Xiang Li
Niao He
ODL
27
22
0
01 Jun 2022
High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad
  Stepsize
High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad Stepsize
Ali Kavis
Kfir Y. Levy
V. Cevher
17
38
0
06 Apr 2022
A Local Convergence Theory for the Stochastic Gradient Descent Method in
  Non-Convex Optimization With Non-isolated Local Minima
A Local Convergence Theory for the Stochastic Gradient Descent Method in Non-Convex Optimization With Non-isolated Local Minima
Tae-Eon Ko
Xiantao Li
22
2
0
21 Mar 2022
Adaptive Gradient Methods with Local Guarantees
Adaptive Gradient Methods with Local Guarantees
Zhou Lu
Wenhan Xia
Sanjeev Arora
Elad Hazan
ODL
24
9
0
02 Mar 2022
Understanding AdamW through Proximal Methods and Scale-Freeness
Understanding AdamW through Proximal Methods and Scale-Freeness
Zhenxun Zhuang
Mingrui Liu
Ashok Cutkosky
Francesco Orabona
39
63
0
31 Jan 2022
A Stochastic Bundle Method for Interpolating Networks
A Stochastic Bundle Method for Interpolating Networks
Alasdair Paren
Leonard Berrada
Rudra P. K. Poudel
M. P. Kumar
24
4
0
29 Jan 2022
Asymptotics of $\ell_2$ Regularized Network Embeddings
Asymptotics of ℓ2\ell_2ℓ2​ Regularized Network Embeddings
A. Davison
23
0
0
05 Jan 2022
A Novel Convergence Analysis for Algorithms of the Adam Family
A Novel Convergence Analysis for Algorithms of the Adam Family
Zhishuai Guo
Yi Tian Xu
W. Yin
R. L. Jin
Tianbao Yang
39
47
0
07 Dec 2021
Stochastic Mirror Descent: Convergence Analysis and Adaptive Variants
  via the Mirror Stochastic Polyak Stepsize
Stochastic Mirror Descent: Convergence Analysis and Adaptive Variants via the Mirror Stochastic Polyak Stepsize
Ryan DÓrazio
Nicolas Loizou
I. Laradji
Ioannis Mitliagkas
34
30
0
28 Oct 2021
Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic
  Gradient Descent
Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent
Sharan Vaswani
Benjamin Dubois-Taine
Reza Babanezhad
51
11
0
21 Oct 2021
Training Deep Neural Networks with Adaptive Momentum Inspired by the
  Quadratic Optimization
Training Deep Neural Networks with Adaptive Momentum Inspired by the Quadratic Optimization
Tao Sun
Huaming Ling
Zuoqiang Shi
Dongsheng Li
Bao Wang
ODL
22
13
0
18 Oct 2021
Variance Reduction based Experience Replay for Policy Optimization
Variance Reduction based Experience Replay for Policy Optimization
Hua Zheng
Wei Xie
M. Feng
OffRL
33
2
0
17 Oct 2021
Adaptive Differentially Private Empirical Risk Minimization
Adaptive Differentially Private Empirical Risk Minimization
Xiaoxia Wu
Lingxiao Wang
Irina Cristali
Quanquan Gu
Rebecca Willett
38
6
0
14 Oct 2021
On the Convergence of Decentralized Adaptive Gradient Methods
On the Convergence of Decentralized Adaptive Gradient Methods
Xiangyi Chen
Belhal Karimi
Weijie Zhao
Ping Li
21
21
0
07 Sep 2021
Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective
Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective
Kushal Chakrabarti
Nikhil Chopra
ODL
AI4CE
37
9
0
31 May 2021
Sequential convergence of AdaGrad algorithm for smooth convex
  optimization
Sequential convergence of AdaGrad algorithm for smooth convex optimization
Cheik Traoré
Edouard Pauwels
11
21
0
24 Nov 2020
A Qualitative Study of the Dynamic Behavior for Adaptive Gradient
  Algorithms
A Qualitative Study of the Dynamic Behavior for Adaptive Gradient Algorithms
Chao Ma
Lei Wu
E. Weinan
ODL
11
23
0
14 Sep 2020
A High Probability Analysis of Adaptive SGD with Momentum
A High Probability Analysis of Adaptive SGD with Momentum
Xiaoyun Li
Francesco Orabona
92
65
0
28 Jul 2020
Adaptive Gradient Methods for Constrained Convex Optimization and
  Variational Inequalities
Adaptive Gradient Methods for Constrained Convex Optimization and Variational Inequalities
Alina Ene
Huy Le Nguyen
Adrian Vladu
ODL
27
28
0
17 Jul 2020
Incremental Without Replacement Sampling in Nonconvex Optimization
Incremental Without Replacement Sampling in Nonconvex Optimization
Edouard Pauwels
38
5
0
15 Jul 2020
Balancing Rates and Variance via Adaptive Batch-Size for Stochastic
  Optimization Problems
Balancing Rates and Variance via Adaptive Batch-Size for Stochastic Optimization Problems
Zhan Gao
Alec Koppel
Alejandro Ribeiro
14
10
0
02 Jul 2020
Stopping Criteria for, and Strong Convergence of, Stochastic Gradient
  Descent on Bottou-Curtis-Nocedal Functions
Stopping Criteria for, and Strong Convergence of, Stochastic Gradient Descent on Bottou-Curtis-Nocedal Functions
V. Patel
18
23
0
01 Apr 2020
A new regret analysis for Adam-type algorithms
A new regret analysis for Adam-type algorithms
Ahmet Alacaoglu
Yura Malitsky
P. Mertikopoulos
V. Cevher
ODL
48
42
0
21 Mar 2020
Adaptive Federated Optimization
Adaptive Federated Optimization
Sashank J. Reddi
Zachary B. Charles
Manzil Zaheer
Zachary Garrett
Keith Rush
Jakub Konecný
Sanjiv Kumar
H. B. McMahan
FedML
15
1,391
0
29 Feb 2020
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast
  Convergence
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence
Nicolas Loizou
Sharan Vaswani
I. Laradji
Simon Lacoste-Julien
27
181
0
24 Feb 2020
Online Learning with Imperfect Hints
Online Learning with Imperfect Hints
Aditya Bhaskara
Ashok Cutkosky
Ravi Kumar
Manish Purohit
27
57
0
11 Feb 2020
Better Theory for SGD in the Nonconvex World
Better Theory for SGD in the Nonconvex World
Ahmed Khaled
Peter Richtárik
13
178
0
09 Feb 2020
SoftAdapt: Techniques for Adaptive Loss Weighting of Neural Networks
  with Multi-Part Loss Functions
SoftAdapt: Techniques for Adaptive Loss Weighting of Neural Networks with Multi-Part Loss Functions
A. Heydari
Craig Thompson
A. Mehmood
24
57
0
27 Dec 2019
Stochastic gradient descent for hybrid quantum-classical optimization
Stochastic gradient descent for hybrid quantum-classical optimization
R. Sweke
Frederik Wilde
Johannes Jakob Meyer
Maria Schuld
Paul K. Fährmann
Barthélémy Meynard-Piganeau
Jens Eisert
17
236
0
02 Oct 2019
Why gradient clipping accelerates training: A theoretical justification
  for adaptivity
Why gradient clipping accelerates training: A theoretical justification for adaptivity
Junzhe Zhang
Tianxing He
S. Sra
Ali Jadbabaie
30
442
0
28 May 2019
Stochastic Iterative Hard Thresholding for Graph-structured Sparsity
  Optimization
Stochastic Iterative Hard Thresholding for Graph-structured Sparsity Optimization
Baojian Zhou
F. Chen
Yiming Ying
26
7
0
09 May 2019
Theoretical Analysis of Auto Rate-Tuning by Batch Normalization
Theoretical Analysis of Auto Rate-Tuning by Batch Normalization
Sanjeev Arora
Zhiyuan Li
Kaifeng Lyu
31
130
0
10 Dec 2018
A Sufficient Condition for Convergences of Adam and RMSProp
A Sufficient Condition for Convergences of Adam and RMSProp
Fangyu Zou
Li Shen
Zequn Jie
Weizhong Zhang
Wei Liu
30
364
0
23 Nov 2018
12
Next