ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.10542
  4. Cited By
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast
  Convergence

Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence

24 February 2020
Nicolas Loizou
Sharan Vaswani
I. Laradji
Simon Lacoste-Julien
ArXivPDFHTML

Papers citing "Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence"

50 / 112 papers shown
Title
Adaptive Federated Learning with Auto-Tuned Clients
Adaptive Federated Learning with Auto-Tuned Clients
J. Kim
Taha Toghani
César A. Uribe
Anastasios Kyrillidis
FedML
40
6
0
19 Jun 2023
Prodigy: An Expeditiously Adaptive Parameter-Free Learner
Prodigy: An Expeditiously Adaptive Parameter-Free Learner
Konstantin Mishchenko
Aaron Defazio
ODL
28
55
0
09 Jun 2023
BiSLS/SPS: Auto-tune Step Sizes for Stable Bi-level Optimization
BiSLS/SPS: Auto-tune Step Sizes for Stable Bi-level Optimization
Chen Fan
Gaspard Choné-Ducasse
Mark W. Schmidt
Christos Thrampoulidis
19
3
0
30 May 2023
DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent
  Method
DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method
Ahmed Khaled
Konstantin Mishchenko
Chi Jin
ODL
22
22
0
25 May 2023
Layer-wise Adaptive Step-Sizes for Stochastic First-Order Methods for Deep Learning
Achraf Bahamou
D. Goldfarb
ODL
31
0
0
23 May 2023
MoMo: Momentum Models for Adaptive Learning Rates
MoMo: Momentum Models for Adaptive Learning Rates
Fabian Schaipp
Ruben Ohana
Michael Eickenberg
Aaron Defazio
Robert Mansel Gower
30
10
0
12 May 2023
Fast Convergence of Random Reshuffling under Over-Parameterization and
  the Polyak-Łojasiewicz Condition
Fast Convergence of Random Reshuffling under Over-Parameterization and the Polyak-Łojasiewicz Condition
Chen Fan
Christos Thrampoulidis
Mark W. Schmidt
20
2
0
02 Apr 2023
Single-Call Stochastic Extragradient Methods for Structured Non-monotone
  Variational Inequalities: Improved Analysis under Weaker Conditions
Single-Call Stochastic Extragradient Methods for Structured Non-monotone Variational Inequalities: Improved Analysis under Weaker Conditions
S. Choudhury
Eduard A. Gorbunov
Nicolas Loizou
25
13
0
27 Feb 2023
Online Continuous Hyperparameter Optimization for Generalized Linear
  Contextual Bandits
Online Continuous Hyperparameter Optimization for Generalized Linear Contextual Bandits
Yue Kang
Cho-Jui Hsieh
T. C. Lee
24
1
0
18 Feb 2023
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
Maor Ivgi
Oliver Hinder
Y. Carmon
ODL
24
56
0
08 Feb 2023
Target-based Surrogates for Stochastic Optimization
Target-based Surrogates for Stochastic Optimization
J. Lavington
Sharan Vaswani
Reza Babanezhad
Mark W. Schmidt
Nicolas Le Roux
46
5
0
06 Feb 2023
FedExP: Speeding Up Federated Averaging via Extrapolation
FedExP: Speeding Up Federated Averaging via Extrapolation
Divyansh Jhunjhunwala
Shiqiang Wang
Gauri Joshi
FedML
19
52
0
23 Jan 2023
A Stochastic Proximal Polyak Step Size
A Stochastic Proximal Polyak Step Size
Fabian Schaipp
Robert Mansel Gower
M. Ulbrich
14
12
0
12 Jan 2023
Optimizing the Performative Risk under Weak Convexity Assumptions
Optimizing the Performative Risk under Weak Convexity Assumptions
Yulai Zhao
19
5
0
02 Sep 2022
Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of
  Deep Learning Optimizer using Hyperparameters Close to One
Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of Deep Learning Optimizer using Hyperparameters Close to One
Hideaki Iiduka
ODL
27
4
0
21 Aug 2022
Adaptive Learning Rates for Faster Stochastic Gradient Methods
Adaptive Learning Rates for Faster Stochastic Gradient Methods
Samuel Horváth
Konstantin Mishchenko
Peter Richtárik
ODL
33
7
0
10 Aug 2022
Improved Policy Optimization for Online Imitation Learning
Improved Policy Optimization for Online Imitation Learning
J. Lavington
Sharan Vaswani
Mark W. Schmidt
OffRL
15
6
0
29 Jul 2022
SP2: A Second Order Stochastic Polyak Method
SP2: A Second Order Stochastic Polyak Method
Shuang Li
W. Swartworth
Martin Takávc
Deanna Needell
Robert Mansel Gower
21
13
0
17 Jul 2022
Theoretical analysis of Adam using hyperparameters close to one without
  Lipschitz smoothness
Theoretical analysis of Adam using hyperparameters close to one without Lipschitz smoothness
Hideaki Iiduka
15
5
0
27 Jun 2022
Grad-GradaGrad? A Non-Monotone Adaptive Stochastic Gradient Method
Grad-GradaGrad? A Non-Monotone Adaptive Stochastic Gradient Method
Aaron Defazio
Baoyu Zhou
Lin Xiao
ODL
14
5
0
14 Jun 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient
  Algorithms
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms
Lam M. Nguyen
Trang H. Tran
32
2
0
13 Jun 2022
Nest Your Adaptive Algorithm for Parameter-Agnostic Nonconvex Minimax
  Optimization
Nest Your Adaptive Algorithm for Parameter-Agnostic Nonconvex Minimax Optimization
Junchi Yang
Xiang Li
Niao He
ODL
27
22
0
01 Jun 2022
Making SGD Parameter-Free
Making SGD Parameter-Free
Y. Carmon
Oliver Hinder
17
41
0
04 May 2022
An Adaptive Incremental Gradient Method With Support for Non-Euclidean
  Norms
An Adaptive Incremental Gradient Method With Support for Non-Euclidean Norms
Binghui Xie
Chen Jin
Kaiwen Zhou
James Cheng
Wei Meng
35
1
0
28 Apr 2022
Learning to Accelerate by the Methods of Step-size Planning
Learning to Accelerate by the Methods of Step-size Planning
Hengshuai Yao
21
0
0
01 Apr 2022
Amortized Proximal Optimization
Amortized Proximal Optimization
Juhan Bae
Paul Vicol
Jeff Z. HaoChen
Roger C. Grosse
ODL
25
14
0
28 Feb 2022
Mirror Descent Strikes Again: Optimal Stochastic Convex Optimization
  under Infinite Noise Variance
Mirror Descent Strikes Again: Optimal Stochastic Convex Optimization under Infinite Noise Variance
Nuri Mert Vural
Lu Yu
Krishnakumar Balasubramanian
S. Volgushev
Murat A. Erdogdu
15
23
0
23 Feb 2022
A Stochastic Bundle Method for Interpolating Networks
A Stochastic Bundle Method for Interpolating Networks
Alasdair Paren
Leonard Berrada
Rudra P. K. Poudel
M. P. Kumar
24
4
0
29 Jan 2022
Minimization of Stochastic First-order Oracle Complexity of Adaptive
  Methods for Nonconvex Optimization
Minimization of Stochastic First-order Oracle Complexity of Adaptive Methods for Nonconvex Optimization
Hideaki Iiduka
13
0
0
14 Dec 2021
Randomized Stochastic Gradient Descent Ascent
Randomized Stochastic Gradient Descent Ascent
Othmane Sebbouh
Marco Cuturi
Gabriel Peyré
118
7
0
25 Nov 2021
Convergence Rates for the MAP of an Exponential Family and Stochastic
  Mirror Descent -- an Open Problem
Convergence Rates for the MAP of an Exponential Family and Stochastic Mirror Descent -- an Open Problem
Rémi Le Priol
Frederik Kunstner
Damien Scieur
Simon Lacoste-Julien
11
1
0
12 Nov 2021
Stochastic Mirror Descent: Convergence Analysis and Adaptive Variants
  via the Mirror Stochastic Polyak Stepsize
Stochastic Mirror Descent: Convergence Analysis and Adaptive Variants via the Mirror Stochastic Polyak Stepsize
Ryan DÓrazio
Nicolas Loizou
I. Laradji
Ioannis Mitliagkas
31
30
0
28 Oct 2021
Accelerated Almost-Sure Convergence Rates for Nonconvex Stochastic
  Gradient Descent using Stochastic Learning Rates
Accelerated Almost-Sure Convergence Rates for Nonconvex Stochastic Gradient Descent using Stochastic Learning Rates
Theodoros Mamalis
D. Stipanović
R. Tao
21
2
0
25 Oct 2021
Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic
  Gradient Descent
Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent
Sharan Vaswani
Benjamin Dubois-Taine
Reza Babanezhad
48
11
0
21 Oct 2021
Doubly Adaptive Scaled Algorithm for Machine Learning Using Second-Order
  Information
Doubly Adaptive Scaled Algorithm for Machine Learning Using Second-Order Information
Majid Jahani
S. Rusakov
Zheng Shi
Peter Richtárik
Michael W. Mahoney
Martin Takávc
ODL
8
25
0
11 Sep 2021
The Number of Steps Needed for Nonconvex Optimization of a Deep Learning
  Optimizer is a Rational Function of Batch Size
The Number of Steps Needed for Nonconvex Optimization of a Deep Learning Optimizer is a Rational Function of Batch Size
Hideaki Iiduka
13
1
0
26 Aug 2021
Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth
  Games: Convergence Analysis under Expected Co-coercivity
Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity
Nicolas Loizou
Hugo Berard
Gauthier Gidel
Ioannis Mitliagkas
Simon Lacoste-Julien
21
53
0
30 Jun 2021
On the Convergence of Stochastic Extragradient for Bilinear Games using
  Restarted Iteration Averaging
On the Convergence of Stochastic Extragradient for Bilinear Games using Restarted Iteration Averaging
C. J. Li
Yaodong Yu
Nicolas Loizou
Gauthier Gidel
Yi-An Ma
Nicolas Le Roux
Michael I. Jordan
23
22
0
30 Jun 2021
Stochastic Polyak Stepsize with a Moving Target
Stochastic Polyak Stepsize with a Moving Target
Robert Mansel Gower
Aaron Defazio
Michael G. Rabbat
24
17
0
22 Jun 2021
Adaptive Learning Rate and Momentum for Training Deep Neural Networks
Adaptive Learning Rate and Momentum for Training Deep Neural Networks
Zhiyong Hao
Yixuan Jiang
Huihua Yu
H. Chiang
ODL
14
9
0
22 Jun 2021
Comment on Stochastic Polyak Step-Size: Performance of ALI-G
Comment on Stochastic Polyak Step-Size: Performance of ALI-G
Leonard Berrada
Andrew Zisserman
M. P. Kumar
13
4
0
20 May 2021
Scale Invariant Monte Carlo under Linear Function Approximation with
  Curvature based step-size
Scale Invariant Monte Carlo under Linear Function Approximation with Curvature based step-size
Rahul Madhavan
Hemant Makwana
11
0
0
15 Apr 2021
Multi-modal anticipation of stochastic trajectories in a dynamic
  environment with Conditional Variational Autoencoders
Multi-modal anticipation of stochastic trajectories in a dynamic environment with Conditional Variational Autoencoders
Albert Dulian
J. Murray
26
4
0
05 Mar 2021
A Probabilistically Motivated Learning Rate Adaptation for Stochastic
  Optimization
A Probabilistically Motivated Learning Rate Adaptation for Stochastic Optimization
Filip de Roos
Carl Jidling
A. Wills
Thomas B. Schon
Philipp Hennig
15
3
0
22 Feb 2021
AI-SARAH: Adaptive and Implicit Stochastic Recursive Gradient Methods
AI-SARAH: Adaptive and Implicit Stochastic Recursive Gradient Methods
Zheng Shi
Abdurakhmon Sadiev
Nicolas Loizou
Peter Richtárik
Martin Takávc
ODL
32
13
0
19 Feb 2021
SVRG Meets AdaGrad: Painless Variance Reduction
SVRG Meets AdaGrad: Painless Variance Reduction
Benjamin Dubois-Taine
Sharan Vaswani
Reza Babanezhad
Mark W. Schmidt
Simon Lacoste-Julien
13
17
0
18 Feb 2021
On the Convergence of Step Decay Step-Size for Stochastic Optimization
On the Convergence of Step Decay Step-Size for Stochastic Optimization
Xiaoyu Wang
Sindri Magnússon
M. Johansson
55
23
0
18 Feb 2021
An Adaptive Stochastic Sequential Quadratic Programming with
  Differentiable Exact Augmented Lagrangians
An Adaptive Stochastic Sequential Quadratic Programming with Differentiable Exact Augmented Lagrangians
Sen Na
M. Anitescu
Mladen Kolar
12
41
0
10 Feb 2021
Recent Theoretical Advances in Non-Convex Optimization
Recent Theoretical Advances in Non-Convex Optimization
Marina Danilova
Pavel Dvurechensky
Alexander Gasnikov
Eduard A. Gorbunov
Sergey Guminov
Dmitry Kamzolov
Innokentiy Shibaev
23
76
0
11 Dec 2020
Counting Cows: Tracking Illegal Cattle Ranching From High-Resolution
  Satellite Imagery
Counting Cows: Tracking Illegal Cattle Ranching From High-Resolution Satellite Imagery
I. Laradji
Pau Rodríguez López
F. Kalaitzis
David Vazquez
Ross Young
E. Davey
Alexandre Lacoste
19
19
0
14 Nov 2020
Previous
123
Next