Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.10542
Cited By
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence
24 February 2020
Nicolas Loizou
Sharan Vaswani
I. Laradji
Simon Lacoste-Julien
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence"
50 / 112 papers shown
Title
Entropic Mirror Descent for Linear Systems: Polyak's Stepsize and Implicit Bias
Yura Malitsky
Alexander Posch
27
0
0
05 May 2025
Analysis of an Idealized Stochastic Polyak Method and its Application to Black-Box Model Distillation
Robert M. Gower
Guillaume Garrigos
Nicolas Loizou
Dimitris Oikonomou
Konstantin Mishchenko
Fabian Schaipp
31
0
0
02 Apr 2025
Personalized Convolutional Dictionary Learning of Physiological Time Series
Axel Roques
Samuel Gruffaz
Kyurae Kim
Alain Durmus
Laurent Oudre
42
0
0
10 Mar 2025
A Novel Unified Parametric Assumption for Nonconvex Optimization
Artem Riabinin
Ahmed Khaled
Peter Richtárik
62
0
0
17 Feb 2025
Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent
Hikaru Umeda
Hideaki Iiduka
67
2
0
17 Feb 2025
Temporal Context Consistency Above All: Enhancing Long-Term Anticipation by Learning and Enforcing Temporal Constraints
Alberto Maté
Mariella Dimiccoli
AI4TS
26
0
0
27 Dec 2024
MARINA-P: Superior Performance in Non-smooth Federated Optimization with Adaptive Stepsizes
Igor Sokolov
Peter Richtárik
77
1
0
22 Dec 2024
Scaled Conjugate Gradient Method for Nonconvex Optimization in Deep Neural Networks
Naoki Sato
Koshiro Izumi
Hideaki Iiduka
ODL
73
0
0
16 Dec 2024
Beyond adaptive gradient: Fast-Controlled Minibatch Algorithm for large-scale optimization
Corrado Coppola
Lorenzo Papa
Irene Amerini
L. Palagi
ODL
76
0
0
24 Nov 2024
Effectively Leveraging Momentum Terms in Stochastic Line Search Frameworks for Fast Optimization of Finite-Sum Problems
Matteo Lapucci
Davide Pucci
ODL
27
0
0
11 Nov 2024
Tuning-free coreset Markov chain Monte Carlo
Naitong Chen
Jonathan H. Huggins
Trevor Campbell
25
0
0
24 Oct 2024
Loss Landscape Characterization of Neural Networks without Over-Parametrization
Rustem Islamov
Niccolò Ajroldi
Antonio Orvieto
Aurélien Lucchi
33
4
0
16 Oct 2024
Convergence of Sharpness-Aware Minimization Algorithms using Increasing Batch Size and Decaying Learning Rate
Hinata Harada
Hideaki Iiduka
30
1
0
16 Sep 2024
Convergence Conditions for Stochastic Line Search Based Optimization of Over-parametrized Models
Matteo Lapucci
Davide Pucci
32
1
0
06 Aug 2024
Stepping on the Edge: Curvature Aware Learning Rate Tuners
Vincent Roulet
Atish Agarwala
Jean-Bastien Grill
Grzegorz Swirszcz
Mathieu Blondel
Fabian Pedregosa
39
1
0
08 Jul 2024
An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes
Antonio Orvieto
Lin Xiao
39
2
0
05 Jul 2024
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
Dimitris Oikonomou
Nicolas Loizou
53
4
0
06 Jun 2024
The High Line: Exact Risk and Learning Rate Curves of Stochastic Adaptive Learning Rate Algorithms
Elizabeth Collins-Woodfin
Inbar Seroussi
Begona García Malaxechebarría
Andrew W. Mackenzie
Elliot Paquette
Courtney Paquette
28
1
0
30 May 2024
Towards Stability of Parameter-free Optimization
Yijiang Pang
Shuyang Yu
Hoang Bao
Jiayu Zhou
29
1
0
07 May 2024
Enhancing Policy Gradient with the Polyak Step-Size Adaption
Yunxiang Li
Rui Yuan
Chen Fan
Mark W. Schmidt
Samuel Horváth
Robert Mansel Gower
Martin Takávc
40
0
0
11 Apr 2024
Faster Convergence of Stochastic Accelerated Gradient Descent under Interpolation
Aaron Mishkin
Mert Pilanci
Mark Schmidt
62
1
0
03 Apr 2024
Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad
Sayantan Choudhury
N. Tupitsa
Nicolas Loizou
Samuel Horváth
Martin Takáč
Eduard A. Gorbunov
30
1
0
05 Mar 2024
Level Set Teleportation: An Optimization Perspective
Aaron Mishkin
A. Bietti
Robert Mansel Gower
33
1
0
05 Mar 2024
On the Convergence of Federated Learning Algorithms without Data Similarity
Ali Beikmohammadi
Sarit Khirirat
Sindri Magnússon
FedML
33
1
0
29 Feb 2024
Iteration and Stochastic First-order Oracle Complexities of Stochastic Gradient Descent using Constant and Decaying Learning Rates
Kento Imaizumi
Hideaki Iiduka
19
2
0
23 Feb 2024
Corridor Geometry in Gradient-Based Optimization
Benoit Dherin
M. Rosca
30
0
0
13 Feb 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
26
13
0
08 Feb 2024
AdaBatchGrad: Combining Adaptive Batch Size and Adaptive Step Size
P. Ostroukhov
Aigerim Zhumabayeva
Chulu Xiang
Alexander Gasnikov
Martin Takáč
Dmitry Kamzolov
ODL
43
2
0
07 Feb 2024
Optimal sampling for stochastic and natural gradient descent
Robert Gruhlke
A. Nouy
Philipp Trunschke
25
3
0
05 Feb 2024
MetaOptimize: A Framework for Optimizing Step Sizes and Other Meta-parameters
Arsalan Sharifnassab
Saber Salehkaleybar
Richard Sutton
35
3
0
04 Feb 2024
(Accelerated) Noise-adaptive Stochastic Heavy-Ball Momentum
Anh Dang
Reza Babanezhad
Sharan Vaswani
22
0
0
12 Jan 2024
Interpreting Adaptive Gradient Methods by Parameter Scaling for Learning-Rate-Free Optimization
Min-Kook Suh
Seung-Woo Seo
ODL
29
0
0
06 Jan 2024
SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms
Farshed Abdukhakimov
Chulu Xiang
Dmitry Kamzolov
Robert Mansel Gower
Martin Takáč
35
2
0
28 Dec 2023
Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise
Rui Pan
Yuxing Liu
Xiaoyu Wang
Tong Zhang
18
5
0
22 Dec 2023
On the Convergence of Loss and Uncertainty-based Active Learning Algorithms
Daniel Haimovich
Dima Karamshuk
Fridolin Linder
Niek Tax
Milan Vojnovic
21
0
0
21 Dec 2023
On the Interplay Between Stepsize Tuning and Progressive Sharpening
Vincent Roulet
Atish Agarwala
Fabian Pedregosa
13
4
0
30 Nov 2023
Critical Influence of Overparameterization on Sharpness-aware Minimization
Sungbin Shin
Dongyeop Lee
Maksym Andriushchenko
Namhoon Lee
AAML
44
1
0
29 Nov 2023
Adaptive Step Sizes for Preconditioned Stochastic Gradient Descent
Frederik Köhne
Leonie Kreis
Anton Schiela
Roland A. Herzog
14
0
0
28 Nov 2023
Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling
Naoki Sato
Hideaki Iiduka
20
3
0
15 Nov 2023
Parameter-Agnostic Optimization under Relaxed Smoothness
Florian Hübler
Junchi Yang
Xiang Li
Niao He
26
12
0
06 Nov 2023
Demystifying the Myths and Legends of Nonconvex Convergence of SGD
Aritra Dutta
El Houcine Bergou
Soumia Boucherouite
Nicklas Werge
M. Kandemir
Xin Li
26
0
0
19 Oct 2023
Stochastic Gradient Descent with Preconditioned Polyak Step-size
Farshed Abdukhakimov
Chulu Xiang
Dmitry Kamzolov
Martin Takáč
26
5
0
03 Oct 2023
Adaptive SGD with Polyak stepsize and Line-search: Robust Convergence and Variance Reduction
Xiao-Yan Jiang
Sebastian U. Stich
22
18
0
11 Aug 2023
A new Gradient TD Algorithm with only One Step-size: Convergence Rate Analysis using
L
L
L
-
λ
λ
λ
Smoothness
Hengshuai Yao
21
2
0
29 Jul 2023
Function Value Learning: Adaptive Learning Rates Based on the Polyak Stepsize and Function Splitting in ERM
Guillaume Garrigos
Robert Mansel Gower
Fabian Schaipp
24
5
0
26 Jul 2023
Relationship between Batch Size and Number of Steps Needed for Nonconvex Optimization of Stochastic Gradient Descent using Armijo Line Search
Yuki Tsukada
Hideaki Iiduka
18
0
0
25 Jul 2023
Variational Inference with Gaussian Score Matching
Chirag Modi
C. Margossian
Yuling Yao
Robert Mansel Gower
David M. Blei
Lawrence K. Saul
16
12
0
15 Jul 2023
Locally Adaptive Federated Learning
Sohom Mukherjee
Nicolas Loizou
Sebastian U. Stich
FedML
19
3
0
12 Jul 2023
Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models
Leonardo Galli
Holger Rauhut
Mark W. Schmidt
24
11
0
22 Jun 2023
No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths
Charles Guille-Escuret
Hiroki Naganuma
Kilian Fatras
Ioannis Mitliagkas
11
3
0
20 Jun 2023
1
2
3
Next