Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence

24 February 2020

Papers citing "Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence"

50 / 112 papers shown

Title
Entropic Mirror Descent for Linear Systems: Polyak's Stepsize and Implicit Bias Yura Malitsky Alexander Posch 27 0 0 05 May 2025
Analysis of an Idealized Stochastic Polyak Method and its Application to Black-Box Model Distillation Robert M. Gower Guillaume Garrigos Nicolas Loizou Dimitris Oikonomou Konstantin Mishchenko Fabian Schaipp 31 0 0 02 Apr 2025
Personalized Convolutional Dictionary Learning of Physiological Time Series Axel Roques Samuel Gruffaz Kyurae Kim Alain Durmus Laurent Oudre 42 0 0 10 Mar 2025
A Novel Unified Parametric Assumption for Nonconvex Optimization Artem Riabinin Ahmed Khaled Peter Richtárik 62 0 0 17 Feb 2025
Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent Hikaru Umeda Hideaki Iiduka 67 2 0 17 Feb 2025
Temporal Context Consistency Above All: Enhancing Long-Term Anticipation by Learning and Enforcing Temporal Constraints Alberto Maté Mariella Dimiccoli AI4TS 26 0 0 27 Dec 2024
MARINA-P: Superior Performance in Non-smooth Federated Optimization with Adaptive Stepsizes Igor Sokolov Peter Richtárik 77 1 0 22 Dec 2024
Scaled Conjugate Gradient Method for Nonconvex Optimization in Deep Neural Networks Naoki Sato Koshiro Izumi Hideaki Iiduka ODL 70 0 0 16 Dec 2024
Beyond adaptive gradient: Fast-Controlled Minibatch Algorithm for large-scale optimization Corrado Coppola Lorenzo Papa Irene Amerini L. Palagi ODL 73 0 0 24 Nov 2024
Effectively Leveraging Momentum Terms in Stochastic Line Search Frameworks for Fast Optimization of Finite-Sum Problems Matteo Lapucci Davide Pucci ODL 27 0 0 11 Nov 2024
Tuning-free coreset Markov chain Monte Carlo Naitong Chen Jonathan H. Huggins Trevor Campbell 25 0 0 24 Oct 2024
Loss Landscape Characterization of Neural Networks without Over-Parametrization Rustem Islamov Niccolò Ajroldi Antonio Orvieto Aurélien Lucchi 33 4 0 16 Oct 2024
Convergence of Sharpness-Aware Minimization Algorithms using Increasing Batch Size and Decaying Learning Rate Hinata Harada Hideaki Iiduka 28 1 0 16 Sep 2024
Convergence Conditions for Stochastic Line Search Based Optimization of Over-parametrized Models Matteo Lapucci Davide Pucci 29 1 0 06 Aug 2024
Stepping on the Edge: Curvature Aware Learning Rate Tuners Vincent Roulet Atish Agarwala Jean-Bastien Grill Grzegorz Swirszcz Mathieu Blondel Fabian Pedregosa 39 1 0 08 Jul 2024
An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes Antonio Orvieto Lin Xiao 39 2 0 05 Jul 2024
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance Dimitris Oikonomou Nicolas Loizou 53 4 0 06 Jun 2024
The High Line: Exact Risk and Learning Rate Curves of Stochastic Adaptive Learning Rate Algorithms Elizabeth Collins-Woodfin Inbar Seroussi Begona García Malaxechebarría Andrew W. Mackenzie Elliot Paquette Courtney Paquette 28 1 0 30 May 2024
Towards Stability of Parameter-free Optimization Yijiang Pang Shuyang Yu Hoang Bao Jiayu Zhou 29 1 0 07 May 2024
Enhancing Policy Gradient with the Polyak Step-Size Adaption Yunxiang Li Rui Yuan Chen Fan Mark W. Schmidt Samuel Horváth Robert Mansel Gower Martin Takávc 40 0 0 11 Apr 2024
Faster Convergence of Stochastic Accelerated Gradient Descent under Interpolation Aaron Mishkin Mert Pilanci Mark Schmidt 62 1 0 03 Apr 2024
Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad Sayantan Choudhury N. Tupitsa Nicolas Loizou Samuel Horváth Martin Takáč Eduard A. Gorbunov 30 1 0 05 Mar 2024
Level Set Teleportation: An Optimization Perspective Aaron Mishkin A. Bietti Robert Mansel Gower 33 1 0 05 Mar 2024
On the Convergence of Federated Learning Algorithms without Data Similarity Ali Beikmohammadi Sarit Khirirat Sindri Magnússon FedML 33 1 0 29 Feb 2024
Iteration and Stochastic First-order Oracle Complexities of Stochastic Gradient Descent using Constant and Decaying Learning Rates Kento Imaizumi Hideaki Iiduka 19 2 0 23 Feb 2024
Corridor Geometry in Gradient-Based Optimization Benoit Dherin M. Rosca 30 0 0 13 Feb 2024
Implicit Bias and Fast Convergence Rates for Self-attention Bhavya Vasudeva Puneesh Deora Christos Thrampoulidis 26 13 0 08 Feb 2024
AdaBatchGrad: Combining Adaptive Batch Size and Adaptive Step Size P. Ostroukhov Aigerim Zhumabayeva Chulu Xiang Alexander Gasnikov Martin Takáč Dmitry Kamzolov ODL 43 2 0 07 Feb 2024
Optimal sampling for stochastic and natural gradient descent Robert Gruhlke A. Nouy Philipp Trunschke 25 3 0 05 Feb 2024
MetaOptimize: A Framework for Optimizing Step Sizes and Other Meta-parameters Arsalan Sharifnassab Saber Salehkaleybar Richard Sutton 35 3 0 04 Feb 2024
(Accelerated) Noise-adaptive Stochastic Heavy-Ball Momentum Anh Dang Reza Babanezhad Sharan Vaswani 22 0 0 12 Jan 2024
Interpreting Adaptive Gradient Methods by Parameter Scaling for Learning-Rate-Free Optimization Min-Kook Suh Seung-Woo Seo ODL 27 0 0 06 Jan 2024
SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms Farshed Abdukhakimov Chulu Xiang Dmitry Kamzolov Robert Mansel Gower Martin Takáč 35 2 0 28 Dec 2023
Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise Rui Pan Yuxing Liu Xiaoyu Wang Tong Zhang 18 5 0 22 Dec 2023
On the Convergence of Loss and Uncertainty-based Active Learning Algorithms Daniel Haimovich Dima Karamshuk Fridolin Linder Niek Tax Milan Vojnovic 21 0 0 21 Dec 2023
On the Interplay Between Stepsize Tuning and Progressive Sharpening Vincent Roulet Atish Agarwala Fabian Pedregosa 13 4 0 30 Nov 2023
Critical Influence of Overparameterization on Sharpness-aware Minimization Sungbin Shin Dongyeop Lee Maksym Andriushchenko Namhoon Lee AAML 44 1 0 29 Nov 2023
Adaptive Step Sizes for Preconditioned Stochastic Gradient Descent Frederik Köhne Leonie Kreis Anton Schiela Roland A. Herzog 14 0 0 28 Nov 2023
Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling Naoki Sato Hideaki Iiduka 20 3 0 15 Nov 2023
Parameter-Agnostic Optimization under Relaxed Smoothness Florian Hübler Junchi Yang Xiang Li Niao He 26 12 0 06 Nov 2023
Demystifying the Myths and Legends of Nonconvex Convergence of SGD Aritra Dutta El Houcine Bergou Soumia Boucherouite Nicklas Werge M. Kandemir Xin Li 26 0 0 19 Oct 2023
Stochastic Gradient Descent with Preconditioned Polyak Step-size Farshed Abdukhakimov Chulu Xiang Dmitry Kamzolov Martin Takáč 26 5 0 03 Oct 2023
Adaptive SGD with Polyak stepsize and Line-search: Robust Convergence and Variance Reduction Xiao-Yan Jiang Sebastian U. Stich 22 18 0 11 Aug 2023
A new Gradient TD Algorithm with only One Step-size: Convergence Rate Analysis using $L$ - $λ$ Smoothness Hengshuai Yao 21 2 0 29 Jul 2023
Function Value Learning: Adaptive Learning Rates Based on the Polyak Stepsize and Function Splitting in ERM Guillaume Garrigos Robert Mansel Gower Fabian Schaipp 24 5 0 26 Jul 2023
Relationship between Batch Size and Number of Steps Needed for Nonconvex Optimization of Stochastic Gradient Descent using Armijo Line Search Yuki Tsukada Hideaki Iiduka 18 0 0 25 Jul 2023
Variational Inference with Gaussian Score Matching Chirag Modi C. Margossian Yuling Yao Robert Mansel Gower David M. Blei Lawrence K. Saul 16 12 0 15 Jul 2023
Locally Adaptive Federated Learning Sohom Mukherjee Nicolas Loizou Sebastian U. Stich FedML 19 3 0 12 Jul 2023
Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models Leonardo Galli Holger Rauhut Mark W. Schmidt 24 11 0 22 Jun 2023
No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths Charles Guille-Escuret Hiroki Naganuma Kilian Fatras Ioannis Mitliagkas 11 3 0 20 Jun 2023