Two Sides of One Coin: the Limits of Untuned SGD and the Power of
Adaptive Methods

Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods

21 May 2023

Xiang Li

Ilyas Fatkhullin

Papers citing "Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods"

17 / 17 papers shown

Title
Beyond adaptive gradient: Fast-Controlled Minibatch Algorithm for large-scale optimization Corrado Coppola Lorenzo Papa Irene Amerini L. Palagi ODL 63 0 0 24 Nov 2024
From Gradient Clipping to Normalization for Heavy Tailed SGD Florian Hübler Ilyas Fatkhullin Niao He 40 5 0 17 Oct 2024
An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes Antonio Orvieto Lin Xiao 32 2 0 05 Jul 2024
Adaptive Variance Reduction for Stochastic Optimization under Weaker Assumptions Wei Jiang Sifan Yang Yibo Wang Lijun Zhang 23 1 0 04 Jun 2024
The High Line: Exact Risk and Learning Rate Curves of Stochastic Adaptive Learning Rate Algorithms Elizabeth Collins-Woodfin Inbar Seroussi Begona García Malaxechebarría Andrew W. Mackenzie Elliot Paquette Courtney Paquette 18 1 0 30 May 2024
Towards Stability of Parameter-free Optimization Yijiang Pang Shuyang Yu Hoang Bao Jiayu Zhou 16 1 0 07 May 2024
On the Convergence of Adam under Non-uniform Smoothness: Separability from SGDM and Beyond Bohan Wang Huishuai Zhang Qi Meng Ruoyu Sun Zhi-Ming Ma Wei-Neng Chen 19 6 0 22 Mar 2024
Taming Nonconvex Stochastic Mirror Descent with General Bregman Divergence Ilyas Fatkhullin Niao He 27 3 0 27 Feb 2024
Problem-Parameter-Free Decentralized Nonconvex Stochastic Optimization Jiaxiang Li Xuxing Chen Shiqian Ma Mingyi Hong ODL 19 2 0 13 Feb 2024
Parameter-Agnostic Optimization under Relaxed Smoothness Florian Hübler Junchi Yang Xiang Li Niao He 13 12 0 06 Nov 2023
Complexity of Single Loop Algorithms for Nonlinear Programming with Stochastic Objective and Constraints Ahmet Alacaoglu Stephen J. Wright 15 10 0 01 Nov 2023
Convergence of AdaGrad for Non-convex Objectives: Simple Proofs and Relaxed Assumptions Bo Wang Huishuai Zhang Zhirui Ma Wei Chen 21 48 0 29 May 2023
Momentum Provably Improves Error Feedback! Ilyas Fatkhullin A. Tyurin Peter Richtárik 12 19 0 24 May 2023
Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent Sharan Vaswani Benjamin Dubois-Taine Reza Babanezhad 27 10 0 21 Oct 2021
On the Convergence of Step Decay Step-Size for Stochastic Optimization Xiaoyu Wang Sindri Magnússon M. Johansson 47 23 0 18 Feb 2021
A High Probability Analysis of Adaptive SGD with Momentum Xiaoyun Li Francesco Orabona 79 64 0 28 Jul 2020
A Simple Convergence Proof of Adam and Adagrad Alexandre Défossez Léon Bottou Francis R. Bach Nicolas Usunier 56 143 0 05 Mar 2020