Momentum via Primal Averaging: Theoretical Insights and Learning Rate
Schedules for Non-Convex Optimization

v1v2v3v4 (latest)

Momentum via Primal Averaging: Theoretical Insights and Learning Rate Schedules for Non-Convex Optimization

1 October 2020

ArXiv (abs)PDF HTML

Papers citing "Momentum via Primal Averaging: Theoretical Insights and Learning Rate Schedules for Non-Convex Optimization"

12 / 12 papers shown

Title
The Road Less Scheduled Aaron Defazio Xingyu Yang Yang Harsh Mehta Konstantin Mishchenko Ahmed Khaled Ashok Cutkosky 120 60 0 24 May 2024
(Accelerated) Noise-adaptive Stochastic Heavy-Ball Momentum Anh Dang Reza Babanezhad Sharan Vaswani 63 0 0 12 Jan 2024
A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale Hao-Jun Michael Shi Tsung-Hsien Lee Shintaro Iwasaki Jose Gallego-Posada Zhijing Li Kaushik Rangadurai Dheevatsa Mudigere Michael Rabbat ODL 98 27 0 12 Sep 2023
The Marginal Value of Momentum for Small Learning Rate SGD Runzhe Wang Sadhika Malladi Tianhao Wang Kaifeng Lyu Zhiyuan Li ODL 82 9 0 27 Jul 2023
When and Why Momentum Accelerates SGD:An Empirical Study Jingwen Fu Bohan Wang Huishuai Zhang Zhizheng Zhang Wei Chen Na Zheng 60 10 0 15 Jun 2023
Momentum Provably Improves Error Feedback! Ilyas Fatkhullin Alexander Tyurin Peter Richtárik 116 23 0 24 May 2023
Learning-Rate-Free Learning by D-Adaptation Aaron Defazio Konstantin Mishchenko 106 85 0 18 Jan 2023
Low-Variance Forward Gradients using Direct Feedback Alignment and Momentum Florian Bacho Dominique F. Chu 58 8 0 14 Dec 2022
Momentum Tracking: Momentum Acceleration for Decentralized Deep Learning on Heterogeneous Data Yuki Takezawa Hang Bao Kenta Niwa Ryoma Sato Makoto Yamada 76 20 0 30 Sep 2022
A Closer Look at Learned Optimization: Stability, Robustness, and Inductive Biases James Harrison Luke Metz Jascha Narain Sohl-Dickstein 112 21 0 22 Sep 2022
High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad Stepsize Ali Kavis Kfir Y. Levy Volkan Cevher 76 42 0 06 Apr 2022
A Simple Convergence Proof of Adam and Adagrad Alexandre Défossez Léon Bottou Francis R. Bach Nicolas Usunier 140 159 0 05 Mar 2020