ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.00406
  4. Cited By
Momentum via Primal Averaging: Theoretical Insights and Learning Rate
  Schedules for Non-Convex Optimization
v1v2v3v4 (latest)

Momentum via Primal Averaging: Theoretical Insights and Learning Rate Schedules for Non-Convex Optimization

1 October 2020
Aaron Defazio
ArXiv (abs)PDFHTML

Papers citing "Momentum via Primal Averaging: Theoretical Insights and Learning Rate Schedules for Non-Convex Optimization"

12 / 12 papers shown
Title
The Road Less Scheduled
The Road Less Scheduled
Aaron Defazio
Xingyu Yang
Yang
Harsh Mehta
Konstantin Mishchenko
Ahmed Khaled
Ashok Cutkosky
120
60
0
24 May 2024
(Accelerated) Noise-adaptive Stochastic Heavy-Ball Momentum
(Accelerated) Noise-adaptive Stochastic Heavy-Ball Momentum
Anh Dang
Reza Babanezhad
Sharan Vaswani
63
0
0
12 Jan 2024
A Distributed Data-Parallel PyTorch Implementation of the Distributed
  Shampoo Optimizer for Training Neural Networks At-Scale
A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale
Hao-Jun Michael Shi
Tsung-Hsien Lee
Shintaro Iwasaki
Jose Gallego-Posada
Zhijing Li
Kaushik Rangadurai
Dheevatsa Mudigere
Michael Rabbat
ODL
98
27
0
12 Sep 2023
The Marginal Value of Momentum for Small Learning Rate SGD
The Marginal Value of Momentum for Small Learning Rate SGD
Runzhe Wang
Sadhika Malladi
Tianhao Wang
Kaifeng Lyu
Zhiyuan Li
ODL
82
9
0
27 Jul 2023
When and Why Momentum Accelerates SGD:An Empirical Study
When and Why Momentum Accelerates SGD:An Empirical Study
Jingwen Fu
Bohan Wang
Huishuai Zhang
Zhizheng Zhang
Wei Chen
Na Zheng
60
10
0
15 Jun 2023
Momentum Provably Improves Error Feedback!
Momentum Provably Improves Error Feedback!
Ilyas Fatkhullin
Alexander Tyurin
Peter Richtárik
116
23
0
24 May 2023
Learning-Rate-Free Learning by D-Adaptation
Learning-Rate-Free Learning by D-Adaptation
Aaron Defazio
Konstantin Mishchenko
106
85
0
18 Jan 2023
Low-Variance Forward Gradients using Direct Feedback Alignment and
  Momentum
Low-Variance Forward Gradients using Direct Feedback Alignment and Momentum
Florian Bacho
Dominique F. Chu
58
8
0
14 Dec 2022
Momentum Tracking: Momentum Acceleration for Decentralized Deep Learning
  on Heterogeneous Data
Momentum Tracking: Momentum Acceleration for Decentralized Deep Learning on Heterogeneous Data
Yuki Takezawa
Hang Bao
Kenta Niwa
Ryoma Sato
Makoto Yamada
76
20
0
30 Sep 2022
A Closer Look at Learned Optimization: Stability, Robustness, and
  Inductive Biases
A Closer Look at Learned Optimization: Stability, Robustness, and Inductive Biases
James Harrison
Luke Metz
Jascha Narain Sohl-Dickstein
112
21
0
22 Sep 2022
High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad
  Stepsize
High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad Stepsize
Ali Kavis
Kfir Y. Levy
Volkan Cevher
76
42
0
06 Apr 2022
A Simple Convergence Proof of Adam and Adagrad
A Simple Convergence Proof of Adam and Adagrad
Alexandre Défossez
Léon Bottou
Francis R. Bach
Nicolas Usunier
140
159
0
05 Mar 2020
1