Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.05591
Cited By
On the insufficiency of existing momentum schemes for Stochastic Optimization
15 March 2018
Rahul Kidambi
Praneeth Netrapalli
Prateek Jain
Sham Kakade
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the insufficiency of existing momentum schemes for Stochastic Optimization"
29 / 29 papers shown
Title
On the Performance Analysis of Momentum Method: A Frequency Domain Perspective
Xianliang Li
Jun Luo
Zhiwei Zheng
Hanxiao Wang
Li Luo
Lingkun Wen
Linlong Wu
Sheng Xu
74
0
0
29 Nov 2024
Accelerated Stochastic Min-Max Optimization Based on Bias-corrected Momentum
H. Cai
Sulaiman A. Alghunaim
Ali H.Sayed
52
1
0
18 Jun 2024
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
Dimitris Oikonomou
Nicolas Loizou
55
4
0
06 Jun 2024
Does SGD really happen in tiny subspaces?
Minhak Song
Kwangjun Ahn
Chulhee Yun
73
5
1
25 May 2024
Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks
Hristo Papazov
Scott Pesme
Nicolas Flammarion
38
5
0
08 Mar 2024
From Optimization to Control: Quasi Policy Iteration
Mohammad Amin Sharifi Kolarijani
Peyman Mohajerin Esfahani
32
2
0
18 Nov 2023
Acceleration of stochastic gradient descent with momentum by averaging: finite-sample rates and asymptotic normality
Kejie Tang
Weidong Liu
Yichen Zhang
Xi Chen
21
2
0
28 May 2023
First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities
Aleksandr Beznosikov
S. Samsonov
Marina Sheshukova
Alexander Gasnikov
A. Naumov
Eric Moulines
52
14
0
25 May 2023
On the fast convergence of minibatch heavy ball momentum
Raghu Bollapragada
Tyler Chen
Rachel A. Ward
32
17
0
15 Jun 2022
Feedback Gradient Descent: Efficient and Stable Optimization with Orthogonality for DNNs
Fanchen Bu
D. Chang
28
6
0
12 May 2022
Temporal Efficient Training of Spiking Neural Network via Gradient Re-weighting
Shi-Wee Deng
Yuhang Li
Shanghang Zhang
Shi Gu
133
248
0
24 Feb 2022
Convergence and Stability of the Stochastic Proximal Point Algorithm with Momentum
J. Kim
Panos Toulis
Anastasios Kyrillidis
26
8
0
11 Nov 2021
An Asymptotic Analysis of Minibatch-Based Momentum Methods for Linear Regression Models
Yuan Gao
Xuening Zhu
Haobo Qi
Guodong Li
Riquan Zhang
Hansheng Wang
23
3
0
02 Nov 2021
Does Momentum Help? A Sample Complexity Analysis
Swetha Ganesh
Rohan Deb
Gugan Thoppe
A. Budhiraja
21
2
0
29 Oct 2021
Quickly Finding a Benign Region via Heavy Ball Momentum in Non-Convex Optimization
Jun-Kun Wang
Jacob D. Abernethy
11
7
0
04 Oct 2020
Almost sure convergence rates for Stochastic Gradient Descent and Stochastic Heavy Ball
Othmane Sebbouh
Robert Mansel Gower
Aaron Defazio
11
22
0
14 Jun 2020
On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings
Mahmoud Assran
Michael G. Rabbat
14
59
0
27 Feb 2020
Statistical Adaptive Stochastic Gradient Methods
Pengchuan Zhang
Hunter Lang
Qiang Liu
Lin Xiao
ODL
15
11
0
25 Feb 2020
The Two Regimes of Deep Network Training
Guillaume Leclerc
A. Madry
19
45
0
24 Feb 2020
Optimization for deep learning: theory and algorithms
Ruoyu Sun
ODL
27
168
0
19 Dec 2019
Understanding the Role of Momentum in Stochastic Gradient Methods
Igor Gitman
Hunter Lang
Pengchuan Zhang
Lin Xiao
33
94
0
30 Oct 2019
Demon: Improved Neural Network Training with Momentum Decay
John Chen
Cameron R. Wolfe
Zhaoqi Li
Anastasios Kyrillidis
ODL
24
15
0
11 Oct 2019
The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares
Rong Ge
Sham Kakade
Rahul Kidambi
Praneeth Netrapalli
37
149
0
29 Apr 2019
A Selective Overview of Deep Learning
Jianqing Fan
Cong Ma
Yiqiao Zhong
BDL
VLM
38
136
0
10 Apr 2019
On the Ineffectiveness of Variance Reduced Optimization for Deep Learning
Aaron Defazio
Léon Bottou
UQCV
DRL
23
112
0
11 Dec 2018
Quasi-hyperbolic momentum and Adam for deep learning
Jerry Ma
Denis Yarats
ODL
84
129
0
16 Oct 2018
Optimal Adaptive and Accelerated Stochastic Gradient Descent
Qi Deng
Yi Cheng
Guanghui Lan
16
8
0
01 Oct 2018
Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks
Jinghui Chen
Dongruo Zhou
Yiqi Tang
Ziyan Yang
Yuan Cao
Quanquan Gu
ODL
19
193
0
18 Jun 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,892
0
15 Sep 2016
1