Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1604.03257
Cited By
v1
v2 (latest)
Unified Convergence Analysis of Stochastic Momentum Methods for Convex and Non-convex Optimization
12 April 2016
Tianbao Yang
Qihang Lin
Zhe Li
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Unified Convergence Analysis of Stochastic Momentum Methods for Convex and Non-convex Optimization"
50 / 72 papers shown
Title
Both Asymptotic and Non-Asymptotic Convergence of Quasi-Hyperbolic Momentum using Increasing Batch Size
Kento Imaizumi
Hideaki Iiduka
168
0
0
30 Jun 2025
Attack Anything: Blind DNNs via Universal Background Adversarial Attack
Jiawei Lian
Shaohui Mei
X. Wang
Yi Wang
L. Wang
Yingjie Lu
Mingyang Ma
Lap-Pui Chau
AAML
441
3
0
17 Aug 2024
Towards Exact Gradient-based Training on Analog In-memory Computing
Neural Information Processing Systems (NeurIPS), 2024
Zhaoxian Wu
Tayfun Gokmen
Malte J. Rasch
Tianyi Chen
240
6
0
18 Jun 2024
Accelerated Stochastic Min-Max Optimization Based on Bias-corrected Momentum
H. Cai
Sulaiman A. Alghunaim
Ali H.Sayed
336
1
0
18 Jun 2024
Almost sure convergence rates of stochastic gradient methods under gradient domination
Simon Weissmann
Sara Klein
Waïss Azizian
Leif Döring
247
6
0
22 May 2024
Revisiting Convergence of AdaGrad with Relaxed Assumptions
Yusu Hong
Junhong Lin
219
13
0
21 Feb 2024
AGD: an Auto-switchable Optimizer using Stepwise Gradient Difference for Preconditioning Matrix
Neural Information Processing Systems (NeurIPS), 2023
Yun Yue
Zhiling Ye
Jiadi Jiang
Yongchao Liu
Ke Zhang
ODL
184
3
0
04 Dec 2023
From Optimization to Control: Quasi Policy Iteration
Mohammad Amin Sharifi Kolarijani
Peyman Mohajerin Esfahani
197
3
0
18 Nov 2023
Demystifying the Myths and Legends of Nonconvex Convergence of SGD
Aritra Dutta
El Houcine Bergou
Soumia Boucherouite
Nicklas Werge
M. Kandemir
Xin Li
223
1
0
19 Oct 2023
Acceleration of stochastic gradient descent with momentum by averaging: finite-sample rates and asymptotic normality
Kejie Tang
Weidong Liu
Yichen Zhang
Xi Chen
172
3
0
28 May 2023
Momentum Provably Improves Error Feedback!
Neural Information Processing Systems (NeurIPS), 2023
Ilyas Fatkhullin
Alexander Tyurin
Peter Richtárik
285
36
0
24 May 2023
AdaSAM: Boosting Sharpness-Aware Minimization with Adaptive Learning Rate and Momentum for Training Deep Neural Networks
Neural Networks (Neural Netw.), 2023
Hao Sun
Li Shen
Qihuang Zhong
Liang Ding
Shi-Yong Chen
Jingwei Sun
Jing Li
Guangzhong Sun
Dacheng Tao
150
41
0
01 Mar 2023
Differentiable Arbitrating in Zero-sum Markov Games
Adaptive Agents and Multi-Agent Systems (AAMAS), 2023
Jing Wang
Meichen Song
Feng Gao
Boyi Liu
Zhaoran Wang
Yi Wu
305
2
0
20 Feb 2023
Building Decision Forest via Deep Reinforcement Learning
Guixuan Wen
Kaigui Wu
118
4
0
01 Apr 2022
On Almost Sure Convergence Rates of Stochastic Gradient Methods
Annual Conference Computational Learning Theory (COLT), 2022
Jun Liu
Ye Yuan
180
43
0
09 Feb 2022
On the Convergence of mSGD and AdaGrad for Stochastic Optimization
International Conference on Learning Representations (ICLR), 2022
Ruinan Jin
Yu Xing
Xingkang He
109
11
0
26 Jan 2022
Accelerated Gradient Flow: Risk, Stability, and Implicit Regularization
Yue Sheng
Alnur Ali
177
2
0
20 Jan 2022
A Novel Convergence Analysis for Algorithms of the Adam Family
Zhishuai Guo
Yi Tian Xu
W. Yin
Rong Jin
Tianbao Yang
178
53
0
07 Dec 2021
AutoDrop: Training Deep Learning Models with Automatic Learning Rate Drop
Yunfei Teng
Jing Wang
A. Choromańska
241
2
0
30 Nov 2021
Training Generative Adversarial Networks with Adaptive Composite Gradient
Data Intelligence (DI), 2021
Huiqing Qi
Fang Li
Shengli Tan
Xiangyun Zhang
GAN
179
4
0
10 Nov 2021
EF21 with Bells & Whistles: Six Algorithmic Extensions of Modern Error Feedback
Ilyas Fatkhullin
Igor Sokolov
Eduard A. Gorbunov
Zhize Li
Peter Richtárik
305
47
0
07 Oct 2021
Accelerate Distributed Stochastic Descent for Nonconvex Optimization with Momentum
Guojing Cong
Tianyi Liu
168
0
0
01 Oct 2021
Momentum Accelerates the Convergence of Stochastic AUPRC Maximization
Guanghui Wang
Minghao Yang
Lijun Zhang
Tianbao Yang
232
22
0
02 Jul 2021
Escaping Saddle Points Faster with Stochastic Momentum
International Conference on Learning Representations (ICLR), 2020
Jun-Kun Wang
Chi-Heng Lin
Jacob D. Abernethy
ODL
171
24
0
05 Jun 2021
Scale Invariant Monte Carlo under Linear Function Approximation with Curvature based step-size
Rahul Madhavan
Hemant Makwana
118
0
0
15 Apr 2021
Analytical Study of Momentum-Based Acceleration Methods in Paradigmatic High-Dimensional Non-Convex Problems
Neural Information Processing Systems (NeurIPS), 2021
Stefano Sarao Mannelli
Pierfrancesco Urbani
247
11
0
23 Feb 2021
The Role of Momentum Parameters in the Optimal Convergence of Adaptive Polyak's Heavy-ball Methods
International Conference on Learning Representations (ICLR), 2021
Wei Tao
Sheng Long
Gao-wei Wu
Qing Tao
101
15
0
15 Feb 2021
On the Last Iterate Convergence of Momentum Methods
International Conference on Algorithmic Learning Theory (ALT), 2021
Xiaoyun Li
Mingrui Liu
Francesco Orabona
258
12
0
13 Feb 2021
Accelerating Training of Batch Normalization: A Manifold Perspective
Conference on Uncertainty in Artificial Intelligence (UAI), 2021
Mingyang Yi
229
3
0
08 Jan 2021
Adaptive Gradient Quantization for Data-Parallel SGD
Neural Information Processing Systems (NeurIPS), 2020
Fartash Faghri
Iman Tabrizian
I. Markov
Dan Alistarh
Daniel M. Roy
Ali Ramezani-Kebrya
MQ
159
99
0
23 Oct 2020
Decentralized Deep Learning using Momentum-Accelerated Consensus
Aditya Balu
Zhanhong Jiang
Sin Yong Tan
Chinmay Hedge
Young M. Lee
Soumik Sarkar
FedML
197
25
0
21 Oct 2020
A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear Network
International Conference on Machine Learning (ICML), 2020
Jun-Kun Wang
Chi-Heng Lin
Jacob D. Abernethy
521
24
0
04 Oct 2020
Quickly Finding a Benign Region via Heavy Ball Momentum in Non-Convex Optimization
Jun-Kun Wang
Jacob D. Abernethy
279
8
0
04 Oct 2020
Momentum via Primal Averaging: Theoretical Insights and Learning Rate Schedules for Non-Convex Optimization
Aaron Defazio
221
28
0
01 Oct 2020
Federated Learning with Nesterov Accelerated Gradient
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2020
Zhengjie Yang
Wei Bao
Dong Yuan
Nguyen H. Tran
Albert Y. Zomaya
FedML
209
39
0
18 Sep 2020
Online Algorithms for Estimating Change Rates of Web Pages
Konstantin Avrachenkov
Kishor P. Patil
Gugan Thoppe
170
16
0
17 Sep 2020
Effective Federated Adaptive Gradient Methods with Non-IID Decentralized Data
Qianqian Tong
Guannan Liang
J. Bi
FedML
217
28
0
14 Sep 2020
Understanding and Detecting Convergence for Stochastic Gradient Descent with Momentum
Jerry Chee
Ping Li
124
13
0
27 Aug 2020
Differentially Private Accelerated Optimization Algorithms
Nurdan Kuru
cS. .Ilker Birbil
Mert Gurbuzbalaban
S. Yıldırım
139
26
0
05 Aug 2020
A High Probability Analysis of Adaptive SGD with Momentum
Xiaoyun Li
Francesco Orabona
273
75
0
28 Jul 2020
A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning
Samuel Horváth
Peter Richtárik
201
64
0
19 Jun 2020
Almost sure convergence rates for Stochastic Gradient Descent and Stochastic Heavy Ball
Othmane Sebbouh
Robert Mansel Gower
Aaron Defazio
127
23
0
14 Jun 2020
An Analysis of the Adaptation Speed of Causal Models
Rémi Le Priol
Reza Babanezhad Harikandeh
Yoshua Bengio
Damien Scieur
CML
229
15
0
18 May 2020
MixML: A Unified Analysis of Weakly Consistent Parallel Learning
Yucheng Lu
J. Nash
Christopher De Sa
FedML
169
12
0
14 May 2020
A Simple Convergence Proof of Adam and Adagrad
Alexandre Défossez
Léon Bottou
Francis R. Bach
Nicolas Usunier
375
196
0
05 Mar 2020
On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings
International Conference on Machine Learning (ICML), 2020
Mahmoud Assran
Michael G. Rabbat
201
69
0
27 Feb 2020
Convergence of a Stochastic Gradient Method with Momentum for Non-Smooth Non-Convex Optimization
International Conference on Machine Learning (ICML), 2020
Vien V. Mai
M. Johansson
219
61
0
13 Feb 2020
Faster On-Device Training Using New Federated Momentum Algorithm
Zhouyuan Huo
Qian Yang
Bin Gu
Heng-Chiao Huang
FedML
276
52
0
06 Feb 2020
Large Batch Training Does Not Need Warmup
Zhouyuan Huo
Bin Gu
Heng-Chiao Huang
AI4CE
ODL
116
5
0
04 Feb 2020
A Rule for Gradient Estimator Selection, with an Application to Variational Inference
International Conference on Artificial Intelligence and Statistics (AISTATS), 2019
Tomas Geffner
Justin Domke
111
6
0
05 Nov 2019
1
2
Next