ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
  • Feedback
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.05507
  4. Cited By
Variants of RMSProp and Adagrad with Logarithmic Regret Bounds
v1v2 (latest)

Variants of RMSProp and Adagrad with Logarithmic Regret Bounds

17 June 2017
Mahesh Chandra Mukkamala
Matthias Hein
    ODL
ArXiv (abs)PDFHTML

Papers citing "Variants of RMSProp and Adagrad with Logarithmic Regret Bounds"

50 / 50 papers shown
Title
Preconditioned Inexact Stochastic ADMM for Deep Model
Preconditioned Inexact Stochastic ADMM for Deep Model
Shenglong Zhou
Ouya Wang
Ziyan Luo
Yongxu Zhu
Geoffrey Ye Li
146
1
0
15 Feb 2025
An Improved Empirical Fisher Approximation for Natural Gradient Descent
An Improved Empirical Fisher Approximation for Natural Gradient Descent
Xiaodong Wu
Wenyi Yu
Chao Zhang
Philip Woodland
129
6
0
10 Jun 2024
Bridging Classical and Quantum Machine Learning: Knowledge Transfer From Classical to Quantum Neural Networks Using Knowledge Distillation
Bridging Classical and Quantum Machine Learning: Knowledge Transfer From Classical to Quantum Neural Networks Using Knowledge Distillation
Mohammad Junayed Hasan
M.R.C. Mahdy
162
5
0
23 Nov 2023
Boosting Diffusion Models with an Adaptive Momentum Sampler
Boosting Diffusion Models with an Adaptive Momentum Sampler
Xiyu Wang
Anh-Dung Dinh
Daochang Liu
Chang Xu
120
6
0
23 Aug 2023
Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters
  and Non-ergodic Case
Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters and Non-ergodic Case
Meixuan He
Yuqing Liang
Jinlan Liu
Dongpo Xu
126
10
0
20 Jul 2023
Time Optimal Ergodic Search
Time Optimal Ergodic Search
Dayi Dong
Henry Berger
Ian Abraham
86
19
0
19 May 2023
Skeleton-based Human Action Recognition via Convolutional Neural
  Networks (CNN)
Skeleton-based Human Action Recognition via Convolutional Neural Networks (CNN)
Ayman Ali
Ekkasit Pinyoanuntapong
Pu Wang
Mohsen Dorodchi
3DH
107
11
0
31 Jan 2023
Estimation of Sea State Parameters from Ship Motion Responses Using
  Attention-based Neural Networks
Estimation of Sea State Parameters from Ship Motion Responses Using Attention-based Neural Networks
Denis Selimovic
Franko Hržić
J. Prpić-Oršić
J. Lerga
58
11
0
21 Jan 2023
Federated Coordinate Descent for Privacy-Preserving Multiparty Linear
  Regression
Federated Coordinate Descent for Privacy-Preserving Multiparty Linear Regression
Xinlin Leng
Chenxu Li
Weifeng Xu
Yuyan Sun
Hongtao Wang
FedML
111
1
0
16 Sep 2022
Dynamic Regret of Adaptive Gradient Methods for Strongly Convex Problems
Dynamic Regret of Adaptive Gradient Methods for Strongly Convex Problems
Parvin Nazari
E. Khorram
ODL
137
3
0
04 Sep 2022
Nest Your Adaptive Algorithm for Parameter-Agnostic Nonconvex Minimax
  Optimization
Nest Your Adaptive Algorithm for Parameter-Agnostic Nonconvex Minimax Optimization
Junchi Yang
Xiang Li
Niao He
ODL
161
23
0
01 Jun 2022
AnoMili: Spoofing Prevention and Explainable Anomaly Detection for the
  1553 Military Avionic Bus
AnoMili: Spoofing Prevention and Explainable Anomaly Detection for the 1553 Military Avionic Bus
Efrat Levy
Nadav Maman
A. Shabtai
Yuval Elovici
71
15
0
14 Feb 2022
Private Adaptive Optimization with Side Information
Private Adaptive Optimization with Side Information
Tian Li
Manzil Zaheer
Sashank J. Reddi
Virginia Smith
116
40
0
12 Feb 2022
A Stochastic Bundle Method for Interpolating Networks
A Stochastic Bundle Method for Interpolating Networks
Alasdair Paren
Leonard Berrada
Rudra P. K. Poudel
M. P. Kumar
122
4
0
29 Jan 2022
On the One-sided Convergence of Adam-type Algorithms in Non-convex
  Non-concave Min-max Optimization
On the One-sided Convergence of Adam-type Algorithms in Non-convex Non-concave Min-max Optimization
Zehao Dou
Yuanzhi Li
109
13
0
29 Sep 2021
AdaLoss: A computationally-efficient and provably convergent adaptive
  gradient method
AdaLoss: A computationally-efficient and provably convergent adaptive gradient method
Xiaoxia Wu
Yuege Xie
S. Du
Rachel A. Ward
ODL
86
7
0
17 Sep 2021
KOALA: A Kalman Optimization Algorithm with Loss Adaptivity
KOALA: A Kalman Optimization Algorithm with Loss Adaptivity
A. Davtyan
Sepehr Sameni
L. Cerkezi
Givi Meishvili
Adam Bielski
Paolo Favaro
ODL
238
3
0
07 Jul 2021
FastAdaBelief: Improving Convergence Rate for Belief-based Adaptive
  Optimizers by Exploiting Strong Convexity
FastAdaBelief: Improving Convergence Rate for Belief-based Adaptive Optimizers by Exploiting Strong Convexity
Yangfan Zhou
Kaizhu Huang
Cheng Cheng
Xuguang Wang
Amir Hussain
Xin Liu
ODL
107
11
0
28 Apr 2021
Meta-Regularization: An Approach to Adaptive Choice of the Learning Rate
  in Gradient Descent
Meta-Regularization: An Approach to Adaptive Choice of the Learning Rate in Gradient Descent
Guangzeng Xie
Hao Jin
Dachao Lin
Zhihua Zhang
71
0
0
12 Apr 2021
The Role of Momentum Parameters in the Optimal Convergence of Adaptive
  Polyak's Heavy-ball Methods
The Role of Momentum Parameters in the Optimal Convergence of Adaptive Polyak's Heavy-ball Methods
Wei Tao
Sheng Long
Gao-wei Wu
Qing Tao
65
14
0
15 Feb 2021
Gravity Optimizer: a Kinematic Approach on Optimization in Deep Learning
Gravity Optimizer: a Kinematic Approach on Optimization in Deep Learning
Dariush Bahrami
Sadegh Pouriyan Zadeh
ODL
68
5
0
22 Jan 2021
Efficient Semi-Implicit Variational Inference
Efficient Semi-Implicit Variational Inference
Vincent Moens
Hang Ren
A. Maraval
Rasul Tutunov
Jun Wang
H. Ammar
208
7
0
15 Jan 2021
Towards Practical Adam: Non-Convexity, Convergence Theory, and
  Mini-Batch Acceleration
Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration
Congliang Chen
Li Shen
Fangyu Zou
Wei Liu
109
32
0
14 Jan 2021
Adam revisited: a weighted past gradients perspective
Adam revisited: a weighted past gradients perspective
Hui Zhong
Zaiyi Chen
Chuan Qin
Zai Huang
V. Zheng
Tong Xu
Enhong Chen
ODL
119
41
0
01 Jan 2021
Gradient Descent Averaging and Primal-dual Averaging for Strongly Convex
  Optimization
Gradient Descent Averaging and Primal-dual Averaging for Strongly Convex Optimization
Wei Tao
Wei Li
Zhisong Pan
Qing Tao
MoMe
72
5
0
29 Dec 2020
On Generalization of Adaptive Methods for Over-parameterized Linear
  Regression
On Generalization of Adaptive Methods for Over-parameterized Linear Regression
Vatsal Shah
Soumya Basu
Anastasios Kyrillidis
Sujay Sanghavi
AI4CE
89
4
0
28 Nov 2020
A Comparison of Optimization Algorithms for Deep Learning
A Comparison of Optimization Algorithms for Deep Learning
Derya Soydaner
171
165
0
28 Jul 2020
Towards Learning Convolutions from Scratch
Towards Learning Convolutions from Scratch
Behnam Neyshabur
SSL
368
72
0
27 Jul 2020
Binary Search and First Order Gradient Based Method for Stochastic
  Optimization
Binary Search and First Order Gradient Based Method for Stochastic Optimization
V. Pandey
ODL
61
0
0
27 Jul 2020
Descending through a Crowded Valley - Benchmarking Deep Learning
  Optimizers
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers
Robin M. Schmidt
Frank Schneider
Philipp Hennig
ODL
341
175
0
03 Jul 2020
Gradient-only line searches to automatically determine learning rates
  for a variety of stochastic training algorithms
Gradient-only line searches to automatically determine learning rates for a variety of stochastic training algorithms
D. Kafka
D. Wilke
ODL
77
0
0
29 Jun 2020
Compositional ADAM: An Adaptive Compositional Solver
Compositional ADAM: An Adaptive Compositional Solver
Rasul Tutunov
Minne Li
Alexander I. Cowen-Rivers
Jun Wang
Haitham Bou-Ammar
ODL
131
16
0
10 Feb 2020
Bregman Proximal Framework for Deep Linear Neural Networks
Bregman Proximal Framework for Deep Linear Neural Networks
Mahesh Chandra Mukkamala
Felix Westerkamp
Emanuel Laude
Zorah Lähner
Peter Ochs
129
8
0
08 Oct 2019
A CNN-based approach to classify cricket bowlers based on their bowling
  actions
A CNN-based approach to classify cricket bowlers based on their bowling actions
M. N. A. Islam
Tanzil Bin Hassan
Siamul Karim Khan
44
25
0
03 Sep 2019
Linear Convergence of Adaptive Stochastic Gradient Descent
Linear Convergence of Adaptive Stochastic Gradient Descent
Yuege Xie
Xiaoxia Wu
Rachel A. Ward
110
48
0
28 Aug 2019
DeepDA: LSTM-based Deep Data Association Network for Multi-Targets
  Tracking in Clutter
DeepDA: LSTM-based Deep Data Association Network for Multi-Targets Tracking in Clutter
Huajun Liu
Hui Zhang
Christoph Mertz
72
35
0
16 Jul 2019
Training Neural Networks for and by Interpolation
Training Neural Networks for and by Interpolation
Leonard Berrada
Andrew Zisserman
M. P. Kumar
3DH
117
65
0
13 Jun 2019
Beyond Alternating Updates for Matrix Factorization with Inertial
  Bregman Proximal Gradient Algorithms
Beyond Alternating Updates for Matrix Factorization with Inertial Bregman Proximal Gradient Algorithms
Mahesh Chandra Mukkamala
Peter Ochs
112
24
0
22 May 2019
Global Convergence of Adaptive Gradient Methods for An
  Over-parameterized Neural Network
Global Convergence of Adaptive Gradient Methods for An Over-parameterized Neural Network
Xiaoxia Wu
S. Du
Rachel A. Ward
148
66
0
19 Feb 2019
Escaping Saddle Points with Adaptive Gradient Methods
Escaping Saddle Points with Adaptive Gradient Methods
Matthew Staib
Sashank J. Reddi
Satyen Kale
Sanjiv Kumar
S. Sra
ODL
143
77
0
26 Jan 2019
A Sufficient Condition for Convergences of Adam and RMSProp
A Sufficient Condition for Convergences of Adam and RMSProp
Fangyu Zou
Li Shen
Zequn Jie
Weizhong Zhang
Wei Liu
153
387
0
23 Nov 2018
Convergence and Dynamical Behavior of the ADAM Algorithm for Non-Convex
  Stochastic Optimization
Convergence and Dynamical Behavior of the ADAM Algorithm for Non-Convex Stochastic Optimization
Anas Barakat
Pascal Bianchi
145
78
0
04 Oct 2018
On the Convergence of Adaptive Gradient Methods for Nonconvex
  Optimization
On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization
Dongruo Zhou
Yiqi Tang
Yuan Cao
Ziyan Yang
Quanquan Gu
250
155
0
16 Aug 2018
Convergence guarantees for RMSProp and ADAM in non-convex optimization
  and an empirical comparison to Nesterov acceleration
Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration
Soham De
Anirbit Mukherjee
Enayat Ullah
195
106
0
18 Jul 2018
Closing the Generalization Gap of Adaptive Gradient Methods in Training
  Deep Neural Networks
Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks
Jinghui Chen
Dongruo Zhou
Yiqi Tang
Ziyan Yang
Yuan Cao
Quanquan Gu
ODL
200
199
0
18 Jun 2018
AdaGrad stepsizes: Sharp convergence over nonconvex landscapes
AdaGrad stepsizes: Sharp convergence over nonconvex landscapes
Rachel A. Ward
Xiaoxia Wu
Léon Bottou
ODL
341
382
0
05 Jun 2018
Phocas: dimensional Byzantine-resilient stochastic gradient descent
Phocas: dimensional Byzantine-resilient stochastic gradient descent
Cong Xie
Oluwasanmi Koyejo
Indranil Gupta
70
56
0
23 May 2018
Generalized Byzantine-tolerant SGD
Generalized Byzantine-tolerant SGD
Cong Xie
Oluwasanmi Koyejo
Indranil Gupta
AAML
102
274
0
27 Feb 2018
BPGrad: Towards Global Optimality in Deep Learning via Branch and
  Pruning
BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning
Ziming Zhang
Yuanwei Wu
Guanghui Wang
ODL
116
28
0
19 Nov 2017
Sequence Prediction with Neural Segmental Models
Sequence Prediction with Neural Segmental Models
Hao Tang
139
2
0
05 Sep 2017
1