v1v2 (latest)

Variants of RMSProp and Adagrad with Logarithmic Regret Bounds

17 June 2017

Mahesh Chandra Mukkamala

Matthias Hein

ODL

ArXiv (abs)PDF HTML

Papers citing "Variants of RMSProp and Adagrad with Logarithmic Regret Bounds"

50 / 50 papers shown

Title
Preconditioned Inexact Stochastic ADMM for Deep Model Shenglong Zhou Ouya Wang Ziyan Luo Yongxu Zhu Geoffrey Ye Li 146 1 0 15 Feb 2025
An Improved Empirical Fisher Approximation for Natural Gradient Descent Xiaodong Wu Wenyi Yu Chao Zhang Philip Woodland 129 6 0 10 Jun 2024
Bridging Classical and Quantum Machine Learning: Knowledge Transfer From Classical to Quantum Neural Networks Using Knowledge Distillation Mohammad Junayed Hasan M.R.C. Mahdy 162 5 0 23 Nov 2023
Boosting Diffusion Models with an Adaptive Momentum Sampler Xiyu Wang Anh-Dung Dinh Daochang Liu Chang Xu 120 6 0 23 Aug 2023
Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters and Non-ergodic Case Meixuan He Yuqing Liang Jinlan Liu Dongpo Xu 126 10 0 20 Jul 2023
Time Optimal Ergodic Search Dayi Dong Henry Berger Ian Abraham 86 19 0 19 May 2023
Skeleton-based Human Action Recognition via Convolutional Neural Networks (CNN) Ayman Ali Ekkasit Pinyoanuntapong Pu Wang Mohsen Dorodchi 3DH 107 11 0 31 Jan 2023
Estimation of Sea State Parameters from Ship Motion Responses Using Attention-based Neural Networks Denis Selimovic Franko Hržić J. Prpić-Oršić J. Lerga 58 11 0 21 Jan 2023
Federated Coordinate Descent for Privacy-Preserving Multiparty Linear Regression Xinlin Leng Chenxu Li Weifeng Xu Yuyan Sun Hongtao Wang FedML 111 1 0 16 Sep 2022
Dynamic Regret of Adaptive Gradient Methods for Strongly Convex Problems Parvin Nazari E. Khorram ODL 137 3 0 04 Sep 2022
Nest Your Adaptive Algorithm for Parameter-Agnostic Nonconvex Minimax Optimization Junchi Yang Xiang Li Niao He ODL 161 23 0 01 Jun 2022
AnoMili: Spoofing Prevention and Explainable Anomaly Detection for the 1553 Military Avionic Bus Efrat Levy Nadav Maman A. Shabtai Yuval Elovici 71 15 0 14 Feb 2022
Private Adaptive Optimization with Side Information Tian Li Manzil Zaheer Sashank J. Reddi Virginia Smith 116 40 0 12 Feb 2022
A Stochastic Bundle Method for Interpolating Networks Alasdair Paren Leonard Berrada Rudra P. K. Poudel M. P. Kumar 122 4 0 29 Jan 2022
On the One-sided Convergence of Adam-type Algorithms in Non-convex Non-concave Min-max Optimization Zehao Dou Yuanzhi Li 109 13 0 29 Sep 2021
AdaLoss: A computationally-efficient and provably convergent adaptive gradient method Xiaoxia Wu Yuege Xie S. Du Rachel A. Ward ODL 86 7 0 17 Sep 2021
KOALA: A Kalman Optimization Algorithm with Loss Adaptivity A. Davtyan Sepehr Sameni L. Cerkezi Givi Meishvili Adam Bielski Paolo Favaro ODL 238 3 0 07 Jul 2021
FastAdaBelief: Improving Convergence Rate for Belief-based Adaptive Optimizers by Exploiting Strong Convexity Yangfan Zhou Kaizhu Huang Cheng Cheng Xuguang Wang Amir Hussain Xin Liu ODL 107 11 0 28 Apr 2021
Meta-Regularization: An Approach to Adaptive Choice of the Learning Rate in Gradient Descent Guangzeng Xie Hao Jin Dachao Lin Zhihua Zhang 71 0 0 12 Apr 2021
The Role of Momentum Parameters in the Optimal Convergence of Adaptive Polyak's Heavy-ball Methods Wei Tao Sheng Long Gao-wei Wu Qing Tao 65 14 0 15 Feb 2021
Gravity Optimizer: a Kinematic Approach on Optimization in Deep Learning Dariush Bahrami Sadegh Pouriyan Zadeh ODL 68 5 0 22 Jan 2021
Efficient Semi-Implicit Variational Inference Vincent Moens Hang Ren A. Maraval Rasul Tutunov Jun Wang H. Ammar 208 7 0 15 Jan 2021
Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration Congliang Chen Li Shen Fangyu Zou Wei Liu 109 32 0 14 Jan 2021
Adam revisited: a weighted past gradients perspective Hui Zhong Zaiyi Chen Chuan Qin Zai Huang V. Zheng Tong Xu Enhong Chen ODL 119 41 0 01 Jan 2021
Gradient Descent Averaging and Primal-dual Averaging for Strongly Convex Optimization Wei Tao Wei Li Zhisong Pan Qing Tao MoMe 72 5 0 29 Dec 2020
On Generalization of Adaptive Methods for Over-parameterized Linear Regression Vatsal Shah Soumya Basu Anastasios Kyrillidis Sujay Sanghavi AI4CE 89 4 0 28 Nov 2020
A Comparison of Optimization Algorithms for Deep Learning Derya Soydaner 171 165 0 28 Jul 2020
Towards Learning Convolutions from Scratch Behnam Neyshabur SSL 368 72 0 27 Jul 2020
Binary Search and First Order Gradient Based Method for Stochastic Optimization V. Pandey ODL 61 0 0 27 Jul 2020
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers Robin M. Schmidt Frank Schneider Philipp Hennig ODL 341 175 0 03 Jul 2020
Gradient-only line searches to automatically determine learning rates for a variety of stochastic training algorithms D. Kafka D. Wilke ODL 77 0 0 29 Jun 2020
Compositional ADAM: An Adaptive Compositional Solver Rasul Tutunov Minne Li Alexander I. Cowen-Rivers Jun Wang Haitham Bou-Ammar ODL 131 16 0 10 Feb 2020
Bregman Proximal Framework for Deep Linear Neural Networks Mahesh Chandra Mukkamala Felix Westerkamp Emanuel Laude Zorah Lähner Peter Ochs 129 8 0 08 Oct 2019
A CNN-based approach to classify cricket bowlers based on their bowling actions M. N. A. Islam Tanzil Bin Hassan Siamul Karim Khan 44 25 0 03 Sep 2019
Linear Convergence of Adaptive Stochastic Gradient Descent Yuege Xie Xiaoxia Wu Rachel A. Ward 110 48 0 28 Aug 2019
DeepDA: LSTM-based Deep Data Association Network for Multi-Targets Tracking in Clutter Huajun Liu Hui Zhang Christoph Mertz 72 35 0 16 Jul 2019
Training Neural Networks for and by Interpolation Leonard Berrada Andrew Zisserman M. P. Kumar 3DH 117 65 0 13 Jun 2019
Beyond Alternating Updates for Matrix Factorization with Inertial Bregman Proximal Gradient Algorithms Mahesh Chandra Mukkamala Peter Ochs 112 24 0 22 May 2019
Global Convergence of Adaptive Gradient Methods for An Over-parameterized Neural Network Xiaoxia Wu S. Du Rachel A. Ward 148 66 0 19 Feb 2019
Escaping Saddle Points with Adaptive Gradient Methods Matthew Staib Sashank J. Reddi Satyen Kale Sanjiv Kumar S. Sra ODL 143 77 0 26 Jan 2019
A Sufficient Condition for Convergences of Adam and RMSProp Fangyu Zou Li Shen Zequn Jie Weizhong Zhang Wei Liu 153 387 0 23 Nov 2018
Convergence and Dynamical Behavior of the ADAM Algorithm for Non-Convex Stochastic Optimization Anas Barakat Pascal Bianchi 145 78 0 04 Oct 2018
On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization Dongruo Zhou Yiqi Tang Yuan Cao Ziyan Yang Quanquan Gu 250 155 0 16 Aug 2018
Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration Soham De Anirbit Mukherjee Enayat Ullah 195 106 0 18 Jul 2018
Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks Jinghui Chen Dongruo Zhou Yiqi Tang Ziyan Yang Yuan Cao Quanquan Gu ODL 200 199 0 18 Jun 2018
AdaGrad stepsizes: Sharp convergence over nonconvex landscapes Rachel A. Ward Xiaoxia Wu Léon Bottou ODL 341 382 0 05 Jun 2018
Phocas: dimensional Byzantine-resilient stochastic gradient descent Cong Xie Oluwasanmi Koyejo Indranil Gupta 70 56 0 23 May 2018
Generalized Byzantine-tolerant SGD Cong Xie Oluwasanmi Koyejo Indranil Gupta AAML 102 274 0 27 Feb 2018
BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning Ziming Zhang Yuanwei Wu Guanghui Wang ODL 116 28 0 19 Nov 2017
Sequence Prediction with Neural Segmental Models Hao Tang 139 2 0 05 Sep 2017