Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.02865
Cited By
WNGrad: Learn the Learning Rate in Gradient Descent
7 March 2018
Xiaoxia Wu
Rachel A. Ward
Léon Bottou
Re-assign community
ArXiv
PDF
HTML
Papers citing
"WNGrad: Learn the Learning Rate in Gradient Descent"
15 / 15 papers shown
Title
Recent Advances in Non-convex Smoothness Conditions and Applicability to Deep Linear Neural Networks
Vivak Patel
Christian Varner
28
0
0
20 Sep 2024
A Novel Gradient Methodology with Economical Objective Function Evaluations for Data Science Applications
Christian Varner
Vivak Patel
21
2
0
19 Sep 2023
The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks
Yuan Cao
Difan Zou
Yuan-Fang Li
Quanquan Gu
MLT
29
5
0
20 Jun 2023
On the Weight Dynamics of Deep Normalized Networks
Christian H. X. Ali Mehmeti-Göpel
Michael Wand
32
1
0
01 Jun 2023
Robust Implicit Regularization via Weight Normalization
H. Chou
Holger Rauhut
Rachel A. Ward
28
7
0
09 May 2023
Adaptive Gradient Methods with Local Guarantees
Zhou Lu
Wenhan Xia
Sanjeev Arora
Elad Hazan
ODL
22
9
0
02 Mar 2022
A Stochastic Bundle Method for Interpolating Networks
Alasdair Paren
Leonard Berrada
Rudra P. K. Poudel
M. P. Kumar
24
4
0
29 Jan 2022
KOALA: A Kalman Optimization Algorithm with Loss Adaptivity
A. Davtyan
Sepehr Sameni
L. Cerkezi
Givi Meishvili
Adam Bielski
Paolo Favaro
ODL
53
2
0
07 Jul 2021
Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective
Kushal Chakrabarti
Nikhil Chopra
ODL
AI4CE
31
9
0
31 May 2021
Flexible numerical optimization with ensmallen
Ryan R. Curtin
Marcus Edel
Rahul Prabhu
S. Basak
Zhihao Lou
Conrad Sanderson
18
1
0
09 Mar 2020
LOSSGRAD: automatic learning rate in gradient descent
B. Wójcik
Lukasz Maziarka
Jacek Tabor
ODL
32
4
0
20 Feb 2019
Theoretical Analysis of Auto Rate-Tuning by Batch Normalization
Sanjeev Arora
Zhiyuan Li
Kaifeng Lyu
28
130
0
10 Dec 2018
AdaGrad stepsizes: Sharp convergence over nonconvex landscapes
Rachel A. Ward
Xiaoxia Wu
Léon Bottou
ODL
19
358
0
05 Jun 2018
On the Convergence of Stochastic Gradient Descent with Adaptive Stepsizes
Xiaoyun Li
Francesco Orabona
32
290
0
21 May 2018
Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes
Ohad Shamir
Tong Zhang
101
570
0
08 Dec 2012
1