Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.12838
Cited By
The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares
29 April 2019
Rong Ge
Sham Kakade
Rahul Kidambi
Praneeth Netrapalli
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares"
19 / 69 papers shown
Title
Adaptive Hierarchical Hyper-gradient Descent
Renlong Jie
Junbin Gao
A. Vasnev
Minh-Ngoc Tran
21
5
0
17 Aug 2020
MLR-SNet: Transferable LR Schedules for Heterogeneous Tasks
Jun Shu
Yanwen Zhu
Qian Zhao
Zongben Xu
Deyu Meng
23
7
0
29 Jul 2020
EfficientHRNet: Efficient Scaling for Lightweight High-Resolution Multi-Person Pose Estimation
Christopher Neff
A. Sheth
Steven Furgurson
Hamed Tabkhi
3DH
19
27
0
16 Jul 2020
Double-Loop Unadjusted Langevin Algorithm
Paul Rolland
Armin Eftekhari
Ali Kavis
V. Cevher
17
3
0
02 Jul 2020
Guarantees for Tuning the Step Size using a Learning-to-Learn Approach
Xiang Wang
Shuai Yuan
Chenwei Wu
Rong Ge
10
16
0
30 Jun 2020
On the Generalization Benefit of Noise in Stochastic Gradient Descent
Samuel L. Smith
Erich Elsen
Soham De
MLT
19
98
0
26 Jun 2020
Understanding the Role of Training Regimes in Continual Learning
Seyed Iman Mirzadeh
Mehrdad Farajtabar
Razvan Pascanu
H. Ghasemzadeh
CLL
18
219
0
12 Jun 2020
Warwick Electron Microscopy Datasets
Jeffrey M. Ede
11
14
0
02 Mar 2020
Disentangling Adaptive Gradient Methods from Learning Rates
Naman Agarwal
Rohan Anil
Elad Hazan
Tomer Koren
Cyril Zhang
11
34
0
26 Feb 2020
A Second look at Exponential and Cosine Step Sizes: Simplicity, Adaptivity, and Performance
Xiaoyun Li
Zhenxun Zhuang
Francesco Orabona
35
18
0
12 Feb 2020
Using Statistics to Automate Stochastic Optimization
Hunter Lang
Pengchuan Zhang
Lin Xiao
11
21
0
21 Sep 2019
Learning an Adaptive Learning Rate Schedule
Zhen Xu
Andrew M. Dai
Jonas Kemp
Luke Metz
16
61
0
20 Sep 2019
From low probability to high confidence in stochastic convex optimization
Damek Davis
Dmitriy Drusvyatskiy
Lin Xiao
Junyu Zhang
11
5
0
31 Jul 2019
Stochastic algorithms with geometric step decay converge linearly on sharp functions
Damek Davis
Dmitriy Drusvyatskiy
Vasileios Charisopoulos
36
26
0
22 Jul 2019
Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks
Yuanzhi Li
Colin Wei
Tengyu Ma
6
290
0
10 Jul 2019
Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks
Jinghui Chen
Dongruo Zhou
Yiqi Tang
Ziyan Yang
Yuan Cao
Quanquan Gu
ODL
19
193
0
18 Jun 2018
A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method
Simon Lacoste-Julien
Mark W. Schmidt
Francis R. Bach
126
259
0
10 Dec 2012
Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes
Ohad Shamir
Tong Zhang
101
570
0
08 Dec 2012
Optimal Distributed Online Prediction using Mini-Batches
O. Dekel
Ran Gilad-Bachrach
Ohad Shamir
Lin Xiao
177
683
0
07 Dec 2010
Previous
1
2