ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.05074
  4. Cited By
L4: Practical loss-based stepsize adaptation for deep learning
v1v2v3v4v5 (latest)

L4: Practical loss-based stepsize adaptation for deep learning

14 February 2018
Michal Rolínek
Georg Martius
    ODL
ArXiv (abs)PDFHTML

Papers citing "L4: Practical loss-based stepsize adaptation for deep learning"

30 / 30 papers shown
An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton
  Stepsizes
An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes
Antonio Orvieto
Lin Xiao
351
7
0
05 Jul 2024
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
Dimitris Oikonomou
Nicolas Loizou
389
11
0
06 Jun 2024
Single-Call Stochastic Extragradient Methods for Structured Non-monotone
  Variational Inequalities: Improved Analysis under Weaker Conditions
Single-Call Stochastic Extragradient Methods for Structured Non-monotone Variational Inequalities: Improved Analysis under Weaker ConditionsNeural Information Processing Systems (NeurIPS), 2023
S. Choudhury
Eduard A. Gorbunov
Nicolas Loizou
359
18
0
27 Feb 2023
QLABGrad: a Hyperparameter-Free and Convergence-Guaranteed Scheme for
  Deep Learning
QLABGrad: a Hyperparameter-Free and Convergence-Guaranteed Scheme for Deep LearningAAAI Conference on Artificial Intelligence (AAAI), 2023
Minghan Fu
Fang-Xiang Wu
ODL
361
12
0
01 Feb 2023
Making SGD Parameter-Free
Making SGD Parameter-FreeAnnual Conference Computational Learning Theory (COLT), 2022
Y. Carmon
Oliver Hinder
451
57
0
04 May 2022
Amortized Proximal Optimization
Amortized Proximal OptimizationNeural Information Processing Systems (NeurIPS), 2022
Juhan Bae
Paul Vicol
Jeff Z. HaoChen
Roger C. Grosse
ODL
380
15
0
28 Feb 2022
A Stochastic Bundle Method for Interpolating Networks
A Stochastic Bundle Method for Interpolating Networks
Alasdair Paren
Leonard Berrada
Rudra P. K. Poudel
M. P. Kumar
248
6
0
29 Jan 2022
Stochastic Mirror Descent: Convergence Analysis and Adaptive Variants
  via the Mirror Stochastic Polyak Stepsize
Stochastic Mirror Descent: Convergence Analysis and Adaptive Variants via the Mirror Stochastic Polyak Stepsize
Ryan DÓrazio
Nicolas Loizou
I. Laradji
Ioannis Mitliagkas
549
38
0
28 Oct 2021
Using a one dimensional parabolic model of the full-batch loss to
  estimate learning rates during training
Using a one dimensional parabolic model of the full-batch loss to estimate learning rates during training
Max Mutschler
Kevin Laube
A. Zell
ODL
215
1
0
31 Aug 2021
KOALA: A Kalman Optimization Algorithm with Loss Adaptivity
KOALA: A Kalman Optimization Algorithm with Loss Adaptivity
A. Davtyan
Sepehr Sameni
L. Cerkezi
Givi Meishvili
Adam Bielski
Paolo Favaro
ODL
475
5
0
07 Jul 2021
LRTuner: A Learning Rate Tuner for Deep Neural Networks
LRTuner: A Learning Rate Tuner for Deep Neural Networks
Nikhil Iyer
V. Thejas
Nipun Kwatra
Ramachandran Ramjee
Muthian Sivathanu
ODL
177
2
0
30 May 2021
Empirically explaining SGD from a line search perspective
Empirically explaining SGD from a line search perspectiveInternational Conference on Artificial Neural Networks (ICANN), 2021
Max Mutschler
A. Zell
ODLLRM
378
4
0
31 Mar 2021
How to decay your learning rate
How to decay your learning rate
Aitor Lewkowycz
365
30
0
23 Mar 2021
A Probabilistically Motivated Learning Rate Adaptation for Stochastic
  Optimization
A Probabilistically Motivated Learning Rate Adaptation for Stochastic Optimization
Filip de Roos
Carl Jidling
A. Wills
Thomas B. Schon
Philipp Hennig
135
3
0
22 Feb 2021
Self-Tuning Stochastic Optimization with Curvature-Aware Gradient
  Filtering
Self-Tuning Stochastic Optimization with Curvature-Aware Gradient Filtering
Ricky T. Q. Chen
Dami Choi
Lukas Balles
David Duvenaud
Philipp Hennig
ODL
247
6
0
09 Nov 2020
A straightforward line search approach on the expected empirical loss
  for stochastic deep learning problems
A straightforward line search approach on the expected empirical loss for stochastic deep learning problems
Max Mutschler
A. Zell
247
0
0
02 Oct 2020
Adaptive Hierarchical Hyper-gradient Descent
Adaptive Hierarchical Hyper-gradient Descent
Renlong Jie
Junbin Gao
A. Vasnev
Minh-Ngoc Tran
213
5
0
17 Aug 2020
MLR-SNet: Transferable LR Schedules for Heterogeneous Tasks
MLR-SNet: Transferable LR Schedules for Heterogeneous TasksIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020
Jun Shu
Yanwen Zhu
Qian Zhao
Zongben Xu
Deyu Meng
384
8
0
29 Jul 2020
Descending through a Crowded Valley - Benchmarking Deep Learning
  Optimizers
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers
Robin M. Schmidt
Frank Schneider
Philipp Hennig
ODL
930
195
0
03 Jul 2020
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and
  Interpolation
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation
Robert Mansel Gower
Othmane Sebbouh
Nicolas Loizou
501
96
0
18 Jun 2020
AdaS: Adaptive Scheduling of Stochastic Gradients
AdaS: Adaptive Scheduling of Stochastic Gradients
Mahdi S. Hosseini
Konstantinos N. Plataniotis
ODL
210
12
0
11 Jun 2020
Generalized Reinforcement Meta Learning for Few-Shot Optimization
Generalized Reinforcement Meta Learning for Few-Shot Optimization
R. Anantha
S. Pulman
Srinivas Chappidi
OffRL
146
3
0
04 May 2020
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast
  Convergence
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast ConvergenceInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Nicolas Loizou
Sharan Vaswani
I. Laradji
Damien Scieur
506
225
0
24 Feb 2020
Training Neural Networks for and by Interpolation
Training Neural Networks for and by InterpolationInternational Conference on Machine Learning (ICML), 2019
Leonard Berrada
Andrew Zisserman
M. P. Kumar
3DH
276
71
0
13 Jun 2019
Painless Stochastic Gradient: Interpolation, Line-Search, and
  Convergence Rates
Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence RatesNeural Information Processing Systems (NeurIPS), 2019
Sharan Vaswani
Aaron Mishkin
I. Laradji
Mark Schmidt
Gauthier Gidel
Damien Scieur
ODL
568
239
0
24 May 2019
Parabolic Approximation Line Search for DNNs
Parabolic Approximation Line Search for DNNs
Max Mutschler
A. Zell
ODL
422
21
0
28 Mar 2019
DeepOBS: A Deep Learning Optimizer Benchmark Suite
DeepOBS: A Deep Learning Optimizer Benchmark Suite
Frank Schneider
Lukas Balles
Philipp Hennig
ODL
492
79
0
13 Mar 2019
LOSSGRAD: automatic learning rate in gradient descent
LOSSGRAD: automatic learning rate in gradient descent
B. Wójcik
Lukasz Maziarka
Jacek Tabor
ODL
235
4
0
20 Feb 2019
Collaborative Sampling in Generative Adversarial Networks
Collaborative Sampling in Generative Adversarial Networks
Yuejiang Liu
Parth Kothari
Alexandre Alahi
TTA
453
17
0
02 Feb 2019
Step Size Matters in Deep Learning
Step Size Matters in Deep Learning
Kamil Nar
S. Shankar Sastry
122
53
0
22 May 2018
1
Page 1 of 1