Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.12022
Cited By
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
8 February 2023
Maor Ivgi
Oliver Hinder
Y. Carmon
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule"
50 / 51 papers shown
Title
Reasoning without Regret
Tarun Chitra
OffRL
LRM
23
0
0
14 Apr 2025
Analysis of an Idealized Stochastic Polyak Method and its Application to Black-Box Model Distillation
Robert M. Gower
Guillaume Garrigos
Nicolas Loizou
Dimitris Oikonomou
Konstantin Mishchenko
Fabian Schaipp
31
0
0
02 Apr 2025
Benefits of Learning Rate Annealing for Tuning-Robustness in Stochastic Optimization
Amit Attia
Tomer Koren
54
1
0
13 Mar 2025
Towards hyperparameter-free optimization with differential privacy
Zhiqi Bu
Ruixuan Liu
24
1
0
02 Mar 2025
A Hessian-informed hyperparameter optimization for differential learning rate
Shiyun Xu
Zhiqi Bu
Yiliang Zhang
Ian J. Barnett
39
1
0
12 Jan 2025
Temporal Context Consistency Above All: Enhancing Long-Term Anticipation by Learning and Enforcing Temporal Constraints
Alberto Maté
Mariella Dimiccoli
AI4TS
26
0
0
27 Dec 2024
Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate
Zhiqi Bu
Xiaomeng Jin
Bhanukiran Vinzamuri
Anil Ramakrishna
Kai-Wei Chang
V. Cevher
Mingyi Hong
MU
83
6
0
29 Oct 2024
Tuning-free coreset Markov chain Monte Carlo
Naitong Chen
Jonathan H. Huggins
Trevor Campbell
20
0
0
24 Oct 2024
Diffusing to the Top: Boost Graph Neural Networks with Minimal Hyperparameter Tuning
Lequan Lin
Dai Shi
Andi Han
Zhiyong Wang
Junbin Gao
18
0
0
08 Oct 2024
Old Optimizer, New Norm: An Anthology
Jeremy Bernstein
Laker Newhouse
ODL
36
12
0
30 Sep 2024
State-free Reinforcement Learning
Mingyu Chen
Aldo Pacchiano
Xuezhou Zhang
56
0
0
27 Sep 2024
Stepping on the Edge: Curvature Aware Learning Rate Tuners
Vincent Roulet
Atish Agarwala
Jean-Bastien Grill
Grzegorz Swirszcz
Mathieu Blondel
Fabian Pedregosa
34
1
0
08 Jul 2024
Dreamguider: Improved Training free Diffusion-based Conditional Generation
Nithin Gopalakrishnan Nair
Vishal M. Patel
25
2
0
04 Jun 2024
Learning-Rate-Free Stochastic Optimization over Riemannian Manifolds
Daniel Dodd
Louis Sharrock
Christopher Nemeth
25
0
0
04 Jun 2024
Adaptive Variance Reduction for Stochastic Optimization under Weaker Assumptions
Wei Jiang
Sifan Yang
Yibo Wang
Lijun Zhang
23
1
0
04 Jun 2024
Fully Unconstrained Online Learning
Ashok Cutkosky
Zakaria Mhammedi
CLL
27
1
0
30 May 2024
The High Line: Exact Risk and Learning Rate Curves of Stochastic Adaptive Learning Rate Algorithms
Elizabeth Collins-Woodfin
Inbar Seroussi
Begona García Malaxechebarría
Andrew W. Mackenzie
Elliot Paquette
Courtney Paquette
18
1
0
30 May 2024
Scalable Optimization in the Modular Norm
Tim Large
Yang Liu
Minyoung Huh
Hyojin Bahng
Phillip Isola
Jeremy Bernstein
28
12
0
23 May 2024
Unleash Graph Neural Networks from Heavy Tuning
Lequan Lin
Dai Shi
Andi Han
Zhiyong Wang
Junbin Gao
AI4CE
27
2
0
21 May 2024
Towards Stability of Parameter-free Optimization
Yijiang Pang
Shuyang Yu
Hoang Bao
Jiayu Zhou
16
1
0
07 May 2024
On the Last-Iterate Convergence of Shuffling Gradient Methods
Zijian Liu
Zhengyuan Zhou
21
2
0
12 Mar 2024
The Price of Adaptivity in Stochastic Convex Optimization
Y. Carmon
Oliver Hinder
10
6
0
16 Feb 2024
Problem-Parameter-Free Decentralized Nonconvex Stochastic Optimization
Jiaxiang Li
Xuxing Chen
Shiqian Ma
Mingyi Hong
ODL
19
2
0
13 Feb 2024
Corridor Geometry in Gradient-Based Optimization
Benoit Dherin
M. Rosca
25
0
0
13 Feb 2024
Tuning-Free Stochastic Optimization
Ahmed Khaled
Chi Jin
22
7
0
12 Feb 2024
How Free is Parameter-Free Stochastic Optimization?
Amit Attia
Tomer Koren
ODL
23
4
0
05 Feb 2024
Discounted Adaptive Online Learning: Towards Better Regularization
Zhiyu Zhang
David Bombara
Heng Yang
16
8
0
05 Feb 2024
MetaOptimize: A Framework for Optimizing Step Sizes and Other Meta-parameters
Arsalan Sharifnassab
Saber Salehkaleybar
Richard Sutton
16
3
0
04 Feb 2024
Stochastic Weakly Convex Optimization Beyond Lipschitz Continuity
Wenzhi Gao
Qi Deng
11
1
0
25 Jan 2024
Embedded Hyperspectral Band Selection with Adaptive Optimization for Image Semantic Segmentation
Yaniv Zimmer
Oren Glickman
17
1
0
21 Jan 2024
Interpreting Adaptive Gradient Methods by Parameter Scaling for Learning-Rate-Free Optimization
Min-Kook Suh
Seung-Woo Seo
ODL
19
0
0
06 Jan 2024
SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms
Farshed Abdukhakimov
Chulu Xiang
Dmitry Kamzolov
Robert Mansel Gower
Martin Takáč
27
2
0
28 Dec 2023
On the convergence of adaptive first order methods: proximal gradient and alternating minimization algorithms
P. Latafat
Andreas Themelis
Panagiotis Patrinos
13
11
0
30 Nov 2023
Locally Optimal Descent for Dynamic Stepsize Scheduling
Gilad Yehudai
Alon Cohen
Amit Daniely
Yoel Drori
Tomer Koren
Mariano Schain
16
0
0
23 Nov 2023
A Whole New Ball Game: A Primal Accelerated Method for Matrix Games and Minimizing the Maximum of Smooth Functions
Y. Carmon
A. Jambulapati
Yujia Jin
Aaron Sidford
11
4
0
17 Nov 2023
A simple uniformly optimal method without line search for convex optimization
Tianjiao Li
Guanghui Lan
11
20
0
16 Oct 2023
Small-scale proxies for large-scale Transformer training instabilities
Mitchell Wortsman
Peter J. Liu
Lechao Xiao
Katie Everett
A. Alemi
...
Jascha Narain Sohl-Dickstein
Kelvin Xu
Jaehoon Lee
Justin Gilmer
Simon Kornblith
35
80
0
25 Sep 2023
ELRA: Exponential learning rate adaption gradient descent optimization method
Alexander Kleinsorge
Stefan Kupper
Alexander Fauck
Felix Rothe
ODL
14
2
0
12 Sep 2023
Normalized Gradients for All
Francesco Orabona
12
8
0
10 Aug 2023
Adaptive Proximal Gradient Method for Convex Optimization
Yura Malitsky
Konstantin Mishchenko
11
20
0
04 Aug 2023
Prodigy: An Expeditiously Adaptive Parameter-Free Learner
Konstantin Mishchenko
Aaron Defazio
ODL
17
54
0
09 Jun 2023
Mechanic: A Learning Rate Tuner
Ashok Cutkosky
Aaron Defazio
Harsh Mehta
OffRL
11
15
0
31 May 2023
DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method
Ahmed Khaled
Konstantin Mishchenko
Chi Jin
ODL
14
22
0
25 May 2023
Stochastic Nonsmooth Convex Optimization with Heavy-Tailed Noises: High-Probability Bound, In-Expectation Rate and Initial Distance Adaptation
Zijian Liu
Zhengyuan Zhou
14
10
0
22 Mar 2023
Learning-Rate-Free Learning by D-Adaptation
Aaron Defazio
Konstantin Mishchenko
11
76
0
18 Jan 2023
On the distance between two neural networks and the stability of learning
Jeremy Bernstein
Arash Vahdat
Yisong Yue
Ming-Yu Liu
ODL
190
57
0
09 Feb 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
L4: Practical loss-based stepsize adaptation for deep learning
Michal Rolínek
Georg Martius
ODL
29
63
0
14 Feb 2018
Densely Connected Convolutional Networks
Gao Huang
Zhuang Liu
L. V. D. van der Maaten
Kilian Q. Weinberger
PINN
3DV
247
36,237
0
25 Aug 2016
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition
Hamed Karimi
J. Nutini
Mark W. Schmidt
119
1,194
0
16 Aug 2016
1
2
Next