ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.12022
  4. Cited By
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule

DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule

8 February 2023
Maor Ivgi
Oliver Hinder
Y. Carmon
    ODL
ArXivPDFHTML

Papers citing "DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule"

50 / 51 papers shown
Title
Reasoning without Regret
Reasoning without Regret
Tarun Chitra
OffRL
LRM
23
0
0
14 Apr 2025
Analysis of an Idealized Stochastic Polyak Method and its Application to Black-Box Model Distillation
Analysis of an Idealized Stochastic Polyak Method and its Application to Black-Box Model Distillation
Robert M. Gower
Guillaume Garrigos
Nicolas Loizou
Dimitris Oikonomou
Konstantin Mishchenko
Fabian Schaipp
31
0
0
02 Apr 2025
Benefits of Learning Rate Annealing for Tuning-Robustness in Stochastic Optimization
Amit Attia
Tomer Koren
54
1
0
13 Mar 2025
Towards hyperparameter-free optimization with differential privacy
Zhiqi Bu
Ruixuan Liu
24
1
0
02 Mar 2025
A Hessian-informed hyperparameter optimization for differential learning rate
A Hessian-informed hyperparameter optimization for differential learning rate
Shiyun Xu
Zhiqi Bu
Yiliang Zhang
Ian J. Barnett
39
1
0
12 Jan 2025
Temporal Context Consistency Above All: Enhancing Long-Term Anticipation
  by Learning and Enforcing Temporal Constraints
Temporal Context Consistency Above All: Enhancing Long-Term Anticipation by Learning and Enforcing Temporal Constraints
Alberto Maté
Mariella Dimiccoli
AI4TS
26
0
0
27 Dec 2024
Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate
Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate
Zhiqi Bu
Xiaomeng Jin
Bhanukiran Vinzamuri
Anil Ramakrishna
Kai-Wei Chang
V. Cevher
Mingyi Hong
MU
83
6
0
29 Oct 2024
Tuning-free coreset Markov chain Monte Carlo
Tuning-free coreset Markov chain Monte Carlo
Naitong Chen
Jonathan H. Huggins
Trevor Campbell
20
0
0
24 Oct 2024
Diffusing to the Top: Boost Graph Neural Networks with Minimal
  Hyperparameter Tuning
Diffusing to the Top: Boost Graph Neural Networks with Minimal Hyperparameter Tuning
Lequan Lin
Dai Shi
Andi Han
Zhiyong Wang
Junbin Gao
18
0
0
08 Oct 2024
Old Optimizer, New Norm: An Anthology
Old Optimizer, New Norm: An Anthology
Jeremy Bernstein
Laker Newhouse
ODL
36
12
0
30 Sep 2024
State-free Reinforcement Learning
State-free Reinforcement Learning
Mingyu Chen
Aldo Pacchiano
Xuezhou Zhang
56
0
0
27 Sep 2024
Stepping on the Edge: Curvature Aware Learning Rate Tuners
Stepping on the Edge: Curvature Aware Learning Rate Tuners
Vincent Roulet
Atish Agarwala
Jean-Bastien Grill
Grzegorz Swirszcz
Mathieu Blondel
Fabian Pedregosa
34
1
0
08 Jul 2024
Dreamguider: Improved Training free Diffusion-based Conditional
  Generation
Dreamguider: Improved Training free Diffusion-based Conditional Generation
Nithin Gopalakrishnan Nair
Vishal M. Patel
25
2
0
04 Jun 2024
Learning-Rate-Free Stochastic Optimization over Riemannian Manifolds
Learning-Rate-Free Stochastic Optimization over Riemannian Manifolds
Daniel Dodd
Louis Sharrock
Christopher Nemeth
25
0
0
04 Jun 2024
Adaptive Variance Reduction for Stochastic Optimization under Weaker
  Assumptions
Adaptive Variance Reduction for Stochastic Optimization under Weaker Assumptions
Wei Jiang
Sifan Yang
Yibo Wang
Lijun Zhang
23
1
0
04 Jun 2024
Fully Unconstrained Online Learning
Fully Unconstrained Online Learning
Ashok Cutkosky
Zakaria Mhammedi
CLL
27
1
0
30 May 2024
The High Line: Exact Risk and Learning Rate Curves of Stochastic
  Adaptive Learning Rate Algorithms
The High Line: Exact Risk and Learning Rate Curves of Stochastic Adaptive Learning Rate Algorithms
Elizabeth Collins-Woodfin
Inbar Seroussi
Begona García Malaxechebarría
Andrew W. Mackenzie
Elliot Paquette
Courtney Paquette
18
1
0
30 May 2024
Scalable Optimization in the Modular Norm
Scalable Optimization in the Modular Norm
Tim Large
Yang Liu
Minyoung Huh
Hyojin Bahng
Phillip Isola
Jeremy Bernstein
28
12
0
23 May 2024
Unleash Graph Neural Networks from Heavy Tuning
Unleash Graph Neural Networks from Heavy Tuning
Lequan Lin
Dai Shi
Andi Han
Zhiyong Wang
Junbin Gao
AI4CE
27
2
0
21 May 2024
Towards Stability of Parameter-free Optimization
Towards Stability of Parameter-free Optimization
Yijiang Pang
Shuyang Yu
Hoang Bao
Jiayu Zhou
16
1
0
07 May 2024
On the Last-Iterate Convergence of Shuffling Gradient Methods
On the Last-Iterate Convergence of Shuffling Gradient Methods
Zijian Liu
Zhengyuan Zhou
21
2
0
12 Mar 2024
The Price of Adaptivity in Stochastic Convex Optimization
The Price of Adaptivity in Stochastic Convex Optimization
Y. Carmon
Oliver Hinder
10
6
0
16 Feb 2024
Problem-Parameter-Free Decentralized Nonconvex Stochastic Optimization
Problem-Parameter-Free Decentralized Nonconvex Stochastic Optimization
Jiaxiang Li
Xuxing Chen
Shiqian Ma
Mingyi Hong
ODL
19
2
0
13 Feb 2024
Corridor Geometry in Gradient-Based Optimization
Corridor Geometry in Gradient-Based Optimization
Benoit Dherin
M. Rosca
25
0
0
13 Feb 2024
Tuning-Free Stochastic Optimization
Tuning-Free Stochastic Optimization
Ahmed Khaled
Chi Jin
22
7
0
12 Feb 2024
How Free is Parameter-Free Stochastic Optimization?
How Free is Parameter-Free Stochastic Optimization?
Amit Attia
Tomer Koren
ODL
23
4
0
05 Feb 2024
Discounted Adaptive Online Learning: Towards Better Regularization
Discounted Adaptive Online Learning: Towards Better Regularization
Zhiyu Zhang
David Bombara
Heng Yang
16
8
0
05 Feb 2024
MetaOptimize: A Framework for Optimizing Step Sizes and Other
  Meta-parameters
MetaOptimize: A Framework for Optimizing Step Sizes and Other Meta-parameters
Arsalan Sharifnassab
Saber Salehkaleybar
Richard Sutton
16
3
0
04 Feb 2024
Stochastic Weakly Convex Optimization Beyond Lipschitz Continuity
Stochastic Weakly Convex Optimization Beyond Lipschitz Continuity
Wenzhi Gao
Qi Deng
11
1
0
25 Jan 2024
Embedded Hyperspectral Band Selection with Adaptive Optimization for
  Image Semantic Segmentation
Embedded Hyperspectral Band Selection with Adaptive Optimization for Image Semantic Segmentation
Yaniv Zimmer
Oren Glickman
17
1
0
21 Jan 2024
Interpreting Adaptive Gradient Methods by Parameter Scaling for
  Learning-Rate-Free Optimization
Interpreting Adaptive Gradient Methods by Parameter Scaling for Learning-Rate-Free Optimization
Min-Kook Suh
Seung-Woo Seo
ODL
19
0
0
06 Jan 2024
SANIA: Polyak-type Optimization Framework Leads to Scale Invariant
  Stochastic Algorithms
SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms
Farshed Abdukhakimov
Chulu Xiang
Dmitry Kamzolov
Robert Mansel Gower
Martin Takáč
27
2
0
28 Dec 2023
On the convergence of adaptive first order methods: proximal gradient
  and alternating minimization algorithms
On the convergence of adaptive first order methods: proximal gradient and alternating minimization algorithms
P. Latafat
Andreas Themelis
Panagiotis Patrinos
13
11
0
30 Nov 2023
Locally Optimal Descent for Dynamic Stepsize Scheduling
Locally Optimal Descent for Dynamic Stepsize Scheduling
Gilad Yehudai
Alon Cohen
Amit Daniely
Yoel Drori
Tomer Koren
Mariano Schain
16
0
0
23 Nov 2023
A Whole New Ball Game: A Primal Accelerated Method for Matrix Games and
  Minimizing the Maximum of Smooth Functions
A Whole New Ball Game: A Primal Accelerated Method for Matrix Games and Minimizing the Maximum of Smooth Functions
Y. Carmon
A. Jambulapati
Yujia Jin
Aaron Sidford
11
4
0
17 Nov 2023
A simple uniformly optimal method without line search for convex
  optimization
A simple uniformly optimal method without line search for convex optimization
Tianjiao Li
Guanghui Lan
11
20
0
16 Oct 2023
Small-scale proxies for large-scale Transformer training instabilities
Small-scale proxies for large-scale Transformer training instabilities
Mitchell Wortsman
Peter J. Liu
Lechao Xiao
Katie Everett
A. Alemi
...
Jascha Narain Sohl-Dickstein
Kelvin Xu
Jaehoon Lee
Justin Gilmer
Simon Kornblith
35
80
0
25 Sep 2023
ELRA: Exponential learning rate adaption gradient descent optimization
  method
ELRA: Exponential learning rate adaption gradient descent optimization method
Alexander Kleinsorge
Stefan Kupper
Alexander Fauck
Felix Rothe
ODL
14
2
0
12 Sep 2023
Normalized Gradients for All
Normalized Gradients for All
Francesco Orabona
12
8
0
10 Aug 2023
Adaptive Proximal Gradient Method for Convex Optimization
Adaptive Proximal Gradient Method for Convex Optimization
Yura Malitsky
Konstantin Mishchenko
11
20
0
04 Aug 2023
Prodigy: An Expeditiously Adaptive Parameter-Free Learner
Prodigy: An Expeditiously Adaptive Parameter-Free Learner
Konstantin Mishchenko
Aaron Defazio
ODL
17
54
0
09 Jun 2023
Mechanic: A Learning Rate Tuner
Mechanic: A Learning Rate Tuner
Ashok Cutkosky
Aaron Defazio
Harsh Mehta
OffRL
11
15
0
31 May 2023
DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent
  Method
DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method
Ahmed Khaled
Konstantin Mishchenko
Chi Jin
ODL
14
22
0
25 May 2023
Stochastic Nonsmooth Convex Optimization with Heavy-Tailed Noises:
  High-Probability Bound, In-Expectation Rate and Initial Distance Adaptation
Stochastic Nonsmooth Convex Optimization with Heavy-Tailed Noises: High-Probability Bound, In-Expectation Rate and Initial Distance Adaptation
Zijian Liu
Zhengyuan Zhou
14
10
0
22 Mar 2023
Learning-Rate-Free Learning by D-Adaptation
Learning-Rate-Free Learning by D-Adaptation
Aaron Defazio
Konstantin Mishchenko
11
76
0
18 Jan 2023
On the distance between two neural networks and the stability of
  learning
On the distance between two neural networks and the stability of learning
Jeremy Bernstein
Arash Vahdat
Yisong Yue
Ming-Yu Liu
ODL
190
57
0
09 Feb 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
L4: Practical loss-based stepsize adaptation for deep learning
L4: Practical loss-based stepsize adaptation for deep learning
Michal Rolínek
Georg Martius
ODL
29
63
0
14 Feb 2018
Densely Connected Convolutional Networks
Densely Connected Convolutional Networks
Gao Huang
Zhuang Liu
L. V. D. van der Maaten
Kilian Q. Weinberger
PINN
3DV
247
36,237
0
25 Aug 2016
Linear Convergence of Gradient and Proximal-Gradient Methods Under the
  Polyak-Łojasiewicz Condition
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition
Hamed Karimi
J. Nutini
Mark W. Schmidt
119
1,194
0
16 Aug 2016
12
Next