ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.22113
  4. Cited By
Where Do Large Learning Rates Lead Us?

Where Do Large Learning Rates Lead Us?

Neural Information Processing Systems (NeurIPS), 2024
29 October 2024
Ildus Sadrtdinov
M. Kodryan
Eduard Pokonechny
E. Lobacheva
Dmitry Vetrov
    AI4CE
ArXiv (abs)PDFHTMLGithub (7★)

Papers citing "Where Do Large Learning Rates Lead Us?"

6 / 6 papers shown
Can Training Dynamics of Scale-Invariant Neural Networks Be Explained by the Thermodynamics of an Ideal Gas?
Can Training Dynamics of Scale-Invariant Neural Networks Be Explained by the Thermodynamics of an Ideal Gas?
Ildus Sadrtdinov
E. Lobacheva
Ivan Klimov
Mikhail I. Katsnelson
Dmitry Vetrov
AI4CE
248
0
0
10 Nov 2025
How does the optimizer implicitly bias the model merging loss landscape?
How does the optimizer implicitly bias the model merging loss landscape?
Chenxiang Zhang
Alexander Theus
Damien Teney
Antonio Orvieto
Jun Pang
S. Mauw
MoMe
238
1
0
06 Oct 2025
Gradient Descent with Large Step Sizes: Chaos and Fractal Convergence Region
Gradient Descent with Large Step Sizes: Chaos and Fractal Convergence Region
Shuang Liang
Guido Montúfar
308
3
0
29 Sep 2025
SFT Doesn't Always Hurt General Capabilities: Revisiting Domain-Specific Fine-Tuning in LLMs
SFT Doesn't Always Hurt General Capabilities: Revisiting Domain-Specific Fine-Tuning in LLMs
J. Lin
Zhongruo Wang
Kun Qian
Tian Wang
Arvind Srinivasan
...
Weiqi Zhang
Sujay Sanghavi
C. L. P. Chen
Hyokun Yun
Lihong Li
CLL
443
7
0
25 Sep 2025
SGD as Free Energy Minimization: A Thermodynamic View on Neural Network Training
SGD as Free Energy Minimization: A Thermodynamic View on Neural Network Training
Ildus Sadrtdinov
Ivan Klimov
E. Lobacheva
Dmitry Vetrov
290
3
0
29 May 2025
What Neural Networks Memorize and Why: Discovering the Long Tail via
  Influence Estimation
What Neural Networks Memorize and Why: Discovering the Long Tail via Influence EstimationNeural Information Processing Systems (NeurIPS), 2020
Vitaly Feldman
Chiyuan Zhang
TDI
720
602
0
09 Aug 2020
1
Page 1 of 1