ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.06189
  4. Cited By
Stochasticity of Deterministic Gradient Descent: Large Learning Rate for
  Multiscale Objective Function
v1v2 (latest)

Stochasticity of Deterministic Gradient Descent: Large Learning Rate for Multiscale Objective Function

Neural Information Processing Systems (NeurIPS), 2020
14 February 2020
Lingkai Kong
Molei Tao
ArXiv (abs)PDFHTML

Papers citing "Stochasticity of Deterministic Gradient Descent: Large Learning Rate for Multiscale Objective Function"

15 / 15 papers shown
Improving Chain-of-Thought Efficiency for Autoregressive Image Generation
Improving Chain-of-Thought Efficiency for Autoregressive Image Generation
Zeqi Gu
Markos Georgopoulos
Xiaoliang Dai
Marjan Ghazvininejad
Chu Wang
...
Zecheng He
Zijian He
Jiawei Zhou
Abe Davis
Jialiang Wang
LRM
170
0
0
07 Oct 2025
Gradient Descent with Large Step Sizes: Chaos and Fractal Convergence Region
Gradient Descent with Large Step Sizes: Chaos and Fractal Convergence Region
Shuang Liang
Guido Montúfar
290
2
0
29 Sep 2025
High-Precision Modal Analysis of Multimode Waveguides from Amplitudes via Large-Step Nonconvex Optimization
High-Precision Modal Analysis of Multimode Waveguides from Amplitudes via Large-Step Nonconvex Optimization
Jingtong Li
Dongting Huang
Minhui Xiong
Mingzhi Li
206
0
0
16 Jul 2025
Leveraging chaotic transients in the training of artificial neural networks
Pedro Jiménez-González
Miguel C. Soriano
Lucas Lacasa
254
4
0
10 Jun 2025
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
Ruiqi Zhang
Jingfeng Wu
Licong Lin
Peter L. Bartlett
399
5
0
05 Apr 2025
The boundary of neural network trainability is fractal
The boundary of neural network trainability is fractal
Jascha Narain Sohl-Dickstein
219
16
0
09 Feb 2024
On the generalization of learning algorithms that do not converge
On the generalization of learning algorithms that do not convergeNeural Information Processing Systems (NeurIPS), 2022
N. Chandramoorthy
Andreas Loukas
Khashayar Gatmiry
Stefanie Jegelka
MLT
415
12
0
16 Aug 2022
Understanding the Generalization Benefit of Normalization Layers:
  Sharpness Reduction
Understanding the Generalization Benefit of Normalization Layers: Sharpness ReductionNeural Information Processing Systems (NeurIPS), 2022
Kaifeng Lyu
Zhiyuan Li
Sanjeev Arora
FAtt
455
92
0
14 Jun 2022
Chaotic Regularization and Heavy-Tailed Limits for Deterministic
  Gradient Descent
Chaotic Regularization and Heavy-Tailed Limits for Deterministic Gradient DescentNeural Information Processing Systems (NeurIPS), 2022
Soon Hoe Lim
Yijun Wan
Umut cSimcsekli
291
14
0
23 May 2022
Beyond the Quadratic Approximation: the Multiscale Structure of Neural
  Network Loss Landscapes
Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss LandscapesJournal of Machine Learning (JML), 2022
Chao Ma
D. Kunin
Lei Wu
Lexing Ying
269
39
0
24 Apr 2022
Gradients are Not All You Need
Gradients are Not All You Need
Luke Metz
C. Freeman
S. Schoenholz
Tal Kachman
307
105
0
10 Nov 2021
Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect
Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect
Yuqing Wang
Minshuo Chen
T. Zhao
Molei Tao
AI4CE
325
51
0
07 Oct 2021
Stochastic Training is Not Necessary for Generalization
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
500
81
0
29 Sep 2021
Generalization Bounds using Lower Tail Exponents in Stochastic
  Optimizers
Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers
Liam Hodgkinson
Umut Simsekli
Rajiv Khanna
Michael W. Mahoney
360
28
0
02 Aug 2021
A novel multi-scale loss function for classification problems in machine
  learning
A novel multi-scale loss function for classification problems in machine learningJournal of Computational Physics (JCP), 2021
L. Berlyand
Robert Creese
P. Jabin
222
4
0
04 Jun 2021
1
Page 1 of 1