ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.09133
  4. Cited By
The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations,
  and Anomalous Diffusion

The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations, and Anomalous Diffusion

19 July 2021
D. Kunin
Javier Sagastuy-Breña
Lauren Gillespie
Eshed Margalit
Hidenori Tanaka
Surya Ganguli
Daniel L. K. Yamins
ArXivPDFHTML

Papers citing "The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations, and Anomalous Diffusion"

13 / 13 papers shown
Title
Weight fluctuations in (deep) linear neural networks and a derivation of
  the inverse-variance flatness relation
Weight fluctuations in (deep) linear neural networks and a derivation of the inverse-variance flatness relation
Markus Gross
A. Raulf
Christoph Räth
32
0
0
23 Nov 2023
Correlated Noise in Epoch-Based Stochastic Gradient Descent:
  Implications for Weight Variances
Correlated Noise in Epoch-Based Stochastic Gradient Descent: Implications for Weight Variances
Marcel Kühn
B. Rosenow
11
3
0
08 Jun 2023
Stochastic Collapse: How Gradient Noise Attracts SGD Dynamics Towards
  Simpler Subnetworks
Stochastic Collapse: How Gradient Noise Attracts SGD Dynamics Towards Simpler Subnetworks
F. Chen
D. Kunin
Atsushi Yamamura
Surya Ganguli
21
26
0
07 Jun 2023
Machine learning in and out of equilibrium
Machine learning in and out of equilibrium
Shishir Adhikari
Alkan Kabakcciouglu
A. Strang
Deniz Yuret
M. Hinczewski
14
4
0
06 Jun 2023
Identifying Equivalent Training Dynamics
Identifying Equivalent Training Dynamics
William T. Redman
J. M. Bello-Rivas
M. Fonoberova
Ryan Mohr
Ioannis G. Kevrekidis
Igor Mezić
27
2
0
17 Feb 2023
Flatter, faster: scaling momentum for optimal speedup of SGD
Flatter, faster: scaling momentum for optimal speedup of SGD
Aditya Cowsik
T. Can
Paolo Glorioso
52
5
0
28 Oct 2022
Beyond the Quadratic Approximation: the Multiscale Structure of Neural
  Network Loss Landscapes
Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes
Chao Ma
D. Kunin
Lei Wu
Lexing Ying
25
27
0
24 Apr 2022
Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for
  Optimization
Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization
G. Luca
E. Silverstein
38
10
0
26 Jan 2022
Neural Network Weights Do Not Converge to Stationary Points: An
  Invariant Measure Perspective
Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective
J. Zhang
Haochuan Li
S. Sra
Ali Jadbabaie
66
9
0
12 Oct 2021
Stochastic Training is Not Necessary for Generalization
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
81
72
0
29 Sep 2021
Revisiting the Characteristics of Stochastic Gradient Noise and Dynamics
Revisiting the Characteristics of Stochastic Gradient Noise and Dynamics
Yixin Wu
Rui Luo
Chen Zhang
Jun Wang
Yaodong Yang
43
7
0
20 Sep 2021
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning
  Dynamics
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics
D. Kunin
Javier Sagastuy-Breña
Surya Ganguli
Daniel L. K. Yamins
Hidenori Tanaka
99
77
0
08 Dec 2020
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,886
0
15 Sep 2016
1