ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.03636
  4. Cited By
Noise and Fluctuation of Finite Learning Rate Stochastic Gradient
  Descent
v1v2v3v4 (latest)

Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent

7 December 2020
Kangqiao Liu
Liu Ziyin
Masakuni Ueda
    MLT
ArXiv (abs)PDFHTML

Papers citing "Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent"

27 / 27 papers shown
LLM-Assisted Modeling of Semantic Web-Enabled Multi-Agents Systems with AJAN
LLM-Assisted Modeling of Semantic Web-Enabled Multi-Agents Systems with AJAN
Hacane Hechehouche
Andre Antakli
Matthias Klusch
LLMAG3DV
249
1
0
08 Oct 2025
SGD as Free Energy Minimization: A Thermodynamic View on Neural Network Training
SGD as Free Energy Minimization: A Thermodynamic View on Neural Network Training
Ildus Sadrtdinov
Ivan Klimov
E. Lobacheva
Dmitry Vetrov
290
3
0
29 May 2025
Homeostatic Ubiquity of Hebbian Dynamics in Regularized Learning Rules
Homeostatic Ubiquity of Hebbian Dynamics in Regularized Learning Rules
David Koplow
Tomaso Poggio
Liu Ziyin
MLTFedML
403
1
0
23 May 2025
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in TrainingInternational Conference on Learning Representations (ICLR), 2024
Zhanpeng Zhou
Mingze Wang
Yuchen Mao
Bingrui Li
Junchi Yan
AAML
601
14
0
14 Oct 2024
How Learning Dynamics Drive Adversarially Robust Generalization?
How Learning Dynamics Drive Adversarially Robust Generalization?
Yuelin Xu
Xiao Zhang
AAML
572
0
0
10 Oct 2024
Formation of Representations in Neural Networks
Formation of Representations in Neural NetworksInternational Conference on Learning Representations (ICLR), 2024
Liu Ziyin
Isaac Chuang
Tomer Galanti
T. Poggio
603
13
0
03 Oct 2024
Accelerated Stochastic Min-Max Optimization Based on Bias-corrected Momentum
Accelerated Stochastic Min-Max Optimization Based on Bias-corrected Momentum
H. Cai
Sulaiman A. Alghunaim
Ali H.Sayed
503
1
0
18 Jun 2024
Do Parameters Reveal More than Loss for Membership Inference?
Do Parameters Reveal More than Loss for Membership Inference?
Anshuman Suri
Xiao Zhang
David Evans
MIACVMIALMAAML
524
9
0
17 Jun 2024
Towards Understanding Inductive Bias in Transformers: A View From
  Infinity
Towards Understanding Inductive Bias in Transformers: A View From Infinity
Itay Lavie
Guy Gur-Ari
Zohar Ringel
383
10
0
07 Feb 2024
Weight fluctuations in (deep) linear neural networks and a derivation of
  the inverse-variance flatness relation
Weight fluctuations in (deep) linear neural networks and a derivation of the inverse-variance flatness relationPhysical Review Research (Phys. Rev. Res.), 2023
Markus Gross
A. Raulf
Christoph Räth
611
0
0
23 Nov 2023
A Theoretical Analysis of Noise Geometry in Stochastic Gradient Descent
A Theoretical Analysis of Noise Geometry in Stochastic Gradient Descent
Mingze Wang
Lei Wu
521
5
0
01 Oct 2023
Exact Mean Square Linear Stability Analysis for SGD
Exact Mean Square Linear Stability Analysis for SGDAnnual Conference Computational Learning Theory (COLT), 2023
Rotem Mulayoff
T. Michaeli
MLT
319
4
0
13 Jun 2023
Anti-Correlated Noise in Epoch-Based Stochastic Gradient Descent: Implications for Weight Variances in Flat Directions
Anti-Correlated Noise in Epoch-Based Stochastic Gradient Descent: Implications for Weight Variances in Flat Directions
Marcel Kühn
B. Rosenow
422
5
0
08 Jun 2023
Decentralized SGD and Average-direction SAM are Asymptotically
  Equivalent
Decentralized SGD and Average-direction SAM are Asymptotically EquivalentInternational Conference on Machine Learning (ICML), 2023
Tongtian Zhu
Fengxiang He
Kaixuan Chen
Weilong Dai
Dacheng Tao
774
21
0
05 Jun 2023
Enhance Diffusion to Improve Robust Generalization
Enhance Diffusion to Improve Robust GeneralizationKnowledge Discovery and Data Mining (KDD), 2023
Jianhui Sun
Sanchit Sinha
Aidong Zhang
355
4
0
05 Jun 2023
The Implicit Regularization of Dynamical Stability in Stochastic
  Gradient Descent
The Implicit Regularization of Dynamical Stability in Stochastic Gradient DescentInternational Conference on Machine Learning (ICML), 2023
Lei Wu
Weijie J. Su
MLT
376
39
0
27 May 2023
On a continuous time model of gradient descent dynamics and instability
  in deep learning
On a continuous time model of gradient descent dynamics and instability in deep learning
Mihaela Rosca
Yan Wu
Chongli Qin
Benoit Dherin
506
14
0
03 Feb 2023
On the Lipschitz Constant of Deep Networks and Double Descent
On the Lipschitz Constant of Deep Networks and Double DescentBritish Machine Vision Conference (BMVC), 2023
Matteo Gamba
Hossein Azizpour
Mårten Björkman
616
13
0
28 Jan 2023
On the Overlooked Structure of Stochastic Gradients
On the Overlooked Structure of Stochastic GradientsNeural Information Processing Systems (NeurIPS), 2022
Zeke Xie
Qian-Yuan Tang
Mingming Sun
P. Li
339
14
0
05 Dec 2022
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of
  SGD via Training Trajectories and via Terminal States
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal StatesConference on Uncertainty in Artificial Intelligence (UAI), 2022
Ziqiao Wang
Yongyi Mao
383
12
0
19 Nov 2022
Exact Solutions of a Deep Linear Network
Exact Solutions of a Deep Linear NetworkNeural Information Processing Systems (NeurIPS), 2022
Liu Ziyin
Botao Li
Xiangmin Meng
ODL
701
27
0
10 Feb 2022
Stochastic Neural Networks with Infinite Width are Deterministic
Stochastic Neural Networks with Infinite Width are Deterministic
Liu Ziyin
Hanlin Zhang
Xiangming Meng
Yuting Lu
Eric P. Xing
Masakuni Ueda
335
3
0
30 Jan 2022
SGD with a Constant Large Learning Rate Can Converge to Local Maxima
SGD with a Constant Large Learning Rate Can Converge to Local Maxima
Liu Ziyin
Botao Li
James B. Simon
Masakuni Ueda
316
10
0
25 Jul 2021
The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations,
  and Anomalous Diffusion
The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations, and Anomalous DiffusionNeural Computation (Neural Comput.), 2021
D. Kunin
Javier Sagastuy-Breña
Lauren Gillespie
Eshed Margalit
Hidenori Tanaka
Surya Ganguli
Daniel L. K. Yamins
603
20
0
19 Jul 2021
Power-law escape rate of SGD
Power-law escape rate of SGDInternational Conference on Machine Learning (ICML), 2021
Takashi Mori
Liu Ziyin
Kangqiao Liu
Masakuni Ueda
265
25
0
20 May 2021
On the Distributional Properties of Adaptive Gradients
On the Distributional Properties of Adaptive GradientsConference on Uncertainty in Artificial Intelligence (UAI), 2021
Z. Zhiyi
Liu Ziyin
207
4
0
15 May 2021
Strength of Minibatch Noise in SGD
Strength of Minibatch Noise in SGDInternational Conference on Learning Representations (ICLR), 2021
Liu Ziyin
Kangqiao Liu
Takashi Mori
Masakuni Ueda
ODLMLT
403
44
0
10 Feb 2021
1
Page 1 of 1