Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2012.03636
Cited By
v1
v2
v3
v4 (latest)
Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent
7 December 2020
Kangqiao Liu
Liu Ziyin
Masakuni Ueda
MLT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent"
27 / 27 papers shown
LLM-Assisted Modeling of Semantic Web-Enabled Multi-Agents Systems with AJAN
Hacane Hechehouche
Andre Antakli
Matthias Klusch
LLMAG
3DV
249
1
0
08 Oct 2025
SGD as Free Energy Minimization: A Thermodynamic View on Neural Network Training
Ildus Sadrtdinov
Ivan Klimov
E. Lobacheva
Dmitry Vetrov
290
3
0
29 May 2025
Homeostatic Ubiquity of Hebbian Dynamics in Regularized Learning Rules
David Koplow
Tomaso Poggio
Liu Ziyin
MLT
FedML
403
1
0
23 May 2025
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training
International Conference on Learning Representations (ICLR), 2024
Zhanpeng Zhou
Mingze Wang
Yuchen Mao
Bingrui Li
Junchi Yan
AAML
601
14
0
14 Oct 2024
How Learning Dynamics Drive Adversarially Robust Generalization?
Yuelin Xu
Xiao Zhang
AAML
572
0
0
10 Oct 2024
Formation of Representations in Neural Networks
International Conference on Learning Representations (ICLR), 2024
Liu Ziyin
Isaac Chuang
Tomer Galanti
T. Poggio
603
13
0
03 Oct 2024
Accelerated Stochastic Min-Max Optimization Based on Bias-corrected Momentum
H. Cai
Sulaiman A. Alghunaim
Ali H.Sayed
503
1
0
18 Jun 2024
Do Parameters Reveal More than Loss for Membership Inference?
Anshuman Suri
Xiao Zhang
David Evans
MIACV
MIALM
AAML
524
9
0
17 Jun 2024
Towards Understanding Inductive Bias in Transformers: A View From Infinity
Itay Lavie
Guy Gur-Ari
Zohar Ringel
383
10
0
07 Feb 2024
Weight fluctuations in (deep) linear neural networks and a derivation of the inverse-variance flatness relation
Physical Review Research (Phys. Rev. Res.), 2023
Markus Gross
A. Raulf
Christoph Räth
611
0
0
23 Nov 2023
A Theoretical Analysis of Noise Geometry in Stochastic Gradient Descent
Mingze Wang
Lei Wu
521
5
0
01 Oct 2023
Exact Mean Square Linear Stability Analysis for SGD
Annual Conference Computational Learning Theory (COLT), 2023
Rotem Mulayoff
T. Michaeli
MLT
319
4
0
13 Jun 2023
Anti-Correlated Noise in Epoch-Based Stochastic Gradient Descent: Implications for Weight Variances in Flat Directions
Marcel Kühn
B. Rosenow
422
5
0
08 Jun 2023
Decentralized SGD and Average-direction SAM are Asymptotically Equivalent
International Conference on Machine Learning (ICML), 2023
Tongtian Zhu
Fengxiang He
Kaixuan Chen
Weilong Dai
Dacheng Tao
774
21
0
05 Jun 2023
Enhance Diffusion to Improve Robust Generalization
Knowledge Discovery and Data Mining (KDD), 2023
Jianhui Sun
Sanchit Sinha
Aidong Zhang
355
4
0
05 Jun 2023
The Implicit Regularization of Dynamical Stability in Stochastic Gradient Descent
International Conference on Machine Learning (ICML), 2023
Lei Wu
Weijie J. Su
MLT
376
39
0
27 May 2023
On a continuous time model of gradient descent dynamics and instability in deep learning
Mihaela Rosca
Yan Wu
Chongli Qin
Benoit Dherin
506
14
0
03 Feb 2023
On the Lipschitz Constant of Deep Networks and Double Descent
British Machine Vision Conference (BMVC), 2023
Matteo Gamba
Hossein Azizpour
Mårten Björkman
616
13
0
28 Jan 2023
On the Overlooked Structure of Stochastic Gradients
Neural Information Processing Systems (NeurIPS), 2022
Zeke Xie
Qian-Yuan Tang
Mingming Sun
P. Li
339
14
0
05 Dec 2022
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal States
Conference on Uncertainty in Artificial Intelligence (UAI), 2022
Ziqiao Wang
Yongyi Mao
383
12
0
19 Nov 2022
Exact Solutions of a Deep Linear Network
Neural Information Processing Systems (NeurIPS), 2022
Liu Ziyin
Botao Li
Xiangmin Meng
ODL
701
27
0
10 Feb 2022
Stochastic Neural Networks with Infinite Width are Deterministic
Liu Ziyin
Hanlin Zhang
Xiangming Meng
Yuting Lu
Eric P. Xing
Masakuni Ueda
335
3
0
30 Jan 2022
SGD with a Constant Large Learning Rate Can Converge to Local Maxima
Liu Ziyin
Botao Li
James B. Simon
Masakuni Ueda
316
10
0
25 Jul 2021
The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations, and Anomalous Diffusion
Neural Computation (Neural Comput.), 2021
D. Kunin
Javier Sagastuy-Breña
Lauren Gillespie
Eshed Margalit
Hidenori Tanaka
Surya Ganguli
Daniel L. K. Yamins
603
20
0
19 Jul 2021
Power-law escape rate of SGD
International Conference on Machine Learning (ICML), 2021
Takashi Mori
Liu Ziyin
Kangqiao Liu
Masakuni Ueda
265
25
0
20 May 2021
On the Distributional Properties of Adaptive Gradients
Conference on Uncertainty in Artificial Intelligence (UAI), 2021
Z. Zhiyi
Liu Ziyin
207
4
0
15 May 2021
Strength of Minibatch Noise in SGD
International Conference on Learning Representations (ICLR), 2021
Liu Ziyin
Kangqiao Liu
Takashi Mori
Masakuni Ueda
ODL
MLT
403
44
0
10 Feb 2021
1
Page 1 of 1