Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.04740
Cited By
v1
v2
v3
v4
v5 (latest)
The Heavy-Tail Phenomenon in SGD
8 June 2020
Mert Gurbuzbalaban
Umut Simsekli
Lingjiong Zhu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"The Heavy-Tail Phenomenon in SGD"
39 / 39 papers shown
Title
Variational Learning Finds Flatter Solutions at the Edge of Stability
Avrajit Ghosh
Bai Cong
Rio Yokota
S. Ravishankar
Rongrong Wang
Molei Tao
Mohammad Emtiyaz Khan
Thomas Möllenhoff
MLT
16
0
0
15 Jun 2025
Complexity of normalized stochastic first-order methods with momentum under heavy-tailed noise
Chuan He
Zhaosong Lu
Defeng Sun
Zhanwang Deng
20
0
0
12 Jun 2025
Eigenspectrum Analysis of Neural Networks without Aspect Ratio Bias
Yuanzhe Hu
Kinshuk Goel
Vlad Killiakov
Yaoqing Yang
43
2
0
06 Jun 2025
Models of Heavy-Tailed Mechanistic Universality
Liam Hodgkinson
Zhichao Wang
Michael W. Mahoney
63
1
0
04 Jun 2025
Nonlinear Stochastic Gradient Descent and Heavy-tailed Noise: A Unified Framework and High-probability Guarantees
Aleksandar Armacki
Shuhua Yu
Pranay Sharma
Gauri Joshi
Dragana Bajović
D. Jakovetić
S. Kar
115
2
0
17 Oct 2024
Uniform Generalization Bounds on Data-Dependent Hypothesis Sets via PAC-Bayesian Theory on Random Sets
Benjamin Dupuis
Paul Viallard
George Deligiannidis
Umut Simsekli
129
5
0
26 Apr 2024
Emergence of heavy tails in homogenized stochastic gradient descent
Zhe Jiao
Martin Keller-Ressel
51
1
0
02 Feb 2024
Power-law Dynamic arising from machine learning
Wei Chen
Weitao Du
Zhi-Ming Ma
Qi Meng
21
0
0
16 Jun 2023
A Heavy-Tailed Algebra for Probabilistic Programming
Feynman T. Liang
Liam Hodgkinson
Michael W. Mahoney
67
3
0
15 Jun 2023
Type-II Saddles and Probabilistic Stability of Stochastic Gradient Descent
Liu Ziyin
Botao Li
Tomer Galanti
Masakuni Ueda
90
7
0
23 Mar 2023
Distributionally Robust Learning with Weakly Convex Losses: Convergence Rates and Finite-Sample Guarantees
Landi Zhu
Mert Gurbuzbalaban
A. Ruszczynski
74
7
0
16 Jan 2023
On the Overlooked Structure of Stochastic Gradients
Zeke Xie
Qian-Yuan Tang
Mingming Sun
P. Li
92
6
0
05 Dec 2022
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal States
Ziqiao Wang
Yongyi Mao
111
12
0
19 Nov 2022
Taming Fat-Tailed ("Heavier-Tailed'' with Potentially Infinite Variance) Noise in Federated Learning
Haibo Yang
Pei-Yuan Qiu
Jia Liu
FedML
74
12
0
03 Oct 2022
Tailoring to the Tails: Risk Measures for Fine-Grained Tail Sensitivity
Christian Frohlich
Robert C. Williamson
60
5
0
05 Aug 2022
Deep neural networks with dependent weights: Gaussian Process mixture limit, heavy tails, sparsity and compressibility
Hoileong Lee
Fadhel Ayed
Paul Jung
Juho Lee
Hongseok Yang
François Caron
102
10
0
17 May 2022
Heavy-Tail Phenomenon in Decentralized SGD
Mert Gurbuzbalaban
Yuanhan Hu
Umut Simsekli
Kun Yuan
Lingjiong Zhu
98
9
0
13 May 2022
Nonlinear gradient mappings and stochastic optimization: A general framework with applications to heavy-tail noise
D. Jakovetić
Dragana Bajović
Anit Kumar Sahu
S. Kar
Nemanja Milošević
Dusan Stamenkovic
57
14
0
06 Apr 2022
A Local Convergence Theory for the Stochastic Gradient Descent Method in Non-Convex Optimization With Non-isolated Local Minima
Tae-Eon Ko
Xiantao Li
55
2
0
21 Mar 2022
Anticorrelated Noise Injection for Improved Generalization
Antonio Orvieto
Hans Kersting
F. Proske
Francis R. Bach
Aurelien Lucchi
116
48
0
06 Feb 2022
Impact of classification difficulty on the weight matrices spectra in Deep Learning and application to early-stopping
Xuran Meng
Jianfeng Yao
91
7
0
26 Nov 2021
Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks
Tolga Birdal
Aaron Lou
Leonidas Guibas
Umut cSimcsekli
73
65
0
25 Nov 2021
A Unified and Refined Convergence Analysis for Non-Convex Decentralized Learning
Sulaiman A. Alghunaim
Kun Yuan
85
63
0
19 Oct 2021
SGD with a Constant Large Learning Rate Can Converge to Local Maxima
Liu Ziyin
Botao Li
James B. Simon
Masakuni Ueda
88
9
0
25 Jul 2021
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Amrit Singh Bedi
Anjaly Parayil
Junyu Zhang
Mengdi Wang
Alec Koppel
88
15
0
15 Jun 2021
Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms
A. Camuto
George Deligiannidis
Murat A. Erdogdu
Mert Gurbuzbalaban
Umut cSimcsekli
Lingjiong Zhu
77
29
0
09 Jun 2021
Heavy Tails in SGD and Compressibility of Overparametrized Neural Networks
Melih Barsbey
Romain Chor
Murat A. Erdogdu
Gaël Richard
Umut Simsekli
66
41
0
07 Jun 2021
Characterization of Generalizability of Spike Timing Dependent Plasticity trained Spiking Neural Networks
Biswadeep Chakraborty
Saibal Mukhopadhyay
125
15
0
31 May 2021
A Fully Spiking Hybrid Neural Network for Energy-Efficient Object Detection
Biswadeep Chakraborty
Xueyuan She
Saibal Mukhopadhyay
98
51
0
21 Apr 2021
Hessian Eigenspectra of More Realistic Nonlinear Models
Zhenyu Liao
Michael W. Mahoney
95
31
0
02 Mar 2021
Convergence Rates of Stochastic Gradient Descent under Infinite Noise Variance
Hongjian Wang
Mert Gurbuzbalaban
Lingjiong Zhu
Umut cSimcsekli
Murat A. Erdogdu
83
42
0
20 Feb 2021
Convergence of stochastic gradient descent schemes for Lojasiewicz-landscapes
Steffen Dereich
Sebastian Kassing
108
27
0
16 Feb 2021
Bayesian Neural Network Priors Revisited
Vincent Fortuin
Adrià Garriga-Alonso
Sebastian W. Ober
F. Wenzel
Gunnar Rätsch
Richard Turner
Mark van der Wilk
Laurence Aitchison
BDL
UQCV
133
141
0
12 Feb 2021
SGD in the Large: Average-case Analysis, Asymptotics, and Stepsize Criticality
Courtney Paquette
Kiwon Lee
Fabian Pedregosa
Elliot Paquette
59
35
0
08 Feb 2021
Robust, Accurate Stochastic Optimization for Variational Inference
Akash Kumar Dhaka
Alejandro Catalina
Michael Riis Andersen
Maans Magnusson
Jonathan H. Huggins
Aki Vehtari
71
34
0
01 Sep 2020
Hausdorff Dimension, Heavy Tails, and Generalization in Neural Networks
Umut Simsekli
Ozan Sener
George Deligiannidis
Murat A. Erdogdu
86
56
0
16 Jun 2020
Sharp Concentration Results for Heavy-Tailed Distributions
Milad Bakhshizadeh
A. Maleki
Víctor Pena
73
23
0
30 Mar 2020
Fractional Underdamped Langevin Dynamics: Retargeting SGD with Momentum under Heavy-Tailed Gradient Noise
Umut Simsekli
Lingjiong Zhu
Yee Whye Teh
Mert Gurbuzbalaban
82
50
0
13 Feb 2020
Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning
Charles H. Martin
Michael W. Mahoney
AI4CE
134
201
0
02 Oct 2018
1