Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.04030
Cited By
High-dimensional limit theorems for SGD: Effective dynamics and critical scaling
8 June 2022
Gerard Ben Arous
Reza Gheissari
Aukosh Jagannath
Re-assign community
ArXiv
PDF
HTML
Papers citing
"High-dimensional limit theorems for SGD: Effective dynamics and critical scaling"
45 / 45 papers shown
Title
A Theory of Initialisation's Impact on Specialisation
Devon Jarvis
Sebastian Lee
Clémentine Dominé
Andrew M. Saxe
Stefano Sarao Mannelli
CLL
67
2
0
04 Mar 2025
Understanding the Generalization Error of Markov algorithms through Poissonization
Benjamin Dupuis
Maxime Haddouche
George Deligiannidis
Umut Simsekli
42
0
0
11 Feb 2025
Low-dimensional Functions are Efficiently Learnable under Randomly Biased Distributions
Elisabetta Cornacchia
Dan Mikulincer
Elchanan Mossel
54
0
0
10 Feb 2025
A theoretical perspective on mode collapse in variational inference
Roman Soletskyi
Marylou Gabrié
Bruno Loureiro
DRL
24
2
0
17 Oct 2024
Shallow diffusion networks provably learn hidden low-dimensional structure
Nicholas M. Boffi
Arthur Jacot
Stephen Tu
Ingvar M. Ziemann
DiffM
29
1
0
15 Oct 2024
A spring-block theory of feature learning in deep neural networks
Chengzhi Shi
Liming Pan
Ivan Dokmanić
AI4CE
36
1
0
28 Jul 2024
Stochastic Differential Equations models for Least-Squares Stochastic Gradient Descent
Adrien Schertzer
Loucas Pillaud-Vivien
16
0
0
02 Jul 2024
Online Learning and Information Exponents: On The Importance of Batch size, and Time/Complexity Tradeoffs
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Bruno Loureiro
Luca Pesce
Ludovic Stephan
43
1
0
04 Jun 2024
Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks
Stefano Sarao Mannelli
Yaraslau Ivashinka
Andrew M. Saxe
Luca Saglietti
24
2
0
03 Jun 2024
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit
Jason D. Lee
Kazusato Oko
Taiji Suzuki
Denny Wu
MLT
87
21
0
03 Jun 2024
The High Line: Exact Risk and Learning Rate Curves of Stochastic Adaptive Learning Rate Algorithms
Elizabeth Collins-Woodfin
Inbar Seroussi
Begona García Malaxechebarría
Andrew W. Mackenzie
Elliot Paquette
Courtney Paquette
18
1
0
30 May 2024
Bias in Motion: Theoretical Insights into the Dynamics of Bias in SGD Training
Anchit Jain
Rozhin Nobahari
A. Baratin
Stefano Sarao Mannelli
32
4
0
28 May 2024
Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
Zhenfeng Tu
Santiago Aranguri
Arthur Jacot
24
8
0
27 May 2024
High dimensional analysis reveals conservative sharpening and a stochastic edge of stability
Atish Agarwala
Jeffrey Pennington
38
3
0
30 Apr 2024
Sliding down the stairs: how correlated latent variables accelerate learning with neural networks
Lorenzo Bardone
Sebastian Goldt
30
7
0
12 Apr 2024
Faster Convergence for Transformer Fine-tuning with Line Search Methods
Philip Kenneweg
Leonardo Galli
Tristan Kenneweg
Barbara Hammer
ODL
20
2
0
27 Mar 2024
Fundamental limits of Non-Linear Low-Rank Matrix Estimation
Pierre Mergny
Justin Ko
Florent Krzakala
Lenka Zdeborová
18
1
0
07 Mar 2024
From Zero to Hero: How local curvature at artless initial conditions leads away from bad minima
Tony Bonnaire
Giulio Biroli
C. Cammarota
30
0
0
04 Mar 2024
Stochastic Gradient Flow Dynamics of Test Risk and its Exact Solution for Weak Features
Rodrigo Veiga
Anastasia Remizova
Nicolas Macris
27
0
0
12 Feb 2024
Asymptotics of feature learning in two-layer networks after one gradient-step
Hugo Cui
Luca Pesce
Yatin Dandi
Florent Krzakala
Yue M. Lu
Lenka Zdeborová
Bruno Loureiro
MLT
44
16
0
07 Feb 2024
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
Yatin Dandi
Emanuele Troiani
Luca Arnaboldi
Luca Pesce
Lenka Zdeborová
Florent Krzakala
MLT
59
25
0
05 Feb 2024
Enhancing selectivity using Wasserstein distance based reweighing
Pratik Worah
OOD
48
0
0
21 Jan 2024
Should Under-parameterized Student Networks Copy or Average Teacher Weights?
Berfin Simsek
Amire Bendjeddou
W. Gerstner
Johanni Brea
25
6
0
03 Nov 2023
High-dimensional SGD aligns with emerging outlier eigenspaces
Gerard Ben Arous
Reza Gheissari
Jiaoyang Huang
Aukosh Jagannath
14
14
0
04 Oct 2023
Symmetric Single Index Learning
Aaron Zweig
Joan Bruna
MLT
23
2
0
03 Oct 2023
Beyond Log-Concavity: Theory and Algorithm for Sum-Log-Concave Optimization
Mastane Achab
11
1
0
26 Sep 2023
SGD Finds then Tunes Features in Two-Layer Neural Networks with near-Optimal Sample Complexity: A Case Study in the XOR problem
Margalit Glasgow
MLT
69
13
0
26 Sep 2023
On the different regimes of Stochastic Gradient Descent
Antonio Sclocchi
M. Wyart
14
17
0
19 Sep 2023
Stochastic Gradient Descent outperforms Gradient Descent in recovering a high-dimensional signal in a glassy energy landscape
Persia Jana Kamali
Pierfrancesco Urbani
11
6
0
09 Sep 2023
On Single Index Models beyond Gaussian Data
Joan Bruna
Loucas Pillaud-Vivien
Aaron Zweig
8
10
0
28 Jul 2023
The Underlying Scaling Laws and Universal Statistical Structure of Complex Datasets
Noam Levi
Yaron Oz
24
4
0
26 Jun 2023
A Nested Matrix-Tensor Model for Noisy Multi-view Clustering
M. Seddik
Mastane Achab
Henrique X. Goulart
Merouane Debbah
9
1
0
31 May 2023
Bottleneck Structure in Learned Features: Low-Dimension vs Regularity Tradeoff
Arthur Jacot
MLT
14
13
0
30 May 2023
How Two-Layer Neural Networks Learn, One (Giant) Step at a Time
Yatin Dandi
Florent Krzakala
Bruno Loureiro
Luca Pesce
Ludovic Stephan
MLT
32
25
0
29 May 2023
Escaping mediocrity: how two-layer networks learn hard generalized linear models with SGD
Luca Arnaboldi
Florent Krzakala
Bruno Loureiro
Ludovic Stephan
MLT
23
3
0
29 May 2023
Hotelling Deflation on Large Symmetric Spiked Tensors
M. Seddik
J. H. D. M. Goulart
M. Guillaud
11
1
0
20 Apr 2023
High-dimensional limit of one-pass SGD on least squares
Elizabeth Collins-Woodfin
Elliot Paquette
14
3
0
13 Apr 2023
High-dimensional scaling limits and fluctuations of online least-squares SGD with smooth covariance
Krishnakumar Balasubramanian
Promit Ghosal
Ye He
23
5
0
03 Apr 2023
Gradient flow on extensive-rank positive semi-definite matrix denoising
A. Bodin
N. Macris
11
3
0
16 Mar 2023
Statistical Inference for Linear Functionals of Online SGD in High-dimensional Linear Regression
Bhavya Agrawalla
Krishnakumar Balasubramanian
Promit Ghosal
23
2
0
20 Feb 2023
From high-dimensional & mean-field dynamics to dimensionless ODEs: A unifying approach to SGD in two-layers networks
Luca Arnaboldi
Ludovic Stephan
Florent Krzakala
Bruno Loureiro
MLT
25
31
0
12 Feb 2023
Learning Single-Index Models with Shallow Neural Networks
A. Bietti
Joan Bruna
Clayton Sanford
M. Song
160
67
0
27 Oct 2022
Rigorous dynamical mean field theory for stochastic gradient descent methods
Cédric Gerbelot
Emanuele Troiani
Francesca Mignacco
Florent Krzakala
Lenka Zdeborova
15
26
0
12 Oct 2022
Understanding Edge-of-Stability Training Dynamics with a Minimalist Example
Xingyu Zhu
Zixuan Wang
Xiang Wang
Mo Zhou
Rong Ge
64
35
0
07 Oct 2022
Understanding Gradient Descent on Edge of Stability in Deep Learning
Sanjeev Arora
Zhiyuan Li
A. Panigrahi
MLT
75
88
0
19 May 2022
1