ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.04030
  4. Cited By
High-dimensional limit theorems for SGD: Effective dynamics and critical
  scaling

High-dimensional limit theorems for SGD: Effective dynamics and critical scaling

8 June 2022
Gerard Ben Arous
Reza Gheissari
Aukosh Jagannath
ArXivPDFHTML

Papers citing "High-dimensional limit theorems for SGD: Effective dynamics and critical scaling"

45 / 45 papers shown
Title
A Theory of Initialisation's Impact on Specialisation
Devon Jarvis
Sebastian Lee
Clémentine Dominé
Andrew M. Saxe
Stefano Sarao Mannelli
CLL
67
2
0
04 Mar 2025
Understanding the Generalization Error of Markov algorithms through Poissonization
Understanding the Generalization Error of Markov algorithms through Poissonization
Benjamin Dupuis
Maxime Haddouche
George Deligiannidis
Umut Simsekli
42
0
0
11 Feb 2025
Low-dimensional Functions are Efficiently Learnable under Randomly Biased Distributions
Elisabetta Cornacchia
Dan Mikulincer
Elchanan Mossel
54
0
0
10 Feb 2025
A theoretical perspective on mode collapse in variational inference
A theoretical perspective on mode collapse in variational inference
Roman Soletskyi
Marylou Gabrié
Bruno Loureiro
DRL
24
2
0
17 Oct 2024
Shallow diffusion networks provably learn hidden low-dimensional
  structure
Shallow diffusion networks provably learn hidden low-dimensional structure
Nicholas M. Boffi
Arthur Jacot
Stephen Tu
Ingvar M. Ziemann
DiffM
29
1
0
15 Oct 2024
A spring-block theory of feature learning in deep neural networks
A spring-block theory of feature learning in deep neural networks
Chengzhi Shi
Liming Pan
Ivan Dokmanić
AI4CE
36
1
0
28 Jul 2024
Stochastic Differential Equations models for Least-Squares Stochastic
  Gradient Descent
Stochastic Differential Equations models for Least-Squares Stochastic Gradient Descent
Adrien Schertzer
Loucas Pillaud-Vivien
16
0
0
02 Jul 2024
Online Learning and Information Exponents: On The Importance of Batch
  size, and Time/Complexity Tradeoffs
Online Learning and Information Exponents: On The Importance of Batch size, and Time/Complexity Tradeoffs
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Bruno Loureiro
Luca Pesce
Ludovic Stephan
43
1
0
04 Jun 2024
Tilting the Odds at the Lottery: the Interplay of Overparameterisation
  and Curricula in Neural Networks
Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks
Stefano Sarao Mannelli
Yaraslau Ivashinka
Andrew M. Saxe
Luca Saglietti
27
2
0
03 Jun 2024
Neural network learns low-dimensional polynomials with SGD near the
  information-theoretic limit
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit
Jason D. Lee
Kazusato Oko
Taiji Suzuki
Denny Wu
MLT
87
21
0
03 Jun 2024
The High Line: Exact Risk and Learning Rate Curves of Stochastic
  Adaptive Learning Rate Algorithms
The High Line: Exact Risk and Learning Rate Curves of Stochastic Adaptive Learning Rate Algorithms
Elizabeth Collins-Woodfin
Inbar Seroussi
Begona García Malaxechebarría
Andrew W. Mackenzie
Elliot Paquette
Courtney Paquette
18
1
0
30 May 2024
Bias in Motion: Theoretical Insights into the Dynamics of Bias in SGD
  Training
Bias in Motion: Theoretical Insights into the Dynamics of Bias in SGD Training
Anchit Jain
Rozhin Nobahari
A. Baratin
Stefano Sarao Mannelli
32
4
0
28 May 2024
Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
Zhenfeng Tu
Santiago Aranguri
Arthur Jacot
24
8
0
27 May 2024
High dimensional analysis reveals conservative sharpening and a stochastic edge of stability
High dimensional analysis reveals conservative sharpening and a stochastic edge of stability
Atish Agarwala
Jeffrey Pennington
38
3
0
30 Apr 2024
Sliding down the stairs: how correlated latent variables accelerate
  learning with neural networks
Sliding down the stairs: how correlated latent variables accelerate learning with neural networks
Lorenzo Bardone
Sebastian Goldt
30
7
0
12 Apr 2024
Faster Convergence for Transformer Fine-tuning with Line Search Methods
Faster Convergence for Transformer Fine-tuning with Line Search Methods
Philip Kenneweg
Leonardo Galli
Tristan Kenneweg
Barbara Hammer
ODL
22
2
0
27 Mar 2024
Fundamental limits of Non-Linear Low-Rank Matrix Estimation
Fundamental limits of Non-Linear Low-Rank Matrix Estimation
Pierre Mergny
Justin Ko
Florent Krzakala
Lenka Zdeborová
20
1
0
07 Mar 2024
From Zero to Hero: How local curvature at artless initial conditions
  leads away from bad minima
From Zero to Hero: How local curvature at artless initial conditions leads away from bad minima
Tony Bonnaire
Giulio Biroli
C. Cammarota
32
0
0
04 Mar 2024
Stochastic Gradient Flow Dynamics of Test Risk and its Exact Solution
  for Weak Features
Stochastic Gradient Flow Dynamics of Test Risk and its Exact Solution for Weak Features
Rodrigo Veiga
Anastasia Remizova
Nicolas Macris
27
0
0
12 Feb 2024
Asymptotics of feature learning in two-layer networks after one
  gradient-step
Asymptotics of feature learning in two-layer networks after one gradient-step
Hugo Cui
Luca Pesce
Yatin Dandi
Florent Krzakala
Yue M. Lu
Lenka Zdeborová
Bruno Loureiro
MLT
44
16
0
07 Feb 2024
The Benefits of Reusing Batches for Gradient Descent in Two-Layer
  Networks: Breaking the Curse of Information and Leap Exponents
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
Yatin Dandi
Emanuele Troiani
Luca Arnaboldi
Luca Pesce
Lenka Zdeborová
Florent Krzakala
MLT
59
25
0
05 Feb 2024
Enhancing selectivity using Wasserstein distance based reweighing
Enhancing selectivity using Wasserstein distance based reweighing
Pratik Worah
OOD
48
0
0
21 Jan 2024
Should Under-parameterized Student Networks Copy or Average Teacher
  Weights?
Should Under-parameterized Student Networks Copy or Average Teacher Weights?
Berfin Simsek
Amire Bendjeddou
W. Gerstner
Johanni Brea
25
6
0
03 Nov 2023
High-dimensional SGD aligns with emerging outlier eigenspaces
High-dimensional SGD aligns with emerging outlier eigenspaces
Gerard Ben Arous
Reza Gheissari
Jiaoyang Huang
Aukosh Jagannath
14
14
0
04 Oct 2023
Symmetric Single Index Learning
Symmetric Single Index Learning
Aaron Zweig
Joan Bruna
MLT
23
2
0
03 Oct 2023
Beyond Log-Concavity: Theory and Algorithm for Sum-Log-Concave
  Optimization
Beyond Log-Concavity: Theory and Algorithm for Sum-Log-Concave Optimization
Mastane Achab
11
1
0
26 Sep 2023
SGD Finds then Tunes Features in Two-Layer Neural Networks with
  near-Optimal Sample Complexity: A Case Study in the XOR problem
SGD Finds then Tunes Features in Two-Layer Neural Networks with near-Optimal Sample Complexity: A Case Study in the XOR problem
Margalit Glasgow
MLT
69
13
0
26 Sep 2023
On the different regimes of Stochastic Gradient Descent
On the different regimes of Stochastic Gradient Descent
Antonio Sclocchi
M. Wyart
16
17
0
19 Sep 2023
Stochastic Gradient Descent outperforms Gradient Descent in recovering a
  high-dimensional signal in a glassy energy landscape
Stochastic Gradient Descent outperforms Gradient Descent in recovering a high-dimensional signal in a glassy energy landscape
Persia Jana Kamali
Pierfrancesco Urbani
11
6
0
09 Sep 2023
On Single Index Models beyond Gaussian Data
On Single Index Models beyond Gaussian Data
Joan Bruna
Loucas Pillaud-Vivien
Aaron Zweig
8
10
0
28 Jul 2023
The Underlying Scaling Laws and Universal Statistical Structure of
  Complex Datasets
The Underlying Scaling Laws and Universal Statistical Structure of Complex Datasets
Noam Levi
Yaron Oz
27
4
0
26 Jun 2023
A Nested Matrix-Tensor Model for Noisy Multi-view Clustering
A Nested Matrix-Tensor Model for Noisy Multi-view Clustering
M. Seddik
Mastane Achab
Henrique X. Goulart
Merouane Debbah
9
1
0
31 May 2023
Bottleneck Structure in Learned Features: Low-Dimension vs Regularity
  Tradeoff
Bottleneck Structure in Learned Features: Low-Dimension vs Regularity Tradeoff
Arthur Jacot
MLT
16
13
0
30 May 2023
How Two-Layer Neural Networks Learn, One (Giant) Step at a Time
How Two-Layer Neural Networks Learn, One (Giant) Step at a Time
Yatin Dandi
Florent Krzakala
Bruno Loureiro
Luca Pesce
Ludovic Stephan
MLT
32
25
0
29 May 2023
Escaping mediocrity: how two-layer networks learn hard generalized
  linear models with SGD
Escaping mediocrity: how two-layer networks learn hard generalized linear models with SGD
Luca Arnaboldi
Florent Krzakala
Bruno Loureiro
Ludovic Stephan
MLT
23
3
0
29 May 2023
Hotelling Deflation on Large Symmetric Spiked Tensors
Hotelling Deflation on Large Symmetric Spiked Tensors
M. Seddik
J. H. D. M. Goulart
M. Guillaud
13
1
0
20 Apr 2023
High-dimensional limit of one-pass SGD on least squares
High-dimensional limit of one-pass SGD on least squares
Elizabeth Collins-Woodfin
Elliot Paquette
16
3
0
13 Apr 2023
High-dimensional scaling limits and fluctuations of online least-squares
  SGD with smooth covariance
High-dimensional scaling limits and fluctuations of online least-squares SGD with smooth covariance
Krishnakumar Balasubramanian
Promit Ghosal
Ye He
23
5
0
03 Apr 2023
Gradient flow on extensive-rank positive semi-definite matrix denoising
Gradient flow on extensive-rank positive semi-definite matrix denoising
A. Bodin
N. Macris
13
3
0
16 Mar 2023
Statistical Inference for Linear Functionals of Online SGD in High-dimensional Linear Regression
Statistical Inference for Linear Functionals of Online SGD in High-dimensional Linear Regression
Bhavya Agrawalla
Krishnakumar Balasubramanian
Promit Ghosal
23
2
0
20 Feb 2023
From high-dimensional & mean-field dynamics to dimensionless ODEs: A
  unifying approach to SGD in two-layers networks
From high-dimensional & mean-field dynamics to dimensionless ODEs: A unifying approach to SGD in two-layers networks
Luca Arnaboldi
Ludovic Stephan
Florent Krzakala
Bruno Loureiro
MLT
25
31
0
12 Feb 2023
Learning Single-Index Models with Shallow Neural Networks
Learning Single-Index Models with Shallow Neural Networks
A. Bietti
Joan Bruna
Clayton Sanford
M. Song
160
67
0
27 Oct 2022
Rigorous dynamical mean field theory for stochastic gradient descent
  methods
Rigorous dynamical mean field theory for stochastic gradient descent methods
Cédric Gerbelot
Emanuele Troiani
Francesca Mignacco
Florent Krzakala
Lenka Zdeborova
17
26
0
12 Oct 2022
Understanding Edge-of-Stability Training Dynamics with a Minimalist
  Example
Understanding Edge-of-Stability Training Dynamics with a Minimalist Example
Xingyu Zhu
Zixuan Wang
Xiang Wang
Mo Zhou
Rong Ge
64
35
0
07 Oct 2022
Understanding Gradient Descent on Edge of Stability in Deep Learning
Understanding Gradient Descent on Edge of Stability in Deep Learning
Sanjeev Arora
Zhiyuan Li
A. Panigrahi
MLT
75
88
0
19 May 2022
1