Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1808.09372
Cited By
v1
v2 (latest)
Mean Field Analysis of Neural Networks: A Central Limit Theorem
28 August 2018
Justin A. Sirignano
K. Spiliopoulos
MLT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Mean Field Analysis of Neural Networks: A Central Limit Theorem"
50 / 139 papers shown
Mean-Field Limits for Two-Layer Neural Networks Trained with Consensus-Based Optimization
William De Deyn
Michael Herty
Giovanni Samaey
212
0
0
26 Nov 2025
Low Rank Gradients and Where to Find Them
Rishi Sonthalia
Michael Murray
Guido Montúfar
220
3
0
01 Oct 2025
Gaussian mixture layers for neural networks
Sinho Chewi
Philippe Rigollet
Yuling Yan
MLT
215
0
0
06 Aug 2025
Asymptotics of SGD in Sequence-Single Index Models and Single-Layer Attention Networks
Luca Arnaboldi
Bruno Loureiro
Ludovic Stephan
Florent Krzakala
Lenka Zdeborová
212
7
0
03 Jun 2025
Scalable Complexity Control Facilitates Reasoning Ability of LLMs
Liangkai Hang
Junjie Yao
Zhiwei Bai
Jiahao Huo
Yang Chen
...
Feiyu Xiong
Y. Zhang
Weinan E
Hongkang Yang
Zhi-hai Xu
LRM
241
3
0
29 May 2025
Non-convex entropic mean-field optimization via Best Response flow
Razvan-Andrei Lascu
Mateusz B. Majka
356
2
0
28 May 2025
Mirror Mean-Field Langevin Dynamics
Anming Gu
Juno Kim
338
2
0
05 May 2025
Don't be lazy: CompleteP enables compute-efficient deep transformers
Nolan Dey
Bin Claire Zhang
Lorenzo Noci
Mufan Li
Blake Bordelon
Shane Bergsma
Cengiz Pehlevan
Boris Hanin
Joel Hestness
713
38
0
02 May 2025
An overview of condensation phenomenon in deep learning
Zhi-Qin John Xu
Yaoyu Zhang
Zhangchen Zhou
AI4CE
375
12
0
13 Apr 2025
Towards Understanding the Optimization Mechanisms in Deep Learning
Binchuan Qi
Wei Gong
Li Li
432
1
0
29 Mar 2025
Global Convergence and Rich Feature Learning in
L
L
L
-Layer Infinite-Width Neural Networks under
μ
μ
μ
P Parametrization
Zixiang Chen
Greg Yang
Qingyue Zhao
Q. Gu
MLT
307
3
0
12 Mar 2025
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input
Neural Information Processing Systems (NeurIPS), 2024
Ziang Chen
Rong Ge
MLT
487
1
0
10 Jan 2025
A Random Matrix Theory Perspective on the Spectrum of Learned Features and Asymptotic Generalization Capabilities
International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Yatin Dandi
Luca Pesce
Hugo Cui
Florent Krzakala
Yue M. Lu
Bruno Loureiro
MLT
391
11
0
24 Oct 2024
Small Contributions, Small Networks: Efficient Neural Network Pruning Based on Relative Importance
Mostafa Hussien
Mahmoud Afifi
K. Nguyen
M. Cheriet
296
3
0
21 Oct 2024
Extended convexity and smoothness and their applications in deep learning
Binchuan Qi
Wei Gong
Li Li
494
0
0
08 Oct 2024
On the Complexity of Learning Sparse Functions with Statistical and Gradient Queries
Nirmit Joshi
Theodor Misiakiewicz
Nathan Srebro
269
11
0
08 Jul 2024
Coding schemes in neural networks learning classification tasks
Alexander van Meegen
H. Sompolinsky
276
19
0
24 Jun 2024
Central Limit Theorem for Bayesian Neural Network trained with Variational Inference
Arnaud Descours
Tom Huix
Arnaud Guillin
Manon Michel
Eric Moulines
Boris Nectoux
257
0
0
10 Jun 2024
Error Bounds of Supervised Classification from Information-Theoretic Perspective
Binchuan Qi
Wei Gong
Li Li
319
0
0
07 Jun 2024
Online Learning and Information Exponents: On The Importance of Batch size, and Time/Complexity Tradeoffs
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Bruno Loureiro
Luca Pesce
Ludovic Stephan
339
2
0
04 Jun 2024
Symmetries in Overparametrized Neural Networks: A Mean-Field View
Javier Maass
Joaquin Fontbona
MLT
FedML
598
4
0
30 May 2024
Improved Particle Approximation Error for Mean Field Neural Networks
Atsushi Nitanda
253
14
0
24 May 2024
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Luca Pesce
Ludovic Stephan
480
29
0
24 May 2024
Initialization is Critical to Whether Transformers Fit Composite Functions by Reasoning or Memorizing
Neural Information Processing Systems (NeurIPS), 2024
Zhongwang Zhang
Pengxiao Lin
Zhiwei Wang
Yaoyu Zhang
Z. Xu
742
3
0
08 May 2024
Generalization of Scaled Deep ResNets in the Mean-Field Regime
International Conference on Learning Representations (ICLR), 2024
Yihang Chen
Fanghui Liu
Yiping Lu
Grigorios G. Chrysos
Volkan Cevher
300
2
0
14 Mar 2024
On the dynamics of three-layer neural networks: initial condensation
Zheng-an Chen
Tao Luo
MLT
AI4CE
245
3
0
25 Feb 2024
Asymptotics of feature learning in two-layer networks after one gradient-step
Hugo Cui
Luca Pesce
Yatin Dandi
Florent Krzakala
Yue M. Lu
Lenka Zdeborová
Bruno Loureiro
MLT
352
29
0
07 Feb 2024
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
International Conference on Machine Learning (ICML), 2024
Yatin Dandi
Emanuele Troiani
Luca Arnaboldi
Luca Pesce
Lenka Zdeborová
Florent Krzakala
MLT
384
41
0
05 Feb 2024
Mean-field underdamped Langevin dynamics and its spacetime discretization
Qiang Fu
Ashia Wilson
518
5
0
26 Dec 2023
Weight fluctuations in (deep) linear neural networks and a derivation of the inverse-variance flatness relation
Physical Review Research (Phys. Rev. Res.), 2023
Markus Gross
A. Raulf
Christoph Räth
615
1
0
23 Nov 2023
Minimum norm interpolation by perceptra: Explicit regularization and implicit bias
Neural Information Processing Systems (NeurIPS), 2023
Jiyoung Park
Ian Pelakh
Stephan Wojtowytsch
259
2
0
10 Nov 2023
Neural Tangent Kernels Motivate Graph Neural Networks with Cross-Covariance Graphs
Shervin Khalafi
Saurabh Sihag
Alejandro Ribeiro
275
0
0
16 Oct 2023
A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks
International Conference on Machine Learning (ICML), 2023
Behrad Moniri
Donghwan Lee
Hamed Hassani
Guang Cheng
MLT
591
37
0
11 Oct 2023
Six Lectures on Linearized Neural Networks
Journal of Statistical Mechanics: Theory and Experiment (J. Stat. Mech.), 2023
Theodor Misiakiewicz
Andrea Montanari
398
18
0
25 Aug 2023
Quantitative CLTs in Deep Neural Networks
Probability theory and related fields (PTRF), 2023
Stefano Favaro
Boris Hanin
Domenico Marinucci
I. Nourdin
G. Peccati
BDL
849
30
0
12 Jul 2023
Fundamental limits of overparametrized shallow neural networks for supervised learning
Francesco Camilli
D. Tieplova
Jean Barbier
269
11
0
11 Jul 2023
Gaussian random field approximation via Stein's method with applications to wide random neural networks
Applied and Computational Harmonic Analysis (ACHA), 2023
Krishnakumar Balasubramanian
L. Goldstein
Nathan Ross
Adil Salim
497
15
0
28 Jun 2023
Convergence of mean-field Langevin dynamics: Time and space discretization, stochastic gradient, and variance reduction
Taiji Suzuki
Denny Wu
Atsushi Nitanda
246
21
0
12 Jun 2023
Escaping mediocrity: how two-layer networks learn hard generalized linear models with SGD
Luca Arnaboldi
Florent Krzakala
Bruno Loureiro
Ludovic Stephan
MLT
366
12
0
29 May 2023
How Two-Layer Neural Networks Learn, One (Giant) Step at a Time
Yatin Dandi
Florent Krzakala
Bruno Loureiro
Luca Pesce
Ludovic Stephan
MLT
596
58
0
29 May 2023
Understanding the Initial Condensation of Convolutional Neural Networks
CSIAM Transactions on Applied Mathematics (TCAM), 2023
Zhangchen Zhou
Hanxu Zhou
Yuqing Li
Zhi-Qin John Xu
MLT
AI4CE
211
6
0
17 May 2023
Leveraging the two timescale regime to demonstrate convergence of neural networks
Neural Information Processing Systems (NeurIPS), 2023
Pierre Marion
Raphael Berthier
312
13
0
19 Apr 2023
Depth Separation with Multilayer Mean-Field Networks
International Conference on Learning Representations (ICLR), 2023
Y. Ren
Mo Zhou
Rong Ge
OOD
327
5
0
03 Apr 2023
High-dimensional scaling limits and fluctuations of online least-squares SGD with smooth covariance
The Annals of Applied Probability (Ann. Appl. Probab.), 2023
Krishnakumar Balasubramanian
Promit Ghosal
Ye He
417
8
0
03 Apr 2023
Phase Diagram of Initial Condensation for Two-layer Neural Networks
CSIAM Transactions on Applied Mathematics (TCAM), 2023
Zheng Chen
Yuqing Li
Yaoyu Zhang
Zhaoguang Zhou
Z. Xu
MLT
AI4CE
257
14
0
12 Mar 2023
Primal and Dual Analysis of Entropic Fictitious Play for Finite-sum Problems
International Conference on Machine Learning (ICML), 2023
Atsushi Nitanda
Kazusato Oko
Denny Wu
Nobuhito Takenouchi
Taiji Suzuki
317
4
0
06 Mar 2023
Stochastic Modified Flows, Mean-Field Limits and Dynamics of Stochastic Gradient Descent
Journal of machine learning research (JMLR), 2023
Benjamin Gess
Sebastian Kassing
Vitalii Konarovskyi
DiffM
368
13
0
14 Feb 2023
From high-dimensional & mean-field dynamics to dimensionless ODEs: A unifying approach to SGD in two-layers networks
Annual Conference Computational Learning Theory (COLT), 2023
Luca Arnaboldi
Ludovic Stephan
Florent Krzakala
Bruno Loureiro
MLT
217
44
0
12 Feb 2023
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Yufeng Zhang
Boyi Liu
Qi Cai
Lingxiao Wang
Zhaoran Wang
391
15
0
30 Dec 2022
Bayesian Interpolation with Deep Linear Networks
Proceedings of the National Academy of Sciences of the United States of America (PNAS), 2022
Boris Hanin
Alexander Zlokapa
513
30
0
29 Dec 2022
1
2
3
Next
Page 1 of 3