Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit

16 February 2019

Papers citing "Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit"

50 / 72 papers shown

Title
Information-theoretic reduction of deep neural networks to linear models in the overparametrized proportional regime Francesco Camilli D. Tieplova Eleonora Bergamin Jean Barbier 106 0 0 06 May 2025
Ultra-fast feature learning for the training of two-layer neural networks in the two-timescale regime Raphael Barboni Gabriel Peyré François-Xavier Vialard MLT 34 0 0 25 Apr 2025
Deep Linear Network Training Dynamics from Random Initialization: Data, Width, Depth, and Hyperparameter Transfer Blake Bordelon C. Pehlevan AI4CE 61 1 0 04 Feb 2025
Convergence Analysis of the Wasserstein Proximal Algorithm beyond Geodesic Convexity Shuailong Zhu Xiaohui Chen 77 0 0 28 Jan 2025
Geometry and Optimization of Shallow Polynomial Networks Yossi Arjevani Joan Bruna Joe Kileel Elzbieta Polak Matthew Trager 34 1 0 10 Jan 2025
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input Ziang Chen Rong Ge MLT 59 1 0 10 Jan 2025
Non-geodesically-convex optimization in the Wasserstein space Hoang Phuc Hau Luu Hanlin Yu Bernardo Williams Petrus Mikkola Marcelo Hartmann Kai Puolamaki Arto Klami 53 2 0 08 Jan 2025
The Optimization Landscape of SGD Across the Feature Learning Strength Alexander B. Atanasov Alexandru Meterez James B. Simon C. Pehlevan 43 2 0 06 Oct 2024
How Feature Learning Can Improve Neural Scaling Laws Blake Bordelon Alexander B. Atanasov C. Pehlevan 54 12 0 26 Sep 2024
Symmetries in Overparametrized Neural Networks: A Mean-Field View Javier Maass Martínez Joaquin Fontbona FedML MLT 38 2 0 30 May 2024
Infinite Limits of Multi-head Transformer Dynamics Blake Bordelon Hamza Tahir Chaudhry C. Pehlevan AI4CE 42 9 0 24 May 2024
High dimensional analysis reveals conservative sharpening and a stochastic edge of stability Atish Agarwala Jeffrey Pennington 41 3 0 30 Apr 2024
Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations Akshay Kumar Jarvis D. Haupt ODL 44 3 0 12 Mar 2024
Mean-field underdamped Langevin dynamics and its spacetime discretization Qiang Fu Ashia Wilson 34 4 0 26 Dec 2023
A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks Behrad Moniri Donghwan Lee Hamed Hassani Edgar Dobriban MLT 34 19 0 11 Oct 2023
Gradient-Based Feature Learning under Structured Data Alireza Mousavi-Hosseini Denny Wu Taiji Suzuki Murat A. Erdogdu MLT 32 18 0 07 Sep 2023
Law of Large Numbers for Bayesian two-layer Neural Network trained with Variational Inference Arnaud Descours Tom Huix Arnaud Guillin Manon Michel Eric Moulines Boris Nectoux BDL 29 1 0 10 Jul 2023
Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks Puyu Wang Yunwen Lei Di Wang Yiming Ying Ding-Xuan Zhou MLT 27 3 0 26 May 2023
Understanding the Initial Condensation of Convolutional Neural Networks Zhangchen Zhou Hanxu Zhou Yuqing Li Zhi-Qin John Xu MLT AI4CE 23 5 0 17 May 2023
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks Blake Bordelon C. Pehlevan MLT 38 29 0 06 Apr 2023
Reproducing kernel Hilbert spaces in the mean field limit Christian Fiedler Michael Herty M. Rom C. Segala Sebastian Trimpe 19 6 0 28 Feb 2023
Learning time-scales in two-layers neural networks Raphael Berthier Andrea Montanari Kangjie Zhou 36 33 0 28 Feb 2023
From high-dimensional & mean-field dynamics to dimensionless ODEs: A unifying approach to SGD in two-layers networks Luca Arnaboldi Ludovic Stephan Florent Krzakala Bruno Loureiro MLT 30 31 0 12 Feb 2023
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning François Caron Fadhel Ayed Paul Jung Hoileong Lee Juho Lee Hongseok Yang 62 2 0 02 Feb 2023
ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients Guihong Li Yuedong Yang Kartikeya Bhardwaj R. Marculescu 34 60 0 26 Jan 2023
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models Yufeng Zhang Boyi Liu Qi Cai Lingxiao Wang Zhaoran Wang 45 11 0 30 Dec 2022
Learning threshold neurons via the "edge of stability" Kwangjun Ahn Sébastien Bubeck Sinho Chewi Y. Lee Felipe Suarez Yi Zhang MLT 36 36 0 14 Dec 2022
Statistical Physics of Deep Neural Networks: Initialization toward Optimal Channels Kangyu Weng Aohua Cheng Ziyang Zhang Pei Sun Yang Tian 48 2 0 04 Dec 2022
Infinite-width limit of deep linear neural networks Lénaïc Chizat Maria Colombo Xavier Fernández-Real Alessio Figalli 31 14 0 29 Nov 2022
Global Convergence of SGD On Two Layer Neural Nets Pulkit Gopalani Anirbit Mukherjee 20 5 0 20 Oct 2022
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization) Zhenyu Zhu Fanghui Liu Grigorios G. Chrysos V. Cevher 39 19 0 15 Sep 2022
Neural Networks can Learn Representations with Gradient Descent Alexandru Damian Jason D. Lee Mahdi Soltanolkotabi SSL MLT 17 112 0 30 Jun 2022
Empirical Phase Diagram for Three-layer Neural Networks with Infinite Width Hanxu Zhou Qixuan Zhou Zhenyuan Jin Tao Luo Yaoyu Zhang Zhi-Qin John Xu 22 20 0 24 May 2022
Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks Blake Bordelon C. Pehlevan MLT 24 79 0 19 May 2022
Mean-Field Nonparametric Estimation of Interacting Particle Systems Rentian Yao Xiaohui Chen Yun Yang 43 9 0 16 May 2022
Trajectory Inference via Mean-field Langevin in Path Space Lénaïc Chizat Stephen X. Zhang Matthieu Heitz Geoffrey Schiebinger 31 20 0 14 May 2022
Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks R. Veiga Ludovic Stephan Bruno Loureiro Florent Krzakala Lenka Zdeborová MLT 10 31 0 01 Feb 2022
Convex Analysis of the Mean Field Langevin Dynamics Atsushi Nitanda Denny Wu Taiji Suzuki MLT 59 64 0 25 Jan 2022
Convergence of Policy Gradient for Entropy Regularized MDPs with Neural Network Approximation in the Mean-Field Regime B. Kerimkulov J. Leahy David Siska Lukasz Szpruch 22 11 0 18 Jan 2022
DNN gradient lossless compression: Can GenNorm be the answer? Zhongzhu Chen Eduin E. Hernandez Yu-Chih Huang Stefano Rini 17 9 0 15 Nov 2021
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks A. Shevchenko Vyacheslav Kungurtsev Marco Mondelli MLT 36 13 0 03 Nov 2021
Subquadratic Overparameterization for Shallow Neural Networks Chaehwan Song Ali Ramezani-Kebrya Thomas Pethick Armin Eftekhari V. Cevher 24 32 0 02 Nov 2021
Rethinking Neural vs. Matrix-Factorization Collaborative Filtering: the Theoretical Perspectives Zida Cheng Chuanwei Ruan Siheng Chen Sushant Kumar Ya-Qin Zhang 19 16 0 23 Oct 2021
Parallel Deep Neural Networks Have Zero Duality Gap Yifei Wang Tolga Ergen Mert Pilanci 79 10 0 13 Oct 2021
AIR-Net: Adaptive and Implicit Regularization Neural Network for Matrix Completion Zhemin Li Tao Sun Hongxia Wang Bao Wang 42 6 0 12 Oct 2021
Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction Dominik Stöger Mahdi Soltanolkotabi ODL 31 74 0 28 Jun 2021
Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent Spencer Frei Quanquan Gu 17 25 0 25 Jun 2021
The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization Mufan Bill Li Mihai Nica Daniel M. Roy 23 33 0 07 Jun 2021
Global Convergence of Three-layer Neural Networks in the Mean Field Regime H. Pham Phan-Minh Nguyen MLT AI4CE 41 19 0 11 May 2021
The Discovery of Dynamics via Linear Multistep Methods and Deep Learning: Error Estimation Q. Du Yiqi Gu Haizhao Yang Chao Zhou 24 20 0 21 Mar 2021