Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent

18 February 2019

Jascha Narain Sohl-Dickstein

Jeffrey Pennington

ArXiv PDF HTML

Papers citing "Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent"

50 / 255 papers shown

Title
L-SWAG: Layer-Sample Wise Activation with Gradients information for Zero-Shot NAS on Vision Transformers S. Casarin Sergio Escalera Oswald Lanz 34 0 0 12 May 2025
Learning Guarantee of Reward Modeling Using Deep Neural Networks Yuanhang Luo Yeheng Ge Ruijian Han Guohao Shen 34 0 0 10 May 2025
Information-theoretic reduction of deep neural networks to linear models in the overparametrized proportional regime Francesco Camilli D. Tieplova Eleonora Bergamin Jean Barbier 135 0 0 06 May 2025
Don't be lazy: CompleteP enables compute-efficient deep transformers Nolan Dey Bin Claire Zhang Lorenzo Noci Mufan Li Blake Bordelon Shane Bergsma Cengiz Pehlevan Boris Hanin Joel Hestness 44 0 0 02 May 2025
Neuronal correlations shape the scaling behavior of memory capacity and nonlinear computational capability of recurrent neural networks Shotaro Takasu Toshio Aoyagi 34 0 0 28 Apr 2025
Ultra-fast feature learning for the training of two-layer neural networks in the two-timescale regime Raphael Barboni Gabriel Peyré François-Xavier Vialard MLT 39 0 0 25 Apr 2025
On the Cone Effect in the Learning Dynamics Zhanpeng Zhou Yongyi Yang Jie Ren Mahito Sugiyama Junchi Yan 53 0 0 20 Mar 2025
Contextual Similarity Distillation: Ensemble Uncertainties with a Single Model Moritz A. Zanger Pascal R. van der Vaart Wendelin Bohmer M. Spaan UQCV BDL 170 0 0 14 Mar 2025
PIED: Physics-Informed Experimental Design for Inverse Problems Apivich Hemachandra Gregory Kang Ruey Lau Szu Hui Ng Bryan Kian Hsiang Low PINN 48 0 0 10 Mar 2025
Make Haste Slowly: A Theory of Emergent Structured Mixed Selectivity in Feature Learning ReLU Networks Devon Jarvis Richard Klein Benjamin Rosman Andrew M. Saxe MLT 66 1 0 08 Mar 2025
Variation Matters: from Mitigating to Embracing Zero-Shot NAS Ranking Function Variation P. Rumiantsev Mark Coates 55 0 0 27 Feb 2025
Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos Dayal Singh Kalra Tianyu He M. Barkeshli 54 4 0 17 Feb 2025
Deep Linear Network Training Dynamics from Random Initialization: Data, Width, Depth, and Hyperparameter Transfer Blake Bordelon Cengiz Pehlevan AI4CE 64 1 0 04 Feb 2025
Infinite Width Limits of Self Supervised Neural Networks Maximilian Fleissner Gautham Govind Anil D. Ghoshdastidar SSL 166 0 0 17 Nov 2024
Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits H. Bui Enrique Mallada Anqi Liu 132 0 0 08 Nov 2024
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks Jim Zhao Sidak Pal Singh Aurelien Lucchi AI4CE 48 0 0 04 Nov 2024
DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models Wenlong Deng Yize Zhao V. Vakilian Minghui Chen Xiaoxiao Li Christos Thrampoulidis 45 3 0 12 Oct 2024
On the Impacts of the Random Initialization in the Neural Tangent Kernel Theory Guhan Chen Yicheng Li Qian Lin AAML 38 1 0 08 Oct 2024
SHAP values via sparse Fourier representation Ali Gorji Andisheh Amrollahi A. Krause FAtt 38 0 0 08 Oct 2024
The Optimization Landscape of SGD Across the Feature Learning Strength Alexander B. Atanasov Alexandru Meterez James B. Simon Cengiz Pehlevan 43 2 0 06 Oct 2024
Peer-to-Peer Learning Dynamics of Wide Neural Networks Shreyas Chaudhari Srinivasa Pranav Emile Anand José M. F. Moura 39 3 0 23 Sep 2024
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks Clémentine Dominé Nicolas Anguita A. Proca Lukas Braun D. Kunin P. Mediano Andrew M. Saxe 38 3 0 22 Sep 2024
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes Nikita Kiselev Andrey Grabovoy 54 1 0 18 Sep 2024
Advancing Hybrid Defense for Byzantine Attacks in Federated Learning Kai Yue Richeng Jin Chau-Wai Wong H. Dai AAML 39 0 0 10 Sep 2024
Variational Search Distributions Daniel M. Steinberg Rafael Oliveira Cheng Soon Ong Edwin V. Bonilla 33 0 0 10 Sep 2024
Input Space Mode Connectivity in Deep Neural Networks Jakub Vrabel Ori Shem-Ur Yaron Oz David Krueger 56 1 0 09 Sep 2024
Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective Jingren Liu Zhong Ji YunLong Yu Jiale Cao Yanwei Pang Jungong Han Xuelong Li CLL 42 5 0 24 Jul 2024
Simplifying Deep Temporal Difference Learning Matteo Gallici Mattie Fellows Benjamin Ellis B. Pou Ivan Masmitja Jakob Foerster Mario Martin OffRL 62 15 0 05 Jul 2024
Neural Lineage Runpeng Yu Xinchao Wang 34 4 0 17 Jun 2024
Bayesian RG Flow in Neural Network Field Theories Jessica N. Howard Marc S. Klinger Anindita Maiti A. G. Stapleton 68 1 0 27 May 2024
Thermodynamic limit in learning period three Yuichiro Terasaki Kohei Nakajima 40 1 0 12 May 2024
Unveiling the optimization process of Physics Informed Neural Networks: How accurate and competitive can PINNs be? Jorge F. Urbán P. Stefanou José A. Pons PINN 45 6 0 07 May 2024
High dimensional analysis reveals conservative sharpening and a stochastic edge of stability Atish Agarwala Jeffrey Pennington 41 3 0 30 Apr 2024
CAM-Based Methods Can See through Walls Magamed Taimeskhanov R. Sicre Damien Garreau 21 1 0 02 Apr 2024
TG-NAS: Leveraging Zero-Cost Proxies with Transformer and Graph Convolution Networks for Efficient Neural Architecture Search Ye Qiao Haocheng Xu Sitao Huang 29 0 0 30 Mar 2024
Understanding the training of infinitely deep and wide ResNets with Conditional Optimal Transport Raphael Barboni Gabriel Peyré Franccois-Xavier Vialard 37 3 0 19 Mar 2024
NTK-Guided Few-Shot Class Incremental Learning Jingren Liu Zhong Ji Yanwei Pang YunLong Yu CLL 39 3 0 19 Mar 2024
Active Few-Shot Fine-Tuning Jonas Hübotter Bhavya Sukhija Lenart Treven Yarden As Andreas Krause 45 1 0 13 Feb 2024
Weak Correlations as the Underlying Principle for Linearization of Gradient-Based Learning Systems Ori Shem-Ur Yaron Oz 19 0 0 08 Jan 2024
Rethinking Adversarial Training with Neural Tangent Kernel Guanlin Li Han Qiu Shangwei Guo Jiwei Li Tianwei Zhang AAML 22 0 0 04 Dec 2023
Differentially Private Non-convex Learning for Multi-layer Neural Networks Hanpu Shen Cheng-Long Wang Zihang Xiang Yiming Ying Di Wang 49 7 0 12 Oct 2023
A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks Behrad Moniri Donghwan Lee Hamed Hassani Yan Sun MLT 40 19 0 11 Oct 2023
DataDAM: Efficient Dataset Distillation with Attention Matching A. Sajedi Samir Khaki Ehsan Amjadian Lucy Z. Liu Y. Lawryshyn Konstantinos N. Plataniotis DD 46 59 0 29 Sep 2023
Connecting NTK and NNGP: A Unified Theoretical Framework for Wide Neural Network Learning Dynamics Yehonatan Avidan Qianyi Li H. Sompolinsky 60 8 0 08 Sep 2023
Modify Training Directions in Function Space to Reduce Generalization Error Yi Yu Wenlian Lu Boyu Chen 27 0 0 25 Jul 2023
Constructing Extreme Learning Machines with zero Spectral Bias Kaumudi Joshi V. Snigdha A. K. Bhattacharya 26 2 0 19 Jul 2023
Quantitative CLTs in Deep Neural Networks Stefano Favaro Boris Hanin Domenico Marinucci I. Nourdin G. Peccati BDL 33 11 0 12 Jul 2023
Training-Free Neural Active Learning with Initialization-Robustness Guarantees Apivich Hemachandra Zhongxiang Dai Jasraj Singh See-Kiong Ng K. H. Low AAML 36 6 0 07 Jun 2023
On the Weight Dynamics of Deep Normalized Networks Christian H. X. Ali Mehmeti-Göpel Michael Wand 38 1 0 01 Jun 2023
Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension Moritz Haas David Holzmüller U. V. Luxburg Ingo Steinwart MLT 35 14 0 23 May 2023