Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent

18 February 2019

Jascha Narain Sohl-Dickstein

Jeffrey Pennington

ArXiv PDF HTML

Papers citing "Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent"

50 / 288 papers shown

Title
Global Convergence of SGD On Two Layer Neural Nets Pulkit Gopalani Anirbit Mukherjee 26 5 0 20 Oct 2022
Data-Efficient Augmentation for Training Neural Networks Tian Yu Liu Baharan Mirzasoleiman 32 7 0 15 Oct 2022
Understanding Impacts of Task Similarity on Backdoor Attack and Detection Di Tang Rui Zhu Xiaofeng Wang Haixu Tang Yi Chen AAML 24 5 0 12 Oct 2022
What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness? Nikolaos Tsilivis Julia Kempe AAML 47 18 0 11 Oct 2022
Meta-Principled Family of Hyperparameter Scaling Strategies Sho Yaida 58 16 0 10 Oct 2022
Second-order regression models exhibit progressive sharpening to the edge of stability Atish Agarwala Fabian Pedregosa Jeffrey Pennington 35 26 0 10 Oct 2022
Continual task learning in natural and artificial agents Timo Flesch Andrew M. Saxe Christopher Summerfield CLL 43 24 0 10 Oct 2022
Critical Learning Periods for Multisensory Integration in Deep Networks Michael Kleinman Alessandro Achille Stefano Soatto 35 10 0 06 Oct 2022
FedMT: Federated Learning with Mixed-type Labels Qiong Zhang Jing Peng Xin Zhang A. Talhouk Gang Niu Xiaoxiao Li FedML 59 0 0 05 Oct 2022
Scale-invariant Bayesian Neural Networks with Connectivity Tangent Kernel Sungyub Kim Si-hun Park Kyungsu Kim Eunho Yang BDL 32 4 0 30 Sep 2022
Formal Conceptual Views in Neural Networks Johannes Hirth Tom Hanika 20 2 0 27 Sep 2022
A Closer Look at Learned Optimization: Stability, Robustness, and Inductive Biases James Harrison Luke Metz Jascha Narain Sohl-Dickstein 49 22 0 22 Sep 2022
Approximation results for Gradient Descent trained Shallow Neural Networks in $1d$ R. Gentile G. Welper ODL 56 6 0 17 Sep 2022
Git Re-Basin: Merging Models modulo Permutation Symmetries Samuel K. Ainsworth J. Hayase S. Srinivasa MoMe 255 318 0 11 Sep 2022
Generalisation under gradient descent via deterministic PAC-Bayes Eugenio Clerico Tyler Farghly George Deligiannidis Benjamin Guedj Arnaud Doucet 31 4 0 06 Sep 2022
On Kernel Regression with Data-Dependent Kernels James B. Simon BDL 29 3 0 04 Sep 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability Z. Li Zixuan Wang Jian Li 19 44 0 26 Jul 2022
Can we achieve robustness from data alone? Nikolaos Tsilivis Jingtong Su Julia Kempe OOD DD 36 18 0 24 Jul 2022
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks Andrew M. Saxe Shagun Sodhani Sam Lewallen AI4CE 32 34 0 21 Jul 2022
Single Model Uncertainty Estimation via Stochastic Data Centering Jayaraman J. Thiagarajan Rushil Anirudh V. Narayanaswamy P. Bremer UQCV OOD 32 26 0 14 Jul 2022
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity Jianyi Yang Shaolei Ren 32 3 0 02 Jul 2022
Neural Networks can Learn Representations with Gradient Descent Alexandru Damian Jason D. Lee Mahdi Soltanolkotabi SSL MLT 25 114 0 30 Jun 2022
Momentum Diminishes the Effect of Spectral Bias in Physics-Informed Neural Networks G. Farhani Alexander Kazachek Boyu Wang 27 6 0 29 Jun 2022
Making Look-Ahead Active Learning Strategies Feasible with Neural Tangent Kernels Mohamad Amin Mohamadi Wonho Bae Danica J. Sutherland 30 20 0 25 Jun 2022
Fast Finite Width Neural Tangent Kernel Roman Novak Jascha Narain Sohl-Dickstein S. Schoenholz AAML 28 54 0 17 Jun 2022
Large-width asymptotics for ReLU neural networks with $α$ -Stable initializations Stefano Favaro S. Fortini Stefano Peluchetti 20 2 0 16 Jun 2022
Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling Jiri Hron Roman Novak Jeffrey Pennington Jascha Narain Sohl-Dickstein UQCV BDL 48 6 0 15 Jun 2022
Wavelet Regularization Benefits Adversarial Training Jun Yan Huilin Yin Xiaoyang Deng Zi-qin Zhao Wancheng Ge Hao Zhang Gerhard Rigoll AAML 19 2 0 08 Jun 2022
Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials Eshaan Nichani Yunzhi Bai Jason D. Lee 29 10 0 08 Jun 2022
Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping Vikranth Dwaracherla Zheng Wen Ian Osband Xiuyuan Lu S. Asghari Benjamin Van Roy UQCV 29 17 0 08 Jun 2022
Dataset Distillation using Neural Feature Regression Yongchao Zhou E. Nezhadarya Jimmy Ba DD FedML 53 151 0 01 Jun 2022
Analyzing Tree Architectures in Ensembles via Neural Tangent Kernel Ryuichi Kanoh M. Sugiyama 31 2 0 25 May 2022
On the Interpretability of Regularisation for Neural Networks Through Model Gradient Similarity Vincent Szolnoky Viktor Andersson Balázs Kulcsár Rebecka Jörnsten 45 5 0 25 May 2022
Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture Libin Zhu Chaoyue Liu M. Belkin GNN AI4CE 23 4 0 24 May 2022
Gaussian Pre-Activations in Neural Networks: Myth or Reality? Pierre Wolinski Julyan Arbel AI4CE 76 8 0 24 May 2022
Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks Blake Bordelon Cengiz Pehlevan MLT 40 77 0 19 May 2022
Trading Positional Complexity vs. Deepness in Coordinate Networks Jianqiao Zheng Sameera Ramasinghe Xueqian Li Simon Lucey 31 18 0 18 May 2022
Incorporating Prior Knowledge into Neural Networks through an Implicit Composite Kernel Ziyang Jiang Tongshu Zheng Yiling Liu David Carlson 30 4 0 15 May 2022
Understanding the unstable convergence of gradient descent Kwangjun Ahn J.N. Zhang S. Sra 36 57 0 03 Apr 2022
Analytic theory for the dynamics of wide quantum neural networks Junyu Liu K. Najafi Kunal Sharma F. Tacchino Liang Jiang Antonio Mezzacapo 31 52 0 30 Mar 2022
Global Convergence of MAML and Theory-Inspired Neural Architecture Search for Few-Shot Learning Haoxiang Wang Yite Wang Ruoyu Sun Bo-wen Li 33 27 0 17 Mar 2022
Generalization Through The Lens Of Leave-One-Out Error Gregor Bachmann Thomas Hofmann Aurelien Lucchi 67 7 0 07 Mar 2022
Uncertainty Estimation for Computed Tomography with a Linearised Deep Image Prior Javier Antorán Riccardo Barbano Johannes Leuschner José Miguel Hernández-Lobato Bangti Jin UQCV 35 10 0 28 Feb 2022
The Spectral Bias of Polynomial Neural Networks Moulik Choraria L. Dadi Grigorios G. Chrysos Julien Mairal V. Cevher 24 18 0 27 Feb 2022
Finding Dynamics Preserving Adversarial Winning Tickets Xupeng Shi Pengfei Zheng Adam Ding Yuan Gao Weizhong Zhang AAML 23 1 0 14 Feb 2022
Demystify Optimization and Generalization of Over-parameterized PAC-Bayesian Learning Wei Huang Chunrui Liu Yilan Chen Tianyu Liu R. Xu BDL MLT 19 2 0 04 Feb 2022
Tight Convergence Rate Bounds for Optimization Under Power Law Spectral Conditions Maksim Velikanov Dmitry Yarotsky 11 6 0 02 Feb 2022
Deep Layer-wise Networks Have Closed-Form Weights Chieh-Tsai Wu A. Masoomi Arthur Gretton Jennifer Dy 29 3 0 01 Feb 2022
Stochastic Neural Networks with Infinite Width are Deterministic Liu Ziyin Hanlin Zhang Xiangming Meng Yuting Lu Eric P. Xing Masakuni Ueda 34 3 0 30 Jan 2022
Interplay between depth of neural networks and locality of target functions Takashi Mori Masakuni Ueda 25 0 0 28 Jan 2022