On Lazy Training in Differentiable Programming

19 December 2018

Papers citing "On Lazy Training in Differentiable Programming"

50 / 227 papers shown

Title
Learning sparse features can lead to overfitting in neural networks Leonardo Petrini Francesco Cagnetta Eric Vanden-Eijnden M. Wyart MLT 42 23 0 24 Jun 2022
Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation Loucas Pillaud-Vivien J. Reygner Nicolas Flammarion NoLa 33 31 0 20 Jun 2022
Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling Jiri Hron Roman Novak Jeffrey Pennington Jascha Narain Sohl-Dickstein UQCV BDL 48 6 0 15 Jun 2022
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction Kaifeng Lyu Zhiyuan Li Sanjeev Arora FAtt 40 70 0 14 Jun 2022
Overcoming the Spectral Bias of Neural Value Approximation Ge Yang Anurag Ajay Pulkit Agrawal 34 25 0 09 Jun 2022
Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials Eshaan Nichani Yunzhi Bai Jason D. Lee 29 10 0 08 Jun 2022
Explaining the physics of transfer learning a data-driven subgrid-scale closure to a different turbulent flow Adam Subel Yifei Guan A. Chattopadhyay P. Hassanzadeh AI4CE 32 41 0 07 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs Etienne Boursier Loucas Pillaud-Vivien Nicolas Flammarion ODL 24 58 0 02 Jun 2022
Analyzing Tree Architectures in Ensembles via Neural Tangent Kernel Ryuichi Kanoh M. Sugiyama 31 2 0 25 May 2022
One-Pixel Shortcut: on the Learning Preference of Deep Neural Networks Shutong Wu Sizhe Chen Cihang Xie X. Huang AAML 45 27 0 24 May 2022
Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture Libin Zhu Chaoyue Liu M. Belkin GNN AI4CE 23 4 0 24 May 2022
Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks Blake Bordelon Cengiz Pehlevan MLT 40 78 0 19 May 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias Itay Safran Gal Vardi Jason D. Lee MLT 59 23 0 18 May 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation Jimmy Ba Murat A. Erdogdu Taiji Suzuki Zhichao Wang Denny Wu Greg Yang MLT 42 121 0 03 May 2022
Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes Chao Ma D. Kunin Lei Wu Lexing Ying 25 27 0 24 Apr 2022
On Feature Learning in Neural Networks with Global Convergence Guarantees Zhengdao Chen Eric Vanden-Eijnden Joan Bruna MLT 36 13 0 22 Apr 2022
Convergence of gradient descent for deep neural networks S. Chatterjee ODL 21 20 0 30 Mar 2022
Random matrix analysis of deep neural network weight matrices M. Thamm Max Staats B. Rosenow 35 12 0 28 Mar 2022
On the (Non-)Robustness of Two-Layer Neural Networks in Different Learning Regimes Elvis Dohmatob A. Bietti AAML 32 13 0 22 Mar 2022
Robust Training under Label Noise by Over-parameterization Sheng Liu Zhihui Zhu Qing Qu Chong You NoLa OOD 27 106 0 28 Feb 2022
On the Benefits of Large Learning Rates for Kernel Methods Gaspard Beugnot Julien Mairal Alessandro Rudi 19 11 0 28 Feb 2022
The Spectral Bias of Polynomial Neural Networks Moulik Choraria L. Dadi Grigorios G. Chrysos Julien Mairal V. Cevher 24 18 0 27 Feb 2022
A Geometric Understanding of Natural Gradient Qinxun Bai S. Rosenberg Wei Xu 21 2 0 13 Feb 2022
Tight Convergence Rate Bounds for Optimization Under Power Law Spectral Conditions Maksim Velikanov Dmitry Yarotsky 9 6 0 02 Feb 2022
Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks R. Veiga Ludovic Stephan Bruno Loureiro Florent Krzakala Lenka Zdeborová MLT 13 31 0 01 Feb 2022
Stochastic Neural Networks with Infinite Width are Deterministic Liu Ziyin Hanlin Zhang Xiangming Meng Yuting Lu Eric P. Xing Masakuni Ueda 29 3 0 30 Jan 2022
Interplay between depth of neural networks and locality of target functions Takashi Mori Masakuni Ueda 25 0 0 28 Jan 2022
Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks Benjamin Bowman Guido Montúfar 28 11 0 12 Jan 2022
Separation of Scales and a Thermodynamic Description of Feature Learning in Some CNNs Inbar Seroussi Gadi Naveh Zohar Ringel 35 51 0 31 Dec 2021
Over-Parametrized Matrix Factorization in the Presence of Spurious Stationary Points Armin Eftekhari 24 1 0 25 Dec 2021
Early Stopping for Deep Image Prior Hengkang Wang Taihui Li Zhong Zhuang Tiancong Chen Hengyue Liang Ju Sun 26 63 0 11 Dec 2021
SHRIMP: Sparser Random Feature Models via Iterative Magnitude Pruning Yuege Xie Bobby Shi Hayden Schaeffer Rachel A. Ward 81 9 0 07 Dec 2021
Learning with convolution and pooling operations in kernel methods Theodor Misiakiewicz Song Mei MLT 15 29 0 16 Nov 2021
On the Equivalence between Neural Network and Support Vector Machine Yilan Chen Wei Huang Lam M. Nguyen Tsui-Wei Weng AAML 25 18 0 11 Nov 2021
Understanding Layer-wise Contributions in Deep Neural Networks through Spectral Analysis Yatin Dandi Arthur Jacot FAtt 26 4 0 06 Nov 2021
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks A. Shevchenko Vyacheslav Kungurtsev Marco Mondelli MLT 41 13 0 03 Nov 2021
Subquadratic Overparameterization for Shallow Neural Networks Chaehwan Song Ali Ramezani-Kebrya Thomas Pethick Armin Eftekhari V. Cevher 30 31 0 02 Nov 2021
Neural Networks as Kernel Learners: The Silent Alignment Effect Alexander B. Atanasov Blake Bordelon Cengiz Pehlevan MLT 26 75 0 29 Oct 2021
Does the Data Induce Capacity Control in Deep Learning? Rubing Yang Jialin Mao Pratik Chaudhari 33 15 0 27 Oct 2021
AIR-Net: Adaptive and Implicit Regularization Neural Network for Matrix Completion Zhemin Li Tao Sun Hongxia Wang Bao Wang 50 6 0 12 Oct 2021
Classification and Adversarial examples in an Overparameterized Linear Model: A Signal Processing Perspective Adhyyan Narang Vidya Muthukumar A. Sahai SILM AAML 36 1 0 27 Sep 2021
Fast and Sample-Efficient Interatomic Neural Network Potentials for Molecules and Materials Based on Gaussian Moments Viktor Zaverkin David Holzmüller Ingo Steinwart Johannes Kastner 29 19 0 20 Sep 2021
Deformed semicircle law and concentration of nonlinear random matrices for ultra-wide neural networks Zhichao Wang Yizhe Zhu 35 18 0 20 Sep 2021
A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning Yehuda Dar Vidya Muthukumar Richard G. Baraniuk 34 71 0 06 Sep 2021
Dash: Semi-Supervised Learning with Dynamic Thresholding Yi Tian Xu Lei Shang Jinxing Ye Qi Qian Yu-Feng Li Baigui Sun Hao Li R. L. Jin 38 218 0 01 Sep 2021
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization Difan Zou Yuan Cao Yuanzhi Li Quanquan Gu MLT AI4CE 47 38 0 25 Aug 2021
Convergence analysis for gradient flows in the training of artificial neural networks with ReLU activation Arnulf Jentzen Adrian Riekert 27 23 0 09 Jul 2021
Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction Dominik Stöger Mahdi Soltanolkotabi ODL 42 75 0 28 Jun 2021
Locality defeats the curse of dimensionality in convolutional teacher-student scenarios Alessandro Favero Francesco Cagnetta M. Wyart 30 31 0 16 Jun 2021
A self consistent theory of Gaussian Processes captures feature learning effects in finite CNNs Gadi Naveh Zohar Ringel SSL MLT 36 31 0 08 Jun 2021