The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization

15 August 2020

Papers citing "The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization"

44 / 94 papers shown

Title
Second-order regression models exhibit progressive sharpening to the edge of stability Atish Agarwala Fabian Pedregosa Jeffrey Pennington 124 30 0 10 Oct 2022
Multiple Descent in the Multiple Random Feature Model Xuran Meng Jianfeng Yao Yuan Cao 108 7 0 21 Aug 2022
Investigating the Impact of Model Width and Density on Generalization in Presence of Label Noise Yihao Xue Kyle Whitecross Baharan Mirzasoleiman NoLa 213 2 0 17 Aug 2022
The BUTTER Zone: An Empirical Study of Training Dynamics in Fully Connected Neural Networks Charles Edison Tripp J. Perr-Sauer L. Hayne M. Lunacek Jamil Gafur AI4CE 156 1 0 25 Jul 2022
A Universal Trade-off Between the Model Size, Test Loss, and Training Loss of Linear Predictors Nikhil Ghosh M. Belkin 148 7 0 23 Jul 2022
Synergy and Symmetry in Deep Learning: Interactions between the Data, Model, and Inference Algorithm Lechao Xiao Jeffrey Pennington 135 10 0 11 Jul 2022
Limitations of the NTK for Understanding Generalization in Deep Learning Nikhil Vyas Yamini Bansal Preetum Nakkiran 169 38 0 20 Jun 2022
Implicit Regularization or Implicit Conditioning? Exact Risk Trajectories of SGD in High Dimensions Courtney Paquette Elliot Paquette Ben Adlam Jeffrey Pennington 85 16 0 15 Jun 2022
Regularization-wise double descent: Why it occurs and how to eliminate it Fatih Yilmaz Reinhard Heckel 115 11 0 03 Jun 2022
Trajectory of Mini-Batch Momentum: Batch Size Saturation and Convergence in High Dimensions Kiwon Lee Andrew N. Cheng Courtney Paquette Elliot Paquette 109 15 0 02 Jun 2022
Precise Learning Curves and Higher-Order Scaling Limits for Dot Product Kernel Regression Lechao Xiao Hong Hu Theodor Misiakiewicz Yue M. Lu Jeffrey Pennington 184 20 0 30 May 2022
Memorization and Optimization in Deep Neural Networks with Minimum Over-parameterization Simone Bombari Mohammad Hossein Amani Marco Mondelli 136 32 0 20 May 2022
Sharp Asymptotics of Kernel Ridge Regression Beyond the Linear Regime Hong Hu Yue M. Lu 113 16 0 13 May 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation Jimmy Ba Murat A. Erdogdu Taiji Suzuki Zhichao Wang Denny Wu Greg Yang MLT 148 144 0 03 May 2022
Overparameterized Linear Regression under Adversarial Attacks Antônio H. Ribeiro Thomas B. Schon AAML 77 22 0 13 Apr 2022
More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize Alexander Wei Wei Hu Jacob Steinhardt 146 79 0 11 Mar 2022
Contrasting random and learned features in deep Bayesian linear regression Jacob A. Zavatone-Veth William L. Tong Cengiz Pehlevan BDL MLT 182 29 0 01 Mar 2022
Benign Overfitting in Two-layer Convolutional Neural Networks Yuan Cao Zixiang Chen M. Belkin Quanquan Gu MLT 176 99 0 14 Feb 2022
Neural Tangent Kernel Beyond the Infinite-Width Limit: Effects of Depth and Initialization Mariia Seleznova Gitta Kutyniok 308 22 0 01 Feb 2022
Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics for Convex Losses in High-Dimension Bruno Loureiro Cédric Gerbelot Maria Refinetti G. Sicuro Florent Krzakala 143 27 0 31 Jan 2022
A generalization gap estimation for overparameterized models via the Langevin functional variance Akifumi Okuno Keisuke Yano 158 3 0 07 Dec 2021
Understanding Square Loss in Training Overparametrized Neural Network Classifiers Tianyang Hu Jun Wang Wei Cao Zhenguo Li UQCV AAML 110 19 0 07 Dec 2021
Model, sample, and epoch-wise descents: exact solution of gradient flow in the random feature model A. Bodin N. Macris 161 14 0 22 Oct 2021
Learning in High Dimension Always Amounts to Extrapolation Randall Balestriero J. Pesenti Yann LeCun 199 107 0 18 Oct 2021
Deformed semicircle law and concentration of nonlinear random matrices for ultra-wide neural networks Zhichao Wang Yizhe Zhu 165 24 0 20 Sep 2021
Dataset Distillation with Infinitely Wide Convolutional Networks Timothy Nguyen Roman Novak Lechao Xiao Jaehoon Lee DD 186 259 0 27 Jul 2021
Taxonomizing local versus global structure in neural network loss landscapes Yaoqing Yang Liam Hodgkinson Ryan Theisen Joe Zou Joseph E. Gonzalez Kannan Ramchandran Michael W. Mahoney 187 39 0 23 Jul 2021
Random Neural Networks in the Infinite Width Limit as Gaussian Processes Boris Hanin BDL 126 52 0 04 Jul 2021
Towards an Understanding of Benign Overfitting in Neural Networks Zhu Li Zhi Zhou Arthur Gretton MLT 117 35 0 06 Jun 2021
Fundamental tradeoffs between memorization and robustness in random features and neural tangent regimes Elvis Dohmatob 92 9 0 04 Jun 2021
Universal scaling laws in the gradient descent training of neural networks Maksim Velikanov Dmitry Yarotsky 123 9 0 02 May 2021
Fitting Elephants P. Mitra 55 0 0 31 Mar 2021
Double-descent curves in neural networks: a new perspective using Gaussian processes Ouns El Harzli Bernardo Cuenca Grau Guillermo Valle Pérez A. Louis 214 6 0 14 Feb 2021
Appearance of Random Matrix Theory in Deep Learning Nicholas P. Baskerville Diego Granziol J. Keating 131 11 0 12 Feb 2021
Explaining Neural Scaling Laws Yasaman Bahri Ethan Dyer Jared Kaplan Jaehoon Lee Utkarsh Sharma 197 324 0 12 Feb 2021
Understanding Double Descent Requires a Fine-Grained Bias-Variance DecompositionNeural Information Processing Systems (NeurIPS), 2025 Ben Adlam Jeffrey Pennington UD 159 99 0 04 Nov 2020
What causes the test error? Going beyond bias-variance via ANOVA Licong Lin Guang Cheng 144 34 0 11 Oct 2020
On the Universality of the Double Descent Peak in Ridgeless Regression David Holzmüller 227 13 0 05 Oct 2020
A Dynamical Central Limit Theorem for Shallow Neural NetworksNeural Information Processing Systems (NeurIPS), 2025 Zhengdao Chen Grant M. Rotskoff Joan Bruna Eric Vanden-Eijnden 130 30 0 21 Aug 2020
Spectral Bias and Task-Model Alignment Explain Generalization in Kernel Regression and Infinitely Wide Neural Networks Abdulkadir Canatar Blake Bordelon Cengiz Pehlevan 306 213 0 23 Jun 2020
Triple descent and the two kinds of overfitting: Where & why do they appear? Stéphane dÁscoli Levent Sagun Giulio Biroli 136 81 0 05 Jun 2020
Spectra of the Conjugate Kernel and Neural Tangent Kernel for linear-width neural networks Z. Fan Zhichao Wang 143 82 0 25 May 2020
A Random Matrix Perspective on Mixtures of Nonlinearities for Deep Learning Ben Adlam J. Levinson Jeffrey Pennington 133 27 0 02 Dec 2019
Surprises in High-Dimensional Ridgeless Least Squares Interpolation Trevor Hastie Andrea Montanari Saharon Rosset Robert Tibshirani 489 782 0 19 Mar 2019