High-dimensional dynamics of generalization error in neural networks

10 October 2017

Papers citing "High-dimensional dynamics of generalization error in neural networks"

50 / 296 papers shown

Title
Improved weight initialization for deep and narrow feedforward neural network Hyunwoo Lee Yunho Kim Seungyeop Yang Hayoung Choi ODL 12 3 0 07 Nov 2023
Changing the Kernel During Training Leads to Double Descent in Kernel Regression Oskar Allerbo 19 0 0 03 Nov 2023
Machine learning refinement of in situ images acquired by low electron dose LC-TEM H. Katsuno Yuki Kimura T. Yamazaki Ichigaku Takigawa 13 0 0 31 Oct 2023
Unraveling the Enigma of Double Descent: An In-depth Analysis through the Lens of Learned Feature Space Yufei Gu Xiaoqing Zheng T. Aste 37 3 0 20 Oct 2023
How connectivity structure shapes rich and lazy learning in neural circuits Yuhan Helena Liu A. Baratin Jonathan H. Cornford Stefan Mihalas E. Shea-Brown Guillaume Lajoie 38 14 0 12 Oct 2023
Dynamical versus Bayesian Phase Transitions in a Toy Model of Superposition Zhongtian Chen Edmund Lau Jake Mendel Susan Wei Daniel Murfet 16 13 0 10 Oct 2023
Towards a statistical theory of data selection under weak supervision Germain Kolossov Andrea Montanari Pulkit Tandon 14 14 0 25 Sep 2023
Uncovering mesa-optimization algorithms in Transformers J. Oswald Eyvind Niklasson Maximilian Schlegel Seijin Kobayashi Nicolas Zucchet ... Mark Sandler Blaise Agüera y Arcas Max Vladymyrov Razvan Pascanu João Sacramento 24 53 0 11 Sep 2023
Connecting NTK and NNGP: A Unified Theoretical Framework for Wide Neural Network Learning Dynamics Yehonatan Avidan Qianyi Li H. Sompolinsky 60 8 0 08 Sep 2023
No Data Augmentation? Alternative Regularizations for Effective Training on Small Datasets Lorenzo Brigato S. Mougiakakou 27 3 0 04 Sep 2023
Six Lectures on Linearized Neural Networks Theodor Misiakiewicz Andrea Montanari 34 12 0 25 Aug 2023
Learning Compact Neural Networks with Deep Overparameterised Multitask Learning Shengqi Ren Haosen Shi 9 0 0 25 Aug 2023
Don't blame Dataset Shift! Shortcut Learning due to Gradients and Cross Entropy A. Puli Lily H. Zhang Yoav Wald Rajesh Ranganath 13 19 0 24 Aug 2023
On High-Dimensional Asymptotic Properties of Model Averaging Estimators Ryo Ando F. Komaki MoMe 12 6 0 18 Aug 2023
The Interpolating Information Criterion for Overparameterized Models Liam Hodgkinson Christopher van der Heide Roberto Salomone Fred Roosta Michael W. Mahoney 16 7 0 15 Jul 2023
Solving Kernel Ridge Regression with Gradient-Based Optimization Methods Oskar Allerbo 8 1 0 29 Jun 2023
Efficient Online Processing with Deep Neural Networks Lukas Hedegaard 18 0 0 23 Jun 2023
Quantifying lottery tickets under label noise: accuracy, calibration, and complexity V. Arora Daniele Irto Sebastian Goldt G. Sanguinetti 34 2 0 21 Jun 2023
Deterministic equivalent of the Conjugate Kernel matrix associated to Artificial Neural Networks Clément Chouard 20 2 0 09 Jun 2023
Gibbs-Based Information Criteria and the Over-Parameterized Regime Haobo Chen Yuheng Bu Greg Wornell 21 1 0 08 Jun 2023
Stochastic Collapse: How Gradient Noise Attracts SGD Dynamics Towards Simpler Subnetworks F. Chen D. Kunin Atsushi Yamamura Surya Ganguli 21 26 0 07 Jun 2023
Extracting Cloud-based Model with Prior Knowledge S. Zhao Kangjie Chen Meng Hao Jian Zhang Guowen Xu Hongwei Li Tianwei Zhang AAML MIACV SILM MLAU SLR 28 5 0 07 Jun 2023
Dropout Drops Double Descent Tianbao Yang J. Suzuki 11 1 0 25 May 2023
Least Squares Regression Can Exhibit Under-Parameterized Double Descent Xinyue Li Rishi Sonthalia 31 3 0 24 May 2023
Understanding the Initial Condensation of Convolutional Neural Networks Zhangchen Zhou Hanxu Zhou Yuqing Li Zhi-Qin John Xu MLT AI4CE 20 5 0 17 May 2023
Do deep neural networks have an inbuilt Occam's razor? Chris Mingard Henry Rees Guillermo Valle Pérez A. Louis UQCV BDL 19 15 0 13 Apr 2023
Double Descent Demystified: Identifying, Interpreting & Ablating the Sources of a Deep Learning Puzzle Rylan Schaeffer Mikail Khona Zachary Robertson Akhilan Boopathy Kateryna Pistunova J. Rocks Ila Rani Fiete Oluwasanmi Koyejo 62 31 0 24 Mar 2023
Online Learning for the Random Feature Model in the Student-Teacher Framework Roman Worschech B. Rosenow 36 0 0 24 Mar 2023
ExplainFix: Explainable Spatially Fixed Deep Networks Alex Gaudio Christos Faloutsos A. Smailagic P. Costa A. Campilho FAtt 19 3 0 18 Mar 2023
Deep Learning Weight Pruning with RMT-SVD: Increasing Accuracy and Reducing Overfitting Yitzchak Shmalo Jonathan Jenkins Oleksii Krupchytskyi 22 3 0 15 Mar 2023
Inversion dynamics of class manifolds in deep learning reveals tradeoffs underlying generalisation Simone Ciceri Lorenzo Cassani Matteo Osella P. Rotondo P. Pizzochero M. Gherardi 26 7 0 09 Mar 2023
Linear CNNs Discover the Statistical Structure of the Dataset Using Only the Most Dominant Frequencies Hannah Pinson Joeri Lenaerts V. Ginis 11 3 0 03 Mar 2023
Over-training with Mixup May Hurt Generalization Zixuan Liu Ziqiao Wang Hongyu Guo Yongyi Mao NoLa 21 11 0 02 Mar 2023
On the Generalization of PINNs outside the training domain and the Hyperparameters influencing it Andrea Bonfanti Roberto Santana M. Ellero Babak Gholami AI4CE PINN 35 3 0 15 Feb 2023
Effects of noise on the overparametrization of quantum neural networks Diego García-Martín Martín Larocca M. Cerezo 25 17 0 10 Feb 2023
On a continuous time model of gradient descent dynamics and instability in deep learning Mihaela Rosca Yan Wu Chongli Qin Benoit Dherin 16 6 0 03 Feb 2023
Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression Mo Zhou Rong Ge 27 2 0 01 Feb 2023
Deep networks for system identification: a Survey G. Pillonetto Aleksandr Aravkin Daniel Gedon L. Ljung Antônio H. Ribeiro Thomas B. Schon OOD 35 35 0 30 Jan 2023
WLD-Reg: A Data-dependent Within-layer Diversity Regularizer Firas Laakom Jenni Raitoharju Alexandros Iosifidis M. Gabbouj AI4CE 23 7 0 03 Jan 2023
Bayesian Interpolation with Deep Linear Networks Boris Hanin Alexander Zlokapa 34 25 0 29 Dec 2022
Gradient flow in the gaussian covariate model: exact solution of learning curves and multiple descent structures Antione Bodin N. Macris 26 4 0 13 Dec 2022
A Survey of Learning Curves with Bad Behavior: or How More Data Need Not Lead to Better Performance Marco Loog T. Viering 21 1 0 25 Nov 2022
Neural networks trained with SGD learn distributions of increasing complexity Maria Refinetti Alessandro Ingrosso Sebastian Goldt UQCV 30 41 0 21 Nov 2022
Understanding the double descent curve in Machine Learning Luis Sa-Couto J. M. Ramos Miguel Almeida Andreas Wichert 14 1 0 18 Nov 2022
Do highly over-parameterized neural networks generalize since bad solutions are rare? Julius Martinetz T. Martinetz 22 1 0 07 Nov 2022
Globally Gated Deep Linear Networks Qianyi Li H. Sompolinsky AI4CE 14 10 0 31 Oct 2022
A Solvable Model of Neural Scaling Laws A. Maloney Daniel A. Roberts J. Sully 31 51 0 30 Oct 2022
Trustworthiness of Laser-Induced Breakdown Spectroscopy Predictions via Simulation-based Synthetic Data Augmentation and Multitask Learning Riccardo Finotello D. L’hermite Celine Quéré Benjamin Rouge M. Tamaazousti J. Sirven 22 1 0 07 Oct 2022
Information FOMO: The unhealthy fear of missing out on information. A method for removing misleading data for healthier models Ethan Pickering T. Sapsis 16 6 0 27 Aug 2022
Investigating the Impact of Model Width and Density on Generalization in Presence of Label Noise Yihao Xue Kyle Whitecross Baharan Mirzasoleiman NoLa 25 1 0 17 Aug 2022