Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1710.03667
Cited By
High-dimensional dynamics of generalization error in neural networks
10 October 2017
Madhu S. Advani
Andrew M. Saxe
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"High-dimensional dynamics of generalization error in neural networks"
50 / 296 papers shown
Title
Improved weight initialization for deep and narrow feedforward neural network
Hyunwoo Lee
Yunho Kim
Seungyeop Yang
Hayoung Choi
ODL
12
3
0
07 Nov 2023
Changing the Kernel During Training Leads to Double Descent in Kernel Regression
Oskar Allerbo
19
0
0
03 Nov 2023
Machine learning refinement of in situ images acquired by low electron dose LC-TEM
H. Katsuno
Yuki Kimura
T. Yamazaki
Ichigaku Takigawa
13
0
0
31 Oct 2023
Unraveling the Enigma of Double Descent: An In-depth Analysis through the Lens of Learned Feature Space
Yufei Gu
Xiaoqing Zheng
T. Aste
37
3
0
20 Oct 2023
How connectivity structure shapes rich and lazy learning in neural circuits
Yuhan Helena Liu
A. Baratin
Jonathan H. Cornford
Stefan Mihalas
E. Shea-Brown
Guillaume Lajoie
38
14
0
12 Oct 2023
Dynamical versus Bayesian Phase Transitions in a Toy Model of Superposition
Zhongtian Chen
Edmund Lau
Jake Mendel
Susan Wei
Daniel Murfet
16
13
0
10 Oct 2023
Towards a statistical theory of data selection under weak supervision
Germain Kolossov
Andrea Montanari
Pulkit Tandon
14
14
0
25 Sep 2023
Uncovering mesa-optimization algorithms in Transformers
J. Oswald
Eyvind Niklasson
Maximilian Schlegel
Seijin Kobayashi
Nicolas Zucchet
...
Mark Sandler
Blaise Agüera y Arcas
Max Vladymyrov
Razvan Pascanu
João Sacramento
24
53
0
11 Sep 2023
Connecting NTK and NNGP: A Unified Theoretical Framework for Wide Neural Network Learning Dynamics
Yehonatan Avidan
Qianyi Li
H. Sompolinsky
60
8
0
08 Sep 2023
No Data Augmentation? Alternative Regularizations for Effective Training on Small Datasets
Lorenzo Brigato
S. Mougiakakou
27
3
0
04 Sep 2023
Six Lectures on Linearized Neural Networks
Theodor Misiakiewicz
Andrea Montanari
34
12
0
25 Aug 2023
Learning Compact Neural Networks with Deep Overparameterised Multitask Learning
Shengqi Ren
Haosen Shi
9
0
0
25 Aug 2023
Don't blame Dataset Shift! Shortcut Learning due to Gradients and Cross Entropy
A. Puli
Lily H. Zhang
Yoav Wald
Rajesh Ranganath
13
19
0
24 Aug 2023
On High-Dimensional Asymptotic Properties of Model Averaging Estimators
Ryo Ando
F. Komaki
MoMe
12
6
0
18 Aug 2023
The Interpolating Information Criterion for Overparameterized Models
Liam Hodgkinson
Christopher van der Heide
Roberto Salomone
Fred Roosta
Michael W. Mahoney
16
7
0
15 Jul 2023
Solving Kernel Ridge Regression with Gradient-Based Optimization Methods
Oskar Allerbo
8
1
0
29 Jun 2023
Efficient Online Processing with Deep Neural Networks
Lukas Hedegaard
18
0
0
23 Jun 2023
Quantifying lottery tickets under label noise: accuracy, calibration, and complexity
V. Arora
Daniele Irto
Sebastian Goldt
G. Sanguinetti
34
2
0
21 Jun 2023
Deterministic equivalent of the Conjugate Kernel matrix associated to Artificial Neural Networks
Clément Chouard
20
2
0
09 Jun 2023
Gibbs-Based Information Criteria and the Over-Parameterized Regime
Haobo Chen
Yuheng Bu
Greg Wornell
21
1
0
08 Jun 2023
Stochastic Collapse: How Gradient Noise Attracts SGD Dynamics Towards Simpler Subnetworks
F. Chen
D. Kunin
Atsushi Yamamura
Surya Ganguli
21
26
0
07 Jun 2023
Extracting Cloud-based Model with Prior Knowledge
S. Zhao
Kangjie Chen
Meng Hao
Jian Zhang
Guowen Xu
Hongwei Li
Tianwei Zhang
AAML
MIACV
SILM
MLAU
SLR
28
5
0
07 Jun 2023
Dropout Drops Double Descent
Tianbao Yang
J. Suzuki
11
1
0
25 May 2023
Least Squares Regression Can Exhibit Under-Parameterized Double Descent
Xinyue Li
Rishi Sonthalia
31
3
0
24 May 2023
Understanding the Initial Condensation of Convolutional Neural Networks
Zhangchen Zhou
Hanxu Zhou
Yuqing Li
Zhi-Qin John Xu
MLT
AI4CE
20
5
0
17 May 2023
Do deep neural networks have an inbuilt Occam's razor?
Chris Mingard
Henry Rees
Guillermo Valle Pérez
A. Louis
UQCV
BDL
19
15
0
13 Apr 2023
Double Descent Demystified: Identifying, Interpreting & Ablating the Sources of a Deep Learning Puzzle
Rylan Schaeffer
Mikail Khona
Zachary Robertson
Akhilan Boopathy
Kateryna Pistunova
J. Rocks
Ila Rani Fiete
Oluwasanmi Koyejo
62
31
0
24 Mar 2023
Online Learning for the Random Feature Model in the Student-Teacher Framework
Roman Worschech
B. Rosenow
36
0
0
24 Mar 2023
ExplainFix: Explainable Spatially Fixed Deep Networks
Alex Gaudio
Christos Faloutsos
A. Smailagic
P. Costa
A. Campilho
FAtt
19
3
0
18 Mar 2023
Deep Learning Weight Pruning with RMT-SVD: Increasing Accuracy and Reducing Overfitting
Yitzchak Shmalo
Jonathan Jenkins
Oleksii Krupchytskyi
22
3
0
15 Mar 2023
Inversion dynamics of class manifolds in deep learning reveals tradeoffs underlying generalisation
Simone Ciceri
Lorenzo Cassani
Matteo Osella
P. Rotondo
P. Pizzochero
M. Gherardi
26
7
0
09 Mar 2023
Linear CNNs Discover the Statistical Structure of the Dataset Using Only the Most Dominant Frequencies
Hannah Pinson
Joeri Lenaerts
V. Ginis
11
3
0
03 Mar 2023
Over-training with Mixup May Hurt Generalization
Zixuan Liu
Ziqiao Wang
Hongyu Guo
Yongyi Mao
NoLa
21
11
0
02 Mar 2023
On the Generalization of PINNs outside the training domain and the Hyperparameters influencing it
Andrea Bonfanti
Roberto Santana
M. Ellero
Babak Gholami
AI4CE
PINN
35
3
0
15 Feb 2023
Effects of noise on the overparametrization of quantum neural networks
Diego García-Martín
Martín Larocca
M. Cerezo
25
17
0
10 Feb 2023
On a continuous time model of gradient descent dynamics and instability in deep learning
Mihaela Rosca
Yan Wu
Chongli Qin
Benoit Dherin
16
6
0
03 Feb 2023
Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression
Mo Zhou
Rong Ge
27
2
0
01 Feb 2023
Deep networks for system identification: a Survey
G. Pillonetto
Aleksandr Aravkin
Daniel Gedon
L. Ljung
Antônio H. Ribeiro
Thomas B. Schon
OOD
35
35
0
30 Jan 2023
WLD-Reg: A Data-dependent Within-layer Diversity Regularizer
Firas Laakom
Jenni Raitoharju
Alexandros Iosifidis
M. Gabbouj
AI4CE
23
7
0
03 Jan 2023
Bayesian Interpolation with Deep Linear Networks
Boris Hanin
Alexander Zlokapa
34
25
0
29 Dec 2022
Gradient flow in the gaussian covariate model: exact solution of learning curves and multiple descent structures
Antione Bodin
N. Macris
26
4
0
13 Dec 2022
A Survey of Learning Curves with Bad Behavior: or How More Data Need Not Lead to Better Performance
Marco Loog
T. Viering
21
1
0
25 Nov 2022
Neural networks trained with SGD learn distributions of increasing complexity
Maria Refinetti
Alessandro Ingrosso
Sebastian Goldt
UQCV
30
41
0
21 Nov 2022
Understanding the double descent curve in Machine Learning
Luis Sa-Couto
J. M. Ramos
Miguel Almeida
Andreas Wichert
14
1
0
18 Nov 2022
Do highly over-parameterized neural networks generalize since bad solutions are rare?
Julius Martinetz
T. Martinetz
22
1
0
07 Nov 2022
Globally Gated Deep Linear Networks
Qianyi Li
H. Sompolinsky
AI4CE
14
10
0
31 Oct 2022
A Solvable Model of Neural Scaling Laws
A. Maloney
Daniel A. Roberts
J. Sully
31
51
0
30 Oct 2022
Trustworthiness of Laser-Induced Breakdown Spectroscopy Predictions via Simulation-based Synthetic Data Augmentation and Multitask Learning
Riccardo Finotello
D. L’hermite
Celine Quéré
Benjamin Rouge
M. Tamaazousti
J. Sirven
22
1
0
07 Oct 2022
Information FOMO: The unhealthy fear of missing out on information. A method for removing misleading data for healthier models
Ethan Pickering
T. Sapsis
16
6
0
27 Aug 2022
Investigating the Impact of Model Width and Density on Generalization in Presence of Label Noise
Yihao Xue
Kyle Whitecross
Baharan Mirzasoleiman
NoLa
25
1
0
17 Aug 2022
Previous
1
2
3
4
5
6
Next