ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1710.03667
  4. Cited By
High-dimensional dynamics of generalization error in neural networks

High-dimensional dynamics of generalization error in neural networks

10 October 2017
Madhu S. Advani
Andrew M. Saxe
    AI4CE
ArXivPDFHTML

Papers citing "High-dimensional dynamics of generalization error in neural networks"

50 / 296 papers shown
Title
The Double Descent Behavior in Two Layer Neural Network for Binary Classification
The Double Descent Behavior in Two Layer Neural Network for Binary Classification
Chathurika S Abeykoon
A. Beknazaryan
Hailin Sang
48
1
0
27 Apr 2025
Exact Learning Dynamics of In-Context Learning in Linear Transformers and Its Application to Non-Linear Transformers
Exact Learning Dynamics of In-Context Learning in Linear Transformers and Its Application to Non-Linear Transformers
Nischal Mainali
Lucas Teixeira
24
0
0
17 Apr 2025
Enhancing Multi-task Learning Capability of Medical Generalist Foundation Model via Image-centric Multi-annotation Data
Enhancing Multi-task Learning Capability of Medical Generalist Foundation Model via Image-centric Multi-annotation Data
Xun Zhu
Fanbin Mo
Zheng Zhang
J. Wang
Yiming Shi
Ming Wu
Chuang Zhang
Miao Li
Ji Wu
32
0
0
14 Apr 2025
Make Haste Slowly: A Theory of Emergent Structured Mixed Selectivity in Feature Learning ReLU Networks
Make Haste Slowly: A Theory of Emergent Structured Mixed Selectivity in Feature Learning ReLU Networks
Devon Jarvis
Richard Klein
Benjamin Rosman
Andrew M. Saxe
MLT
64
1
0
08 Mar 2025
On the Relationship Between Double Descent of CNNs and Shape/Texture Bias Under Learning Process
Shun Iwase
Shuya Takahashi
Nakamasa Inoue
Rio Yokota
Ryo Nakamura
Hirokatsu Kataoka
72
0
0
04 Mar 2025
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Yoonsoo Nam
Seok Hyeong Lee
Clementine Domine
Yea Chan Park
Charles London
Wonyl Choi
Niclas Goring
Seungjai Lee
AI4CE
35
0
0
28 Feb 2025
Correlating and Predicting Human Evaluations of Language Models from Natural Language Processing Benchmarks
Correlating and Predicting Human Evaluations of Language Models from Natural Language Processing Benchmarks
Rylan Schaeffer
Punit Singh Koura
Binh Tang
R. Subramanian
Aaditya K. Singh
...
Vedanuj Goswami
Sergey Edunov
Dieuwke Hupkes
Sanmi Koyejo
Sharan Narang
ALM
69
0
0
24 Feb 2025
A distributional simplicity bias in the learning dynamics of transformers
A distributional simplicity bias in the learning dynamics of transformers
Riccardo Rende
Federica Gerace
A. Laio
Sebastian Goldt
71
8
0
17 Feb 2025
Early Stopping Against Label Noise Without Validation Data
Early Stopping Against Label Noise Without Validation Data
Suqin Yuan
Lei Feng
Tongliang Liu
NoLa
96
14
0
11 Feb 2025
Deep Linear Network Training Dynamics from Random Initialization: Data, Width, Depth, and Hyperparameter Transfer
Deep Linear Network Training Dynamics from Random Initialization: Data, Width, Depth, and Hyperparameter Transfer
Blake Bordelon
C. Pehlevan
AI4CE
59
1
0
04 Feb 2025
A theoretical framework for overfitting in energy-based modeling
A theoretical framework for overfitting in energy-based modeling
Giovanni Catania
A. Decelle
Cyril Furtlehner
Beatriz Seoane
57
2
0
31 Jan 2025
Bilinear Sequence Regression: A Model for Learning from Long Sequences
  of High-dimensional Tokens
Bilinear Sequence Regression: A Model for Learning from Long Sequences of High-dimensional Tokens
Vittorio Erba
Emanuele Troiani
Luca Biggio
Antoine Maillard
Lenka Zdeborová
18
0
0
24 Oct 2024
Rethinking generalization of classifiers in separable classes scenarios
  and over-parameterized regimes
Rethinking generalization of classifiers in separable classes scenarios and over-parameterized regimes
Julius Martinetz
C. Linse
Thomas Martinetz
23
0
0
22 Oct 2024
Generalization for Least Squares Regression With Simple Spiked
  Covariances
Generalization for Least Squares Regression With Simple Spiked Covariances
Jiping Li
Rishi Sonthalia
23
0
0
17 Oct 2024
Analyzing Neural Scaling Laws in Two-Layer Networks with Power-Law Data
  Spectra
Analyzing Neural Scaling Laws in Two-Layer Networks with Power-Law Data Spectra
Roman Worschech
B. Rosenow
39
0
0
11 Oct 2024
MLP-KAN: Unifying Deep Representation and Function Learning
MLP-KAN: Unifying Deep Representation and Function Learning
Yunhong He
Yifeng Xie
Zhengqing Yuan
Lichao Sun
29
1
0
03 Oct 2024
Differentiation and Specialization of Attention Heads via the Refined
  Local Learning Coefficient
Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient
George Wang
Jesse Hoogland
Stan van Wingerden
Zach Furman
Daniel Murfet
OffRL
26
7
0
03 Oct 2024
Investigating the Impact of Model Complexity in Large Language Models
Investigating the Impact of Model Complexity in Large Language Models
Jing Luo
Huiyuan Wang
Weiran Huang
34
0
0
01 Oct 2024
Unified Neural Network Scaling Laws and Scale-time Equivalence
Unified Neural Network Scaling Laws and Scale-time Equivalence
Akhilan Boopathy
Ila Fiete
35
0
0
09 Sep 2024
Lecture Notes on Linear Neural Networks: A Tale of Optimization and
  Generalization in Deep Learning
Lecture Notes on Linear Neural Networks: A Tale of Optimization and Generalization in Deep Learning
Nadav Cohen
Noam Razin
31
0
0
25 Aug 2024
Risk and cross validation in ridge regression with correlated samples
Risk and cross validation in ridge regression with correlated samples
Alexander B. Atanasov
Jacob A. Zavatone-Veth
C. Pehlevan
27
4
0
08 Aug 2024
Towards understanding epoch-wise double descent in two-layer linear
  neural networks
Towards understanding epoch-wise double descent in two-layer linear neural networks
Amanda Olmin
Fredrik Lindsten
MLT
27
3
0
13 Jul 2024
Understanding Visual Feature Reliance through the Lens of Complexity
Understanding Visual Feature Reliance through the Lens of Complexity
Thomas Fel
Louis Bethune
Andrew Kyle Lampinen
Thomas Serre
Katherine Hermann
FAtt
CoGe
30
6
0
08 Jul 2024
Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks
Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks
Amit Peleg
Matthias Hein
26
0
0
04 Jul 2024
How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self
  Distillation Networks
How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self Distillation Networks
Etai Littwin
Omid Saremi
Madhu Advani
Vimal Thilak
Preetum Nakkiran
Chen Huang
Joshua Susskind
37
3
0
03 Jul 2024
Towards an Improved Understanding and Utilization of Maximum Manifold
  Capacity Representations
Towards an Improved Understanding and Utilization of Maximum Manifold Capacity Representations
Rylan Schaeffer
Victor Lecomte
Dhruv Pai
Andres Carranza
Berivan Isik
...
Yann LeCun
SueYeon Chung
Andrey Gromov
Ravid Shwartz-Ziv
Sanmi Koyejo
41
5
0
13 Jun 2024
Precise analysis of ridge interpolators under heavy correlations -- a
  Random Duality Theory view
Precise analysis of ridge interpolators under heavy correlations -- a Random Duality Theory view
Mihailo Stojnic
24
1
0
13 Jun 2024
Ridge interpolators in correlated factor regression models -- exact risk
  analysis
Ridge interpolators in correlated factor regression models -- exact risk analysis
Mihailo Stojnic
20
1
0
13 Jun 2024
On Regularization via Early Stopping for Least Squares Regression
On Regularization via Early Stopping for Least Squares Regression
Rishi Sonthalia
Jackie Lok
E. Rebrova
25
2
0
06 Jun 2024
Disentangling and Mitigating the Impact of Task Similarity for Continual
  Learning
Disentangling and Mitigating the Impact of Task Similarity for Continual Learning
Naoki Hiratani
CLL
35
2
0
30 May 2024
Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
Zhenfeng Tu
Santiago Aranguri
Arthur Jacot
26
8
0
27 May 2024
Cascade of phase transitions in the training of Energy-based models
Cascade of phase transitions in the training of Energy-based models
Dimitrios Bachtis
Giulio Biroli
A. Decelle
Beatriz Seoane
36
4
0
23 May 2024
Deep linear networks for regression are implicitly regularized towards
  flat minima
Deep linear networks for regression are implicitly regularized towards flat minima
Pierre Marion
Lénaic Chizat
ODL
26
5
0
22 May 2024
Class-wise Activation Unravelling the Engima of Deep Double Descent
Class-wise Activation Unravelling the Engima of Deep Double Descent
Yufei Gu
28
0
0
13 May 2024
Thermodynamic limit in learning period three
Thermodynamic limit in learning period three
Yuichiro Terasaki
Kohei Nakajima
38
1
0
12 May 2024
Learned feature representations are biased by complexity, learning
  order, position, and more
Learned feature representations are biased by complexity, learning order, position, and more
Andrew Kyle Lampinen
Stephanie C. Y. Chan
Katherine Hermann
AI4CE
FaML
SSL
OOD
32
6
0
09 May 2024
Why is SAM Robust to Label Noise?
Why is SAM Robust to Label Noise?
Christina Baek
Zico Kolter
Aditi Raghunathan
NoLa
AAML
41
9
0
06 May 2024
PNeRV: Enhancing Spatial Consistency via Pyramidal Neural Representation
  for Videos
PNeRV: Enhancing Spatial Consistency via Pyramidal Neural Representation for Videos
Qi Zhao
M. Salman Asif
Zhan Ma
29
3
0
13 Apr 2024
"Lossless" Compression of Deep Neural Networks: A High-dimensional
  Neural Tangent Kernel Approach
"Lossless" Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel Approach
Lingyu Gu
Yongqiang Du
Yuan Zhang
Di Xie
Shiliang Pu
Robert C. Qiu
Zhenyu Liao
36
6
0
01 Mar 2024
Low-Rank Learning by Design: the Role of Network Architecture and
  Activation Linearity in Gradient Rank Collapse
Low-Rank Learning by Design: the Role of Network Architecture and Activation Linearity in Gradient Rank Collapse
Bradley T. Baker
Ba Pearlmutter
Robyn L. Miller
Vince D. Calhoun
Sergey Plis
AI4CE
11
2
0
09 Feb 2024
A Dynamical Model of Neural Scaling Laws
A Dynamical Model of Neural Scaling Laws
Blake Bordelon
Alexander B. Atanasov
C. Pehlevan
46
36
0
02 Feb 2024
Deeper or Wider: A Perspective from Optimal Generalization Error with
  Sobolev Loss
Deeper or Wider: A Perspective from Optimal Generalization Error with Sobolev Loss
Yahong Yang
Juncai He
AI4CE
26
7
0
31 Jan 2024
The twin peaks of learning neural networks
The twin peaks of learning neural networks
Elizaveta Demyanenko
Christoph Feinauer
Enrico M. Malatesta
Luca Saglietti
10
0
0
23 Jan 2024
Interpretable Concept Bottlenecks to Align Reinforcement Learning Agents
Interpretable Concept Bottlenecks to Align Reinforcement Learning Agents
Quentin Delfosse
Sebastian Sztwiertnia
M. Rothermel
Wolfgang Stammer
Kristian Kersting
47
18
0
11 Jan 2024
Learning from higher-order statistics, efficiently: hypothesis tests,
  random features, and neural networks
Learning from higher-order statistics, efficiently: hypothesis tests, random features, and neural networks
Eszter Székely
Lorenzo Bardone
Federica Gerace
Sebastian Goldt
32
2
0
22 Dec 2023
Understanding Unimodal Bias in Multimodal Deep Linear Networks
Understanding Unimodal Bias in Multimodal Deep Linear Networks
Yedi Zhang
Peter E. Latham
Andrew Saxe
26
5
0
01 Dec 2023
Critical Influence of Overparameterization on Sharpness-aware Minimization
Critical Influence of Overparameterization on Sharpness-aware Minimization
Sungbin Shin
Dongyeop Lee
Maksym Andriushchenko
Namhoon Lee
AAML
41
1
0
29 Nov 2023
Weight fluctuations in (deep) linear neural networks and a derivation of
  the inverse-variance flatness relation
Weight fluctuations in (deep) linear neural networks and a derivation of the inverse-variance flatness relation
Markus Gross
A. Raulf
Christoph Räth
38
0
0
23 Nov 2023
Evolutionary algorithms as an alternative to backpropagation for
  supervised training of Biophysical Neural Networks and Neural ODEs
Evolutionary algorithms as an alternative to backpropagation for supervised training of Biophysical Neural Networks and Neural ODEs
James Hazelden
Yuhan Helena Liu
Eli Shlizerman
E. Shea-Brown
34
2
0
17 Nov 2023
Why Do Probabilistic Clinical Models Fail To Transport Between Sites?
Why Do Probabilistic Clinical Models Fail To Transport Between Sites?
Thomas A. Lasko
Eric V. Strobl
William W Stead
OOD
36
7
0
08 Nov 2023
123456
Next