Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.05827
Cited By
Kernel and Rich Regimes in Overparametrized Models
13 June 2019
Blake E. Woodworth
Suriya Gunasekar
Pedro H. P. Savarese
E. Moroshko
Itay Golan
J. Lee
Daniel Soudry
Nathan Srebro
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Kernel and Rich Regimes in Overparametrized Models"
50 / 85 papers shown
Title
Entropic Mirror Descent for Linear Systems: Polyak's Stepsize and Implicit Bias
Yura Malitsky
Alexander Posch
27
0
0
05 May 2025
Generalization through variance: how noise shapes inductive biases in diffusion models
John J. Vastola
DiffM
138
1
0
16 Apr 2025
On the Cone Effect in the Learning Dynamics
Zhanpeng Zhou
Yongyi Yang
Jie Ren
Mahito Sugiyama
Junchi Yan
53
0
0
20 Mar 2025
MLPs at the EOC: Dynamics of Feature Learning
Dávid Terjék
MLT
41
0
0
18 Feb 2025
Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries
Chris Kolb
T. Weber
Bernd Bischl
David Rügamer
104
0
0
04 Feb 2025
Optimization Insights into Deep Diagonal Linear Networks
Hippolyte Labarrière
C. Molinari
Lorenzo Rosasco
S. Villa
Cristian Vega
76
0
0
21 Dec 2024
Fast Training of Sinusoidal Neural Fields via Scaling Initialization
Taesun Yeom
Sangyoon Lee
Jaeho Lee
53
2
0
07 Oct 2024
The Optimization Landscape of SGD Across the Feature Learning Strength
Alexander B. Atanasov
Alexandru Meterez
James B. Simon
C. Pehlevan
43
2
0
06 Oct 2024
How Feature Learning Can Improve Neural Scaling Laws
Blake Bordelon
Alexander B. Atanasov
C. Pehlevan
49
12
0
26 Sep 2024
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
Clémentine Dominé
Nicolas Anguita
A. Proca
Lukas Braun
D. Kunin
P. Mediano
Andrew M. Saxe
30
3
0
22 Sep 2024
Mask in the Mirror: Implicit Sparsification
Tom Jacobs
R. Burkholz
40
3
0
19 Aug 2024
Trimming the Fat: Efficient Compression of 3D Gaussian Splats through Pruning
Muhammad Salman Ali
Maryam Qamar
Sung-Ho Bae
Enzo Tartaglione
3DGS
38
11
0
26 Jun 2024
How Neural Networks Learn the Support is an Implicit Regularization Effect of SGD
Pierfrancesco Beneventano
Andrea Pinto
Tomaso A. Poggio
MLT
27
1
0
17 Jun 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
A. Banerjee
Qiaobo Li
Yingxue Zhou
44
0
0
11 Jun 2024
When does compositional structure yield compositional generalization? A kernel theory
Samuel Lippl
Kim Stachenfeld
NAI
CoGe
70
5
0
26 May 2024
Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations
Akshay Kumar
Jarvis D. Haupt
ODL
44
3
0
12 Mar 2024
Critical Influence of Overparameterization on Sharpness-aware Minimization
Sungbin Shin
Dongyeop Lee
Maksym Andriushchenko
Namhoon Lee
AAML
41
1
0
29 Nov 2023
Connecting NTK and NNGP: A Unified Theoretical Framework for Wide Neural Network Learning Dynamics
Yehonatan Avidan
Qianyi Li
H. Sompolinsky
60
8
0
08 Sep 2023
Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization
Kaiyue Wen
Zhiyuan Li
Tengyu Ma
FAtt
33
26
0
20 Jul 2023
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond
Itai Kreisler
Mor Shpigel Nacson
Daniel Soudry
Y. Carmon
23
13
0
22 May 2023
Convex optimization over a probability simplex
James Chok
G. Vasil
23
2
0
15 May 2023
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks
Eshaan Nichani
Alexandru Damian
Jason D. Lee
MLT
36
13
0
11 May 2023
Robust Implicit Regularization via Weight Normalization
H. Chou
Holger Rauhut
Rachel A. Ward
28
7
0
09 May 2023
Saddle-to-Saddle Dynamics in Diagonal Linear Networks
Scott Pesme
Nicolas Flammarion
29
35
0
02 Apr 2023
General Loss Functions Lead to (Approximate) Interpolation in High Dimensions
Kuo-Wei Lai
Vidya Muthukumar
18
5
0
13 Mar 2023
Phase Diagram of Initial Condensation for Two-layer Neural Networks
Zheng Chen
Yuqing Li
Tao Luo
Zhaoguang Zhou
Z. Xu
MLT
AI4CE
41
8
0
12 Mar 2023
Backdoor Learning for NLP: Recent Advances, Challenges, and Future Research Directions
Marwan Omar
SILM
AAML
25
20
0
14 Feb 2023
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning
François Caron
Fadhel Ayed
Paul Jung
Hoileong Lee
Juho Lee
Hongseok Yang
59
2
0
02 Feb 2023
Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression
Mo Zhou
Rong Ge
27
2
0
01 Feb 2023
Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing
Jikai Jin
Zhiyuan Li
Kaifeng Lyu
S. Du
Jason D. Lee
MLT
46
34
0
27 Jan 2023
Infinite-width limit of deep linear neural networks
Lénaïc Chizat
Maria Colombo
Xavier Fernández-Real
Alessio Figalli
31
14
0
29 Nov 2022
Characterizing the Spectrum of the NTK via a Power Series Expansion
Michael Murray
Hui Jin
Benjamin Bowman
Guido Montúfar
30
11
0
15 Nov 2022
Learning Single-Index Models with Shallow Neural Networks
A. Bietti
Joan Bruna
Clayton Sanford
M. Song
162
67
0
27 Oct 2022
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
32
49
0
25 Oct 2022
Continual task learning in natural and artificial agents
Timo Flesch
Andrew M. Saxe
Christopher Summerfield
CLL
35
24
0
10 Oct 2022
Deep Linear Networks can Benignly Overfit when Shallow Ones Do
Niladri S. Chatterji
Philip M. Long
15
8
0
19 Sep 2022
Lazy vs hasty: linearization in deep networks impacts learning schedule based on example difficulty
Thomas George
Guillaume Lajoie
A. Baratin
23
5
0
19 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization)
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
V. Cevher
39
19
0
15 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms
Gal Vardi
FedML
AI4CE
30
72
0
26 Aug 2022
Blessing of Nonconvexity in Deep Linear Models: Depth Flattens the Optimization Landscape Around the True Solution
Jianhao Ma
S. Fattahi
40
5
0
15 Jul 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
32
27
0
08 Jul 2022
Neural Networks can Learn Representations with Gradient Descent
Alexandru Damian
Jason D. Lee
Mahdi Soltanolkotabi
SSL
MLT
17
112
0
30 Jun 2022
Learning sparse features can lead to overfitting in neural networks
Leonardo Petrini
Francesco Cagnetta
Eric Vanden-Eijnden
M. Wyart
MLT
29
23
0
24 Jun 2022
Provable Acceleration of Heavy Ball beyond Quadratics for a Class of Polyak-Łojasiewicz Functions when the Non-Convexity is Averaged-Out
Jun-Kun Wang
Chi-Heng Lin
Andre Wibisono
Bin Hu
19
20
0
22 Jun 2022
Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation
Loucas Pillaud-Vivien
J. Reygner
Nicolas Flammarion
NoLa
31
31
0
20 Jun 2022
Reconstructing Training Data from Trained Neural Networks
Niv Haim
Gal Vardi
Gilad Yehudai
Ohad Shamir
Michal Irani
34
132
0
15 Jun 2022
Towards Understanding Sharpness-Aware Minimization
Maksym Andriushchenko
Nicolas Flammarion
AAML
24
133
0
13 Jun 2022
Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials
Eshaan Nichani
Yunzhi Bai
Jason D. Lee
21
10
0
08 Jun 2022
Generalized Federated Learning via Sharpness Aware Minimization
Zhe Qu
Xingyu Li
Rui Duan
Yaojiang Liu
Bo Tang
Zhuo Lu
FedML
20
130
0
06 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs
Etienne Boursier
Loucas Pillaud-Vivien
Nicolas Flammarion
ODL
19
58
0
02 Jun 2022
1
2
Next