Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2409.14623
Cited By
v1
v2 (latest)
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
International Conference on Learning Representations (ICLR), 2024
22 September 2024
Clémentine Dominé
Nicolas Anguita
A. Proca
Lukas Braun
D. Kunin
P. Mediano
Andrew M. Saxe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks"
50 / 66 papers shown
Title
Diagonalizing the Softmax: Hadamard Initialization for Tractable Cross-Entropy Dynamics
Connall Garrod
Jonathan P. Keating
Christos Thrampoulidis
88
0
0
03 Dec 2025
Data Curation Through the Lens of Spectral Dynamics: Static Limits, Dynamic Acceleration, and Practical Oracles
Yizhou Zhang
Lun Du
112
0
0
02 Dec 2025
A Generalized Spectral Framework to Expain Neural Scaling and Compression Dynamics
Yizhou Zhang
116
3
0
11 Nov 2025
You Had One Job: Per-Task Quantization Using LLMs' Hidden Representations
Amit Levi
Raz Lapid
Rom Himelstein
Yaniv Nemcovsky
Ravid Shwartz Ziv
A. Mendelson
MQ
117
1
0
09 Nov 2025
Provable Scaling Laws of Feature Emergence from Learning Dynamics of Grokking
Yuandong Tian
188
0
0
25 Sep 2025
Intrinsic training dynamics of deep neural networks
Sibylle Marcotte
Gabriel Peyré
Rémi Gribonval
AI4CE
96
1
0
10 Aug 2025
Feature learning is decoupled from generalization in high capacity neural networks
Niclas Goring
Charles London
Abdurrahman Hadi Erturk
Chris Mingard
Yoonsoo Nam
Ard A. Louis
OOD
MLT
252
1
0
25 Jul 2025
Alternating Gradient Flows: A Theory of Feature Learning in Two-layer Neural Networks
D. Kunin
Giovanni Luca Marchetti
F. Chen
Dhruva Karkada
James B. Simon
M. DeWeese
Surya Ganguli
Nina Miolane
401
4
0
06 Jun 2025
Sign-In to the Lottery: Reparameterizing Sparse Training From Scratch
Advait Gadhikar
Tom Jacobs
Chao Zhou
R. Burkholz
356
1
0
17 Apr 2025
On the Cone Effect in the Learning Dynamics
Zhanpeng Zhou
Yongyi Yang
Jie Ren
Mahito Sugiyama
Junchi Yan
388
1
0
20 Mar 2025
A Theory of Initialisation's Impact on Specialisation
International Conference on Learning Representations (ICLR), 2025
Devon Jarvis
Sebastian Lee
Clémentine Dominé
Andrew M. Saxe
Stefano Sarao Mannelli
CLL
281
2
0
04 Mar 2025
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Yoonsoo Nam
Seok Hyeong Lee
Clementine Domine
Yea Chan Park
Charles London
Wonyl Choi
Niclas Goring
Seungjai Lee
AI4CE
534
1
0
28 Feb 2025
Bridging Critical Gaps in Convergent Learning: How Representational Alignment Evolves Across Layers, Training, and Distribution Shifts
Chaitanya Kapoor
Sudhanshu Srivastava
Meenakshi Khosla
366
1
0
26 Feb 2025
Deep Linear Network Training Dynamics from Random Initialization: Data, Width, Depth, and Hyperparameter Transfer
Blake Bordelon
Cengiz Pehlevan
AI4CE
671
5
0
04 Feb 2025
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning
Neural Information Processing Systems (NeurIPS), 2024
D. Kunin
Allan Raventós
Clémentine Dominé
Feng Chen
David Klindt
Andrew M. Saxe
Surya Ganguli
MLT
323
25
0
10 Jun 2024
Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
Zhenfeng Tu
Santiago Aranguri
Arthur Jacot
207
14
0
27 May 2024
Asymptotics of feature learning in two-layer networks after one gradient-step
Hugo Cui
Luca Pesce
Yatin Dandi
Florent Krzakala
Yue M. Lu
Lenka Zdeborová
Bruno Loureiro
MLT
286
23
0
07 Feb 2024
How connectivity structure shapes rich and lazy learning in neural circuits
International Conference on Learning Representations (ICLR), 2023
Yuhan Helena Liu
A. Baratin
Jonathan H. Cornford
Stefan Mihalas
E. Shea-Brown
Guillaume Lajoie
392
22
0
12 Oct 2023
Neural Feature Learning in Function Space
Journal of machine learning research (JMLR), 2023
Xiangxiang Xu
Lizhong Zheng
270
16
0
18 Sep 2023
Abide by the Law and Follow the Flow: Conservation Laws for Gradient Flows
Neural Information Processing Systems (NeurIPS), 2023
Sibylle Marcotte
Rémi Gribonval
Gabriel Peyré
317
27
0
30 Jun 2023
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
International Conference on Machine Learning (ICML), 2023
Libin Zhu
Chaoyue Liu
Adityanarayanan Radhakrishnan
M. Belkin
412
24
0
07 Jun 2023
The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks
International Conference on Learning Representations (ICLR), 2022
D. Kunin
Atsushi Yamamura
Chao Ma
Surya Ganguli
157
22
0
07 Oct 2022
Relative representations enable zero-shot latent space communication
International Conference on Learning Representations (ICLR), 2022
Luca Moschella
Valentino Maiorca
Marco Fumero
Antonio Norelli
Francesco Locatello
Emanuele Rodolà
314
152
0
30 Sep 2022
Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation
Sebastian Lee
Stefano Sarao Mannelli
Claudia Clopath
Sebastian Goldt
Andrew M. Saxe
CLL
316
14
0
18 May 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation
Neural Information Processing Systems (NeurIPS), 2022
Jimmy Ba
Murat A. Erdogdu
Taiji Suzuki
Zhichao Wang
Denny Wu
Greg Yang
MLT
236
116
0
03 May 2022
Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer
Greg Yang
J. E. Hu
Igor Babuschkin
Szymon Sidor
Xiaodong Liu
David Farhi
Nick Ryder
J. Pachocki
Weizhu Chen
Jianfeng Gao
333
220
0
07 Mar 2022
Exact Solutions of a Deep Linear Network
Neural Information Processing Systems (NeurIPS), 2022
Liu Ziyin
Botao Li
Xiangmin Meng
ODL
555
23
0
10 Feb 2022
Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity
Arthur Jacot
François Ged
Berfin cSimcsek
Clément Hongler
Franck Gabriel
316
65
0
30 Jun 2021
LoRA: Low-Rank Adaptation of Large Language Models
International Conference on Learning Representations (ICLR), 2021
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
1.5K
15,183
0
17 Jun 2021
Probing transfer learning with a model of synthetic correlated datasets
Federica Gerace
Luca Saglietti
Stefano Sarao Mannelli
Andrew M. Saxe
Lenka Zdeborová
OOD
173
37
0
09 Jun 2021
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent
International Conference on Machine Learning (ICML), 2021
Shahar Azulay
E. Moroshko
Mor Shpigel Nacson
Blake E. Woodworth
Nathan Srebro
Amir Globerson
Daniel Soudry
AI4CE
251
78
0
19 Feb 2021
Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank Learning
International Conference on Learning Representations (ICLR), 2020
Zhiyuan Li
Yuping Luo
Kaifeng Lyu
213
143
0
17 Dec 2020
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics
D. Kunin
Javier Sagastuy-Breña
Surya Ganguli
Daniel L. K. Yamins
Hidenori Tanaka
341
89
0
08 Dec 2020
Feature Learning in Infinite-Width Neural Networks
Greg Yang
J. E. Hu
MLT
384
180
0
30 Nov 2020
Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel
Neural Information Processing Systems (NeurIPS), 2020
Stanislav Fort
Gintare Karolina Dziugaite
Mansheej Paul
Sepideh Kharaghani
Daniel M. Roy
Surya Ganguli
293
219
0
28 Oct 2020
Phase diagram for two-layer ReLU neural networks at infinite-width limit
Journal of machine learning research (JMLR), 2020
Yaoyu Zhang
Zhi-Qin John Xu
Zheng Ma
Yaoyu Zhang
202
71
0
15 Jul 2020
The large learning rate phase of deep learning: the catapult mechanism
Aitor Lewkowycz
Yasaman Bahri
Ethan Dyer
Jascha Narain Sohl-Dickstein
Guy Gur-Ari
ODL
487
260
0
04 Mar 2020
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss
Annual Conference Computational Learning Theory (COLT), 2020
Lénaïc Chizat
Francis R. Bach
MLT
594
364
0
11 Feb 2020
The Implicit Bias of Depth: How Incremental Learning Drives Generalization
International Conference on Learning Representations (ICLR), 2019
Daniel Gissin
Shai Shalev-Shwartz
Amit Daniely
AI4CE
250
85
0
26 Sep 2019
Kernel and Rich Regimes in Overparametrized Models
Annual Conference Computational Learning Theory (COLT), 2019
Blake E. Woodworth
Suriya Gunasekar
Pedro H. P. Savarese
E. Moroshko
Itay Golan
Jason D. Lee
Daniel Soudry
Nathan Srebro
335
391
0
13 Jun 2019
Implicit Regularization in Deep Matrix Factorization
Neural Information Processing Systems (NeurIPS), 2019
Sanjeev Arora
Nadav Cohen
Wei Hu
Yuping Luo
AI4CE
356
559
0
31 May 2019
Similarity of Neural Network Representations Revisited
International Conference on Machine Learning (ICML), 2019
Simon Kornblith
Mohammad Norouzi
Honglak Lee
Geoffrey E. Hinton
1.2K
1,738
0
01 May 2019
Implicit Regularization of Discrete Gradient Dynamics in Linear Neural Networks
Neural Information Processing Systems (NeurIPS), 2019
Gauthier Gidel
Francis R. Bach
Damien Scieur
AI4CE
185
168
0
30 Apr 2019
On Exact Computation with an Infinitely Wide Neural Net
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
605
989
0
26 Apr 2019
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent
Jaehoon Lee
Lechao Xiao
S. Schoenholz
Yasaman Bahri
Roman Novak
Jascha Narain Sohl-Dickstein
Jeffrey Pennington
582
1,210
0
18 Feb 2019
Width Provably Matters in Optimization for Deep Linear Neural Networks
S. Du
Wei Hu
358
101
0
24 Jan 2019
On Lazy Training in Differentiable Programming
Lénaïc Chizat
Edouard Oyallon
Francis R. Bach
518
905
0
19 Dec 2018
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
MLT
759
813
0
12 Nov 2018
A Convergence Theory for Deep Learning via Over-Parameterization
International Conference on Machine Learning (ICML), 2018
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
AI4CE
ODL
1.4K
1,551
0
09 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
International Conference on Machine Learning (ICML), 2018
S. Du
Jason D. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
939
1,189
0
09 Nov 2018
1
2
Next