Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2012.09839
Cited By
v1
v2 (latest)
Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank Learning
17 December 2020
Zhiyuan Li
Yuping Luo
Kaifeng Lyu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank Learning"
50 / 105 papers shown
Title
Alternating Gradient Flows: A Theory of Feature Learning in Two-layer Neural Networks
D. Kunin
Giovanni Luca Marchetti
F. Chen
Dhruva Karkada
James B. Simon
M. DeWeese
Surya Ganguli
Nina Miolane
26
0
0
06 Jun 2025
Heavy-Ball Momentum Method in Continuous Time and Discretization Error Analysis
Bochen Lyu
Xiaojing Zhang
Fangyi Zheng
He Wang
Zheng Wang
Zhanxing Zhu
7
0
0
03 Jun 2025
The Rich and the Simple: On the Implicit Bias of Adam and SGD
Bhavya Vasudeva
Jung Whan Lee
Vatsal Sharan
Mahdi Soltanolkotabi
15
0
0
29 May 2025
Saddle-To-Saddle Dynamics in Deep ReLU Networks: Low-Rank Bias in the First Saddle Escape
Ioannis Bantzis
James B. Simon
Arthur Jacot
ODL
42
0
0
27 May 2025
LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning
Nurbek Tastan
Stefanos Laskaridis
Martin Takáč
Karthik Nandakumar
Samuel Horváth
AI4CE
70
0
0
27 May 2025
Mirror, Mirror of the Flow: How Does Regularization Shape Implicit Bias?
Tom Jacobs
Chao Zhou
R. Burkholz
OffRL
AI4CE
74
1
0
17 Apr 2025
Gradient Descent Robustly Learns the Intrinsic Dimension of Data in Training Convolutional Neural Networks
Chenyang Zhang
Peifeng Gao
Difan Zou
Yuan Cao
OOD
MLT
157
0
0
11 Apr 2025
An Overview of Low-Rank Structures in the Training and Adaptation of Large Models
Laura Balzano
Tianjiao Ding
B. Haeffele
Soo Min Kwon
Qing Qu
Peng Wang
Ziyi Wang
Can Yaras
OffRL
AI4CE
88
2
0
25 Mar 2025
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Yoonsoo Nam
Seok Hyeong Lee
Clementine Domine
Yea Chan Park
Charles London
Wonyl Choi
Niclas Goring
Seungjai Lee
AI4CE
204
1
0
28 Feb 2025
Implicit Bias in Matrix Factorization and its Explicit Realization in a New Architecture
Yikun Hou
Suvrit Sra
A. Yurtsever
69
0
0
28 Jan 2025
Weight decay induces low-rank attention layers
Seijin Kobayashi
Yassir Akram
J. Oswald
103
12
0
31 Oct 2024
Bilinear Sequence Regression: A Model for Learning from Long Sequences of High-dimensional Tokens
Vittorio Erba
Emanuele Troiani
Luca Biggio
Antoine Maillard
Lenka Zdeborová
200
2
0
24 Oct 2024
Swing-by Dynamics in Concept Learning and Compositional Generalization
Yongyi Yang
Core Francisco Park
Ekdeep Singh Lubana
Maya Okawa
Wei Hu
Hidenori Tanaka
CoGe
DiffM
57
0
0
10 Oct 2024
Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient
George Wang
Jesse Hoogland
Stan van Wingerden
Zach Furman
Daniel Murfet
OffRL
86
9
0
03 Oct 2024
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
Clémentine Dominé
Nicolas Anguita
A. Proca
Lukas Braun
D. Kunin
P. Mediano
Andrew M. Saxe
127
6
0
22 Sep 2024
Improving Adaptivity via Over-Parameterization in Sequence Models
Yicheng Li
Qian Lin
74
1
0
02 Sep 2024
Lecture Notes on Linear Neural Networks: A Tale of Optimization and Generalization in Deep Learning
Nadav Cohen
Noam Razin
115
0
0
25 Aug 2024
Approaching Deep Learning through the Spectral Dynamics of Weights
David Yunis
Kumar Kshitij Patel
Samuel Wheeler
Pedro H. P. Savarese
Gal Vardi
Karen Livescu
Michael Maire
Matthew R. Walter
104
3
0
21 Aug 2024
A Generalization Bound for Nearly-Linear Networks
Eugene Golikov
73
0
0
09 Jul 2024
How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning
Arthur Jacot
Seok Hoan Choi
Yuxiao Wen
AI4CE
141
2
0
08 Jul 2024
How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self Distillation Networks
Etai Littwin
Omid Saremi
Madhu Advani
Vimal Thilak
Preetum Nakkiran
Chen Huang
Joshua Susskind
78
5
0
03 Jul 2024
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning
D. Kunin
Allan Raventós
Clémentine Dominé
Feng Chen
David Klindt
Andrew M. Saxe
Surya Ganguli
MLT
127
18
0
10 Jun 2024
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation
Can Yaras
Peng Wang
Laura Balzano
Qing Qu
AI4CE
71
15
0
06 Jun 2024
Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
Zhenfeng Tu
Santiago Aranguri
Arthur Jacot
63
11
0
27 May 2024
Hamiltonian Mechanics of Feature Learning: Bottleneck Structure in Leaky ResNets
Arthur Jacot
Alexandre Kaiser
72
1
0
27 May 2024
Disentangle Sample Size and Initialization Effect on Perfect Generalization for Single-Neuron Target
Jiajie Zhao
Zhiwei Bai
Yaoyu Zhang
63
0
0
22 May 2024
Deep linear networks for regression are implicitly regularized towards flat minima
Pierre Marion
Lénaic Chizat
ODL
104
6
0
22 May 2024
Connectivity Shapes Implicit Regularization in Matrix Factorization Models for Matrix Completion
Zhiwei Bai
Jiajie Zhao
Yaoyu Zhang
AI4CE
98
0
0
22 May 2024
Implicit Regularization of Gradient Flow on One-Layer Softmax Attention
Heejune Sheen
Siyu Chen
Tianhao Wang
Harrison H. Zhou
MLT
87
13
0
13 Mar 2024
Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations
Akshay Kumar
Jarvis Haupt
ODL
103
4
0
12 Mar 2024
Transformers Learn Low Sensitivity Functions: Investigations and Implications
Bhavya Vasudeva
Deqing Fu
Tianyi Zhou
Elliott Kau
Youqi Huang
Vatsal Sharan
101
2
0
11 Mar 2024
The Expected Loss of Preconditioned Langevin Dynamics Reveals the Hessian Rank
Amitay Bar
Rotem Mulayoff
T. Michaeli
Ronen Talmon
81
0
0
21 Feb 2024
Average gradient outer product as a mechanism for deep neural collapse
Daniel Beaglehole
Peter Súkeník
Marco Mondelli
Misha Belkin
AI4CE
100
13
0
21 Feb 2024
Which Frequencies do CNNs Need? Emergent Bottleneck Structure in Feature Learning
Yuxiao Wen
Arthur Jacot
134
7
0
12 Feb 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
104
21
0
08 Feb 2024
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
Kaifeng Lyu
Jikai Jin
Zhiyuan Li
Simon S. Du
Jason D. Lee
Wei Hu
AI4CE
92
38
0
30 Nov 2023
Applying statistical learning theory to deep learning
Cédric Gerbelot
Avetik G. Karagulyan
Stefani Karp
Kavya Ravichandran
Menachem Stern
Nathan Srebro
FedML
49
2
0
26 Nov 2023
Efficient Compression of Overparameterized Deep Models through Low-Dimensional Learning Dynamics
Soo Min Kwon
Zekai Zhang
Dogyoon Song
Laura Balzano
Qing Qu
107
4
0
08 Nov 2023
A Quadratic Synchronization Rule for Distributed Deep Learning
Xinran Gu
Kaifeng Lyu
Sanjeev Arora
Jingzhao Zhang
Longbo Huang
85
1
0
22 Oct 2023
Dynamical versus Bayesian Phase Transitions in a Toy Model of Superposition
Zhongtian Chen
Edmund Lau
Jake Mendel
Susan Wei
Daniel Murfet
61
15
0
10 Oct 2023
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization
Nuoya Xiong
Lijun Ding
Simon S. Du
104
13
0
03 Oct 2023
Implicit regularization of deep residual networks towards neural ODEs
Pierre Marion
Yu-Han Wu
Michael E. Sander
Gérard Biau
123
17
0
03 Sep 2023
Six Lectures on Linearized Neural Networks
Theodor Misiakiewicz
Andrea Montanari
131
13
0
25 Aug 2023
Early Neuron Alignment in Two-layer ReLU Networks with Small Initialization
Hancheng Min
Enrique Mallada
René Vidal
MLT
92
23
0
24 Jul 2023
Neural Hilbert Ladders: Multi-Layer Neural Networks in Function Space
Zhengdao Chen
75
1
0
03 Jul 2023
The Inductive Bias of Flatness Regularization for Deep Matrix Factorization
Khashayar Gatmiry
Zhiyuan Li
Ching-Yao Chuang
Sashank J. Reddi
Tengyu Ma
Stefanie Jegelka
ODL
77
12
0
22 Jun 2023
The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks
Yuan Cao
Difan Zou
Yuan-Fang Li
Quanquan Gu
MLT
102
5
0
20 Jun 2023
InRank: Incremental Low-Rank Learning
Jiawei Zhao
Yifei Zhang
Beidi Chen
F. Schafer
Anima Anandkumar
61
7
0
20 Jun 2023
Trained Transformers Learn Linear Models In-Context
Ruiqi Zhang
Spencer Frei
Peter L. Bartlett
93
207
0
16 Jun 2023
Transformers learn through gradual rank increase
Enric Boix-Adserà
Etai Littwin
Emmanuel Abbe
Samy Bengio
J. Susskind
93
37
0
12 Jun 2023
1
2
3
Next