Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.00488
Cited By
Saddle-to-Saddle Dynamics in Diagonal Linear Networks
2 April 2023
Scott Pesme
Nicolas Flammarion
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Saddle-to-Saddle Dynamics in Diagonal Linear Networks"
25 / 25 papers shown
Title
Understanding the Learning Dynamics of LoRA: A Gradient Flow Perspective on Low-Rank Adaptation in Matrix Factorization
Ziqing Xu
Hancheng Min
Lachlan Ewen MacDonald
Jinqi Luo
Salma Tarmoun
Enrique Mallada
René Vidal
AI4CE
51
0
0
10 Mar 2025
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Yoonsoo Nam
Seok Hyeong Lee
Clementine Domine
Yea Chan Park
Charles London
Wonyl Choi
Niclas Goring
Seungjai Lee
AI4CE
33
0
0
28 Feb 2025
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
Jinbo Wang
Mingze Wang
Zhanpeng Zhou
Junchi Yan
Weinan E
Lei Wu
81
1
0
26 Feb 2025
Towards understanding epoch-wise double descent in two-layer linear neural networks
Amanda Olmin
Fredrik Lindsten
MLT
27
3
0
13 Jul 2024
How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self Distillation Networks
Etai Littwin
Omid Saremi
Madhu Advani
Vimal Thilak
Preetum Nakkiran
Chen Huang
Joshua Susskind
37
3
0
03 Jul 2024
Implicit Bias of Mirror Flow on Separable Data
Scott Pesme
Radu-Alexandru Dragomir
Nicolas Flammarion
34
1
0
18 Jun 2024
Improving Generalization and Convergence by Enhancing Implicit Regularization
Mingze Wang
Haotian He
Jinbo Wang
Zilin Wang
Guanhua Huang
Feiyu Xiong
Zhiyu Li
E. Weinan
Lei Wu
37
6
0
31 May 2024
Synchronization on circles and spheres with nonlinear interactions
Christopher Criscitiello
Quentin Rebjock
Andrew D. McRae
Nicolas Boumal
28
4
0
28 May 2024
Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
Zhenfeng Tu
Santiago Aranguri
Arthur Jacot
26
8
0
27 May 2024
Implicit Regularization of Gradient Flow on One-Layer Softmax Attention
Heejune Sheen
Siyu Chen
Tianhao Wang
Harrison H. Zhou
MLT
33
10
0
13 Mar 2024
Directional Convergence Near Small Initializations and Saddles in Two-Homogeneous Neural Networks
Akshay Kumar
Jarvis D. Haupt
ODL
30
7
0
14 Feb 2024
When Representations Align: Universality in Representation Learning Dynamics
Loek van Rossem
Andrew M. Saxe
AI4CE
48
4
0
14 Feb 2024
Stochastic Gradient Flow Dynamics of Test Risk and its Exact Solution for Weak Features
Rodrigo Veiga
Anastasia Remizova
Nicolas Macris
32
0
0
12 Feb 2024
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding
Zhengqing Wu
Berfin Simsek
Francois Ged
ODL
38
0
0
08 Feb 2024
Compression of Structured Data with Autoencoders: Provable Benefit of Nonlinearities and Depth
Kevin Kögler
A. Shevchenko
Hamed Hassani
Marco Mondelli
MLT
25
0
0
07 Feb 2024
Understanding Unimodal Bias in Multimodal Deep Linear Networks
Yedi Zhang
Peter E. Latham
Andrew Saxe
26
5
0
01 Dec 2023
Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large Catapults
Prin Phunyaphibarn
Junghyun Lee
Bohan Wang
Huishuai Zhang
Chulhee Yun
18
0
0
25 Nov 2023
Achieving Margin Maximization Exponentially Fast via Progressive Norm Rescaling
Mingze Wang
Zeping Min
Lei Wu
25
3
0
24 Nov 2023
SGD Finds then Tunes Features in Two-Layer Neural Networks with near-Optimal Sample Complexity: A Case Study in the XOR problem
Margalit Glasgow
MLT
74
13
0
26 Sep 2023
Implicit regularization in AI meets generalized hardness of approximation in optimization -- Sharp results for diagonal linear networks
J. S. Wind
Vegard Antun
A. Hansen
17
4
0
13 Jul 2023
Transformers learn through gradual rank increase
Enric Boix-Adserà
Etai Littwin
Emmanuel Abbe
Samy Bengio
J. Susskind
36
33
0
12 Jun 2023
Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias for Correlated Inputs
D. Chistikov
Matthias Englert
R. Lazic
MLT
32
12
0
10 Jun 2023
Robust Implicit Regularization via Weight Normalization
H. Chou
Holger Rauhut
Rachel A. Ward
28
7
0
09 May 2023
SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Emmanuel Abbe
Enric Boix-Adserà
Theodor Misiakiewicz
FedML
MLT
79
72
0
21 Feb 2023
(S)GD over Diagonal Linear Networks: Implicit Regularisation, Large Stepsizes and Edge of Stability
Mathieu Even
Scott Pesme
Suriya Gunasekar
Nicolas Flammarion
26
16
0
17 Feb 2023
1