ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.00488
  4. Cited By
Saddle-to-Saddle Dynamics in Diagonal Linear Networks

Saddle-to-Saddle Dynamics in Diagonal Linear Networks

2 April 2023
Scott Pesme
Nicolas Flammarion
ArXivPDFHTML

Papers citing "Saddle-to-Saddle Dynamics in Diagonal Linear Networks"

25 / 25 papers shown
Title
Understanding the Learning Dynamics of LoRA: A Gradient Flow Perspective on Low-Rank Adaptation in Matrix Factorization
Ziqing Xu
Hancheng Min
Lachlan Ewen MacDonald
Jinqi Luo
Salma Tarmoun
Enrique Mallada
René Vidal
AI4CE
51
0
0
10 Mar 2025
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Yoonsoo Nam
Seok Hyeong Lee
Clementine Domine
Yea Chan Park
Charles London
Wonyl Choi
Niclas Goring
Seungjai Lee
AI4CE
33
0
0
28 Feb 2025
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
Jinbo Wang
Mingze Wang
Zhanpeng Zhou
Junchi Yan
Weinan E
Lei Wu
81
1
0
26 Feb 2025
Towards understanding epoch-wise double descent in two-layer linear
  neural networks
Towards understanding epoch-wise double descent in two-layer linear neural networks
Amanda Olmin
Fredrik Lindsten
MLT
27
3
0
13 Jul 2024
How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self
  Distillation Networks
How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self Distillation Networks
Etai Littwin
Omid Saremi
Madhu Advani
Vimal Thilak
Preetum Nakkiran
Chen Huang
Joshua Susskind
37
3
0
03 Jul 2024
Implicit Bias of Mirror Flow on Separable Data
Implicit Bias of Mirror Flow on Separable Data
Scott Pesme
Radu-Alexandru Dragomir
Nicolas Flammarion
34
1
0
18 Jun 2024
Improving Generalization and Convergence by Enhancing Implicit
  Regularization
Improving Generalization and Convergence by Enhancing Implicit Regularization
Mingze Wang
Haotian He
Jinbo Wang
Zilin Wang
Guanhua Huang
Feiyu Xiong
Zhiyu Li
E. Weinan
Lei Wu
37
6
0
31 May 2024
Synchronization on circles and spheres with nonlinear interactions
Synchronization on circles and spheres with nonlinear interactions
Christopher Criscitiello
Quentin Rebjock
Andrew D. McRae
Nicolas Boumal
28
4
0
28 May 2024
Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
Zhenfeng Tu
Santiago Aranguri
Arthur Jacot
26
8
0
27 May 2024
Implicit Regularization of Gradient Flow on One-Layer Softmax Attention
Implicit Regularization of Gradient Flow on One-Layer Softmax Attention
Heejune Sheen
Siyu Chen
Tianhao Wang
Harrison H. Zhou
MLT
33
10
0
13 Mar 2024
Directional Convergence Near Small Initializations and Saddles in
  Two-Homogeneous Neural Networks
Directional Convergence Near Small Initializations and Saddles in Two-Homogeneous Neural Networks
Akshay Kumar
Jarvis D. Haupt
ODL
30
7
0
14 Feb 2024
When Representations Align: Universality in Representation Learning
  Dynamics
When Representations Align: Universality in Representation Learning Dynamics
Loek van Rossem
Andrew M. Saxe
AI4CE
48
4
0
14 Feb 2024
Stochastic Gradient Flow Dynamics of Test Risk and its Exact Solution
  for Weak Features
Stochastic Gradient Flow Dynamics of Test Risk and its Exact Solution for Weak Features
Rodrigo Veiga
Anastasia Remizova
Nicolas Macris
32
0
0
12 Feb 2024
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding
Zhengqing Wu
Berfin Simsek
Francois Ged
ODL
38
0
0
08 Feb 2024
Compression of Structured Data with Autoencoders: Provable Benefit of
  Nonlinearities and Depth
Compression of Structured Data with Autoencoders: Provable Benefit of Nonlinearities and Depth
Kevin Kögler
A. Shevchenko
Hamed Hassani
Marco Mondelli
MLT
25
0
0
07 Feb 2024
Understanding Unimodal Bias in Multimodal Deep Linear Networks
Understanding Unimodal Bias in Multimodal Deep Linear Networks
Yedi Zhang
Peter E. Latham
Andrew Saxe
26
5
0
01 Dec 2023
Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large
  Catapults
Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large Catapults
Prin Phunyaphibarn
Junghyun Lee
Bohan Wang
Huishuai Zhang
Chulhee Yun
18
0
0
25 Nov 2023
Achieving Margin Maximization Exponentially Fast via Progressive Norm
  Rescaling
Achieving Margin Maximization Exponentially Fast via Progressive Norm Rescaling
Mingze Wang
Zeping Min
Lei Wu
25
3
0
24 Nov 2023
SGD Finds then Tunes Features in Two-Layer Neural Networks with
  near-Optimal Sample Complexity: A Case Study in the XOR problem
SGD Finds then Tunes Features in Two-Layer Neural Networks with near-Optimal Sample Complexity: A Case Study in the XOR problem
Margalit Glasgow
MLT
74
13
0
26 Sep 2023
Implicit regularization in AI meets generalized hardness of
  approximation in optimization -- Sharp results for diagonal linear networks
Implicit regularization in AI meets generalized hardness of approximation in optimization -- Sharp results for diagonal linear networks
J. S. Wind
Vegard Antun
A. Hansen
17
4
0
13 Jul 2023
Transformers learn through gradual rank increase
Transformers learn through gradual rank increase
Enric Boix-Adserà
Etai Littwin
Emmanuel Abbe
Samy Bengio
J. Susskind
36
33
0
12 Jun 2023
Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias
  for Correlated Inputs
Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias for Correlated Inputs
D. Chistikov
Matthias Englert
R. Lazic
MLT
32
12
0
10 Jun 2023
Robust Implicit Regularization via Weight Normalization
Robust Implicit Regularization via Weight Normalization
H. Chou
Holger Rauhut
Rachel A. Ward
28
7
0
09 May 2023
SGD learning on neural networks: leap complexity and saddle-to-saddle
  dynamics
SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Emmanuel Abbe
Enric Boix-Adserà
Theodor Misiakiewicz
FedML
MLT
79
72
0
21 Feb 2023
(S)GD over Diagonal Linear Networks: Implicit Regularisation, Large
  Stepsizes and Edge of Stability
(S)GD over Diagonal Linear Networks: Implicit Regularisation, Large Stepsizes and Edge of Stability
Mathieu Even
Scott Pesme
Suriya Gunasekar
Nicolas Flammarion
26
16
0
17 Feb 2023
1