$Implicit bias of SGD in $L_{2}$-regularized linear DNNs: One-way jumps from high to low rank$

v1v2 (latest)

Implicit bias of SGD in $L_{2}$ -regularized linear DNNs: One-way jumps from high to low rank

25 May 2023

ArXiv (abs)PDF HTML

Papers citing "Implicit bias of SGD in $L_{2}$-regularized linear DNNs: One-way jumps from high to low rank"

6 / 6 papers shown

Title
AlphaDecay: Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs Di He Ajay Jaiswal Songjun Tu Li Shen Ganzhao Yuan Shiwei Liu L. Yin 24 0 0 17 Jun 2025
Saddle-To-Saddle Dynamics in Deep ReLU Networks: Low-Rank Bias in the First Saddle Escape Ioannis Bantzis James B. Simon Arthur Jacot ODL 44 0 0 27 May 2025
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking) Yoonsoo Nam Seok Hyeong Lee Clementine Domine Yea Chan Park Charles London Wonyl Choi Niclas Goring Seungjai Lee AI4CE 204 1 0 28 Feb 2025
The Persistence of Neural Collapse Despite Low-Rank Bias: An Analytic Perspective Through Unconstrained Features Connall Garrod Jonathan P. Keating 65 4 0 30 Oct 2024
How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning Arthur Jacot Seok Hoan Choi Yuxiao Wen AI4CE 141 2 0 08 Jul 2024
Hamiltonian Mechanics of Feature Learning: Bottleneck Structure in Leaky ResNets Arthur Jacot Alexandre Kaiser 72 1 0 27 May 2024