Depth Dependence of $μ$P Learning Rates in ReLU MLPs

Depth Dependence of $μ$ P Learning Rates in ReLU MLPs

13 May 2023

Sashank J. Reddi

Srinadh Bhojanapalli

Papers citing "Depth Dependence of $μ$P Learning Rates in ReLU MLPs"

6 / 6 papers shown

Title
MLPs at the EOC: Dynamics of Feature Learning Dávid Terjék MLT 41 0 0 18 Feb 2025
Scalable Optimization in the Modular Norm Tim Large Yang Liu Minyoung Huh Hyojin Bahng Phillip Isola Jeremy Bernstein 39 12 0 23 May 2024
Principled Architecture-aware Scaling of Hyperparameters Wuyang Chen Junru Wu Zhangyang Wang Boris Hanin AI4CE 36 1 0 27 Feb 2024
The Feature Speed Formula: a flexible approach to scale hyper-parameters of deep neural networks Lénaic Chizat Praneeth Netrapalli 13 4 0 30 Nov 2023
Tensor Programs VI: Feature Learning in Infinite-Depth Neural Networks Greg Yang Dingli Yu Chen Zhu Soufiane Hayou MLT 8 28 0 03 Oct 2023
Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit Blake Bordelon Lorenzo Noci Mufan Bill Li Boris Hanin C. Pehlevan 27 23 0 28 Sep 2023