Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.16620
Cited By
Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit
28 September 2023
Blake Bordelon
Lorenzo Noci
Mufan Bill Li
Boris Hanin
C. Pehlevan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit"
15 / 15 papers shown
Title
Don't be lazy: CompleteP enables compute-efficient deep transformers
Nolan Dey
Bin Claire Zhang
Lorenzo Noci
Mufan Bill Li
Blake Bordelon
Shane Bergsma
C. Pehlevan
Boris Hanin
Joel Hestness
37
0
0
02 May 2025
Deep Neural Nets as Hamiltonians
Mike Winer
Boris Hanin
73
0
0
31 Mar 2025
MLPs at the EOC: Dynamics of Feature Learning
Dávid Terjék
MLT
41
0
0
18 Feb 2025
Deep Linear Network Training Dynamics from Random Initialization: Data, Width, Depth, and Hyperparameter Transfer
Blake Bordelon
C. Pehlevan
AI4CE
59
1
0
04 Feb 2025
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit
Oleg Filatov
Jan Ebert
Jiangtao Wang
Stefan Kesselheim
36
3
0
10 Jan 2025
How Does Critical Batch Size Scale in Pre-training?
Hanlin Zhang
Depen Morwani
Nikhil Vyas
Jingfeng Wu
Difan Zou
Udaya Ghai
Dean Phillips Foster
Sham Kakade
64
8
0
29 Oct 2024
How Feature Learning Can Improve Neural Scaling Laws
Blake Bordelon
Alexander B. Atanasov
C. Pehlevan
49
12
0
26 Sep 2024
Understanding and Minimising Outlier Features in Neural Network Training
Bobby He
Lorenzo Noci
Daniele Paliotta
Imanol Schlag
Thomas Hofmann
34
3
0
29 May 2024
Infinite Limits of Multi-head Transformer Dynamics
Blake Bordelon
Hamza Tahir Chaudhry
C. Pehlevan
AI4CE
42
9
0
24 May 2024
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks
Blake Bordelon
C. Pehlevan
MLT
35
29
0
06 Apr 2023
The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks
Blake Bordelon
C. Pehlevan
41
22
0
05 Oct 2022
Scaling Laws For Deep Learning Based Image Reconstruction
Tobit Klug
Reinhard Heckel
57
12
0
27 Sep 2022
Stable ResNet
Soufiane Hayou
Eugenio Clerico
Bo He
George Deligiannidis
Arnaud Doucet
Judith Rousseau
ODL
SSeg
46
51
0
24 Oct 2020
On the distance between two neural networks and the stability of learning
Jeremy Bernstein
Arash Vahdat
Yisong Yue
Ming-Yu Liu
ODL
190
57
0
09 Feb 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,424
0
23 Jan 2020
1