Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.04909
Cited By
Meta-Principled Family of Hyperparameter Scaling Strategies
10 October 2022
Sho Yaida
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Meta-Principled Family of Hyperparameter Scaling Strategies"
9 / 9 papers shown
Title
Don't be lazy: CompleteP enables compute-efficient deep transformers
Nolan Dey
Bin Claire Zhang
Lorenzo Noci
Mufan Bill Li
Blake Bordelon
Shane Bergsma
C. Pehlevan
Boris Hanin
Joel Hestness
37
0
0
02 May 2025
Deep Neural Nets as Hamiltonians
Mike Winer
Boris Hanin
49
0
0
31 Mar 2025
Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos
Dayal Singh Kalra
Tianyu He
M. Barkeshli
47
4
0
17 Feb 2025
Small-scale proxies for large-scale Transformer training instabilities
Mitchell Wortsman
Peter J. Liu
Lechao Xiao
Katie Everett
A. Alemi
...
Jascha Narain Sohl-Dickstein
Kelvin Xu
Jaehoon Lee
Justin Gilmer
Simon Kornblith
24
80
0
25 Sep 2023
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks
Blake Bordelon
C. Pehlevan
MLT
24
29
0
06 Apr 2023
Effective Theory of Transformers at Initialization
Emily Dinan
Sho Yaida
Susan Zhang
20
14
0
04 Apr 2023
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
223
4,424
0
23 Jan 2020
Trainability and Accuracy of Neural Networks: An Interacting Particle System Approach
Grant M. Rotskoff
Eric Vanden-Eijnden
56
114
0
02 May 2018
1