Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.02984
Cited By
Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient
3 October 2024
George Wang
Jesse Hoogland
Stan van Wingerden
Zach Furman
Daniel Murfet
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient"
4 / 4 papers shown
Title
Modes of Sequence Models and Learning Coefficients
Zhongtian Chen
Daniel Murfet
87
1
0
25 Apr 2025
Studying Small Language Models with Susceptibilities
Garrett Baker
George Wang
Jesse Hoogland
Daniel Murfet
AAML
75
1
0
25 Apr 2025
Emergence of Computational Structure in a Neural Network Physics Simulator
Rohan Hitchcock
Gary W. Delaney
J. Manton
Richard Scalzo
Jingge Zhu
29
0
0
16 Apr 2025
Almost Bayesian: The Fractal Dynamics of Stochastic Gradient Descent
Max Hennick
Stijn De Baerdemacker
49
0
0
28 Mar 2025
1