Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.06301
Cited By
Dynamical versus Bayesian Phase Transitions in a Toy Model of Superposition
10 October 2023
Zhongtian Chen
Edmund Lau
Jake Mendel
Susan Wei
Daniel Murfet
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Dynamical versus Bayesian Phase Transitions in a Toy Model of Superposition"
12 / 12 papers shown
Title
Modes of Sequence Models and Learning Coefficients
Zhongtian Chen
Daniel Murfet
77
1
0
25 Apr 2025
Emergence of Computational Structure in a Neural Network Physics Simulator
Rohan Hitchcock
Gary W. Delaney
J. Manton
Richard Scalzo
Jingge Zhu
22
0
0
16 Apr 2025
Almost Bayesian: The Fractal Dynamics of Stochastic Gradient Descent
Max Hennick
Stijn De Baerdemacker
36
0
0
28 Mar 2025
Beyond Interpretability: The Gains of Feature Monosemanticity on Model Robustness
Qi Zhang
Yifei Wang
Jingyi Cui
Xiang Pan
Qi Lei
Stefanie Jegelka
Yisen Wang
AAML
29
1
0
27 Oct 2024
The Persian Rug: solving toy models of superposition using large-scale symmetries
Aditya Cowsik
Kfir Dolev
Alex Infanger
19
0
0
15 Oct 2024
Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient
George Wang
Jesse Hoogland
Stan van Wingerden
Zach Furman
Daniel Murfet
OffRL
15
7
0
03 Oct 2024
Using Degeneracy in the Loss Landscape for Mechanistic Interpretability
Lucius Bushnaq
Jake Mendel
Stefan Heimersheim
Dan Braun
Nicholas Goldowsky-Dill
Kaarel Hänni
Cindy Wu
Marius Hobbhahn
19
7
0
17 May 2024
Mechanistic Interpretability for AI Safety -- A Review
Leonard Bereska
E. Gavves
AI4CE
38
111
0
22 Apr 2024
Estimating the Local Learning Coefficient at Scale
Zach Furman
Edmund Lau
17
3
0
06 Feb 2024
SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Emmanuel Abbe
Enric Boix-Adserà
Theodor Misiakiewicz
FedML
MLT
76
72
0
21 Feb 2023
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
240
453
0
24 Sep 2022
Toy Models of Superposition
Nelson Elhage
Tristan Hume
Catherine Olsson
Nicholas Schiefer
T. Henighan
...
Sam McCandlish
Jared Kaplan
Dario Amodei
Martin Wattenberg
C. Olah
AAML
MILM
120
314
0
21 Sep 2022
1