Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.04478
Cited By
Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small
5 September 2024
Maheep Chaudhary
Atticus Geiger
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small"
4 / 4 papers shown
Title
MIB: A Mechanistic Interpretability Benchmark
Aaron Mueller
Atticus Geiger
Sarah Wiegreffe
Dana Arad
Iván Arcuschin
...
Alessandro Stolfo
Martin Tutek
Amir Zur
David Bau
Yonatan Belinkov
41
1
0
17 Apr 2025
Rethinking Evaluation of Sparse Autoencoders through the Representation of Polysemous Words
Gouki Minegishi
Hiroki Furuta
Yusuke Iwasawa
Y. Matsuo
49
1
0
09 Jan 2025
Analyzing (In)Abilities of SAEs via Formal Languages
Abhinav Menon
Manish Shrivastava
David M. Krueger
Ekdeep Singh Lubana
42
7
0
15 Oct 2024
Residual Stream Analysis with Multi-Layer SAEs
Tim Lawson
Lucy Farnik
Conor Houghton
Laurence Aitchison
26
3
0
06 Sep 2024
1