Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.11179
Cited By
Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations with MDL-SAEs
15 October 2024
Kola Ayonrinde
Michael T. Pearce
Lee Sharkey
MILM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations with MDL-SAEs"
2 / 2 papers shown
Title
Train One Sparse Autoencoder Across Multiple Sparsity Budgets to Preserve Interpretability and Accuracy
Nikita Balagansky
Yaroslav Aksenov
Daniil Laptev
Vadim Kurochkin
Gleb Gerasimov
Nikita Koryagin
Daniil Gavrilov
46
0
0
30 May 2025
A Mathematical Philosophy of Explanations in Mechanistic Interpretability -- The Strange Science Part I.i
Kola Ayonrinde
Louis Jaburi
MILM
175
1
0
01 May 2025
1