Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.00743
Cited By
Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models
1 November 2024
Aashiq Muhamed
Mona Diab
Virginia Smith
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models"
1 / 1 papers shown
Title
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
Daking Rai
Yilun Zhou
Shi Feng
Abulhair Saparov
Ziyu Yao
75
18
0
02 Jul 2024
1