Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.13714
Cited By
TracrBench: Generating Interpretability Testbeds with Large Language Models
7 September 2024
Hannes Thurnherr
Jérémy Scheurer
Re-assign community
ArXiv
PDF
HTML
Papers citing
"TracrBench: Generating Interpretability Testbeds with Large Language Models"
3 / 3 papers shown
Title
Evaluating Explanations: An Explanatory Virtues Framework for Mechanistic Interpretability -- The Strange Science Part I.ii
Kola Ayonrinde
Louis Jaburi
XAI
55
1
0
02 May 2025
Neural Decompiling of Tracr Transformers
Hannes Thurnherr
Kaspar Riesen
ViT
15
1
0
29 Sep 2024
InterpBench: Semi-Synthetic Transformers for Evaluating Mechanistic Interpretability Techniques
Rohan Gupta
Iván Arcuschin
Thomas Kwa
Adrià Garriga-Alonso
34
2
0
19 Jul 2024
1