ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.13714
  4. Cited By
TracrBench: Generating Interpretability Testbeds with Large Language
  Models

TracrBench: Generating Interpretability Testbeds with Large Language Models

7 September 2024
Hannes Thurnherr
Jérémy Scheurer
ArXivPDFHTML

Papers citing "TracrBench: Generating Interpretability Testbeds with Large Language Models"

3 / 3 papers shown
Title
Evaluating Explanations: An Explanatory Virtues Framework for Mechanistic Interpretability -- The Strange Science Part I.ii
Evaluating Explanations: An Explanatory Virtues Framework for Mechanistic Interpretability -- The Strange Science Part I.ii
Kola Ayonrinde
Louis Jaburi
XAI
55
1
0
02 May 2025
Neural Decompiling of Tracr Transformers
Neural Decompiling of Tracr Transformers
Hannes Thurnherr
Kaspar Riesen
ViT
15
1
0
29 Sep 2024
InterpBench: Semi-Synthetic Transformers for Evaluating Mechanistic
  Interpretability Techniques
InterpBench: Semi-Synthetic Transformers for Evaluating Mechanistic Interpretability Techniques
Rohan Gupta
Iván Arcuschin
Thomas Kwa
Adrià Garriga-Alonso
34
2
0
19 Jul 2024
1