Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.00901
Cited By
Real Sparks of Artificial Intelligence and the Importance of Inner Interpretability
31 January 2024
Alex Grzankowski
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Real Sparks of Artificial Intelligence and the Importance of Inner Interpretability"
2 / 2 papers shown
Title
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
210
486
0
01 Nov 2022
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
221
402
0
24 Feb 2021
1