The Mechanics of Conceptual Interpretation in GPT Models: Interpretative Insights

5 August 2024

Papers citing "The Mechanics of Conceptual Interpretation in GPT Models: Interpretative Insights"

2 / 2 papers shown

Title
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small Kevin Wang Alexandre Variengien Arthur Conmy Buck Shlegeris Jacob Steinhardt 210 486 0 01 Nov 2022
Analyzing Commonsense Emergence in Few-shot Knowledge Models Jeff Da Ronan Le Bras Ximing Lu Yejin Choi Antoine Bosselut AI4MH KELM 64 40 0 01 Jan 2021