Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.17653
Cited By
InversionView: A General-Purpose Method for Reading Information from Neural Activations
27 May 2024
Xinting Huang
Madhur Panwar
Navin Goyal
Michael Hahn
Re-assign community
ArXiv
PDF
HTML
Papers citing
"InversionView: A General-Purpose Method for Reading Information from Neural Activations"
5 / 5 papers shown
Title
Model Lakes
Koyena Pal
David Bau
Renée J. Miller
63
0
0
24 Feb 2025
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model
Michael Hanna
Ollie Liu
Alexandre Variengien
LRM
184
116
0
30 Apr 2023
Dissecting Recall of Factual Associations in Auto-Regressive Language Models
Mor Geva
Jasmijn Bastings
Katja Filippova
Amir Globerson
KELM
189
260
0
28 Apr 2023
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
210
486
0
01 Nov 2022
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
221
402
0
24 Feb 2021
1