Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.04028
Cited By
Contrastive Sparse Autoencoders for Interpreting Planning of Chess-Playing Agents
6 June 2024
Yoann Poupart
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Contrastive Sparse Autoencoders for Interpreting Planning of Chess-Playing Agents"
2 / 2 papers shown
Title
Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models
Goutham Rajendran
Simon Buchholz
Bryon Aragam
Bernhard Schölkopf
Pradeep Ravikumar
AI4CE
83
19
0
14 Feb 2024
Finding Neurons in a Haystack: Case Studies with Sparse Probing
Wes Gurnee
Neel Nanda
Matthew Pauly
Katherine Harvey
Dmitrii Troitskii
Dimitris Bertsimas
MILM
153
186
0
02 May 2023
1