Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.15943
Cited By
Transformers represent belief state geometry in their residual stream
24 May 2024
A. Shai
Sarah E. Marzen
Lucas Teixeira
Alexander Gietelink Oldenziel
P. Riechers
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Transformers represent belief state geometry in their residual stream"
4 / 4 papers shown
Title
Evaluating Explanations: An Explanatory Virtues Framework for Mechanistic Interpretability -- The Strange Science Part I.ii
Kola Ayonrinde
Louis Jaburi
XAI
67
1
0
02 May 2025
Modes of Sequence Models and Learning Coefficients
Zhongtian Chen
Daniel Murfet
77
1
0
25 Apr 2025
ICLR: In-Context Learning of Representations
Core Francisco Park
Andrew Lee
Ekdeep Singh Lubana
Yongyi Yang
Maya Okawa
Kento Nishi
Martin Wattenberg
Hidenori Tanaka
AIFin
111
3
0
29 Dec 2024
Programming Refusal with Conditional Activation Steering
Bruce W. Lee
Inkit Padhi
K. Ramamurthy
Erik Miehling
Pierre L. Dognin
Manish Nagireddy
Amit Dhurandhar
LLMSV
89
13
0
06 Sep 2024
1