Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.04962
Cited By
Activation Scaling for Steering and Interpreting Language Models
7 October 2024
Niklas Stoehr
Kevin Du
Vésteinn Snæbjarnarson
Robert West
Ryan Cotterell
Aaron Schein
LLMSV
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Activation Scaling for Steering and Interpreting Language Models"
1 / 1 papers shown
Title
Better Estimation of the KL Divergence Between Language Models
Afra Amini
Tim Vieira
Ryan Cotterell
36
0
0
14 Apr 2025
1