Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.05209
Cited By
Model Tampering Attacks Enable More Rigorous Evaluations of LLM Capabilities
3 February 2025
Zora Che
Stephen Casper
Robert Kirk
Anirudh Satheesh
Stewart Slocum
Lev E McKinney
Rohit Gandikota
Aidan Ewart
Domenic Rosati
Zichu Wu
Zikui Cai
Bilal Chughtai
Y. Gal
Furong Huang
Dylan Hadfield-Menell
MU
AAML
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Model Tampering Attacks Enable More Rigorous Evaluations of LLM Capabilities"
1 / 1 papers shown
Title
Adaptively evaluating models with task elicitation
Davis Brown
Prithvi Balehannina
Helen Jin
Shreya Havaldar
Hamed Hassani
Eric Wong
ALM
ELM
82
0
0
03 Mar 2025
1