ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.05209
  4. Cited By
Model Tampering Attacks Enable More Rigorous Evaluations of LLM Capabilities

Model Tampering Attacks Enable More Rigorous Evaluations of LLM Capabilities

3 February 2025
Zora Che
Stephen Casper
Robert Kirk
Anirudh Satheesh
Stewart Slocum
Lev E McKinney
Rohit Gandikota
Aidan Ewart
Domenic Rosati
Zichu Wu
Zikui Cai
Bilal Chughtai
Y. Gal
Furong Huang
Dylan Hadfield-Menell
    MU
    AAML
    ELM
ArXivPDFHTML

Papers citing "Model Tampering Attacks Enable More Rigorous Evaluations of LLM Capabilities"

1 / 1 papers shown
Title
Adaptively evaluating models with task elicitation
Davis Brown
Prithvi Balehannina
Helen Jin
Shreya Havaldar
Hamed Hassani
Eric Wong
ALM
ELM
82
0
0
03 Mar 2025
1