Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2508.00943
Cited By
v1
v2 (latest)
LLMs Can Covertly Sandbag on Capability Evaluations Against Chain-of-Thought Monitoring
31 July 2025
Chloe Li
Mary Phuong
Noah Y. Siegel
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"LLMs Can Covertly Sandbag on Capability Evaluations Against Chain-of-Thought Monitoring"
3 / 3 papers shown
Measuring Chain-of-Thought Monitorability Through Faithfulness and Verbosity
Austin Meek
Eitan Sprejer
Iván Arcuschin
A. Brockmeier
Steven Basart
LRM
172
0
0
31 Oct 2025
A Pragmatic Way to Measure Chain-of-Thought Monitorability
Scott Emmons
Roland S. Zimmermann
David Elson
Rohin Shah
LRM
113
0
0
28 Oct 2025
All Code, No Thought: Current Language Models Struggle to Reason in Ciphered Language
Shiyuan Guo
Henry Sleight
Fabien Roger
ELM
LRM
174
0
0
10 Oct 2025
1