ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2508.00943
  4. Cited By
LLMs Can Covertly Sandbag on Capability Evaluations Against Chain-of-Thought Monitoring
v1v2 (latest)

LLMs Can Covertly Sandbag on Capability Evaluations Against Chain-of-Thought Monitoring

31 July 2025
Chloe Li
Mary Phuong
Noah Y. Siegel
    ELM
ArXiv (abs)PDFHTML

Papers citing "LLMs Can Covertly Sandbag on Capability Evaluations Against Chain-of-Thought Monitoring"

3 / 3 papers shown
Measuring Chain-of-Thought Monitorability Through Faithfulness and Verbosity
Measuring Chain-of-Thought Monitorability Through Faithfulness and Verbosity
Austin Meek
Eitan Sprejer
Iván Arcuschin
A. Brockmeier
Steven Basart
LRM
172
0
0
31 Oct 2025
A Pragmatic Way to Measure Chain-of-Thought Monitorability
A Pragmatic Way to Measure Chain-of-Thought Monitorability
Scott Emmons
Roland S. Zimmermann
David Elson
Rohin Shah
LRM
113
0
0
28 Oct 2025
All Code, No Thought: Current Language Models Struggle to Reason in Ciphered Language
All Code, No Thought: Current Language Models Struggle to Reason in Ciphered Language
Shiyuan Guo
Henry Sleight
Fabien Roger
ELMLRM
174
0
0
10 Oct 2025
1