Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2508.00943
Cited By

LLMs Can Covertly Sandbag on Capability Evaluations Against Chain-of-Thought Monitoring

v1v2 (latest)

LLMs Can Covertly Sandbag on Capability Evaluations Against Chain-of-Thought Monitoring

31 July 2025

ArXiv (abs)PDF HTML

Papers citing "LLMs Can Covertly Sandbag on Capability Evaluations Against Chain-of-Thought Monitoring"

3 / 3 papers shown

Measuring Chain-of-Thought Monitorability Through Faithfulness and Verbosity

Measuring Chain-of-Thought Monitorability Through Faithfulness and Verbosity

Iván Arcuschin

172

0

0

31 Oct 2025

A Pragmatic Way to Measure Chain-of-Thought Monitorability

A Pragmatic Way to Measure Chain-of-Thought Monitorability

Roland S. Zimmermann

113

0

0

28 Oct 2025

All Code, No Thought: Current Language Models Struggle to Reason in Ciphered Language

All Code, No Thought: Current Language Models Struggle to Reason in Ciphered Language

174

0

0

10 Oct 2025