ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2509.23392
150
0
v1v2 (latest)

Your Models Have Thought Enough: Training Large Reasoning Models to Stop Overthinking

27 September 2025
Jinyi Han
Ying Huang
Ying Liao
Zishang Jiang
Xikun Lu
Haiquan Zhao
X. Wang
Guanghao Zhou
Sihang Jiang
Jiaqing Liang
Weikang Zhou
Zeye Sun
Fei Yu
Yanghua Xiao
    OffRLLRM
ArXiv (abs)PDFHTMLGithub (8★)
Main:8 Pages
10 Figures
Bibliography:3 Pages
3 Tables
Appendix:10 Pages
Abstract

Large Reasoning Models (LRMs) have achieved impressive performance on challenging tasks, yet their deep reasoning often incurs substantial computational costs. To achieve efficient reasoning, existing reinforcement learning methods still struggle to construct short reasoning path during the rollout stage, limiting effective learning. Inspired by Evidence Accumulation Models, we find that LRMs have accumulated sufficient information early in reasoning, making further reasoning steps redundant. Based on this insight, we propose Just-Enough Thinking (JET), which trains models to proactively terminate unnecessary reasoning. JET performs trajectory truncation during rollout to expose the model to short, distributionally consistent reasoning paths. Besides, it uses a quality-controlled length reward to better encourage concise reasoning while maintaining correctness. Extensive experiments demonstrate that JET significantly improves reasoning efficiency without sacrificing accuracy. Especially, DeepSeek-Distill-Qwen-1.5B achieves a 4.6% accuracy gain while reducing output length by 46.3% on the Olympiad benchmark. Our code is available in the GitHub.

View on arXiv
Comments on this paper