ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2509.23808
  4. Cited By
Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR
v1v2v3 (latest)

Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR

28 September 2025
Fanding Huang
Guanbo Huang
Xiao Fan
Yi He
Xiao Liang
Xiao Chen
Qinting Jiang
Faisal Nadeem Khan
Jingyan Jiang
Zhi Wang
    OffRL
ArXiv (abs)PDFHTMLHuggingFace (46 upvotes)Github (26★)

Papers citing "Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR"

0 / 0 papers shown

No papers found