ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
  • Feedback
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2509.06941
  4. Cited By
Outcome-based Exploration for LLM Reasoning

Outcome-based Exploration for LLM Reasoning

8 September 2025
Yuda Song
Julia Kempe
Remi Munos
    OffRLLRM
ArXiv (abs)PDFHTML

Papers citing "Outcome-based Exploration for LLM Reasoning"

6 / 6 papers shown
Title
Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards
Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards
Haoran He
Yuxiao Ye
Qingpeng Cai
Chen-Hao Hu
Binxing Jiao
Daxin Jiang
Ling Pan
OffRLLRM
0
0
0
29 Sep 2025
Quantile Advantage Estimation for Entropy-Safe Reasoning
Quantile Advantage Estimation for Entropy-Safe Reasoning
Junkang Wu
Kexin Huang
Jiancan Wu
An Zhang
Xiang Wang
Xiangnan He
0
0
0
26 Sep 2025
RL Squeezes, SFT Expands: A Comparative Study of Reasoning LLMs
RL Squeezes, SFT Expands: A Comparative Study of Reasoning LLMs
Kohsei Matsutani
Shota Takashiro
Gouki Minegishi
Takeshi Kojima
Yusuke Iwasawa
Yutaka Matsuo
OffRLReLMLRM
41
0
0
25 Sep 2025
Soft Tokens, Hard Truths
Soft Tokens, Hard Truths
Natasha Butt
Ariel Kwiatkowski
Ismail Labiad
Julia Kempe
Yann Ollivier
OffRLCLLLRM
0
0
0
23 Sep 2025
Compute as Teacher: Turning Inference Compute Into Reference-Free Supervision
Compute as Teacher: Turning Inference Compute Into Reference-Free Supervision
Dulhan Jayalath
Shashwat Goel
Thomas Foster
Parag Jain
Suchin Gururangan
Cheng Zhang
Anirudh Goyal
Alan Schelten
OffRL
4
0
0
17 Sep 2025
Diversity-Enhanced Reasoning for Subjective Questions
Diversity-Enhanced Reasoning for Subjective Questions
Yumeng Wang
Zhiyuan Fan
Jiayu Liu
J. Huang
Yi R. Fung
LRM
57
3
0
27 Jul 2025
1