Papers
Communities
Organizations
Events
Blog
Pricing
Feedback
Contact Sales
Search
Open menu
Home
Papers
2509.06941
Cited By
Outcome-based Exploration for LLM Reasoning
8 September 2025
Yuda Song
Julia Kempe
Remi Munos
OffRL
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Outcome-based Exploration for LLM Reasoning"
6 / 6 papers shown
Title
Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards
Haoran He
Yuxiao Ye
Qingpeng Cai
Chen-Hao Hu
Binxing Jiao
Daxin Jiang
Ling Pan
OffRL
LRM
0
0
0
29 Sep 2025
Quantile Advantage Estimation for Entropy-Safe Reasoning
Junkang Wu
Kexin Huang
Jiancan Wu
An Zhang
Xiang Wang
Xiangnan He
0
0
0
26 Sep 2025
RL Squeezes, SFT Expands: A Comparative Study of Reasoning LLMs
Kohsei Matsutani
Shota Takashiro
Gouki Minegishi
Takeshi Kojima
Yusuke Iwasawa
Yutaka Matsuo
OffRL
ReLM
LRM
41
0
0
25 Sep 2025
Soft Tokens, Hard Truths
Natasha Butt
Ariel Kwiatkowski
Ismail Labiad
Julia Kempe
Yann Ollivier
OffRL
CLL
LRM
0
0
0
23 Sep 2025
Compute as Teacher: Turning Inference Compute Into Reference-Free Supervision
Dulhan Jayalath
Shashwat Goel
Thomas Foster
Parag Jain
Suchin Gururangan
Cheng Zhang
Anirudh Goyal
Alan Schelten
OffRL
4
0
0
17 Sep 2025
Diversity-Enhanced Reasoning for Subjective Questions
Yumeng Wang
Zhiyuan Fan
Jiayu Liu
J. Huang
Yi R. Fung
LRM
57
3
0
27 Jul 2025
1