Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.15311
Cited By
Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning
21 May 2025
Yurun Yuan
Fan Chen
Zeyu Jia
Alexander Rakhlin
Tengyang Xie
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning"
1 / 1 papers shown
Title
Bridging Supervised Learning and Reinforcement Learning in Math Reasoning
Huayu Chen
Kaiwen Zheng
Qinsheng Zhang
Ganqu Cui
Yin Cui
Haotian Ye
Tsung-Yi Lin
Ming-Yu Liu
Jun Zhu
Haoxiang Wang
OffRL
LRM
27
0
0
23 May 2025
1