ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.15311
  4. Cited By
Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning

Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning

21 May 2025
Yurun Yuan
Fan Chen
Zeyu Jia
Alexander Rakhlin
Tengyang Xie
    OffRL
ArXivPDFHTML

Papers citing "Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning"

1 / 1 papers shown
Title
Bridging Supervised Learning and Reinforcement Learning in Math Reasoning
Bridging Supervised Learning and Reinforcement Learning in Math Reasoning
Huayu Chen
Kaiwen Zheng
Qinsheng Zhang
Ganqu Cui
Yin Cui
Haotian Ye
Tsung-Yi Lin
Ming-Yu Liu
Jun Zhu
Haoxiang Wang
OffRL
LRM
27
0
0
23 May 2025
1