ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.05118
  4. Cited By
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

7 April 2025
Yu Yue
Yufeng Yuan
Qiying Yu
Xiaochen Zuo
Ruofei Zhu
W. Xu
Jiaze Chen
C. Wang
Tiantian Fan
Zhengyin Du
Xiangpeng Wei
X. Yu
Gaohong Liu
J. Liu
L. Liu
H. Lin
Zhiqi Lin
Bole Ma
C. Zhang
Mofan Zhang
Wang Zhang
Hang Zhu
Ru Zhang
Xin Liu
Mingxuan Wang
Yonghui Wu
Lin Yan
    OffRL
    LRM
ArXivPDFHTML

Papers citing "VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks"

4 / 4 papers shown
Title
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Yiping Wang
Qing Yang
Zhiyuan Zeng
Liliang Ren
L. Liu
...
Jianfeng Gao
Weizhu Chen
S. Wang
Simon S. Du
Yelong Shen
OffRL
ReLM
LRM
108
2
0
29 Apr 2025
Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning
Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning
Yixuan Even Xu
Yash Savani
Fei Fang
Zico Kolter
OffRL
24
1
0
18 Apr 2025
ToolRL: Reward is All Tool Learning Needs
ToolRL: Reward is All Tool Learning Needs
Cheng Qian
Emre Can Acikgoz
Qi He
Hongru Wang
X. Chen
Dilek Hakkani-Tür
Gökhan Tür
Heng Ji
OffRL
LRM
25
3
0
16 Apr 2025
A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility
A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility
Andreas Hochlehnert
Hardik Bhatnagar
Vishaal Udandarao
Samuel Albanie
Ameya Prabhu
Matthias Bethge
ReLM
ALM
LRM
66
4
0
09 Apr 2025
1