Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.01679
Cited By
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
2 October 2024
Amirhossein Kazemnejad
Milad Aghajohari
Eva Portelance
Alessandro Sordoni
Siva Reddy
Aaron C. Courville
Nicolas Le Roux
OffRL
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"
4 / 4 papers shown
Title
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Yiping Wang
Qing Yang
Zhiyuan Zeng
Liliang Ren
L. Liu
...
Jianfeng Gao
Weizhu Chen
S. Wang
Simon S. Du
Yelong Shen
OffRL
ReLM
LRM
110
2
0
29 Apr 2025
Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
Taiwei Shi
Yiyang Wu
Linxin Song
Tianyi Zhou
Jieyu Zhao
LRM
76
1
0
07 Apr 2025
Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling
Zhenyu Hou
Xin Lv
Rui Lu
J. Zhang
Y. Li
Zijun Yao
Juanzi Li
J. Tang
Yuxiao Dong
OffRL
LRM
ReLM
49
20
0
20 Jan 2025
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models
Michael Noukhovitch
Shengyi Huang
Sophie Xhonneux
Arian Hosseini
Rishabh Agarwal
Aaron C. Courville
OffRL
77
5
0
23 Oct 2024
1