Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.04612
Cited By
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
6 October 2024
Zhaolin Gao
Wenhao Zhan
Jonathan D. Chang
Gokul Swamy
Kianté Brantley
Jason D. Lee
Wen Sun
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF"
2 / 2 papers shown
Title
RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning
Z. Wang
K. Wang
Q. Wang
Pingyue Zhang
Linjie Li
...
Jiajun Wu
L. Fei-Fei
Lijuan Wang
Yejin Choi
Manling Li
73
1
0
24 Apr 2025
All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning
Gokul Swamy
Sanjiban Choudhury
Wen Sun
Zhiwei Steven Wu
J. Andrew Bagnell
OffRL
34
7
0
03 Mar 2025
1