Communities
Connect sessions
AI calendar
Organizations
Contact Sales
Search
Open menu
Home
Papers
All Papers
Title
Home
Papers
2508.05613
Cited By
Cooper: Co-Optimizing Policy and Reward Models in Reinforcement Learning for Large Language Models
7 August 2025
Haitao Hong
Yuchen Yan
Xingyu Wu
Guiyang Hou
Wenqi Zhang
Weiming Lu
Yongliang Shen
Jun Xiao
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (3 upvotes)
Github (23★)
Papers citing
"Cooper: Co-Optimizing Policy and Reward Models in Reinforcement Learning for Large Language Models"
1 / 1 papers shown
Title
VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models
Yuchen Yan
Jin Jiang
Zhenbang Ren
Yijun Li
Xudong Cai
...
Mengdi Zhang
Jian Shao
Yongliang Shen
Jun Xiao
Yueting Zhuang
OffRL
ALM
LRM
227
6
0
21 May 2025
1