Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.19328
Cited By
Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems
26 February 2025
Hao Peng
Y. Qi
Xiaozhi Wang
Zijun Yao
Bin Xu
Lei Hou
Juanzi Li
ALM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems"
3 / 3 papers shown
Title
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models
Xiaobao Wu
LRM
60
0
0
05 May 2025
A Desideratum for Conversational Agents: Capabilities, Challenges, and Future Directions
Emre Can Acikgoz
Cheng Qian
Hongru Wang
Vardhan Dongre
X. Chen
Heng Ji
Dilek Hakkani-Tür
Gökhan Tür
LM&Ro
ELM
41
1
0
07 Apr 2025
Inference-Time Scaling for Generalist Reward Modeling
Zijun Liu
P. Wang
R. Xu
Shirong Ma
Chong Ruan
Peng Li
Yang Janet Liu
Y. Wu
OffRL
LRM
36
9
0
03 Apr 2025
1