Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2504.09895
Cited By
Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data
14 April 2025
Shuai Zhao
Linchao Zhu
Yi Yang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data"
1 / 1 papers shown
Title
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models
Xiaobao Wu
LRM
60
0
0
05 May 2025
1