Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.03742
Cited By
Beyond Scalar Reward Model: Learning Generative Judge from Preference Data
1 October 2024
Ziyi Ye
Xiangsheng Li
Qiuchi Li
Qingyao Ai
Yujia Zhou
Wei Shen
Dong Yan
Yiqun Liu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Beyond Scalar Reward Model: Learning Generative Judge from Preference Data"
4 / 4 papers shown
Title
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models
Xiaobao Wu
LRM
60
0
0
05 May 2025
GenCLS++: Pushing the Boundaries of Generative Classification in LLMs Through Comprehensive SFT and RL Studies Across Diverse Datasets
Mingqian He
Fei Zhao
Chonggang Lu
Z. Liu
Y. Wang
Haofu Qian
OffRL
AI4TS
VLM
64
0
0
28 Apr 2025
IPO: Your Language Model is Secretly a Preference Classifier
Shivank Garg
Ayush Singh
Shweta Singh
Paras Chopra
47
1
0
22 Feb 2025
Reinforcement Learning Enhanced LLMs: A Survey
Shuhe Wang
Shengyu Zhang
J. Zhang
Runyi Hu
Xiaoya Li
Tianwei Zhang
Jiwei Li
Fei Wu
G. Wang
Eduard H. Hovy
OffRL
114
6
0
05 Dec 2024
1