Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.13518
Cited By
Selective Preference Optimization via Token-Level Reward Function Estimation
24 August 2024
Kailai Yang
Zhiwei Liu
Qianqian Xie
Jimin Huang
Erxue Min
Sophia Ananiadou
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Selective Preference Optimization via Token-Level Reward Function Estimation"
2 / 2 papers shown
Title
A Survey on Progress in LLM Alignment from the Perspective of Reward Design
Miaomiao Ji
Yanqiu Wu
Zhibin Wu
Shoujin Wang
Jian Yang
Mark Dras
Usman Naseem
31
0
0
05 May 2025
TLDR: Token-Level Detective Reward Model for Large Vision Language Models
Deqing Fu
Tong Xiao
Rui Wang
Wang Zhu
Pengchuan Zhang
Guan Pang
Robin Jia
Lawrence Chen
53
5
0
07 Oct 2024
1