Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2502.04567
Cited By
Preference Optimization via Contrastive Divergence: Your Reward Model is Secretly an NLL Estimator
6 February 2025
Zhuotong Chen
Fang Liu
Xuan Zhu
Yanjun Qi
Mohammad Ghavamzadeh
Re-assign community
ArXiv (abs)
PDF
HTML
Github
Papers citing
"Preference Optimization via Contrastive Divergence: Your Reward Model is Secretly an NLL Estimator"
0 / 0 papers shown
No papers found
Page 1 of 0