Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.08847
Cited By
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
11 October 2024
Noam Razin
Sadhika Malladi
Adithya Bhaskar
Danqi Chen
Sanjeev Arora
Boris Hanin
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization"
9 / 9 papers shown
Title
SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning
Tianjian Li
Daniel Khashabi
7
0
0
05 May 2025
Decoupling Contrastive Decoding: Robust Hallucination Mitigation in Multimodal Large Language Models
Wei Chen
Xin Yan
Bin Wen
Fan Yang
Tingting Gao
Di Zhang
Long Chen
MLLM
52
0
0
09 Apr 2025
Conformal Linguistic Calibration: Trading-off between Factuality and Specificity
Zhengping Jiang
Anqi Liu
Benjamin Van Durme
51
1
0
26 Feb 2025
Is Free Self-Alignment Possible?
Dyah Adila
Changho Shin
Yijing Zhang
Frederic Sala
MoMe
67
2
0
24 Feb 2025
Training a Generally Curious Agent
Fahim Tajwar
Yiding Jiang
Abitha Thankaraj
Sumaita Sadia Rahman
J. Zico Kolter
Jeff Schneider
Ruslan Salakhutdinov
74
0
0
24 Feb 2025
Preference learning made easy: Everything should be understood through win rate
Lily H. Zhang
Rajesh Ranganath
33
0
0
14 Feb 2025
PIPA: Preference Alignment as Prior-Informed Statistical Estimation
Junbo Li
Zhangyang Wang
Qiang Liu
OffRL
54
0
0
09 Feb 2025
Understanding the Logic of Direct Preference Alignment through Logic
Kyle Richardson
Vivek Srikumar
Ashish Sabharwal
45
1
0
23 Dec 2024
Robust Preference Optimization through Reward Model Distillation
Adam Fisch
Jacob Eisenstein
Vicky Zayats
Alekh Agarwal
Ahmad Beirami
Chirag Nagpal
Peter Shaw
Jonathan Berant
38
18
0
29 May 2024
1