Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.11839
Cited By
mDPO: Conditional Preference Optimization for Multimodal Large Language Models
17 June 2024
Fei Wang
Wenxuan Zhou
James Y. Huang
Nan Xu
Sheng Zhang
Hoifung Poon
Muhao Chen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"mDPO: Conditional Preference Optimization for Multimodal Large Language Models"
8 / 8 papers shown
Title
Noise Contrastive Alignment of Language Models with Explicit Rewards
Huayu Chen
Guande He
Lifan Yuan
Ganqu Cui
Hang Su
Jun Zhu
23
14
0
08 Feb 2024
KTO: Model Alignment as Prospect Theoretic Optimization
Kawin Ethayarajh
Winnie Xu
Niklas Muennighoff
Dan Jurafsky
Douwe Kiela
139
147
0
02 Feb 2024
Self-Rewarding Language Models
Weizhe Yuan
Richard Yuanzhe Pang
Kyunghyun Cho
Xian Li
Sainbayar Sukhbaatar
Jing Xu
Jason Weston
ReLM
SyDa
ALM
LRM
190
133
0
18 Jan 2024
Silkie: Preference Distillation for Large Visual Language Models
Lei Li
Zhihui Xie
Mukai Li
Shunian Chen
Peiyi Wang
Liang Chen
Yazheng Yang
Benyou Wang
Lingpeng Kong
MLLM
69
32
0
17 Dec 2023
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
M. Steyvers
Yuan Yao
Haoye Zhang
Taiwen He
Yifeng Han
...
Xinyue Hu
Zhiyuan Liu
Hai-Tao Zheng
Maosong Sun
Tat-Seng Chua
MLLM
VLM
109
56
0
01 Dec 2023
Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
Sicong Leng
Hang Zhang
Guanzheng Chen
Xin Li
Shijian Lu
Chunyan Miao
Li Bing
VLM
MLLM
68
66
0
28 Nov 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
298
8,441
0
04 Mar 2022
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
227
3,029
0
09 Mar 2020
1