ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.11839
  4. Cited By
mDPO: Conditional Preference Optimization for Multimodal Large Language
  Models

mDPO: Conditional Preference Optimization for Multimodal Large Language Models

17 June 2024
Fei Wang
Wenxuan Zhou
James Y. Huang
Nan Xu
Sheng Zhang
Hoifung Poon
Muhao Chen
ArXiv (abs)PDFHTMLHuggingFace (40 upvotes)Github

Papers citing "mDPO: Conditional Preference Optimization for Multimodal Large Language Models"

31 / 31 papers shown
Optimizing LVLMs with On-Policy Data for Effective Hallucination Mitigation
Optimizing LVLMs with On-Policy Data for Effective Hallucination Mitigation
Chengzhi Yu
Yifan Xu
Yifan Chen
Wenyi Zhang
MLLMOffRL
336
1
0
30 Nov 2025
What Color Is It? A Text-Interference Multimodal Hallucination Benchmark
What Color Is It? A Text-Interference Multimodal Hallucination Benchmark
Jinkun Zhao
Lei Huang
Haixin Ge
Wenjun Wu
VLM
286
1
0
17 Nov 2025
MedAlign: A Synergistic Framework of Multimodal Preference Optimization and Federated Meta-Cognitive Reasoning
MedAlign: A Synergistic Framework of Multimodal Preference Optimization and Federated Meta-Cognitive Reasoning
Siyong Chen
Jinbo Wen
Jiawen Kang
Tenghui Huang
Xumin Huang
...
Hudan Pan
Zishao Zhong
Zhu Han
Shengli Xie
Dong In Kim
207
0
0
24 Oct 2025
RL makes MLLMs see better than SFT
RL makes MLLMs see better than SFT
Junha Song
Sangdoo Yun
Dongyoon Han
Jaegul Choo
Byeongho Heo
LRM
244
2
0
18 Oct 2025
COSMO-RL: Towards Trustworthy LMRMs via Joint Safety and Stability
COSMO-RL: Towards Trustworthy LMRMs via Joint Safety and Stability
Yizhuo Ding
M. Chen
Qiuhua Liu
Fenghua Weng
Wanying Qu
Yue Yang
Yugang Jiang
Zuxuan Wu
Yanwei Fu
Wenqi Shao
LRM
115
0
0
05 Oct 2025
Harnessing Synthetic Preference Data for Enhancing Temporal Understanding of Video-LLMs
Harnessing Synthetic Preference Data for Enhancing Temporal Understanding of Video-LLMs
Sameep Vani
Shreyas Jena
Maitreya Patel
Chitta Baral
Somak Aditya
Yezhou Yang
AI4TSSyDa
182
0
0
04 Oct 2025
Importance Sampling for Multi-Negative Multimodal Direct Preference Optimization
Importance Sampling for Multi-Negative Multimodal Direct Preference Optimization
Xintong Li
Chuhan Wang
Junda Wu
Rohan Surana
Tong Yu
Julian McAuley
Jingbo Shang
145
1
0
30 Sep 2025
Mitigating Visual Hallucinations via Semantic Curriculum Preference Optimization in MLLMs
Mitigating Visual Hallucinations via Semantic Curriculum Preference Optimization in MLLMs
Yuanshuai Li
Yuping Yan
Junfeng Tang
Yunxuan Li
Zeqi Zheng
Yaochu Jin
178
1
0
29 Sep 2025
OmniDPO: A Preference Optimization Framework to Address Omni-Modal Hallucination
OmniDPO: A Preference Optimization Framework to Address Omni-Modal Hallucination
Junzhe Chen
Tianshu Zhang
Shiyu Huang
Yuwei Niu
Chao Sun
Rongzhou Zhang
G. Zhou
Lijie Wen
Xuming Hu
MLLM
225
2
0
31 Aug 2025
Directed-Tokens: A Robust Multi-Modality Alignment Approach to Large Language-Vision Models
Directed-Tokens: A Robust Multi-Modality Alignment Approach to Large Language-Vision Models
Thanh-Dat Truong
Huu-Thien Tran
Tran Thai Son
Bhiksha Raj
Khoa Luu
373
2
0
19 Aug 2025
Controlling Multimodal LLMs via Reward-guided Decoding
Controlling Multimodal LLMs via Reward-guided Decoding
Oscar Manas
Pierluca DÓro
Koustuv Sinha
Adriana Romero Soriano
M. Drozdzal
Aishwarya Agrawal
MLLM
196
0
0
15 Aug 2025
TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs
TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs
Kejia Zhang
Keda Tao
Zhiming Luo
Chang Liu
Jiasheng Tang
Huan Wang
370
0
0
29 Jul 2025
Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning
Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning
Yana Wei
Liang Zhao
Jianjian Sun
Kangheng Lin
Jisheng Yin
...
Qi Han
Zheng Ge
Xiangyu Zhang
Daxin Jiang
Vishal M. Patel
OffRLReLMLRMVLM
285
22
0
07 Jul 2025
Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs
Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs
Yifan Shen
Yuanzhe Liu
Jingyuan Zhu
Xu Cao
Xiaofeng Zhang
Yixiao He
Wenming Ye
James M. Rehg
Ismini Lourentzou
LRM
239
19
0
26 Jun 2025
LEO-VL: Efficient Scene Representation for Scalable 3D Vision-Language Learning
LEO-VL: Efficient Scene Representation for Scalable 3D Vision-Language Learning
J. Huang
Xiaojian Ma
Xiongkun Linghu
Yue Fan
Junchao He
Wenxin Tan
Qing Li
Song-Chun Zhu
Yixin Chen
CoGe
334
5
0
11 Jun 2025
MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning
MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning
Yiqing Liang
Jielin Qiu
Wenhao Ding
Zuxin Liu
James Tompkin
Mengdi Xu
Mengzhou Xia
Zhengzhong Tu
Laixi Shi
Jiacheng Zhu
OffRL
529
20
0
30 May 2025
Diagnosing and Mitigating Modality Interference in Multimodal Large Language Models
Diagnosing and Mitigating Modality Interference in Multimodal Large Language Models
Rui Cai
Bangzheng Li
Xiaofei Wen
Muhao Chen
Zhe Zhao
293
0
0
26 May 2025
Co-Reinforcement Learning for Unified Multimodal Understanding and Generation
Co-Reinforcement Learning for Unified Multimodal Understanding and Generation
Jingjing Jiang
Chongjie Si
Jun Luo
Hanwang Zhang
Chao Ma
939
8
0
23 May 2025
OViP: Online Vision-Language Preference Learning for VLM Hallucination
OViP: Online Vision-Language Preference Learning for VLM Hallucination
Shujun Liu
Siyuan Wang
Zejun Li
Jianxiang Wang
Cheng Zeng
Zhongyu Wei
MLLMVLM
366
0
0
21 May 2025
VideoPASTA: 7K Preference Pairs That Matter for Video-LLM Alignment
VideoPASTA: 7K Preference Pairs That Matter for Video-LLM Alignment
Yogesh Kulkarni
Pooyan Fazli
617
5
0
18 Apr 2025
Perception-R1: Pioneering Perception Policy with Reinforcement Learning
Perception-R1: Pioneering Perception Policy with Reinforcement Learning
En Yu
Kangheng Lin
Liang Zhao
Jisheng Yin
Yana Wei
...
Zheng Ge
Xiangyu Zhang
Daxin Jiang
Jingyu Wang
Wenbing Tao
VLMOffRLLRM
431
85
0
10 Apr 2025
PaMi-VDPO: Mitigating Video Hallucinations by Prompt-Aware Multi-Instance Video Preference Learning
PaMi-VDPO: Mitigating Video Hallucinations by Prompt-Aware Multi-Instance Video Preference Learning
Xinpeng Ding
Jianchao Tan
Jinahua Han
Lanqing Hong
Hang Xu
Xuelong Li
MLLMVLM
1.2K
8
0
08 Apr 2025
Text Speaks Louder than Vision: ASCII Art Reveals Textual Biases in Vision-Language Models
Text Speaks Louder than Vision: ASCII Art Reveals Textual Biases in Vision-Language Models
Zhaochen Wang
Yujun Cai
Zi Huang
Bryan Hooi
Yiwei Wang
Ming Yang
CoGeVLM
436
5
0
02 Apr 2025
Aligning Multimodal LLM with Human Preference: A Survey
Aligning Multimodal LLM with Human Preference: A Survey
Tao Yu
Yujiao Shi
Chaoyou Fu
Junkang Wu
Jinda Lu
...
Qingsong Wen
Zheng Zhang
Yan Huang
Liang Wang
Tieniu Tan
888
15
0
18 Mar 2025
Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding
Octopus: Alleviating Hallucination via Dynamic Contrastive DecodingComputer Vision and Pattern Recognition (CVPR), 2025
Wei Suo
Lijun Zhang
Mengyang Sun
Lin Yuanbo Wu
Peng Wang
Yujiao Shi
MLLMVLM
338
21
0
01 Mar 2025
MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization
MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization
Kangyu Zhu
Peng Xia
Yun Li
Hongtu Zhu
Sheng Wang
Huaxiu Yao
714
22
0
09 Dec 2024
Modality-Fair Preference Optimization for Trustworthy MLLM Alignment
Modality-Fair Preference Optimization for Trustworthy MLLM AlignmentInternational Joint Conference on Artificial Intelligence (IJCAI), 2024
Songtao Jiang
Yan Zhang
Ruizhe Chen
Yeying Jin
Zuozhu Liu
Qinglin He
Yang Feng
Jian Wu
Zuozhu Liu
MoEMLLM
395
23
0
20 Oct 2024
From Introspection to Best Practices: Principled Analysis of Demonstrations in Multimodal In-Context Learning
From Introspection to Best Practices: Principled Analysis of Demonstrations in Multimodal In-Context Learning
Nan Xu
Fei Wang
Sheng Zhang
Hoifung Poon
Muhao Chen
408
10
0
01 Jul 2024
Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback
Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback
Wenyi Xiao
Ziwei Huang
Yaoyao Yu
Wanggui He
Haoyuan Li
Zhelun Yu
Hao Jiang
Leilei Gan
Linchao Zhu
MLLM
409
69
0
22 Apr 2024
FGAIF: Aligning Large Vision-Language Models with Fine-grained AI Feedback
FGAIF: Aligning Large Vision-Language Models with Fine-grained AI Feedback
Liqiang Jing
Xinya Du
453
33
0
07 Apr 2024
Self-Rewarding Language Models
Self-Rewarding Language Models
Weizhe Yuan
Richard Yuanzhe Pang
Kyunghyun Cho
Xian Li
Sainbayar Sukhbaatar
Jing Xu
Jason Weston
ReLMSyDaALMLRM
988
540
0
18 Jan 2024
1
Page 1 of 1