ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.18930
  4. Cited By
Hallucination of Multimodal Large Language Models: A Survey
v1v2 (latest)

Hallucination of Multimodal Large Language Models: A Survey

29 April 2024
Zechen Bai
Pichao Wang
Tianjun Xiao
Tong He
Zongbo Han
Zheng Zhang
Mike Zheng Shou
    VLMLRM
ArXiv (abs)PDFHTML

Papers citing "Hallucination of Multimodal Large Language Models: A Survey"

50 / 334 papers shown
Mitigating Hallucinations in Multimodal LLMs via Object-aware Preference Optimization
Mitigating Hallucinations in Multimodal LLMs via Object-aware Preference Optimization
Alberto Compagnoni
Davide Caffagni
Nicholas Moratelli
Lorenzo Baraldi
Marcella Cornia
Rita Cucchiara
MLLM
208
1
0
27 Aug 2025
Do LVLMs Know What They Know? A Systematic Study of Knowledge Boundary Perception in LVLMs
Do LVLMs Know What They Know? A Systematic Study of Knowledge Boundary Perception in LVLMs
Zhikai Ding
Shiyu Ni
Keping Bi
81
1
0
26 Aug 2025
Hierarchical Contextual Grounding LVLM: Enhancing Fine-Grained Visual-Language Understanding with Robust Grounding
Hierarchical Contextual Grounding LVLM: Enhancing Fine-Grained Visual-Language Understanding with Robust Grounding
Leilei Guo
Antonio Carlos Rivera
Peiyu Tang
Haoxuan Ren
Zheyu Song
171
1
0
23 Aug 2025
Learning to Steer: Input-dependent Steering for Multimodal LLMs
Learning to Steer: Input-dependent Steering for Multimodal LLMs
Jayneel Parekh
Pegah Khayatan
Mustafa Shukor
Arnaud Dapogny
A. Newson
Matthieu Cord
LLMSV
382
2
0
18 Aug 2025
Controlling Multimodal LLMs via Reward-guided Decoding
Controlling Multimodal LLMs via Reward-guided Decoding
Oscar Manas
Pierluca DÓro
Koustuv Sinha
Adriana Romero Soriano
M. Drozdzal
Aishwarya Agrawal
MLLM
137
0
0
15 Aug 2025
Empowering Multimodal LLMs with External Tools: A Comprehensive Survey
Empowering Multimodal LLMs with External Tools: A Comprehensive Survey
Wenbin An
Jiahao Nie
Yaqiang Wu
Feng Tian
Shijian Lu
Q. Zheng
MLLM
182
1
0
14 Aug 2025
Exploring Causal Effect of Social Bias on Faithfulness Hallucinations in Large Language Models
Exploring Causal Effect of Social Bias on Faithfulness Hallucinations in Large Language Models
Zhenliang Zhang
Junzhe Zhang
Xinyu Hu
Huixuan Zhang
Xiaojun Wan
HILM
170
0
0
11 Aug 2025
A Rolling Stone Gathers No Moss: Adaptive Policy Optimization for Stable Self-Evaluation in Large Multimodal Models
A Rolling Stone Gathers No Moss: Adaptive Policy Optimization for Stable Self-Evaluation in Large Multimodal Models
Wenkai Wang
Hongcan Guo
Zheqi Lv
Shengyu Zhang
93
0
0
05 Aug 2025
CAP-LLM: Context-Augmented Personalized Large Language Models for News Headline Generation
CAP-LLM: Context-Augmented Personalized Large Language Models for News Headline Generation
Raymond Wilson
Cole Graham
Chase Carter
Zefeng Yang
Ruiqi Gu
111
0
0
05 Aug 2025
MoHoBench: Assessing Honesty of Multimodal Large Language Models via Unanswerable Visual Questions
MoHoBench: Assessing Honesty of Multimodal Large Language Models via Unanswerable Visual Questions
Yanxu Zhu
Shitong Duan
Xiangxu Zhang
Jitao Sang
Peng Zhang
Tun Lu
Xiao Zhou
Jing Yao
Xiaoyuan Yi
Xing Xie
173
0
0
29 Jul 2025
TARS: MinMax Token-Adaptive Preference Strategy for MLLM Hallucination Reduction
TARS: MinMax Token-Adaptive Preference Strategy for MLLM Hallucination Reduction
Kejia Zhang
Keda Tao
Zhiming Luo
Chang Liu
Jiasheng Tang
Huan Wang
LRM
286
0
0
29 Jul 2025
Zero-shot Performance of Generative AI in Brazilian Portuguese Medical Exam
Zero-shot Performance of Generative AI in Brazilian Portuguese Medical Exam
C. Truyts
Amanda Gomes Rabelo
Gabriel Mesquita de Souza
Daniel Scaldaferri Lages
Adriano Jose Pereira
Uri Adrian Prync Flato
E. Reis
Joaquim Edson Vieira
Paulo Sergio Panse Silveira
Edson Amaro Junior
LM&MAELM
105
0
0
26 Jul 2025
OW-CLIP: Data-Efficient Visual Supervision for Open-World Object Detection via Human-AI Collaboration
OW-CLIP: Data-Efficient Visual Supervision for Open-World Object Detection via Human-AI Collaboration
Junwen Duan
Wei Xue
Ziyao Kang
Shixia Liu
Jiazhi Xia
VLM
167
0
0
26 Jul 2025
OVFact: Measuring and Improving Open-Vocabulary Factuality for Long Caption Models
OVFact: Measuring and Improving Open-Vocabulary Factuality for Long Caption Models
Monika Wysoczańska
Shyamal Buch
Anurag Arnab
Cordelia Schmid
HILM
187
0
0
25 Jul 2025
A Survey of Multimodal Hallucination Evaluation and Detection
A Survey of Multimodal Hallucination Evaluation and Detection
Zhiyuan Chen
Yuecong Min
Jie M. Zhang
Bei Yan
Jiahao Wang
X. Wang
Shiguang Shan
HILM
359
5
0
25 Jul 2025
Extracting Visual Facts from Intermediate Layers for Mitigating Hallucinations in Multimodal Large Language Models
Extracting Visual Facts from Intermediate Layers for Mitigating Hallucinations in Multimodal Large Language Models
Haoran Zhou
Zihan Zhang
Hao Chen
156
0
0
21 Jul 2025
Mitigating Object Hallucinations via Sentence-Level Early Intervention
Mitigating Object Hallucinations via Sentence-Level Early Intervention
Shangpin Peng
Senqiao Yang
Li Jiang
Zhuotao Tian
MLLM
243
5
0
16 Jul 2025
ByDeWay: Boost Your multimodal LLM with DEpth prompting in a Training-Free Way
ByDeWay: Boost Your multimodal LLM with DEpth prompting in a Training-Free Way
Rajarshi Roy
Devleena Das
A. Banerjee
Arjya Bhattacharjee
Kousik Dasgupta
Subarna Tripathi
VLM
245
1
0
11 Jul 2025
Identify, Isolate, and Purge: Mitigating Hallucinations in LVLMs via Self-Evolving Distillation
Identify, Isolate, and Purge: Mitigating Hallucinations in LVLMs via Self-Evolving Distillation
Wenhao Li
Xiu Su
Jingyi Wu
Feng Yang
Yang-Yang Liu
Yi-Ling Chen
Shan You
Chang Xu
VLM
232
0
0
07 Jul 2025
Loss-Oriented Ranking for Automated Visual Prompting in LVLMs
Loss-Oriented Ranking for Automated Visual Prompting in LVLMs
Yuan Zhang
Chun-Kai Fan
Tao Huang
Ming Lu
Sicheng Yu
Junwen Pan
Kuan Cheng
Qi She
Shanghang Zhang
VLMLRM
246
2
0
19 Jun 2025
Dual-Stage Value-Guided Inference with Margin-Based Reward Adjustment for Fast and Faithful VLM Captioning
Dual-Stage Value-Guided Inference with Margin-Based Reward Adjustment for Fast and Faithful VLM Captioning
Ankan Deria
Adinath Madhavrao Dukre
Feilong Tang
Sara Atito
Sudipta Roy
Muhammad Awais
Muhammad Haris Khan
Imran Razzak
VLM
283
0
0
18 Jun 2025
HEAL: An Empirical Study on Hallucinations in Embodied Agents Driven by Large Language Models
HEAL: An Empirical Study on Hallucinations in Embodied Agents Driven by Large Language Models
Trishna Chakraborty
Udita Ghosh
Xiaopan Zhang
Fahim Faisal Niloy
Yue Dong
Jiachen Li
Amit K. Roy-Chowdhury
Chengyu Song
LLMAGHILMLRM
251
3
0
18 Jun 2025
ASCD: Attention-Steerable Contrastive Decoding for Reducing Hallucination in MLLM
ASCD: Attention-Steerable Contrastive Decoding for Reducing Hallucination in MLLM
Yujun Wang
Aniri
Jinhe Bi
Soeren Pirk
Yunpu Ma
MLLM
363
11
0
17 Jun 2025
From Black Boxes to Transparent Minds: Evaluating and Enhancing the Theory of Mind in Multimodal Large Language Models
From Black Boxes to Transparent Minds: Evaluating and Enhancing the Theory of Mind in Multimodal Large Language Models
Xinyang Li
Siqi Liu
Bochao Zou
Jiansheng Chen
Huimin Ma
215
2
0
17 Jun 2025
VFaith: Do Large Multimodal Models Really Reason on Seen Images Rather than Previous Memories?
VFaith: Do Large Multimodal Models Really Reason on Seen Images Rather than Previous Memories?
Jiachen Yu
Yufei Zhan
Ziheng Wu
Yousong Zhu
Jinqiao Wang
Minghui Qiu
VLMLRM
217
4
0
13 Jun 2025
SECOND: Mitigating Perceptual Hallucination in Vision-Language Models via Selective and Contrastive Decoding
Woohyeon Park
Woojin Kim
Jaeik Kim
Jaeyoung Do
VLM
162
11
0
10 Jun 2025
Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models
Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models
Ruiyang Zhang
Hu Zhang
Hao Fei
Zhedong Zheng
UQCV
275
0
0
09 Jun 2025
HAIBU-ReMUD: Reasoning Multimodal Ultrasound Dataset and Model Bridging to General Specific Domains
HAIBU-ReMUD: Reasoning Multimodal Ultrasound Dataset and Model Bridging to General Specific Domains
Shijie Wang
Yilun Zhang
Zeyu Lai
Dexing Kong
226
0
0
09 Jun 2025
Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning
Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning
Tianyi Bai
Yuxuan Fan
Jiantao Qiu
Fupeng Sun
Jiayi Song
Junlin Han
Zichen Liu
Conghui He
Wentao Zhang
Binhang Yuan
MLLMVLM
277
2
0
08 Jun 2025
Ignoring Directionality Leads to Compromised Graph Neural Network Explanations
Changsheng Sun
Xinke Li
Jin Song Dong
AAML
302
7
0
05 Jun 2025
CoRe-MMRAG: Cross-Source Knowledge Reconciliation for Multimodal RAG
CoRe-MMRAG: Cross-Source Knowledge Reconciliation for Multimodal RAGAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Yang Tian
Fan Liu
Jingyuan Zhang
Victoria A. Webster-Wood
Yupeng Hu
Liqiang Nie
VLM
244
7
0
03 Jun 2025
V2X-UniPool: Unifying Multimodal Perception and Knowledge Reasoning for Autonomous Driving
V2X-UniPool: Unifying Multimodal Perception and Knowledge Reasoning for Autonomous Driving
Xuewen Luo
Fengze Yang
Fan Ding
Xiangbo Gao
Shuo Xing
Yang Zhou
Zhengzhong Tu
Chenxi Liu
LRM
304
13
0
03 Jun 2025
CLAIM: Mitigating Multilingual Object Hallucination in Large Vision-Language Models with Cross-Lingual Attention Intervention
CLAIM: Mitigating Multilingual Object Hallucination in Large Vision-Language Models with Cross-Lingual Attention InterventionAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Zekai Ye
Qiming Li
Xiaocheng Feng
L. Qin
Yichong Huang
...
Zhirui Zhang
Yunfei Lu
Duyu Tang
Dandan Tu
Bing Qin
VLMLRM
150
10
0
03 Jun 2025
Preemptive Hallucination Reduction: An Input-Level Approach for Multimodal Language Model
Preemptive Hallucination Reduction: An Input-Level Approach for Multimodal Language Model
Nokimul Hasan Arif
Shadman Rabby
Md Hefzul Hossain Papon
Sabbir Ahmed
MLLMVLM
337
0
0
29 May 2025
MMBoundary: Advancing MLLM Knowledge Boundary Awareness through Reasoning Step Confidence Calibration
MMBoundary: Advancing MLLM Knowledge Boundary Awareness through Reasoning Step Confidence CalibrationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Zhitao He
Sandeep Polisetty
Zhiyuan Fan
Yuchen Huang
Shujin Wu
Yi R.
LRM
466
11
0
29 May 2025
mRAG: Elucidating the Design Space of Multi-modal Retrieval-Augmented Generation
mRAG: Elucidating the Design Space of Multi-modal Retrieval-Augmented Generation
Chan-wei Hu
Yueqi Wang
Shuo Xing
Chia-Ju Chen
Zhengzhong Tu
Ryan Rossi
Zhengzhong Tu
3DV
359
2
0
29 May 2025
Learning to Route Queries Across Knowledge Bases for Step-wise Retrieval-Augmented Reasoning
Learning to Route Queries Across Knowledge Bases for Step-wise Retrieval-Augmented Reasoning
Chunyi Peng
Zhipeng Xu
Zhenghao Liu
Yishan Li
Shi Yu
...
Zhiyuan Liu
Yu Gu
Minghe Yu
Ge Yu
Maosong Sun
LRM
262
3
0
28 May 2025
RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction
RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction
Yuchi Wang
Yishuo Cai
Shuhuai Ren
Sihan Yang
Linli Yao
Yuanxin Liu
Y. Zhang
Pengfei Wan
Xu Sun
VLM
178
1
0
28 May 2025
D-Fusion: Direct Preference Optimization for Aligning Diffusion Models with Visually Consistent Samples
D-Fusion: Direct Preference Optimization for Aligning Diffusion Models with Visually Consistent Samples
Zijing Hu
Tai-wei Chang
Kun Kuang
277
9
0
28 May 2025
Mitigating Hallucination in Large Vision-Language Models via Adaptive Attention Calibration
Mitigating Hallucination in Large Vision-Language Models via Adaptive Attention Calibration
Mehrdad Fazli
Bowen Wei
Ahmet Sari
Ziwei Zhu
VLM
474
3
0
27 May 2025
The Mirage of Multimodality: Where Truth is Tested and Honesty Unravels
The Mirage of Multimodality: Where Truth is Tested and Honesty Unravels
Jiaming Ji
Sitong Fang
Wenjing Cao
Jiahao Li
Xuyao Wang
Juntao Dai
Chi-Min Chan
Sirui Han
Wenhan Luo
Yaodong Yang
LRM
204
0
0
26 May 2025
Causal-LLaVA: Causal Disentanglement for Mitigating Hallucination in Multimodal Large Language Models
Causal-LLaVA: Causal Disentanglement for Mitigating Hallucination in Multimodal Large Language Models
Xinmiao Hu
C. Wang
Ruihe An
ChenYu Shao
Xiaojun Ye
Sheng Zhou
Liangcheng Li
MLLMLRM
286
2
0
26 May 2025
ChartLens: Fine-grained Visual Attribution in Charts
ChartLens: Fine-grained Visual Attribution in ChartsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Manan Suri
Puneet Mathur
Nedim Lipka
Franck Dernoncourt
Ryan Rossi
Dinesh Manocha
212
1
0
25 May 2025
CCHall: A Novel Benchmark for Joint Cross-Lingual and Cross-Modal Hallucinations Detection in Large Language Models
CCHall: A Novel Benchmark for Joint Cross-Lingual and Cross-Modal Hallucinations Detection in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Yongheng Zhang
Xu Liu
Ruoxi Zhou
Qiguang Chen
Hao Fei
Wenpeng Lu
L. Qin
HILMLRM
223
7
0
25 May 2025
REGen: Multimodal Retrieval-Embedded Generation for Long-to-Short Video Editing
REGen: Multimodal Retrieval-Embedded Generation for Long-to-Short Video Editing
Weihan Xu
Yimeng Ma
Jingyue Huang
Yang Li
Wenye Ma
Taylor Berg-Kirkpatrick
Julian McAuley
Paul Pu Liang
Hao-Wen Dong
DiffMVGen
340
1
0
24 May 2025
Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps
Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps
Sicheng Feng
Song Wang
Shuyi Ouyang
Lingdong Kong
Zikai Song
Jianke Zhu
Huan Wang
Xinchao Wang
LRM
381
10
0
24 May 2025
EVADE-Bench: Multimodal Benchmark for Evasive Content Detection in E-Commerce Applications
EVADE-Bench: Multimodal Benchmark for Evasive Content Detection in E-Commerce Applications
Ancheng Xu
Zhihao Yang
Junlin Li
Guanghu Yuan
Longze Chen
...
Jiehui Zhou
Zhen Qin
Hengyun Chang
Hamid Alinejad-Rokny
Bo Zheng
AAML
280
1
0
23 May 2025
Analyzing Fine-Grained Alignment and Enhancing Vision Understanding in Multimodal Language Models
Analyzing Fine-Grained Alignment and Enhancing Vision Understanding in Multimodal Language Models
Jiachen Jiang
Jinxin Zhou
Bo Peng
Xia Ning
Zhihui Zhu
282
1
0
22 May 2025
MMaDA: Multimodal Large Diffusion Language Models
MMaDA: Multimodal Large Diffusion Language Models
Ling Yang
Ye Tian
Bowen Li
Xinchen Zhang
Ke Shen
Yunhai Tong
Mengdi Wang
VLMLRM
503
111
0
21 May 2025
Learning Interpretable Representations Leads to Semantically Faithful EEG-to-Text Generation
Learning Interpretable Representations Leads to Semantically Faithful EEG-to-Text Generation
Xiaozhao Liu
Dinggang Shen
Xihui Liu
322
1
0
21 May 2025
Previous
1234567
Next