ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.18930
  4. Cited By
Hallucination of Multimodal Large Language Models: A Survey
v1v2 (latest)

Hallucination of Multimodal Large Language Models: A Survey

29 April 2024
Zechen Bai
Pichao Wang
Tianjun Xiao
Tong He
Zongbo Han
Zheng Zhang
Mike Zheng Shou
    VLMLRM
ArXiv (abs)PDFHTML

Papers citing "Hallucination of Multimodal Large Language Models: A Survey"

50 / 334 papers shown
Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction
Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction
Yangui Fang
Baixu Cheng
Jing Peng
Xu Li
Yu Xi
Chengwei Zhang
Guohui Zhong
320
5
0
24 Dec 2025
Drifting Away from Truth: GenAI-Driven News Diversity Challenges LVLM-Based Misinformation Detection
Drifting Away from Truth: GenAI-Driven News Diversity Challenges LVLM-Based Misinformation Detection
Fanxiao Li
Jiaying Wu
Tingchao Fu
Yunyun Dong
Bingbing Song
Wei Zhou
232
2
0
24 Dec 2025
Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation
Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation
Subin Kim
Sangwoo Mo
Mamshad Nayeem Rizve
Yiran Xu
Difan Liu
Jinwoo Shin
Tobias Hinz
LRM
192
0
0
03 Dec 2025
Debate with Images: Detecting Deceptive Behaviors in Multimodal Large Language Models
Debate with Images: Detecting Deceptive Behaviors in Multimodal Large Language Models
Sitong Fang
Shiyi Hou
Kaile Wang
Boyuan Chen
Donghai Hong
Jiayi Zhou
Josef Dai
Yaodong Yang
Jiaming Ji
AAML
187
0
0
29 Nov 2025
TrafficLens: Multi-Camera Traffic Video Analysis Using LLMs
TrafficLens: Multi-Camera Traffic Video Analysis Using LLMs
Md. Adnan Arefeen
Biplob K. Debnath
S. Chakradhar
358
0
0
26 Nov 2025
VeriSciQA: An Auto-Verified Dataset for Scientific Visual Question Answering
VeriSciQA: An Auto-Verified Dataset for Scientific Visual Question Answering
Yuyi Li
Daoyuan Chen
Zhen Wang
Yutong Lu
Yaliang Li
143
0
0
25 Nov 2025
Beyond Words and Pixels: A Benchmark for Implicit World Knowledge Reasoning in Generative Models
Beyond Words and Pixels: A Benchmark for Implicit World Knowledge Reasoning in Generative Models
Tianyang Han
Junhao Su
J. Hu
Peizhen Yang
Hengyu Shi
Junfeng Luo
Jialin Gao
EGVMVGen
481
0
0
23 Nov 2025
ARIAL: An Agentic Framework for Document VQA with Precise Answer Localization
ARIAL: An Agentic Framework for Document VQA with Precise Answer Localization
Ahmad Mohammadshirazi
Pinaki Prasad Guha Neogi
Dheeraj Kulshrestha
R. Ramnath
121
0
0
22 Nov 2025
V-ReasonBench: Toward Unified Reasoning Benchmark Suite for Video Generation Models
Yang Luo
Xuanlei Zhao
Baijiong Lin
Lingting Zhu
Liyao Tang
Y. Liu
Ying Chen
Shengju Qian
Xin Wang
Yang You
174
2
0
20 Nov 2025
Dual-LoRA and Quality-Enhanced Pseudo Replay for Multimodal Continual Food Learning
Dual-LoRA and Quality-Enhanced Pseudo Replay for Multimodal Continual Food Learning
Xinlan Wu
B. Zhu
Feng Han
Pengkun Jiao
Jingjing Chen
CLL
248
0
0
17 Nov 2025
What Color Is It? A Text-Interference Multimodal Hallucination Benchmark
What Color Is It? A Text-Interference Multimodal Hallucination Benchmark
Jinkun Zhao
Lei Huang
Haixin Ge
Wenjun Wu
VLM
243
1
0
17 Nov 2025
Suppressing VLM Hallucinations with Spectral Representation Filtering
Suppressing VLM Hallucinations with Spectral Representation Filtering
Ameen Ali
Tamim Zoabi
Lior Wolf
145
0
0
15 Nov 2025
An Analysis of Architectural Impact on LLM-based Abstract Visual Reasoning: A Systematic Benchmark on RAVEN-FAIR
An Analysis of Architectural Impact on LLM-based Abstract Visual Reasoning: A Systematic Benchmark on RAVEN-FAIR
Sinan Urgun
Seçkin Arı
61
0
0
14 Nov 2025
A Low-Rank Method for Vision Language Model Hallucination Mitigation in Autonomous Driving
A Low-Rank Method for Vision Language Model Hallucination Mitigation in Autonomous Driving
Keke Long
Jiacheng Guo
Tianyun Zhang
Hongkai Yu
Xiaopeng Li
93
1
0
09 Nov 2025
Role-SynthCLIP: A Role Play Driven Diverse Synthetic Data Approach
Role-SynthCLIP: A Role Play Driven Diverse Synthetic Data Approach
Yuanxiang Huangfu
Chaochao Wang
Weilei Wang
CLIPVLM
112
0
0
07 Nov 2025
Mitigating Hallucination in Large Language Models (LLMs): An Application-Oriented Survey on RAG, Reasoning, and Agentic Systems
Mitigating Hallucination in Large Language Models (LLMs): An Application-Oriented Survey on RAG, Reasoning, and Agentic Systems
Yihan Li
Xiyuan Fu
Ghanshyam Verma
P. Buitelaar
Mingming Liu
LRM
191
1
0
28 Oct 2025
MAD-Fact: A Multi-Agent Debate Framework for Long-Form Factuality Evaluation in LLMs
MAD-Fact: A Multi-Agent Debate Framework for Long-Form Factuality Evaluation in LLMs
Yucheng Ning
Xixun Lin
Fang Fang
Yanan Cao
HILM
313
0
0
27 Oct 2025
From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model
From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model
Yatai Ji
Teng Wang
Yuying Ge
Zhiheng Liu
Sidi Yang
Y. Shan
Ping Luo
DiffMVLM
171
1
0
22 Oct 2025
PruneHal: Reducing Hallucinations in Multi-modal Large Language Models through Adaptive KV Cache Pruning
PruneHal: Reducing Hallucinations in Multi-modal Large Language Models through Adaptive KV Cache Pruning
Fengyuan Sun
Hui Chen
Xinhao Xu
Dandan Zheng
Jingdong Chen
Jun Zhou
Jungong Han
Guiguang Ding
VLM
120
0
0
22 Oct 2025
Beyond Single Models: Mitigating Multimodal Hallucinations via Adaptive Token Ensemble Decoding
Beyond Single Models: Mitigating Multimodal Hallucinations via Adaptive Token Ensemble Decoding
Jinlin Li
Y. X. R. Wang
Yifei Yuan
Xiao Zhou
Y. Zhang
Xixian Yong
Yefeng Zheng
X. Wu
MLLM
151
0
0
21 Oct 2025
Med-VRAgent: A Framework for Medical Visual Reasoning-Enhanced Agents
Med-VRAgent: A Framework for Medical Visual Reasoning-Enhanced Agents
Guangfu Guo
Xiaoqian Lu
Yue Feng
LRM
180
1
0
21 Oct 2025
Token-Level Inference-Time Alignment for Vision-Language Models
Token-Level Inference-Time Alignment for Vision-Language Models
Kejia Chen
Jiawen Zhang
Jiacong Hu
Kewei Gao
Jian Lou
Zunlei Feng
Mingli Song
MLLMVLM
277
0
0
20 Oct 2025
Hallucination Benchmark for Speech Foundation Models
Hallucination Benchmark for Speech Foundation Models
Alkis Koudounas
Moreno La Quatra
Manuel Giollo
Sabato Marco Siniscalchi
Elena Baralis
HILM
238
1
0
18 Oct 2025
Spatial Preference Rewarding for MLLMs Spatial Understanding
Spatial Preference Rewarding for MLLMs Spatial Understanding
Han Qiu
Peng Gao
Lewei Lu
Xiaoqin Zhang
Ling Shao
Shijian Lu
LRM
134
0
0
16 Oct 2025
Mitigating Hallucination in Multimodal Reasoning via Functional Attention Control
Mitigating Hallucination in Multimodal Reasoning via Functional Attention Control
H. Lu
Bolun Chu
Weiye Fu
Guoshun Nan
Junning Liu
Minghui Pan
Qiankun Li
Yi Yu
Hua Wang
Kun Wang
LRM
135
0
0
11 Oct 2025
Beyond Textual CoT: Interleaved Text-Image Chains with Deep Confidence Reasoning for Image Editing
Beyond Textual CoT: Interleaved Text-Image Chains with Deep Confidence Reasoning for Image Editing
Zhentao Zou
Zhengrong Yue
Kunpeng Du
Binlei Bao
Hanting Li
...
Yue Zhou
Yali Wang
Jie Hu
Xue Jiang
X. Chen
LRM
180
0
0
09 Oct 2025
ChainMPQ: Interleaved Text-Image Reasoning Chains for Mitigating Relation Hallucinations
ChainMPQ: Interleaved Text-Image Reasoning Chains for Mitigating Relation Hallucinations
Yike Wu
Yiwei Wang
Yujun Cai
LRM
119
0
0
07 Oct 2025
When Thinking Drifts: Evidential Grounding for Robust Video Reasoning
When Thinking Drifts: Evidential Grounding for Robust Video Reasoning
M. Luo
Zihui Xue
Alex Dimakis
Kristen Grauman
VGenLRM
271
4
0
07 Oct 2025
CoDA: Agentic Systems for Collaborative Data Visualization
CoDA: Agentic Systems for Collaborative Data Visualization
Zichen Chen
Jiefeng Chen
Sercan O. Arik
Misha Sra
Tomas Pfister
Jinsung Yoon
102
2
0
03 Oct 2025
RefineShot: Rethinking Cinematography Understanding with Foundational Skill Evaluation
RefineShot: Rethinking Cinematography Understanding with Foundational Skill Evaluation
Hang Wu
Yujun Cai
Haonan Ge
H. Chen
Ming-Hsuan Yang
Yiwei Wang
CoGe
175
0
0
02 Oct 2025
MedMMV: A Controllable Multimodal Multi-Agent Framework for Reliable and Verifiable Clinical Reasoning
MedMMV: A Controllable Multimodal Multi-Agent Framework for Reliable and Verifiable Clinical Reasoning
Hongjun Liu
Yinghao Zhu
Y Samuel Wang
Yitao Long
Zeyu Lai
Lequan Yu
Chen Zhao
LRM
168
2
0
29 Sep 2025
DocPruner: A Storage-Efficient Framework for Multi-Vector Visual Document Retrieval via Adaptive Patch-Level Embedding Pruning
DocPruner: A Storage-Efficient Framework for Multi-Vector Visual Document Retrieval via Adaptive Patch-Level Embedding Pruning
Yibo Yan
Guangwei Xu
Xin Zou
Shuliang Liu
James Kwok
Xuming Hu
189
5
0
28 Sep 2025
Exposing Hallucinations To Suppress Them: VLMs Representation Editing With Generative Anchors
Exposing Hallucinations To Suppress Them: VLMs Representation Editing With Generative Anchors
Youxu Shi
Suorong Yang
Dong Liu
MLLMVLM
145
1
0
26 Sep 2025
Customizing Visual Emotion Evaluation for MLLMs: An Open-vocabulary, Multifaceted, and Scalable Approach
Customizing Visual Emotion Evaluation for MLLMs: An Open-vocabulary, Multifaceted, and Scalable Approach
Daiqing Wu
Dongbao Yang
Sicheng Zhao
Can Ma
Can Ma
MLLM
152
1
0
26 Sep 2025
From Superficial Outputs to Superficial Learning: Risks of Large Language Models in Education
From Superficial Outputs to Superficial Learning: Risks of Large Language Models in Education
Iris Delikoura
Yi.R
Fung
AI4Ed
403
3
0
26 Sep 2025
Hallucination as an Upper Bound: A New Perspective on Text-to-Image Evaluation
Hallucination as an Upper Bound: A New Perspective on Text-to-Image Evaluation
S. Kasaei
M. Rohban
EGVMVLMLRM
300
0
0
25 Sep 2025
Are Hallucinations Bad Estimations?
Are Hallucinations Bad Estimations?
Hude Liu
Jerry Yao-Chieh Hu
Jennifer Yuntong Zhang
Zhao Song
Han Liu
HILM
161
0
0
25 Sep 2025
Revealing Multimodal Causality with Large Language Models
Revealing Multimodal Causality with Large Language Models
Jin Li
Shoujin Wang
Qi Zhang
Feng Liu
Tongliang Liu
LongBing Cao
Shui Yu
F. Chen
188
0
0
22 Sep 2025
Losing the Plot: How VLM responses degrade on imperfect charts
Losing the Plot: How VLM responses degrade on imperfect charts
P. W. Shin
Jack Sampson
Vijaykrishnan Narayanan
Andres Marquez
Mahantesh Halappanavar
102
0
0
22 Sep 2025
WISE: Weak-Supervision-Guided Step-by-Step Explanations for Multimodal LLMs in Image Classification
WISE: Weak-Supervision-Guided Step-by-Step Explanations for Multimodal LLMs in Image Classification
Yiwen Jiang
Deval Mehta
Siyuan Yan
Yaling Shen
Z. Wang
Zongyuan Ge
LRM
126
1
0
22 Sep 2025
ChartHal: A Fine-grained Framework Evaluating Hallucination of Large Vision Language Models in Chart Understanding
ChartHal: A Fine-grained Framework Evaluating Hallucination of Large Vision Language Models in Chart Understanding
Xingqi Wang
Yiming Cui
Xin Yao
Shijin Wang
Guoping Hu
Xiaoyu Qin
LRM
124
0
0
22 Sep 2025
Beyond Spurious Signals: Debiasing Multimodal Large Language Models via Counterfactual Inference and Adaptive Expert Routing
Beyond Spurious Signals: Debiasing Multimodal Large Language Models via Counterfactual Inference and Adaptive Expert Routing
Zichen Wu
Hsiu-Yuan Huang
Yunfang Wu
MoE
98
1
0
18 Sep 2025
ORCA: Agentic Reasoning For Hallucination and Adversarial Robustness in Vision-Language Models
ORCA: Agentic Reasoning For Hallucination and Adversarial Robustness in Vision-Language Models
Chung-En Yu
Hsuan-Chih
Chen
Brian Jalaian
Nathaniel D. Bastian
AAMLVLMLRM
146
0
0
18 Sep 2025
EdiVal-Agent: An Object-Centric Framework for Automated, Fine-Grained Evaluation of Multi-Turn Editing
EdiVal-Agent: An Object-Centric Framework for Automated, Fine-Grained Evaluation of Multi-Turn Editing
Tianyu Chen
Yasi Zhang
Zhi Zhang
Peiyu Yu
Shu Wang
...
Jianwen Xie
Oscar Leong
L. xilinx Wang
Ying Nian Wu
Mingyuan Zhou
143
0
0
16 Sep 2025
HARMONIC: A Content-Centric Cognitive Robotic Architecture
HARMONIC: A Content-Centric Cognitive Robotic Architecture
Sanjay Oruganti
S. Nirenburg
M. McShane
Jesse English
Michael K. Roberts
Christian Arndt
Carlos Gonzalez
Mingyo Seo
Luis Sentis
77
1
0
16 Sep 2025
FineQuest: Adaptive Knowledge-Assisted Sports Video Understanding via Agent-of-Thoughts Reasoning
FineQuest: Adaptive Knowledge-Assisted Sports Video Understanding via Agent-of-Thoughts Reasoning
Haodong Chen
Haojian Huang
XinXiang Yin
Dian Shao
LRM
175
2
0
15 Sep 2025
Dr.V: A Hierarchical Perception-Temporal-Cognition Framework to Diagnose Video Hallucination by Fine-grained Spatial-Temporal Grounding
Dr.V: A Hierarchical Perception-Temporal-Cognition Framework to Diagnose Video Hallucination by Fine-grained Spatial-Temporal Grounding
Meng Luo
Shengqiong Wu
Liqiang Jing
Tianjie Ju
Li Zheng
...
Jiebo Luo
William Yang Wang
Hao Fei
Yang Deng
Wynne Hsu
169
1
0
15 Sep 2025
OmniDPO: A Preference Optimization Framework to Address Omni-Modal Hallucination
OmniDPO: A Preference Optimization Framework to Address Omni-Modal Hallucination
Junzhe Chen
Tianshu Zhang
Shiyu Huang
Yuwei Niu
Chao Sun
Rongzhou Zhang
G. Zhou
Lijie Wen
Xuming Hu
MLLM
185
0
0
31 Aug 2025
MM-SeR: Multimodal Self-Refinement for Lightweight Image Captioning
MM-SeR: Multimodal Self-Refinement for Lightweight Image Captioning
Junha Song
Yongsik Jo
So Yeon Min
Quanting Xie
Taehwan Kim
Yonatan Bisk
Jaegul Choo
VLM
212
0
0
29 Aug 2025
GLSim: Detecting Object Hallucinations in LVLMs via Global-Local Similarity
GLSim: Detecting Object Hallucinations in LVLMs via Global-Local Similarity
Seongheon Park
Yixuan Li
148
1
0
27 Aug 2025
1234567
Next