ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.18930
  4. Cited By
Hallucination of Multimodal Large Language Models: A Survey
v1v2 (latest)

Hallucination of Multimodal Large Language Models: A Survey

29 April 2024
Zechen Bai
Pichao Wang
Tianjun Xiao
Tong He
Zongbo Han
Zheng Zhang
Mike Zheng Shou
    VLMLRM
ArXiv (abs)PDFHTML

Papers citing "Hallucination of Multimodal Large Language Models: A Survey"

50 / 334 papers shown
OViP: Online Vision-Language Preference Learning for VLM Hallucination
OViP: Online Vision-Language Preference Learning for VLM Hallucination
Shujun Liu
Siyuan Wang
Zejun Li
Jianxiang Wang
Cheng Zeng
Zhongyu Wei
MLLMVLM
317
0
0
21 May 2025
Incentivizing Truthful Language Models via Peer Elicitation Games
Incentivizing Truthful Language Models via Peer Elicitation Games
Baiting Chen
Tong Zhu
Jiale Han
Lexin Li
Gang Li
Xiaowu Dai
368
1
0
19 May 2025
Mixture of Decoding: An Attention-Inspired Adaptive Decoding Strategy to Mitigate Hallucinations in Large Vision-Language Models
Mixture of Decoding: An Attention-Inspired Adaptive Decoding Strategy to Mitigate Hallucinations in Large Vision-Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Xinlong Chen
Yuanxing Zhang
Sihan Yang
Junfei Wu
Fuzheng Zhang
Tieniu Tan
MLLM
360
1
0
17 May 2025
Hallucination-Aware Multimodal Benchmark for Gastrointestinal Image Analysis with Large Vision-Language Models
Hallucination-Aware Multimodal Benchmark for Gastrointestinal Image Analysis with Large Vision-Language ModelsInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Bidur Khanal
Sandesh Pokhrel
Sanjay Bhandari
Ramesh Rana
Nikesh Shrestha
Ram Bahadur Gurung
Cristian A. Linte
Angus Watson
Yash Raj Shrestha
Binod Bhattarai
VLM
317
5
0
11 May 2025
Perceiving Beyond Language Priors: Enhancing Visual Comprehension and Attention in Multimodal Models
Perceiving Beyond Language Priors: Enhancing Visual Comprehension and Attention in Multimodal Models
Aarti Ghatkesar
Uddeshya Upadhyay
VLM
390
1
0
08 May 2025
Characterizing the Robustness of Black-Box LLM Planners Under Perturbed Observations with Adaptive Stress Testing
Characterizing the Robustness of Black-Box LLM Planners Under Perturbed Observations with Adaptive Stress Testing
Neeloy Chakraborty
John Pohovey
Melkior Ornik
Katherine Driggs-Campbell
360
0
0
08 May 2025
VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding
VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding
Zongxia Li
Xiyang Wu
Guangyao Shi
Yubin Qin
Hongyang Du
Tianyi Zhou
Wanrong Zhu
Dinesh Manocha
Jordan Lee Boyd-Graber
MLLM
649
0
0
02 May 2025
Multimodal Large Language Models for Medicine: A Comprehensive Survey
Multimodal Large Language Models for Medicine: A Comprehensive Survey
Jiarui Ye
Hao Tang
LM&MA
501
14
0
29 Apr 2025
HyPerAlign: Interpretable Personalized LLM Alignment via Hypothesis Generation
HyPerAlign: Interpretable Personalized LLM Alignment via Hypothesis Generation
Cristina Garbacea
Chenhao Tan
444
0
0
29 Apr 2025
TRACE: Textual Relevance Augmentation and Contextual Encoding for Multimodal Hate Detection
TRACE: Textual Relevance Augmentation and Contextual Encoding for Multimodal Hate Detection
Girish A. Koushik
Helen Treharne
Helen Treharne
Aditya Joshi
VLM
339
2
0
24 Apr 2025
Multimodal Large Language Models for Enhanced Traffic Safety: A Comprehensive Review and Future Trends
Multimodal Large Language Models for Enhanced Traffic Safety: A Comprehensive Review and Future Trends
M. Tami
Mohammed Elhenawy
Huthaifa I. Ashqar
334
1
0
21 Apr 2025
Hydra: An Agentic Reasoning Approach for Enhancing Adversarial Robustness and Mitigating Hallucinations in Vision-Language Models
Hydra: An Agentic Reasoning Approach for Enhancing Adversarial Robustness and Mitigating Hallucinations in Vision-Language Models
Chung-En
Hsuan-Chih
Chen
Brian Jalaian
Nathaniel D. Bastian
AAMLVLM
293
1
0
19 Apr 2025
Low-hallucination Synthetic Captions for Large-Scale Vision-Language Model Pre-training
Low-hallucination Synthetic Captions for Large-Scale Vision-Language Model Pre-training
Xinsong Zhang
Yarong Zeng
Xinting Huang
Hu Hu
Runquan Xie
Han Hu
Zhanhui Kang
MLLMVLM
515
5
0
17 Apr 2025
AeroLite: Tag-Guided Lightweight Generation of Aerial Image Captions
AeroLite: Tag-Guided Lightweight Generation of Aerial Image Captions
Xing Zi
Tengjun Ni
Xianjing Fan
Xian Tao
Jun Li
Ali Braytee
Mukesh Prasad
161
0
0
13 Apr 2025
Data Metabolism: An Efficient Data Design Schema For Vision Language Model
Data Metabolism: An Efficient Data Design Schema For Vision Language Model
Jingyuan Zhang
Hongzhi Zhang
Zhou Haonan
Chenxi Sun
Xingguang Ji
Jiakang Wang
Fanheng Kong
Wenshu Fan
Qi Wang
Fuzheng Zhang
VLM
386
2
0
10 Apr 2025
Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local PerceptionComputer Vision and Pattern Recognition (CVPR), 2025
Ruotian Peng
Haiying He
Yake Wei
Yandong Wen
D. Hu
VLM
209
0
0
09 Apr 2025
Decoupling Contrastive Decoding: Robust Hallucination Mitigation in Multimodal Large Language Models
Decoupling Contrastive Decoding: Robust Hallucination Mitigation in Multimodal Large Language Models
Wei Chen
Xin Yan
Bin Wen
Fan Yang
Yan Li
Di Zhang
Long Chen
MLLM
459
0
0
09 Apr 2025
Explaining Low Perception Model Competency with High-Competency Counterfactuals
Explaining Low Perception Model Competency with High-Competency Counterfactuals
Sara Pohland
Claire Tomlin
DiffMAAML
293
0
0
07 Apr 2025
TARAC: Mitigating Hallucination in LVLMs via Temporal Attention Real-time Accumulative Connection
TARAC: Mitigating Hallucination in LVLMs via Temporal Attention Real-time Accumulative Connection
C. Xie
Tongxuan Liu
Lei Jiang
Yuting Zeng
Jinpei Guo
Yunheng Shen
Weizhe Huang
Jing Li
Xiaohua Xu
VLM
238
6
0
05 Apr 2025
Towards Trustworthy GUI Agents: A Survey
Towards Trustworthy GUI Agents: A Survey
Yucheng Shi
Wenhao Yu
Wenlin Yao
Wenhu Chen
Ninghao Liu
291
18
0
30 Mar 2025
AutoComPose: Automatic Generation of Pose Transition Descriptions for Composed Pose Retrieval Using Multimodal LLMs
AutoComPose: Automatic Generation of Pose Transition Descriptions for Composed Pose Retrieval Using Multimodal LLMs
Yi-Ting Shen
Sungmin Eum
Doheon Lee
Rohit Shete
Chiao-Yi Wang
H. Kwon
Shuvra S. Bhattacharyya
370
0
0
28 Mar 2025
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation
Hongcheng Gao
Jiashu Qu
Jingyi Tang
Baolong Bi
Yi Liu
Hongyu Chen
Li Liang
Li Su
Qingming Huang
MLLMVLMLRM
438
13
0
25 Mar 2025
Mitigating Object Hallucinations in MLLMs via Multi-Frequency Perturbations
Mitigating Object Hallucinations in MLLMs via Multi-Frequency Perturbations
Shuo Li
Jiajun Sun
Guodong Zheng
Xiaoran Fan
Yujiong Shen
...
Wenming Tan
Changzhi Sun
Tao Gui
Tao Gui
Qi Zhang
AAMLVLM
388
4
0
19 Mar 2025
Do Multimodal Large Language Models Understand Welding?
Do Multimodal Large Language Models Understand Welding?Information Fusion (Inf. Fusion), 2025
Grigorii Khvatskii
Yong Suk Lee
Corey Angst
Maria Gibbs
Robert Landers
Nitesh Chawla
AI4CE
229
3
0
18 Mar 2025
Can Large Vision Language Models Read Maps Like a Human?
Can Large Vision Language Models Read Maps Like a Human?
Shuo Xing
Zezhou Sun
Shuangyu Xie
Kaiyuan Chen
Yanjia Huang
Yuping Wang
Jiachen Li
Dezhen Song
Zhengzhong Tu
391
20
0
18 Mar 2025
Uncertainty Distillation: Teaching Language Models to Express Semantic Confidence
Uncertainty Distillation: Teaching Language Models to Express Semantic Confidence
Sophia Hager
David Mueller
Kevin Duh
Nicholas Andrews
464
5
0
18 Mar 2025
RAG-KG-IL: A Multi-Agent Hybrid Framework for Reducing Hallucinations and Enhancing LLM Reasoning through RAG and Incremental Knowledge Graph Learning Integration
RAG-KG-IL: A Multi-Agent Hybrid Framework for Reducing Hallucinations and Enhancing LLM Reasoning through RAG and Incremental Knowledge Graph Learning Integration
Hong Qing Yu
Frank McQuade
282
7
0
14 Mar 2025
Taxonomic Reasoning for Rare Arthropods: Combining Dense Image Captioning and RAG for Interpretable Classification
Nathaniel Lesperance
S. Ratnasingham
Graham W. Taylor
VLM
322
0
0
13 Mar 2025
ExtremeAIGC: Benchmarking LMM Vulnerability to AI-Generated Extremist Content
Bhavik Chandna
Mariam Aboujenane
Usman Naseem
276
1
0
13 Mar 2025
Attention Hijackers: Detect and Disentangle Attention Hijacking in LVLMs for Hallucination Mitigation
Beitao Chen
Xinyu Lyu
Lianli Gao
Jingkuan Song
Mengqi Li
544
3
0
11 Mar 2025
Hallucinatory Image Tokens: A Training-free EAZY Approach on Detecting and Mitigating Object Hallucinations in LVLMs
Hallucinatory Image Tokens: A Training-free EAZY Approach on Detecting and Mitigating Object Hallucinations in LVLMs
Liwei Che
Tony Qingze Liu
Jing Jia
Weiyi Qin
Ruixiang Tang
Vladimir Pavlovic
MLLMVLM
411
2
0
10 Mar 2025
PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual TrainingInternational Conference on Learning Representations (ICLR), 2025
Cong Chen
Mingyu Liu
Chenchen Jing
Y. Zhou
Fengyun Rao
Hao Chen
Bo Zhang
Chunhua Shen
MLLMAAMLVLM
293
25
0
09 Mar 2025
TPC: Cross-Temporal Prediction Connection for Vision-Language Model Hallucination Reduction
Chao Wang
Weiwei Fu
Yang Zhou
MLLMVLM
348
3
0
06 Mar 2025
MCiteBench: A Multimodal Benchmark for Generating Text with Citations
MCiteBench: A Multimodal Benchmark for Generating Text with Citations
Caiyu Hu
Yikai Zhang
Tinghui Zhu
Yiwei Ye
Yanghua Xiao
456
0
0
04 Mar 2025
Evaluating and Predicting Distorted Human Body Parts for Generated Images
Lu Ma
Kaibo Cao
Hao Liang
Jiaxin Lin
Zhiyu Li
Yuhong Liu
Jihong Zhang
Wentao Zhang
Tengjiao Wang
MedIm
345
1
0
02 Mar 2025
HalCECE: A Framework for Explainable Hallucination Detection through Conceptual Counterfactuals in Image Captioning
HalCECE: A Framework for Explainable Hallucination Detection through Conceptual Counterfactuals in Image Captioning
Maria Lymperaiou
Giorgos Filandrianos
Angeliki Dimitriou
Athanasios Voulodimos
Giorgos Stamou
MLLM
210
0
0
01 Mar 2025
Octopus: Alleviating Hallucination via Dynamic Contrastive DecodingComputer Vision and Pattern Recognition (CVPR), 2025
Wei Suo
Lijun Zhang
Mengyang Sun
Lin Yuanbo Wu
Peng Wang
Yujiao Shi
MLLMVLM
295
15
0
01 Mar 2025
Towards Statistical Factuality Guarantee for Large Vision-Language Models
Towards Statistical Factuality Guarantee for Large Vision-Language Models
Hao Sun
Chao Yan
Nicholas J. Jackson
Wendi Cui
B. Li
Jiaxin Zhang
Sricharan Kumar
347
0
0
27 Feb 2025
FilterRAG: Zero-Shot Informed Retrieval-Augmented Generation to Mitigate Hallucinations in VQA
FilterRAG: Zero-Shot Informed Retrieval-Augmented Generation to Mitigate Hallucinations in VQA
S M Sarwar
463
2
0
25 Feb 2025
Mitigating Hallucinations in Diffusion Models through Adaptive Attention Modulation
Mitigating Hallucinations in Diffusion Models through Adaptive Attention Modulation
Trevine Oorloff
Yaser Yacoob
Abhinav Shrivastava
196
3
0
24 Feb 2025
LOVA3: Learning to Visual Question Answering, Asking and Assessment
LOVA3: Learning to Visual Question Answering, Asking and AssessmentNeural Information Processing Systems (NeurIPS), 2024
Henry Hengyuan Zhao
Pan Zhou
Difei Gao
Zechen Bai
Mike Zheng Shou
417
14
0
21 Feb 2025
Mitigating Hallucinations in Large Vision-Language Models via Summary-Guided Decoding
Mitigating Hallucinations in Large Vision-Language Models via Summary-Guided DecodingNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Kyungmin Min
Minbeom Kim
Kang-il Lee
Dongryeol Lee
Kyomin Jung
MLLM
465
14
0
20 Feb 2025
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Yue Yang
Ajay Patel
Matt Deitke
Tanmay Gupta
Luca Weihs
...
Mark Yatskar
Chris Callison-Burch
Ranjay Krishna
Aniruddha Kembhavi
Christopher Clark
SyDa
567
28
0
20 Feb 2025
Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization
Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization
Shuo Xing
Peiran Li
Peiran Li
Ruizheng Bai
Longji Xu
Chan-wei Hu
Chengxuan Qian
Huaxiu Yao
Zhengzhong Tu
520
20
0
18 Feb 2025
Mitigating Visual Knowledge Forgetting in MLLM Instruction-tuning via Modality-decoupled Gradient Descent
Mitigating Visual Knowledge Forgetting in MLLM Instruction-tuning via Modality-decoupled Gradient Descent
Junda Wu
Yuxin Xiong
Xintong Li
Yu Xia
Ruoyu Wang
...
Sungchul Kim
Ryan Rossi
Lina Yao
Jingbo Shang
Julian McAuley
CLLVLM
346
2
0
17 Feb 2025
Valuable Hallucinations: Realizable Non-realistic Propositions
Valuable Hallucinations: Realizable Non-realistic Propositions
Qiucheng Chen
Bo Wang
LRM
309
2
0
16 Feb 2025
MRAMG-Bench: A Comprehensive Benchmark for Advancing Multimodal Retrieval-Augmented Multimodal Generation
MRAMG-Bench: A Comprehensive Benchmark for Advancing Multimodal Retrieval-Augmented Multimodal GenerationAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2025
Qinhan Yu
Zhiyou Xiao
Binghui Li
Zhengren Wang
Chong Chen
Feiyu Xiong
RALMVLM
905
1
0
06 Feb 2025
Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning
Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning
Yibo Yan
Shen Wang
Jiahao Huo
Jingheng Ye
Zhendong Chu
Xuming Hu
Philip S. Yu
Daniel Schwalbe-Koda
B. Selman
Qingsong Wen
LRM
573
27
0
05 Feb 2025
The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering
The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering
Zhuowei Li
Haizhou Shi
Yunhe Gao
Di Liu
Zhenting Wang
Yuxiao Chen
Ting Liu
Long Zhao
Hao Wang
Dimitris N. Metaxas
MLLM
259
3
0
05 Feb 2025
Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models
Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models
H. Malik
Fahad Shamshad
Muzammal Naseer
Karthik Nandakumar
Fahad Shahbaz Khan
Salman Khan
AAMLMLLMVLM
488
8
0
03 Feb 2025
Previous
1234567
Next
Page 3 of 7