ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.18930
  4. Cited By
Hallucination of Multimodal Large Language Models: A Survey
v1v2 (latest)

Hallucination of Multimodal Large Language Models: A Survey

29 April 2024
Zechen Bai
Pichao Wang
Tianjun Xiao
Tong He
Zongbo Han
Zheng Zhang
Mike Zheng Shou
    VLMLRM
ArXiv (abs)PDFHTML

Papers citing "Hallucination of Multimodal Large Language Models: A Survey"

50 / 334 papers shown
Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models
Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models
H. Malik
Fahad Shamshad
Muzammal Naseer
Karthik Nandakumar
Fahad Shahbaz Khan
Salman Khan
AAMLMLLMVLM
485
7
0
03 Feb 2025
Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling
Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling
Xiaokang Chen
Zhiyu Wu
Xingchao Liu
Zizheng Pan
Wen Liu
Zhenda Xie
X. Yu
Chong Ruan
AI4TS
538
469
0
29 Jan 2025
Learning to Summarize from LLM-generated Feedback
Learning to Summarize from LLM-generated FeedbackNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Hwanjun Song
Taewon Yun
Yuho Lee
Jihwan Oh
Gihun Lee
Jason (Jinglun) Cai
Hang Su
367
16
0
28 Jan 2025
Mirage in the Eyes: Hallucination Attack on Multi-modal Large Language Models with Only Attention Sink
Yining Wang
Mi Zhang
Junjie Sun
Chenyue Wang
Min Yang
Hui Xue
Jialing Tao
Ranjie Duan
Qingbin Liu
246
6
0
28 Jan 2025
EAGLE: Enhanced Visual Grounding Minimizes Hallucinations in Instructional Multimodal Models
Andrés Villa
Juan Carlos León Alcázar
Motasem Alfarra
Vladimir Araujo
Alvaro Soto
Bernard Ghanem
VLM
107
6
0
06 Jan 2025
A Review of Multimodal Explainable Artificial Intelligence: Past,
  Present and Future
A Review of Multimodal Explainable Artificial Intelligence: Past, Present and Future
Shilin Sun
Wenbin An
Feng Tian
Fang Nan
Qidong Liu
Jing Liu
N. Shah
Ping Chen
387
20
0
18 Dec 2024
Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection
Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace ProjectionComputer Vision and Pattern Recognition (CVPR), 2024
Le Yang
Ziwei Zheng
Boxu Chen
Subrat Kishore Dutta
Chenhao Lin
Chao Shen
VLM
585
22
0
18 Dec 2024
Cracking the Code of Hallucination in LVLMs with Vision-aware Head Divergence
Cracking the Code of Hallucination in LVLMs with Vision-aware Head DivergenceAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Jinghan He
Kuan Zhu
Haiyun Guo
Cunchun Li
Zhenglin Hua
Yuheng Jia
Ming Tang
Tat-Seng Chua
Jinqiao Wang
VLM
376
16
0
18 Dec 2024
Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language Models
Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Ido Cohen
Daniela Gottesman
Mor Geva
Raja Giryes
VLM
480
5
1
18 Dec 2024
Combating Multimodal LLM Hallucination via Bottom-Up Holistic Reasoning
Combating Multimodal LLM Hallucination via Bottom-Up Holistic ReasoningAAAI Conference on Artificial Intelligence (AAAI), 2024
Shengqiong Wu
Hao Fei
Liangming Pan
William Yang Wang
Shuicheng Yan
Tat-Seng Chua
LRM
403
14
0
15 Dec 2024
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced
  Multimodal Understanding
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Z. F. Wu
Xiaokang Chen
Zizheng Pan
Xianglong Liu
Wen Liu
...
Xingkai Yu
Haowei Zhang
Bo Pan
Yijiao Wang
Chong Ruan
MLLMVLMMoE
467
400
0
13 Dec 2024
A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future Directions
A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future DirectionsACM Computing Surveys (ACM CSUR), 2024
Ola Shorinwa
Zhiting Mei
Justin Lidard
Allen Z. Ren
Anirudha Majumdar
HILMLRM
432
19
0
07 Dec 2024
Who Brings the Frisbee: Probing Hidden Hallucination Factors in Large
  Vision-Language Model via Causality Analysis
Who Brings the Frisbee: Probing Hidden Hallucination Factors in Large Vision-Language Model via Causality AnalysisIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Po-Hsuan Huang
Jeng-Lin Li
Chin-Po Chen
Ming-Ching Chang
Wei-Chao Chen
LRM
300
4
0
04 Dec 2024
Explainable and Interpretable Multimodal Large Language Models: A
  Comprehensive Survey
Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Yunkai Dang
Kaichen Huang
Jiahao Huo
Yibo Yan
Shijie Huang
...
Kun Wang
Yong Liu
Jing Shao
Hui Xiong
Xuming Hu
LRM
426
51
0
03 Dec 2024
FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation Models
FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation ModelsComputer Vision and Pattern Recognition (CVPR), 2024
Alice Heiman
Xiaoman Zhang
E. Chen
Sung Eun Kim
Pranav Rajpurkar
HILMMedIm
658
5
0
27 Nov 2024
What's in the Image? A Deep-Dive into the Vision of Vision Language
  Models
What's in the Image? A Deep-Dive into the Vision of Vision Language ModelsComputer Vision and Pattern Recognition (CVPR), 2024
Omri Kaduri
Shai Bagon
Tali Dekel
VLMCoGe
213
24
0
26 Nov 2024
Systematic Reward Gap Optimization for Mitigating VLM Hallucinations
Systematic Reward Gap Optimization for Mitigating VLM Hallucinations
Lehan He
Zeren Chen
Zhelun Shi
Tianyu Yu
Jing Shao
Lu Sheng
MLLM
606
2
0
26 Nov 2024
VaLiD: Mitigating the Hallucination of Large Vision Language Models by Visual Layer Fusion Contrastive Decoding
VaLiD: Mitigating the Hallucination of Large Vision Language Models by Visual Layer Fusion Contrastive Decoding
Yuan Liu
Yifei Gao
Jitao Sang
MLLM
487
10
0
24 Nov 2024
ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object
  Hallucination in Large Vision-Language Models
ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language ModelsComputer Vision and Pattern Recognition (CVPR), 2024
Junzhe Chen
Tianshu Zhang
Shijie Huang
Yuwei Niu
Linfeng Zhang
Lijie Wen
Xuming Hu
MLLMVLM
1.1K
11
0
22 Nov 2024
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
MovieBench: A Hierarchical Movie Level Dataset for Long Video GenerationComputer Vision and Pattern Recognition (CVPR), 2024
Weijia Wu
Mingyu Liu
Zeyu Zhu
Xi Xia
Haoen Feng
Wen Wang
Kevin Qinghong Lin
Chunhua Shen
Mike Zheng Shou
DiffMVGen
445
14
0
22 Nov 2024
Understanding Multimodal LLMs: the Mechanistic Interpretability of Llava in Visual Question Answering
Zeping Yu
Sophia Ananiadou
1.1K
9
0
17 Nov 2024
Thinking Before Looking: Improving Multimodal LLM Reasoning via
  Mitigating Visual Hallucination
Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucination
Haojie Zheng
Tianyang Xu
Hanchi Sun
Shu Pu
Ruoxi Chen
Lichao Sun
MLLMLRM
269
26
0
15 Nov 2024
Mitigating Hallucination in Multimodal Large Language Model via
  Hallucination-targeted Direct Preference Optimization
Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference OptimizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Yuhan Fu
Ruobing Xie
Xingwu Sun
Zhanhui Kang
Xirong Li
MLLM
238
11
0
15 Nov 2024
Seeing Clearly by Layer Two: Enhancing Attention Heads to Alleviate
  Hallucination in LVLMs
Seeing Clearly by Layer Two: Enhancing Attention Heads to Alleviate Hallucination in LVLMs
Xiaofeng Zhang
Yihao Quan
Chaochen Gu
Chen Shen
Xiaosong Yuan
Shaotian Yan
Hao Cheng
Kaijie Wu
Jieping Ye
209
37
0
15 Nov 2024
V-DPO: Mitigating Hallucination in Large Vision Language Models via
  Vision-Guided Direct Preference Optimization
V-DPO: Mitigating Hallucination in Large Vision Language Models via Vision-Guided Direct Preference OptimizationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yuxi Xie
Guanzhen Li
Xiao Xu
Min-Yen Kan
MLLMVLM
213
46
0
05 Nov 2024
Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios
Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios
Yunkai Dang
Mengxi Gao
Yibo Yan
Xin Zou
Yanggan Gu
...
Jingyu Wang
Peijie Jiang
Aiwei Liu
Jia Liu
Xuming Hu
351
11
0
05 Nov 2024
Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning AgentInternational Conference on Learning Representations (ICLR), 2024
Yangning Li
Hai-Tao Zheng
Xinyu Wang
Yong Jiang
Zhen Zhang
...
Hui Wang
Hai-Tao Zheng
Pengjun Xie
Philip S. Yu
Fei Huang
645
53
0
05 Nov 2024
RadFlag: A Black-Box Hallucination Detection Method for Medical Vision
  Language Models
RadFlag: A Black-Box Hallucination Detection Method for Medical Vision Language Models
Serena Zhang
Siyang Song
Oishi Banerjee
J. N. Acosta
L. John Fahrner
Pranav Rajpurkar
VLM
251
5
0
01 Nov 2024
Driving by the Rules: A Benchmark for Integrating Traffic Sign Regulations into Vectorized HD Map
Driving by the Rules: A Benchmark for Integrating Traffic Sign Regulations into Vectorized HD MapComputer Vision and Pattern Recognition (CVPR), 2024
Xinyuan Chang
Maixuan Xue
Xinran Liu
Zheng Pan
Xing Wei
614
7
0
31 Oct 2024
MAD-Sherlock: Multi-Agent Debate for Visual Misinformation Detection
MAD-Sherlock: Multi-Agent Debate for Visual Misinformation Detection
Kumud Lakara
Juil Sock
Christian Rupprecht
Juil Sock
Philip Torr
John Collomosse
Christian Schroeder de Witt
276
0
0
26 Oct 2024
Mitigating Object Hallucination via Concentric Causal Attention
Mitigating Object Hallucination via Concentric Causal AttentionNeural Information Processing Systems (NeurIPS), 2024
Yun Xing
Yiheng Li
Ivan Laptev
Shijian Lu
277
40
0
21 Oct 2024
A Survey of Hallucination in Large Visual Language Models
A Survey of Hallucination in Large Visual Language Models
Wei Lan
Wenyi Chen
Qingfeng Chen
Shirui Pan
Huiyu Zhou
Yi-Lun Pan
LRM
315
12
0
20 Oct 2024
Modality-Fair Preference Optimization for Trustworthy MLLM Alignment
Modality-Fair Preference Optimization for Trustworthy MLLM AlignmentInternational Joint Conference on Artificial Intelligence (IJCAI), 2024
Songtao Jiang
Yan Zhang
Ruizhe Chen
Yeying Jin
Zuozhu Liu
Qinglin He
Yang Feng
Jian Wu
Zuozhu Liu
MoEMLLM
317
18
0
20 Oct 2024
MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
MLLM can see? Dynamic Correction Decoding for Hallucination MitigationInternational Conference on Learning Representations (ICLR), 2024
Chenxi Wang
Xiang Chen
Ningyu Zhang
Bozhong Tian
Haoming Xu
Shumin Deng
Ningyu Zhang
MLLMLRM
788
49
0
15 Oct 2024
LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large
  Language Models
LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models
Han Qiu
Jiaxing Huang
Peng Gao
Qin Qi
Xiaoqin Zhang
Ling Shao
Shijian Lu
HILM
291
6
0
13 Oct 2024
From Pixels to Tokens: Revisiting Object Hallucinations in Large Vision-Language Models
From Pixels to Tokens: Revisiting Object Hallucinations in Large Vision-Language Models
Yuying Shang
Xinyi Zeng
Yutao Zhu
Xiao Yang
Zhengwei Fang
Jingyuan Zhang
Jiawei Chen
Zinan Liu
Yu Tian
VLMMLLM
818
2
0
09 Oct 2024
DAMRO: Dive into the Attention Mechanism of LVLM to Reduce Object Hallucination
DAMRO: Dive into the Attention Mechanism of LVLM to Reduce Object HallucinationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Xuan Gong
Tianshi Ming
Xinpeng Wang
Zhihua Wei
MLLM
399
36
0
06 Oct 2024
ProcBench: Benchmark for Multi-Step Reasoning and Following Procedure
ProcBench: Benchmark for Multi-Step Reasoning and Following Procedure
Ippei Fujisawa
Sensho Nobe
Hiroki Seto
Rina Onda
Yoshiaki Uchida
Hiroki Ikoma
Pei-Chun Chien
Ryota Kanai
LRM
231
8
0
04 Oct 2024
Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models
Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models
Xin Zou
Yizhou Wang
Yibo Yan
Yuanhuiyi Lyu
Kening Zheng
...
Junkai Chen
Peijie Jiang
Qingbin Liu
Chang Tang
Xuming Hu
461
28
0
04 Oct 2024
From Pixels to Tokens: Byte-Pair Encoding on Quantized Visual Modalities
From Pixels to Tokens: Byte-Pair Encoding on Quantized Visual ModalitiesInternational Conference on Learning Representations (ICLR), 2024
Wanpeng Zhang
Zilong Xie
Yicheng Feng
Yijiang Li
Xingrun Xing
Sipeng Zheng
Zongqing Lu
MLLM
355
11
0
03 Oct 2024
Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations
Interpreting and Editing Vision-Language Representations to Mitigate HallucinationsInternational Conference on Learning Representations (ICLR), 2024
Nick Jiang
Anish Kachinthaya
Suzie Petryk
Yossi Gandelsman
VLM
412
62
0
03 Oct 2024
Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering
Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy FilteringInternational Conference on Learning Representations (ICLR), 2024
Kemal Kurniawan
Bernhard Schölkopf
Michael Muehlebach
581
5
0
02 Oct 2024
Truth or Deceit? A Bayesian Decoding Game Enhances Consistency and
  Reliability
Truth or Deceit? A Bayesian Decoding Game Enhances Consistency and Reliability
Weitong Zhang
Chengqi Zang
Bernhard Kainz
221
1
0
01 Oct 2024
HELPD: Mitigating Hallucination of LVLMs by Hierarchical Feedback
  Learning with Vision-enhanced Penalty Decoding
HELPD: Mitigating Hallucination of LVLMs by Hierarchical Feedback Learning with Vision-enhanced Penalty DecodingConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Fan Yuan
Chi Qin
Xiaogang Xu
Piji Li
VLMMLLM
162
9
0
30 Sep 2024
One Token to Seg Them All: Language Instructed Reasoning Segmentation in
  Videos
One Token to Seg Them All: Language Instructed Reasoning Segmentation in VideosNeural Information Processing Systems (NeurIPS), 2024
Zechen Bai
Tong He
Haiyang Mei
Pichao Wang
Ziteng Gao
Joya Chen
Lei Liu
Zheng Zhang
Mike Zheng Shou
VLMVOSMLLM
256
76
0
29 Sep 2024
A Survey on the Honesty of Large Language Models
A Survey on the Honesty of Large Language Models
Siheng Li
Cheng Yang
Taiqiang Wu
Chufan Shi
Yuji Zhang
...
Jie Zhou
Yujiu Yang
Ngai Wong
Xixin Wu
Wai Lam
HILM
297
18
0
27 Sep 2024
A Unified Hallucination Mitigation Framework for Large Vision-Language
  Models
A Unified Hallucination Mitigation Framework for Large Vision-Language Models
Yue Chang
Liqiang Jing
Xiaopeng Zhang
Yue Zhang
VLMMLLM
223
5
0
24 Sep 2024
A Survey on Multimodal Benchmarks: In the Era of Large AI Models
A Survey on Multimodal Benchmarks: In the Era of Large AI Models
Lin Li
Guikun Chen
Hanrong Shi
Jun Xiao
Long Chen
346
23
0
21 Sep 2024
Towards Child-Inclusive Clinical Video Understanding for Autism Spectrum
  Disorder
Towards Child-Inclusive Clinical Video Understanding for Autism Spectrum Disorder
Aditya Kommineni
Digbalay Bose
Tiantian Feng
So Hyun Kim
Helen Tager-Flusberg
Somer Bishop
C. Lord
Sudarsana Reddy Kadiri
Shrikanth Narayanan
157
3
0
20 Sep 2024
FIHA: Autonomous Hallucination Evaluation in Vision-Language Models with Davidson Scene Graphs
FIHA: Autonomous Hallucination Evaluation in Vision-Language Models with Davidson Scene Graphs
Bowen Yan
Zhengsong Zhang
Liqiang Jing
Eftekhar Hossain
Xinya Du
297
6
0
20 Sep 2024
Previous
1234567
Next