v1v2 (latest)

Hallucination of Multimodal Large Language Models: A Survey

29 April 2024

Tianjun Xiao

Zheng Zhang

Papers citing "Hallucination of Multimodal Large Language Models: A Survey"

50 / 334 papers shown

Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models

485

03 Feb 2025

Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling

538

469

29 Jan 2025

Learning to Summarize from LLM-generated FeedbackNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

367

28 Jan 2025

Mirage in the Eyes: Hallucination Attack on Multi-modal Large Language Models with Only Attention Sink

246

28 Jan 2025

EAGLE: Enhanced Visual Grounding Minimizes Hallucinations in Instructional Multimodal Models

Andrés Villa

Juan Carlos León Alcázar

107

06 Jan 2025

A Review of Multimodal Explainable Artificial Intelligence: Past, Present and Future

387

18 Dec 2024

Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace ProjectionComputer Vision and Pattern Recognition (CVPR), 2024

585

18 Dec 2024

Cracking the Code of Hallucination in LVLMs with Vision-aware Head DivergenceAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

376

18 Dec 2024

Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

480

18 Dec 2024

Combating Multimodal LLM Hallucination via Bottom-Up Holistic ReasoningAAAI Conference on Artificial Intelligence (AAAI), 2024

403

15 Dec 2024

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

...

467

400

13 Dec 2024

A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future DirectionsACM Computing Surveys (ACM CSUR), 2024

432

07 Dec 2024

Who Brings the Frisbee: Probing Hidden Hallucination Factors in Large Vision-Language Model via Causality AnalysisIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

300

04 Dec 2024

Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey

...

426

03 Dec 2024

FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation ModelsComputer Vision and Pattern Recognition (CVPR), 2024

658

27 Nov 2024

What's in the Image? A Deep-Dive into the Vision of Vision Language ModelsComputer Vision and Pattern Recognition (CVPR), 2024

213

26 Nov 2024

Systematic Reward Gap Optimization for Mitigating VLM Hallucinations

606

26 Nov 2024

VaLiD: Mitigating the Hallucination of Large Vision Language Models by Visual Layer Fusion Contrastive Decoding

487

24 Nov 2024

ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language ModelsComputer Vision and Pattern Recognition (CVPR), 2024

1.1K

22 Nov 2024

MovieBench: A Hierarchical Movie Level Dataset for Long Video GenerationComputer Vision and Pattern Recognition (CVPR), 2024

445

22 Nov 2024

Understanding Multimodal LLMs: the Mechanistic Interpretability of Llava in Visual Question Answering

Zeping Yu

Sophia Ananiadou

1.1K

17 Nov 2024

Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucination

269

15 Nov 2024

Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference OptimizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

238

15 Nov 2024

Seeing Clearly by Layer Two: Enhancing Attention Heads to Alleviate Hallucination in LVLMs

209

15 Nov 2024

V-DPO: Mitigating Hallucination in Large Vision Language Models via Vision-Guided Direct Preference OptimizationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

213

05 Nov 2024

Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios

...

351

05 Nov 2024

Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning AgentInternational Conference on Learning Representations (ICLR), 2024

...

645

05 Nov 2024

RadFlag: A Black-Box Hallucination Detection Method for Medical Vision Language Models

L. John Fahrner

251

01 Nov 2024

Driving by the Rules: A Benchmark for Integrating Traffic Sign Regulations into Vectorized HD MapComputer Vision and Pattern Recognition (CVPR), 2024

614

31 Oct 2024

MAD-Sherlock: Multi-Agent Debate for Visual Misinformation Detection

Christian Schroeder de Witt

276

26 Oct 2024

Mitigating Object Hallucination via Concentric Causal AttentionNeural Information Processing Systems (NeurIPS), 2024

277

21 Oct 2024

A Survey of Hallucination in Large Visual Language Models

Qingfeng Chen

315

20 Oct 2024

Modality-Fair Preference Optimization for Trustworthy MLLM AlignmentInternational Joint Conference on Artificial Intelligence (IJCAI), 2024

317

20 Oct 2024

MLLM can see? Dynamic Correction Decoding for Hallucination MitigationInternational Conference on Learning Representations (ICLR), 2024

788

15 Oct 2024

LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models

291

13 Oct 2024

From Pixels to Tokens: Revisiting Object Hallucinations in Large Vision-Language Models

818

09 Oct 2024

DAMRO: Dive into the Attention Mechanism of LVLM to Reduce Object HallucinationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

399

06 Oct 2024

ProcBench: Benchmark for Multi-Step Reasoning and Following Procedure

231

04 Oct 2024

Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models

...

461

04 Oct 2024

From Pixels to Tokens: Byte-Pair Encoding on Quantized Visual ModalitiesInternational Conference on Learning Representations (ICLR), 2024

355

03 Oct 2024

Interpreting and Editing Vision-Language Representations to Mitigate HallucinationsInternational Conference on Learning Representations (ICLR), 2024

412

03 Oct 2024

Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy FilteringInternational Conference on Learning Representations (ICLR), 2024

Kemal Kurniawan

Bernhard Schölkopf

Michael Muehlebach

581

02 Oct 2024

Truth or Deceit? A Bayesian Decoding Game Enhances Consistency and Reliability

Weitong Zhang

Chengqi Zang

Bernhard Kainz

221

01 Oct 2024

HELPD: Mitigating Hallucination of LVLMs by Hierarchical Feedback Learning with Vision-enhanced Penalty DecodingConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

162

30 Sep 2024

One Token to Seg Them All: Language Instructed Reasoning Segmentation in VideosNeural Information Processing Systems (NeurIPS), 2024

Tong He

Joya Chen

Zheng Zhang

Mike Zheng Shou

VLM VOS MLLM

256

29 Sep 2024

A Survey on the Honesty of Large Language Models

Siheng Li

Cheng Yang

Taiqiang Wu

Chufan Shi

Yuji Zhang

...

Ngai Wong

Wai Lam

297

27 Sep 2024

A Unified Hallucination Mitigation Framework for Large Vision-Language Models

Xiaopeng Zhang

223

24 Sep 2024

A Survey on Multimodal Benchmarks: In the Era of Large AI Models

Lin Li

Guikun Chen

Hanrong Shi

Jun Xiao

Long Chen

346

21 Sep 2024

Towards Child-Inclusive Clinical Video Understanding for Autism Spectrum Disorder

Tiantian Feng

Sudarsana Reddy Kadiri

Shrikanth Narayanan

157

20 Sep 2024

FIHA: Autonomous Hallucination Evaluation in Vision-Language Models with Davidson Scene Graphs

297

20 Sep 2024