v1v2 (latest)

Hallucination of Multimodal Large Language Models: A Survey

29 April 2024

Tianjun Xiao

Zheng Zhang

Papers citing "Hallucination of Multimodal Large Language Models: A Survey"

50 / 334 papers shown

Mitigating Hallucinations in Multimodal LLMs via Object-aware Preference Optimization

208

27 Aug 2025

Do LVLMs Know What They Know? A Systematic Study of Knowledge Boundary Perception in LVLMs

Zhikai Ding

Shiyu Ni

Keping Bi

26 Aug 2025

Hierarchical Contextual Grounding LVLM: Enhancing Fine-Grained Visual-Language Understanding with Robust Grounding

Leilei Guo

Antonio Carlos Rivera

Peiyu Tang

Haoxuan Ren

Zheyu Song

171

23 Aug 2025

Learning to Steer: Input-dependent Steering for Multimodal LLMs

382

18 Aug 2025

Controlling Multimodal LLMs via Reward-guided Decoding

Oscar Manas

Pierluca DÓro

Koustuv Sinha

Adriana Romero Soriano

M. Drozdzal

Aishwarya Agrawal

MLLM

137

15 Aug 2025

Empowering Multimodal LLMs with External Tools: A Comprehensive Survey

182

14 Aug 2025

Exploring Causal Effect of Social Bias on Faithfulness Hallucinations in Large Language Models

170

11 Aug 2025

A Rolling Stone Gathers No Moss: Adaptive Policy Optimization for Stable Self-Evaluation in Large Multimodal Models

05 Aug 2025

CAP-LLM: Context-Augmented Personalized Large Language Models for News Headline Generation

111

05 Aug 2025

MoHoBench: Assessing Honesty of Multimodal Large Language Models via Unanswerable Visual Questions

173

29 Jul 2025

TARS: MinMax Token-Adaptive Preference Strategy for MLLM Hallucination Reduction

286

29 Jul 2025

Zero-shot Performance of Generative AI in Brazilian Portuguese Medical Exam

C. Truyts

Amanda Gomes Rabelo

Gabriel Mesquita de Souza

Daniel Scaldaferri Lages

Adriano Jose Pereira

Uri Adrian Prync Flato

E. Reis

Joaquim Edson Vieira

Paulo Sergio Panse Silveira

Edson Amaro Junior

LM&MA ELM

105

26 Jul 2025

OW-CLIP: Data-Efficient Visual Supervision for Open-World Object Detection via Human-AI Collaboration

167

26 Jul 2025

OVFact: Measuring and Improving Open-Vocabulary Factuality for Long Caption Models

187

25 Jul 2025

A Survey of Multimodal Hallucination Evaluation and Detection

359

25 Jul 2025

Extracting Visual Facts from Intermediate Layers for Mitigating Hallucinations in Multimodal Large Language Models

Haoran Zhou

Zihan Zhang

Hao Chen

156

21 Jul 2025

Mitigating Object Hallucinations via Sentence-Level Early Intervention

243

16 Jul 2025

ByDeWay: Boost Your multimodal LLM with DEpth prompting in a Training-Free Way

245

11 Jul 2025

Identify, Isolate, and Purge: Mitigating Hallucinations in LVLMs via Self-Evolving Distillation

232

07 Jul 2025

Loss-Oriented Ranking for Automated Visual Prompting in LVLMs

246

19 Jun 2025

Dual-Stage Value-Guided Inference with Margin-Based Reward Adjustment for Fast and Faithful VLM Captioning

Ankan Deria

Adinath Madhavrao Dukre

283

18 Jun 2025

HEAL: An Empirical Study on Hallucinations in Embodied Agents Driven by Large Language Models

Amit K. Roy-Chowdhury

Chengyu Song

LLMAG HILM LRM

251

18 Jun 2025

ASCD: Attention-Steerable Contrastive Decoding for Reducing Hallucination in MLLM

363

17 Jun 2025

From Black Boxes to Transparent Minds: Evaluating and Enhancing the Theory of Mind in Multimodal Large Language Models

215

17 Jun 2025

VFaith: Do Large Multimodal Models Really Reason on Seen Images Rather than Previous Memories?

217

13 Jun 2025

SECOND: Mitigating Perceptual Hallucination in Vision-Language Models via Selective and Contrastive Decoding

162

10 Jun 2025

Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models

275

09 Jun 2025

HAIBU-ReMUD: Reasoning Multimodal Ultrasound Dataset and Model Bridging to General Specific Domains

226

09 Jun 2025

Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning

277

08 Jun 2025

Ignoring Directionality Leads to Compromised Graph Neural Network Explanations

302

05 Jun 2025

CoRe-MMRAG: Cross-Source Knowledge Reconciliation for Multimodal RAGAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

Yang Tian

Fan Liu

Jingyuan Zhang

Victoria A. Webster-Wood

Yupeng Hu

Liqiang Nie

VLM

244

03 Jun 2025

V2X-UniPool: Unifying Multimodal Perception and Knowledge Reasoning for Autonomous Driving

304

03 Jun 2025

CLAIM: Mitigating Multilingual Object Hallucination in Large Vision-Language Models with Cross-Lingual Attention InterventionAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

...

150

03 Jun 2025

Preemptive Hallucination Reduction: An Input-Level Approach for Multimodal Language Model

Nokimul Hasan Arif

Shadman Rabby

Md Hefzul Hossain Papon

Sabbir Ahmed

MLLM VLM

337

29 May 2025

MMBoundary: Advancing MLLM Knowledge Boundary Awareness through Reasoning Step Confidence CalibrationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

466

29 May 2025

mRAG: Elucidating the Design Space of Multi-modal Retrieval-Augmented Generation

359

29 May 2025

Learning to Route Queries Across Knowledge Bases for Step-wise Retrieval-Augmented Reasoning

...

262

28 May 2025

RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction

178

28 May 2025

D-Fusion: Direct Preference Optimization for Aligning Diffusion Models with Visually Consistent Samples

Zijing Hu

Tai-wei Chang

Kun Kuang

277

28 May 2025

Mitigating Hallucination in Large Vision-Language Models via Adaptive Attention Calibration

474

27 May 2025

The Mirage of Multimodality: Where Truth is Tested and Honesty Unravels

204

26 May 2025

Causal-LLaVA: Causal Disentanglement for Mitigating Hallucination in Multimodal Large Language Models

286

26 May 2025

ChartLens: Fine-grained Visual Attribution in ChartsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

212

25 May 2025

CCHall: A Novel Benchmark for Joint Cross-Lingual and Cross-Modal Hallucinations Detection in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

223

25 May 2025

REGen: Multimodal Retrieval-Embedded Generation for Long-to-Short Video Editing

Taylor Berg-Kirkpatrick

340

24 May 2025

Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps

381

24 May 2025

EVADE-Bench: Multimodal Benchmark for Evasive Content Detection in E-Commerce Applications

...

280

23 May 2025

Analyzing Fine-Grained Alignment and Enhancing Vision Understanding in Multimodal Language Models

282

22 May 2025

MMaDA: Multimodal Large Diffusion Language Models

503

111

21 May 2025

Learning Interpretable Representations Leads to Semantically Faithful EEG-to-Text Generation

Xiaozhao Liu

Dinggang Shen

Xihui Liu

322

21 May 2025