ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.18930
  4. Cited By
Hallucination of Multimodal Large Language Models: A Survey
v1v2 (latest)

Hallucination of Multimodal Large Language Models: A Survey

29 April 2024
Zechen Bai
Pichao Wang
Tianjun Xiao
Tong He
Zongbo Han
Zheng Zhang
Mike Zheng Shou
    VLMLRM
ArXiv (abs)PDFHTML

Papers citing "Hallucination of Multimodal Large Language Models: A Survey"

50 / 334 papers shown
ViGoR: Improving Visual Grounding of Large Vision Language Models with Fine-Grained Reward Modeling
ViGoR: Improving Visual Grounding of Large Vision Language Models with Fine-Grained Reward Modeling
Siming Yan
Min Bai
Weifeng Chen
Xiong Zhou
Qixing Huang
Erran L. Li
VLM
361
31
0
09 Feb 2024
The Instinctive Bias: Spurious Images lead to Hallucination in MLLMs
The Instinctive Bias: Spurious Images lead to Hallucination in MLLMsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Tianyang Han
Qing Lian
Boyao Wang
Renjie Pi
Jipeng Zhang
Shizhe Diao
Yong Lin
Tong Zhang
160
12
0
06 Feb 2024
Enhancing Multimodal Large Language Models with Vision Detection Models:
  An Empirical Study
Enhancing Multimodal Large Language Models with Vision Detection Models: An Empirical Study
Qirui Jiao
Daoyuan Chen
Yilun Huang
Yaliang Li
Ying Shen
186
12
0
31 Jan 2024
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMsComputer Vision and Pattern Recognition (CVPR), 2024
Shengbang Tong
Zhuang Liu
Yuexiang Zhai
Yi-An Ma
Yann LeCun
Saining Xie
VLMMLLM
412
568
0
11 Jan 2024
InternVL: Scaling up Vision Foundation Models and Aligning for Generic
  Visual-Linguistic Tasks
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Zhe Chen
Jiannan Wu
Wenhai Wang
Weijie Su
Guo Chen
...
Bin Li
Ping Luo
Tong Lu
Yu Qiao
Jifeng Dai
VLMMLLM
641
2,182
0
21 Dec 2023
VCoder: Versatile Vision Encoders for Multimodal Large Language Models
VCoder: Versatile Vision Encoders for Multimodal Large Language Models
Jitesh Jain
Jianwei Yang
Humphrey Shi
MLLM
203
48
0
21 Dec 2023
Silkie: Preference Distillation for Large Visual Language Models
Silkie: Preference Distillation for Large Visual Language Models
Lei Li
Zhihui Xie
Mukai Li
Shunian Chen
Peiyi Wang
Liang Chen
Yazheng Yang
Benyou Wang
Lingpeng Kong
MLLM
391
107
0
17 Dec 2023
Mitigating Fine-Grained Hallucination by Fine-Tuning Large
  Vision-Language Models with Caption Rewrites
Mitigating Fine-Grained Hallucination by Fine-Tuning Large Vision-Language Models with Caption RewritesConference on Multimedia Modeling (MMM), 2023
Lei Wang
Jiabang He
Shenshen Li
Ning Liu
Ee-Peng Lim
MLLM
220
64
0
04 Dec 2023
Behind the Magic, MERLIM: Multi-modal Evaluation Benchmark for Large Image-Language Models
Behind the Magic, MERLIM: Multi-modal Evaluation Benchmark for Large Image-Language Models
Andrés Villa
Juan Carlos León Alcázar
Alvaro Soto
Bernard Ghanem
MLLMVLM
292
18
0
03 Dec 2023
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from
  Fine-grained Correctional Human Feedback
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human FeedbackComputer Vision and Pattern Recognition (CVPR), 2023
M. Steyvers
Yuan Yao
Haoye Zhang
Taiwen He
Yifeng Han
...
Xinyue Hu
Zhiyuan Liu
Hai-Tao Zheng
Maosong Sun
Tat-Seng Chua
MLLMVLM
436
343
0
01 Dec 2023
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models
  via Over-Trust Penalty and Retrospection-Allocation
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-AllocationComputer Vision and Pattern Recognition (CVPR), 2023
Qidong Huang
Xiao-wen Dong
Pan Zhang
Sijin Yu
Conghui He
Yuan Liu
Dahua Lin
Weiming Zhang
Neng H. Yu
MLLM
472
363
0
29 Nov 2023
Mitigating Object Hallucinations in Large Vision-Language Models through
  Visual Contrastive Decoding
Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive DecodingComputer Vision and Pattern Recognition (CVPR), 2023
Sicong Leng
Hang Zhang
Guanzheng Chen
Xin Li
Shijian Lu
Chunyan Miao
Li Bing
VLMMLLM
314
448
0
28 Nov 2023
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware
  Direct Preference Optimization
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
Zhiyuan Zhao
Sijin Yu
Linke Ouyang
Xiao-wen Dong
Yuan Liu
Conghui He
MLLMVLM
369
188
0
28 Nov 2023
Mitigating Hallucination in Visual Language Models with Visual
  Supervision
Mitigating Hallucination in Visual Language Models with Visual Supervision
Zhiyang Chen
Yousong Zhu
Yufei Zhan
Zhaowen Li
Honghui Dong
Jinqiao Wang
Ming Tang
VLMMLLM
244
52
0
27 Nov 2023
Multimodal Large Language Models: A Survey
Multimodal Large Language Models: A SurveyBigData Congress [Services Society] (BSS), 2023
Jiayang Wu
Wensheng Gan
Zefeng Chen
Shicheng Wan
Philip S. Yu
235
310
0
22 Nov 2023
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction
  Data
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction DataComputer Vision and Pattern Recognition (CVPR), 2023
Qifan Yu
Juncheng Li
Longhui Wei
Liang Pang
Wentao Ye
Bosheng Qin
Siliang Tang
Qi Tian
Yueting Zhuang
MLLMVLM
269
122
0
22 Nov 2023
SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for
  Multi-modal Large Language Models
SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models
Ziyi Lin
Chris Liu
Renrui Zhang
Shiyang Feng
Longtian Qiu
...
Siyuan Huang
Yichi Zhang
Xuming He
Jiaming Song
Yu Qiao
MLLMVLM
319
275
0
13 Nov 2023
AMBER: An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination
  Evaluation
AMBER: An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination Evaluation
Junyang Wang
Yuhang Wang
Guohai Xu
Jing Zhang
Yukai Gu
...
Yuan Liu
Haiyang Xu
Ming Yan
Ji Zhang
Jitao Sang
MLLMVLM
252
188
0
13 Nov 2023
InfMLLM: A Unified Framework for Visual-Language Tasks
InfMLLM: A Unified Framework for Visual-Language Tasks
Qiang-feng Zhou
Zhibin Wang
Wei Chu
Yinghui Xu
Hao Li
Yuan Qi
MLLM
144
12
0
12 Nov 2023
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with
  Modality Collaboration
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality CollaborationComputer Vision and Pattern Recognition (CVPR), 2023
Qinghao Ye
Haiyang Xu
Jiabo Ye
Mingshi Yan
Anwen Hu
Haowei Liu
Qi Qian
Ji Zhang
Fei Huang
Jingren Zhou
MLLMVLM
467
600
0
07 Nov 2023
Holistic Analysis of Hallucination in GPT-4V(ision): Bias and
  Interference Challenges
Holistic Analysis of Hallucination in GPT-4V(ision): Bias and Interference Challenges
Chenhang Cui
Yiyang Zhou
Xinyu Yang
Shirley Wu
Linjun Zhang
James Zou
Huaxiu Yao
MLLM
290
122
0
06 Nov 2023
Improved Baselines with Visual Instruction Tuning
Improved Baselines with Visual Instruction TuningComputer Vision and Pattern Recognition (CVPR), 2023
Haotian Liu
Chunyuan Li
Yuheng Li
Yong Jae Lee
VLMMLLM
611
4,207
0
05 Oct 2023
Analyzing and Mitigating Object Hallucination in Large Vision-Language
  Models
Analyzing and Mitigating Object Hallucination in Large Vision-Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Yiyang Zhou
Chenhang Cui
Jaehong Yoon
Linjun Zhang
Zhun Deng
Chelsea Finn
Mohit Bansal
Huaxiu Yao
MLLM
356
266
0
01 Oct 2023
Reformulating Vision-Language Foundation Models and Datasets Towards
  Universal Multimodal Assistants
Reformulating Vision-Language Foundation Models and Datasets Towards Universal Multimodal Assistants
Tianyu Yu
Jinyi Hu
Yuan Yao
Haoye Zhang
Yue Zhao
...
Jiao Xue
Dahai Li
Zhiyuan Liu
Hai-Tao Zheng
Maosong Sun
VLMMLLM
156
23
0
01 Oct 2023
Aligning Large Multimodal Models with Factually Augmented RLHF
Aligning Large Multimodal Models with Factually Augmented RLHFAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Zhiqing Sun
Sheng Shen
Shengcao Cao
Haotian Liu
Chunyuan Li
...
Liangyan Gui
Yu-Xiong Wang
Yiming Yang
Kurt Keutzer
Trevor Darrell
VLM
285
593
0
25 Sep 2023
Language Modeling Is Compression
Language Modeling Is CompressionInternational Conference on Learning Representations (ICLR), 2023
Grégoire Delétang
Anian Ruoss
Paul-Ambroise Duquenne
Elliot Catt
Tim Genewein
...
Wenliang Kevin Li
Matthew Aitchison
Laurent Orseau
Marcus Hutter
J. Veness
AI4CE
426
202
0
19 Sep 2023
Unsupervised Open-Vocabulary Object Localization in Videos
Unsupervised Open-Vocabulary Object Localization in VideosIEEE International Conference on Computer Vision (ICCV), 2023
Ke Fan
Zechen Bai
Tianjun Xiao
Dominik Zietlow
Max Horn
...
Bernt Schiele
Thomas Brox
Zheng Zhang
Yanwei Fu
Tong He
290
13
0
18 Sep 2023
MMICL: Empowering Vision-language Model with Multi-Modal In-Context
  Learning
MMICL: Empowering Vision-language Model with Multi-Modal In-Context LearningInternational Conference on Learning Representations (ICLR), 2023
Haozhe Zhao
Zefan Cai
Shuzheng Si
Xiaojian Ma
Kaikai An
Liang Chen
Zixuan Liu
Sheng Wang
Wenjuan Han
Baobao Chang
MLLMVLM
453
184
0
14 Sep 2023
ImageBind-LLM: Multi-modality Instruction Tuning
ImageBind-LLM: Multi-modality Instruction Tuning
Jiaming Han
Renrui Zhang
Wenqi Shao
Shiyang Feng
Peng Xu
...
Yafei Wen
Xiaoxin Chen
Xiangyu Yue
Jiaming Song
Yu Qiao
MLLM
282
154
0
07 Sep 2023
DoLa: Decoding by Contrasting Layers Improves Factuality in Large
  Language Models
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Yung-Sung Chuang
Yujia Xie
Hongyin Luo
Yoon Kim
James R. Glass
Pengcheng He
HILM
286
284
0
07 Sep 2023
CIEM: Contrastive Instruction Evaluation Method for Better Instruction
  Tuning
CIEM: Contrastive Instruction Evaluation Method for Better Instruction Tuning
Hongyu Hu
Jiyuan Zhang
Minyi Zhao
Zhenbang Sun
MLLM
215
79
0
05 Sep 2023
Evaluation and Analysis of Hallucination in Large Vision-Language Models
Evaluation and Analysis of Hallucination in Large Vision-Language Models
Junyan Wang
Yi Zhou
Guohai Xu
Pengcheng Shi
Chenlin Zhao
...
Mingshi Yan
Ji Zhang
Jihua Zhu
Jitao Sang
Haoyu Tang
MLLM
271
93
0
29 Aug 2023
VIGC: Visual Instruction Generation and Correction
VIGC: Visual Instruction Generation and CorrectionAAAI Conference on Artificial Intelligence (AAAI), 2023
Sijin Yu
Fan Wu
Xiao Han
Jiahui Peng
Huaping Zhong
...
Xiao-wen Dong
Weijia Li
Wei Li
Yuan Liu
Conghui He
MLLM
339
84
0
24 Aug 2023
Instruction Tuning for Large Language Models: A Survey
Instruction Tuning for Large Language Models: A Survey
Shengyu Zhang
Linfeng Dong
Xiaoya Li
Sen Zhang
Xiaofei Sun
...
Jiwei Li
Runyi Hu
Tianwei Zhang
Leilei Gan
Guoyin Wang
LM&MA
922
765
0
21 Aug 2023
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual
  Questions
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual QuestionsAAAI Conference on Artificial Intelligence (AAAI), 2023
Wenbo Hu
Y. Xu
Jian Wang
W. Li
Zhe Chen
Zhuowen Tu
MLLMVLM
347
189
0
19 Aug 2023
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative
  Instructions
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative InstructionsInternational Conference on Learning Representations (ICLR), 2023
Juncheng Li
Kaihang Pan
Zhiqi Ge
Minghe Gao
Wei Ji
Wenqiao Zhang
Tat-Seng Chua
Siliang Tang
Hanwang Zhang
Yueting Zhuang
MLLM
333
89
0
08 Aug 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MHALM
8.3K
15,302
0
18 Jul 2023
MMBench: Is Your Multi-modal Model an All-around Player?
MMBench: Is Your Multi-modal Model an All-around Player?European Conference on Computer Vision (ECCV), 2023
Yuanzhan Liu
Haodong Duan
Yuanhan Zhang
Yue Liu
Songyang Zhang
...
Yuan Liu
Conghui He
Ziwei Liu
Kai-xiang Chen
Dahua Lin
709
1,664
0
12 Jul 2023
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Shilong Zhang
Pei Sun
Shoufa Chen
Min Xiao
Wenqi Shao
Wenwei Zhang
Yu Liu
Kai-xiang Chen
Ping Luo
MLLMVLM
913
320
0
07 Jul 2023
What Matters in Training a GPT4-Style Language Model with Multimodal
  Inputs?
What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Yan Zeng
Hanbo Zhang
Jiani Zheng
Jiangnan Xia
Guoqiang Wei
Yang Wei
Yuchen Zhang
Tao Kong
MLLM
321
88
0
05 Jul 2023
Shikra: Unleashing Multimodal LLM's Referential Dialogue Magic
Shikra: Unleashing Multimodal LLM's Referential Dialogue Magic
Ke Chen
Zhao Zhang
Weili Zeng
Richong Zhang
Feng Zhu
Rui Zhao
ObjD
456
816
0
27 Jun 2023
Mitigating Hallucination in Large Multi-Modal Models via Robust
  Instruction Tuning
Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction TuningInternational Conference on Learning Representations (ICLR), 2023
Fuxiao Liu
Kevin Qinghong Lin
Linjie Li
Jianfeng Wang
Yaser Yacoob
Lijuan Wang
VLMMLLM
433
406
0
26 Jun 2023
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
Chaoyou Fu
Peixian Chen
Chunjiang Ge
Yulei Qin
Mengdan Zhang
...
Xing Sun
Zhenyu Qiu
Rongrong Ji
Caifeng Shan
Ran He
ELMMLLM
802
1,224
0
23 Jun 2023
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora
  with Web Data, and Web Data Only
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Guilherme Penedo
Quentin Malartic
Daniel Hesslow
Ruxandra-Aimée Cojocaru
Alessandro Cappelli
Hamza Alobeidli
B. Pannier
Ebtesam Almazrouei
Julien Launay
422
882
0
01 Jun 2023
PandaGPT: One Model To Instruction-Follow Them All
PandaGPT: One Model To Instruction-Follow Them AllTsinghua Interdisciplinary Workshop on Logic, Language and Meaning (TILLM), 2023
Yixuan Su
Tian Lan
Huayang Li
Jialu Xu
Yan Wang
Deng Cai
MLLM
260
379
0
25 May 2023
LIMA: Less Is More for Alignment
LIMA: Less Is More for AlignmentNeural Information Processing Systems (NeurIPS), 2023
Chunting Zhou
Pengfei Liu
Puxin Xu
Srini Iyer
Jiao Sun
...
Susan Zhang
Gargi Ghosh
M. Lewis
Luke Zettlemoyer
Omer Levy
ALM
443
1,138
0
18 May 2023
C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for
  Foundation Models
C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation ModelsNeural Information Processing Systems (NeurIPS), 2023
Yuzhen Huang
Yuzhuo Bai
Zhihao Zhu
Junlei Zhang
Jinghan Zhang
...
Yikai Zhang
Jiayi Lei
Yao Fu
Maosong Sun
Junxian He
ELMLRM
425
741
0
15 May 2023
InstructBLIP: Towards General-purpose Vision-Language Models with
  Instruction Tuning
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction TuningNeural Information Processing Systems (NeurIPS), 2023
Wenliang Dai
Junnan Li
Dongxu Li
A. M. H. Tiong
Junqi Zhao
Weisheng Wang
Boyang Albert Li
Pascale Fung
Steven C. H. Hoi
MLLMVLM
1.4K
2,908
0
11 May 2023
MultiModal-GPT: A Vision and Language Model for Dialogue with Humans
MultiModal-GPT: A Vision and Language Model for Dialogue with Humans
T. Gong
Chengqi Lyu
Shilong Zhang
Yudong Wang
Miao Zheng
Qianmengke Zhao
Kuikun Liu
Wenwei Zhang
Ping Luo
Kai-xiang Chen
MLLM
347
305
0
08 May 2023
Otter: A Multi-Modal Model with In-Context Instruction Tuning
Otter: A Multi-Modal Model with In-Context Instruction TuningIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Yue Liu
Yuanhan Zhang
Liangyu Chen
Jinghao Wang
Fanyi Pu
Joshua Adrian Cahyono
Jingkang Yang
Yu Qiao
MLLM
520
620
0
05 May 2023
Previous
1234567
Next