v1v2 (latest)

Hallucination of Multimodal Large Language Models: A Survey

29 April 2024

Tianjun Xiao

Zheng Zhang

Papers citing "Hallucination of Multimodal Large Language Models: A Survey"

34 / 334 papers shown

VPGTrans: Transfer Visual Prompt Generator across LLMsNeural Information Processing Systems (NeurIPS), 2023

Ao Zhang

Hao Fei

Yuan Yao

Wei Ji

Li Li

Zhiyuan Liu

Tat-Seng Chua

MLLM VLM

208

100

02 May 2023

LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model

...

Conghui He

Yu Qiao

296

711

28 Apr 2023

mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality

Jiabo Ye

...

Ji Zhang

Jingren Zhou

1.1K

1,170

27 Apr 2023

Instruction Tuning with GPT-4

493

752

06 Apr 2023

Segment AnythingIEEE International Conference on Computer Vision (ICCV), 2023

...

Piotr Dollár

968

11,417

05 Apr 2023

Benchmarking Large Language Models for News SummarizationTransactions of the Association for Computational Linguistics (TACL), 2023

Tianyi Zhang

Faisal Ladhak

Esin Durmus

Abigail Z. Jacobs

Kathleen McKeown

Tatsunori B. Hashimoto

ELM

327

676

31 Jan 2023

Reasoning with Language Model Prompting: A SurveyAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Ningyu Zhang

Shumin Deng

Chuanqi Tan

Fei Huang

Huajun Chen

ReLM ELM LRM

719

398

19 Dec 2022

CREPE: Can Vision-Language Foundation Models Reason Compositionally?Computer Vision and Pattern Recognition (CVPR), 2022

376

183

13 Dec 2022

Contrastive Decoding: Open-ended Text Generation as OptimizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Xiang Lisa Li

Ari Holtzman

Daniel Fried

Abigail Z. Jacobs

Jason Eisner

Tatsunori Hashimoto

Luke Zettlemoyer

M. Lewis

426

517

27 Oct 2022

When and why vision-language models behave like bags-of-words, and what to do about it?International Conference on Learning Representations (ICLR), 2022

Mert Yuksekgonul

Dan Jurafsky

448

524

04 Oct 2022

Knowledge Unlearning for Mitigating Privacy Risks in Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

507

365

04 Oct 2022

Large Language Models are Zero-Shot ReasonersNeural Information Processing Systems (NeurIPS), 2022

1.4K

6,210

24 May 2022

CoCa: Contrastive Captioners are Image-Text Foundation Models

Mojtaba Seyedhosseini

Yonghui Wu

VLM CLIP OffRL

708

1,608

04 May 2022

Hierarchical Text-Conditional Image Generation with CLIP Latents

1.2K

8,366

13 Apr 2022

PaLM: Scaling Language Modeling with PathwaysJournal of machine learning research (JMLR), 2022

Sharan Narang

...

Kathy Meier-Hellstern

1.2K

7,524

05 Apr 2022

Training language models to follow instructions with human feedbackNeural Information Processing Systems (NeurIPS), 2022

Carroll L. Wainwright

...

2.1K

17,754

04 Mar 2022

LoRA: Low-Rank Adaptation of Large Language ModelsInternational Conference on Learning Representations (ICLR), 2021

OffRL AI4TS AI4CE ALM AIMat

1.6K

15,626

17 Jun 2021

Emerging Properties in Self-Supervised Vision TransformersIEEE International Conference on Computer Vision (ICCV), 2021

2.0K

7,967

29 Apr 2021

GLM: General Language Model Pretraining with Autoregressive Blank InfillingAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

Zhengxiao Du

Yujie Qian

Xiao Liu

Ming Ding

425

1,812

18 Mar 2021

Learning Transferable Visual Models From Natural Language SupervisionInternational Conference on Machine Learning (ICML), 2021

...

2.0K

42,087

26 Feb 2021

Scaling Up Visual and Vision-Language Representation Learning With Noisy Text SupervisionInternational Conference on Machine Learning (ICML), 2021

1.4K

4,920

11 Feb 2021

Measuring Massive Multitask Language UnderstandingInternational Conference on Learning Representations (ICLR), 2020

2.3K

6,731

07 Sep 2020

Learning to summarize from human feedbackNeural Information Processing Systems (NeurIPS), 2020

876

2,757

02 Sep 2020

Show, Recall, and Tell: Image Captioning with Recall MechanismAAAI Conference on Artificial Intelligence (AAAI), 2020

269

15 Jan 2020

BERTScore: Evaluating Text Generation with BERT

2.4K

7,563

21 Apr 2019

Object Hallucination in Image Captioning

447

603

06 Sep 2018

Proximal Policy Optimization Algorithms

1.3K

24,405

20 Jul 2017

Deep reinforcement learning from human preferencesNeural Information Processing Systems (NeurIPS), 2017

1.6K

4,461

12 Jun 2017

An Analysis of Visual Question Answering Algorithms

Kushal Kafle

Christopher Kanan

250

251

28 Mar 2017

Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering

Devi Parikh

1.2K

3,844

02 Dec 2016

Semantic Understanding of Scenes through the ADE20K DatasetInternational Journal of Computer Vision (IJCV), 2016

Hang Zhao

Sanja Fidler

Antonio Torralba

777

2,176

18 Aug 2016

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

...

Fei-Fei Li

2.0K

6,245

23 Feb 2016

Fast R-CNN

Ross B. Girshick

ObjD

860

27,570

30 Apr 2015

Microsoft COCO: Common Objects in ContextEuropean Conference on Computer Vision (ECCV), 2014

Piotr Dollár

20.2K

49,827

01 May 2014