v1v2 (latest)

Advancing Multimodal In-Context Learning in Large Vision-Language Models with Task-aware Demonstrations

5 March 2025

Yanshu Li

ArXiv (abs)PDF HTML Github

Papers citing "Advancing Multimodal In-Context Learning in Large Vision-Language Models with Task-aware Demonstrations"

35 / 35 papers shown

DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action

151

27 Nov 2025

TACO: Enhancing Multimodal In-context Learning via Task Mapping-Guided Sequence Configuration

566

21 May 2025

Make LVLMs Focus: Context-Aware Attention Modulation for Better Multimodal In-Context Learning

...

488

21 May 2025

Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning

Trevor Darrell

294

21 Jun 2024

ICLEval: Evaluating In-Context Learning Ability of Large Language Models

Yankai Lin

Ji-Rong Wen

293

21 Jun 2024

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

...

Dahua Lin

Yu Qiao

Jifeng Dai

Wenhai Wang

MLLM VLM

667

1,136

25 Apr 2024

Visual In-Context Learning for Large Vision-Language Models

276

134

18 Feb 2024

Comparable Demonstrations are Important in In-Context Learning: A Novel Perspective on Demonstration SelectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

338

12 Dec 2023

How to Configure Good In-Context Sequence for Visual Question AnsweringComputer Vision and Pattern Recognition (CVPR), 2023

303

04 Dec 2023

Meta-Adapter: An Online Few-shot Learner for Vision-Language ModelNeural Information Processing Systems (NeurIPS), 2023

Ying Shan

580

07 Nov 2023

An Early Evaluation of GPT-4V(ision)

219

25 Oct 2023

Multimodal Neurons in Pretrained Text-Only Transformers

Antonio Torralba

386

03 Aug 2023

OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models

...

Pang Wei Koh

462

591

02 Aug 2023

Learning to Retrieve In-Context Examples for Large Language ModelsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023

Liang Wang

Nan Yang

Furu Wei

RALM

276

14 Jul 2023

Measuring Inductive Biases of In-Context Learning with Underspecified DemonstrationsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

386

22 May 2023

What In-Context Learning "Learns" In-Context: Disentangling Task Recognition and Task LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

243

167

16 May 2023

Larger language models do in-context learning differently

...

553

461

07 Mar 2023

Self-Adaptive In-Context Learning: An Information Compression Perspective for In-Context Example Selection and OrderingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Zhiyong Wu

Yaoxiang Wang

Jiacheng Ye

Lingpeng Kong

472

199

20 Dec 2022

Z-ICL: Zero-Shot In-Context Learning with Pseudo-DemonstrationsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Luke Zettlemoyer

240

19 Dec 2022

What Can Transformers Learn In-Context? A Case Study of Simple Function ClassesNeural Information Processing Systems (NeurIPS), 2022

777

734

01 Aug 2022

Least-to-Most Prompting Enables Complex Reasoning in Large Language ModelsInternational Conference on Learning Representations (ICLR), 2022

...

929

1,672

21 May 2022

Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Luke Zettlemoyer

680

1,949

25 Feb 2022

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language ProcessingACM Computing Surveys (CSUR), 2021

Graham Neubig

913

5,235

28 Jul 2021

Multimodal Few-Shot Learning with Frozen Language ModelsNeural Information Processing Systems (NeurIPS), 2021

700

951

25 Jun 2021

Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order SensitivityAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

1.2K

1,479

18 Apr 2021

The Power of Scale for Parameter-Efficient Prompt TuningConference on Empirical Methods in Natural Language Processing (EMNLP), 2021

1.6K

5,344

18 Apr 2021

What Makes Good In-Context Examples for GPT-

3

?Workshop on Knowledge Extraction and Integration for Deep Learning Architectures; Deep Learning Inside Out (DEELIO), 2021

Lawrence Carin

669

1,702

17 Jan 2021

Making Pre-trained Language Models Better Few-shot LearnersAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

Tianyu Gao

Adam Fisch

Danqi Chen

945

2,248

31 Dec 2020

Language Models are Few-Shot LearnersNeural Information Processing Systems (NeurIPS), 2020

...

2.4K

56,453

28 May 2020

The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes

Douwe Kiela

Amanpreet Singh

514

853

10 May 2020

OK-VQA: A Visual Question Answering Benchmark Requiring External KnowledgeComputer Vision and Pattern Recognition (CVPR), 2019

854

1,493

31 May 2019

VizWiz Grand Challenge: Answering Visual Questions from Blind People

972

1,165

22 Feb 2018

Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering

Devi Parikh

1.5K

4,101

02 Dec 2016

CIDEr: Consensus-based Image Description EvaluationComputer Vision and Pattern Recognition (CVPR), 2014

Ramakrishna Vedantam

C. L. Zitnick

Devi Parikh

947

5,370

20 Nov 2014

Microsoft COCO: Common Objects in ContextEuropean Conference on Computer Vision (ECCV), 2014

Piotr Dollár

27.3K

51,996

01 May 2014