Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning

21 June 2024

Trevor Darrell

Papers citing "Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning"

21 / 21 papers shown

Title
Advancing Multimodal In-Context Learning in Large Vision-Language Models with Task-aware Demonstrations Yanshu Li 44 0 0 05 Mar 2025
Granite Vision: a lightweight, open-source multimodal model for enterprise Intelligence Granite Vision Team Leonid Karlinsky Assaf Arbelle Abraham Daniels A. Nassar ... Sriram Raghavan T. Syeda-Mahmood Peter W. J. Staar Tal Drory Rogerio Feris VLM AI4TS 102 0 0 14 Feb 2025
Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers Chancharik Mitra Brandon Huang Tianning Chai Zhiqiu Lin Assaf Arbelle Rogerio Feris Leonid Karlinsky Trevor Darrell Deva Ramanan Roei Herzig VLM 116 4 0 28 Nov 2024
Teaching VLMs to Localize Specific Objects from In-context Examples Sivan Doveh Nimrod Shabtay Wei Lin Eli Schwartz Hilde Kuehne ... Leonid Karlinsky James Glass Assaf Arbelle S. Ullman Muhammad Jehanzeb Mirza VLM 94 1 0 20 Nov 2024
In-Context Learning Enables Robot Action Prediction in LLMs Yida Yin Zekai Wang Yuvan Sharma Dantong Niu Trevor Darrell Roei Herzig LM&Ro 51 1 0 16 Oct 2024
ELICIT: LLM Augmentation via External In-Context Capability Futing Wang Jianhao Yan Yue Zhang Tao Lin 35 0 0 12 Oct 2024
Finding Visual Task Vectors Alberto Hojel Yutong Bai Trevor Darrell Amir Globerson Amir Bar 58 6 0 08 Apr 2024
TraveLER: A Multi-LMM Agent Framework for Video Question-Answering Chuyi Shang Amos You Sanjay Subramanian Trevor Darrell Roei Herzig LLMAG 44 3 0 01 Apr 2024
Towards Multimodal In-Context Learning for Vision & Language Models Sivan Doveh Shaked Perek M. Jehanzeb Mirza Wei Lin Amit Alfassy Assaf Arbelle S. Ullman Leonid Karlinsky VLM 108 13 0 19 Mar 2024
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration Qinghao Ye Haiyang Xu Jiabo Ye Mingshi Yan Anwen Hu Haowei Liu Qi Qian Ji Zhang Fei Huang Jingren Zhou MLLM VLM 116 367 0 07 Nov 2023
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality Qinghao Ye Haiyang Xu Guohai Xu Jiabo Ye Ming Yan ... Junfeng Tian Qiang Qi Ji Zhang Feiyan Huang Jingren Zhou VLM MLLM 203 883 0 27 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models Junnan Li Dongxu Li Silvio Savarese Steven C. H. Hoi VLM MLLM 244 4,186 0 30 Jan 2023
Learning by Distilling Context Charles Burton Snell Dan Klein Ruiqi Zhong ReLM LRM 151 44 0 30 Sep 2022
Large Language Models are Zero-Shot Reasoners Takeshi Kojima S. Gu Machel Reid Yutaka Matsuo Yusuke Iwasawa ReLM LRM 291 2,712 0 24 May 2022
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners Zhenhailong Wang Manling Li Ruochen Xu Luowei Zhou Jie Lei ... Chenguang Zhu Derek Hoiem Shih-Fu Chang Mohit Bansal Heng Ji MLLM VLM 167 134 0 22 May 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models Xuezhi Wang Jason W. Wei Dale Schuurmans Quoc Le Ed H. Chi Sharan Narang Aakanksha Chowdhery Denny Zhou ReLM BDL LRM AI4CE 297 3,163 0 21 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation Junnan Li Dongxu Li Caiming Xiong S. Hoi MLLM BDL VLM CLIP 382 4,010 0 28 Jan 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Jason W. Wei Xuezhi Wang Dale Schuurmans Maarten Bosma Brian Ichter F. Xia Ed H. Chi Quoc Le Denny Zhou LM&Ro LRM AI4CE ReLM 315 8,261 0 28 Jan 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization Victor Sanh Albert Webson Colin Raffel Stephen H. Bach Lintang Sutawika ... T. Bers Stella Biderman Leo Gao Thomas Wolf Alexander M. Rush LRM 203 1,651 0 15 Oct 2021
The Power of Scale for Parameter-Efficient Prompt Tuning Brian Lester Rami Al-Rfou Noah Constant VPVLM 278 3,784 0 18 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision Chao Jia Yinfei Yang Ye Xia Yi-Ting Chen Zarana Parekh Hieu H. Pham Quoc V. Le Yun-hsuan Sung Zhen Li Tom Duerig VLM CLIP 293 3,683 0 11 Feb 2021