Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.13884
Cited By
Multimodal Few-Shot Learning with Frozen Language Models
25 June 2021
Maria Tsimpoukelli
Jacob Menick
Serkan Cabi
S. M. Ali Eslami
Oriol Vinyals
Felix Hill
MLLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Multimodal Few-Shot Learning with Frozen Language Models"
50 / 532 papers shown
Title
Lightweight In-Context Tuning for Multimodal Unified Models
Yixin Chen
Shuai Zhang
Boran Han
Jiaya Jia
24
2
0
08 Oct 2023
From task structures to world models: What do LLMs know?
ilker. yildirim
L. A. Paul
17
41
0
06 Oct 2023
Demystifying Embedding Spaces using Large Language Models
Guy Tennenholtz
Yinlam Chow
Chih-Wei Hsu
Jihwan Jeong
Lior Shani
Azamat Tulepbergenov
Deepak Ramachandran
Martin Mladenov
Craig Boutilier
23
11
0
06 Oct 2023
Expedited Training of Visual Conditioned Language Generation via Redundancy Reduction
Yiren Jian
Tingkai Liu
Yunzhe Tao
Chunhui Zhang
Soroush Vosoughi
HX Yang
VLM
15
7
0
05 Oct 2023
ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models
Yi-Lin Sung
Jaehong Yoon
Mohit Bansal
VLM
15
14
0
04 Oct 2023
ReForm-Eval: Evaluating Large Vision Language Models via Unified Re-Formulation of Task-Oriented Benchmarks
Zejun Li
Ye Wang
Mengfei Du
Qingwen Liu
Binhao Wu
...
Zhihao Fan
Jie Fu
Jingjing Chen
Xuanjing Huang
Zhongyu Wei
25
13
0
04 Oct 2023
MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens
Kaizhi Zheng
Xuehai He
Xin Eric Wang
MLLM
17
92
0
03 Oct 2023
Time-LLM: Time Series Forecasting by Reprogramming Large Language Models
Ming Jin
Shiyu Wang
Lintao Ma
Zhixuan Chu
James Y. Zhang
...
Pin-Yu Chen
Yuxuan Liang
Yuan-Fang Li
Shirui Pan
Qingsong Wen
AI4TS
36
352
0
03 Oct 2023
Beyond Task Performance: Evaluating and Reducing the Flaws of Large Multimodal Models with In-Context Learning
Mustafa Shukor
Alexandre Ramé
Corentin Dancette
Matthieu Cord
LRM
MLLM
38
20
0
01 Oct 2023
Self-Supervised Open-Ended Classification with Small Visual Language Models
Mohammad Mahdi Derakhshani
Ivona Najdenkoska
Cees G. M. Snoek
M. Worring
Yuki M. Asano
VLM
22
0
0
30 Sep 2023
AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model
Avamarie Brueggeman
Andrea Madotto
Zhaojiang Lin
Tushar Nagarajan
Matt Smith
...
Peyman Heidari
Yue Liu
Kavya Srinet
Babak Damavandi
Anuj Kumar
MLLM
32
93
0
27 Sep 2023
Jointly Training Large Autoregressive Multimodal Models
Emanuele Aiello
L. Yu
Yixin Nie
Armen Aghajanyan
Barlas Oğuz
13
29
0
27 Sep 2023
Tackling VQA with Pretrained Foundation Models without Further Training
Alvin De Jun Tan
Bingquan Shen
MLLM
26
1
0
27 Sep 2023
VidChapters-7M: Video Chapters at Scale
Antoine Yang
Arsha Nagrani
Ivan Laptev
Josef Sivic
Cordelia Schmid
VGen
13
26
0
25 Sep 2023
Natural Language based Context Modeling and Reasoning for Ubiquitous Computing with Large Language Models: A Tutorial
Haoyi Xiong
Jiang Bian
Sijia Yang
Xiaofei Zhang
Linghe Kong
Daqing Zhang
LRM
LLMAG
33
5
0
24 Sep 2023
ContextRef: Evaluating Referenceless Metrics For Image Description Generation
Elisa Kreiss
E. Zelikman
Christopher Potts
Nick Haber
16
5
0
21 Sep 2023
USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models
Guanlong Zhao
Yongqiang Wang
Jason W. Pelecanos
Yu Zhang
Hank Liao
Yiling Huang
Han Lu
Quan Wang
9
4
0
14 Sep 2023
MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning
Haozhe Zhao
Zefan Cai
Shuzheng Si
Xiaojian Ma
Kaikai An
Liang Chen
Zixuan Liu
Sheng Wang
Wenjuan Han
Baobao Chang
MLLM
VLM
24
133
0
14 Sep 2023
PRE: Vision-Language Prompt Learning with Reparameterization Encoder
Anh Pham Thi Minh
An Duc Nguyen
Georgios Tzimiropoulos
VPVLM
VLM
19
3
0
14 Sep 2023
Can Whisper perform speech-based in-context learning?
Siyin Wang
Chao-Han Huck Yang
Ji Wu
Chao Zhang
19
24
0
13 Sep 2023
Language Models as Black-Box Optimizers for Vision-Language Models
Shihong Liu
Zhiqiu Lin
Samuel Yu
Ryan Lee
Tiffany Ling
Deepak Pathak
Deva Ramanan
VLM
22
28
0
12 Sep 2023
InstructDiffusion: A Generalist Modeling Interface for Vision Tasks
Zigang Geng
Binxin Yang
Tiankai Hang
Chen Li
Shuyang Gu
...
Jianmin Bao
Zheng-Wei Zhang
Han Hu
Dongdong Chen
Baining Guo
DiffM
VLM
43
93
0
07 Sep 2023
Language Models for Novelty Detection in System Call Traces
Quentin Fournier
Daniel Aloise
Leandro R. Costa
AI4TS
22
4
0
05 Sep 2023
Expanding Frozen Vision-Language Models without Retraining: Towards Improved Robot Perception
Riley Tavassoli
Mani Amani
Reza Akhavian
33
1
0
31 Aug 2023
UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory
Haiwen Diao
Bo Wan
Y. Zhang
Xuecong Jia
Huchuan Lu
Long Chen
VLM
31
18
0
28 Aug 2023
PromptMRG: Diagnosis-Driven Prompts for Medical Report Generation
Haibo Jin
Haoxuan Che
Yi-Mou Lin
Haoxing Chen
MedIm
30
56
0
24 Aug 2023
Multi-event Video-Text Retrieval
Gengyuan Zhang
Jisen Ren
Jindong Gu
Volker Tresp
19
13
0
22 Aug 2023
CiteTracker: Correlating Image and Text for Visual Tracking
Xin Li
Yuqing Huang
Zhenyu He
Yaowei Wang
Huchuan Lu
Ming-Hsuan Yang
24
28
0
22 Aug 2023
ViCo: Engaging Video Comment Generation with Human Preference Rewards
Yuchong Sun
Bei Liu
Xu Chen
Ruihua Song
Jianlong Fu
VGen
20
2
0
22 Aug 2023
Large Language Models for Software Engineering: A Systematic Literature Review
Xinying Hou
Yanjie Zhao
Yue Liu
Zhou Yang
Kailong Wang
Li Li
Xiapu Luo
David Lo
John C. Grundy
Haoyu Wang
25
322
0
21 Aug 2023
Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer
Guangyi Chen
Xiao Liu
Guangrun Wang
Kun Zhang
Philip H.S.Torr
Xiaoping Zhang
Yansong Tang
19
18
0
16 Aug 2023
Link-Context Learning for Multimodal LLMs
Yan Tai
Weichen Fan
Zhao Zhang
Feng Zhu
Rui Zhao
Ziwei Liu
ReLM
LRM
21
17
0
15 Aug 2023
You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content
Xinlei He
Savvas Zannettou
Yun Shen
Yang Zhang
CLL
13
37
0
10 Aug 2023
Fine-Tune Language Models as Multi-Modal Differential Equation Solvers
Liu Yang
Siting Liu
Stanley J. Osher
19
0
0
09 Aug 2023
EventBind: Learning a Unified Representation to Bind Them All for Event-based Open-world Understanding
Jiazhou Zhou
Xueye Zheng
Yuanhuiyi Lyu
Lin Wang
VLM
17
12
0
06 Aug 2023
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Weihao Yu
Zhengyuan Yang
Linjie Li
Jianfeng Wang
Kevin Qinghong Lin
Zicheng Liu
Xinchao Wang
Lijuan Wang
MLLM
43
605
0
04 Aug 2023
Multimodal Neurons in Pretrained Text-Only Transformers
Sarah Schwettmann
Neil Chowdhury
Samuel J. Klein
David Bau
Antonio Torralba
MILM
30
27
0
03 Aug 2023
Bridging the Gap: Exploring the Capabilities of Bridge-Architectures for Complex Visual Reasoning Tasks
Kousik Rajesh
Mrigank Raman
M. A. Karim
Pranit Chawla
VLM
23
2
0
31 Jul 2023
Towards Generalist Biomedical AI
Tao Tu
Shekoofeh Azizi
Danny Driess
M. Schaekermann
Mohamed Amin
...
Yossi Matias
K. Singhal
Peter R. Florence
Alan Karthikesalingam
Vivek Natarajan
LM&MA
MedIm
AI4MH
35
241
0
26 Jul 2023
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Muhammad Awais
Muzammal Naseer
Salman Khan
Rao Muhammad Anwer
Hisham Cholakkal
M. Shah
Ming Yang
F. Khan
VLM
24
118
0
25 Jul 2023
Re-mine, Learn and Reason: Exploring the Cross-modal Semantic Correlations for Language-guided HOI detection
Yichao Cao
Qingfei Tang
Fengyuan Yang
Xiu Su
Shan You
Xiaobo Lu
Chang Xu
24
16
0
25 Jul 2023
SINC: Self-Supervised In-Context Learning for Vision-Language Tasks
Yi-Syuan Chen
Yun-Zhu Song
Cheng Yu Yeo
Bei Liu
Jianlong Fu
Hong-Han Shuai
VLM
LRM
24
4
0
15 Jul 2023
Bootstrapping Vision-Language Learning with Decoupled Language Pre-training
Yiren Jian
Chongyang Gao
Soroush Vosoughi
VLM
MLLM
27
25
0
13 Jul 2023
mBLIP: Efficient Bootstrapping of Multilingual Vision-LLMs
Gregor Geigle
Abhay Jain
Radu Timofte
Goran Glavavs
VLM
MLLM
18
29
0
13 Jul 2023
Self-Adaptive Sampling for Efficient Video Question-Answering on Image--Text Models
Wei Han
Hui Chen
MingSung Kan
Soujanya Poria
24
1
0
09 Jul 2023
Zero-Shot Dense Video Captioning by Jointly Optimizing Text and Moment
Yongrae Jo
Seongyun Lee
Aiden Seung Joon Lee
Hyunji Lee
Hanseok Oh
Minjoon Seo
16
2
0
05 Jul 2023
Human Inspired Progressive Alignment and Comparative Learning for Grounded Word Acquisition
Yuwei Bao
B. Lattimer
J. Chai
CLL
32
1
0
05 Jul 2023
Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels
Bang-ju Yang
Fenglin Liu
Zheng Li
Qingyu Yin
Chenyu You
Bing Yin
Yuexian Zou
VLM
26
5
0
05 Jul 2023
On Conditional and Compositional Language Model Differentiable Prompting
Jonathan Pilault
Can Liu
Mohit Bansal
Markus Dreyer
22
1
0
04 Jul 2023
SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs
Lijun Yu
Yong Cheng
Zhiruo Wang
Vivek Kumar
Wolfgang Macherey
...
Yonatan Bisk
Ming Yang
Kevin Patrick Murphy
Alexander G. Hauptmann
Lu Jiang
MLLM
20
49
0
30 Jun 2023
Previous
1
2
3
...
5
6
7
...
9
10
11
Next