ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.13884
  4. Cited By
Multimodal Few-Shot Learning with Frozen Language Models

Multimodal Few-Shot Learning with Frozen Language Models

25 June 2021
Maria Tsimpoukelli
Jacob Menick
Serkan Cabi
S. M. Ali Eslami
Oriol Vinyals
Felix Hill
    MLLM
ArXivPDFHTML

Papers citing "Multimodal Few-Shot Learning with Frozen Language Models"

50 / 532 papers shown
Title
No More Fine-Tuning? An Experimental Evaluation of Prompt Tuning in Code
  Intelligence
No More Fine-Tuning? An Experimental Evaluation of Prompt Tuning in Code Intelligence
Chaozheng Wang
Yuanhang Yang
Cuiyun Gao
Yun Peng
Hongyu Zhang
Michael R. Lyu
AAML
52
134
0
24 Jul 2022
Learning to translate by learning to communicate
Learning to translate by learning to communicate
C.M. Downey
Xuhui Zhou
Leo Z. Liu
Shane Steinert-Threlkeld
26
5
0
14 Jul 2022
Towards Multimodal Vision-Language Models Generating Non-Generic Text
Towards Multimodal Vision-Language Models Generating Non-Generic Text
Wes Robbins
Zanyar Zohourianshahzadi
Jugal Kalita
14
1
0
09 Jul 2022
Vision-and-Language Pretraining
Vision-and-Language Pretraining
Thong Nguyen
Cong-Duy Nguyen
Xiaobao Wu
See-Kiong Ng
A. Luu
VLM
CLIP
19
2
0
05 Jul 2022
e-CLIP: Large-Scale Vision-Language Representation Learning in
  E-commerce
e-CLIP: Large-Scale Vision-Language Representation Learning in E-commerce
Wonyoung Shin
Jonghun Park
Taekang Woo
Yongwoo Cho
Kwangjin Oh
Hwanjun Song
VLM
21
16
0
01 Jul 2022
Know your audience: specializing grounded language models with listener
  subtraction
Know your audience: specializing grounded language models with listener subtraction
Aaditya K. Singh
David Ding
Andrew M. Saxe
Felix Hill
Andrew Kyle Lampinen
25
2
0
16 Jun 2022
Zero-Shot Video Question Answering via Frozen Bidirectional Language
  Models
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
34
226
0
16 Jun 2022
LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning
  Tasks
LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks
Tuan Dinh
Yuchen Zeng
Ruisu Zhang
Ziqian Lin
Michael Gira
Shashank Rajput
Jy-yong Sohn
Dimitris Papailiopoulos
Kangwook Lee
LMTD
32
125
0
14 Jun 2022
Multimodal Learning with Transformers: A Survey
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
41
525
0
13 Jun 2022
Language Models are General-Purpose Interfaces
Language Models are General-Purpose Interfaces
Y. Hao
Haoyu Song
Li Dong
Shaohan Huang
Zewen Chi
Wenhui Wang
Shuming Ma
Furu Wei
MLLM
19
95
0
13 Jun 2022
Singular Value Fine-tuning: Few-shot Segmentation requires
  Few-parameters Fine-tuning
Singular Value Fine-tuning: Few-shot Segmentation requires Few-parameters Fine-tuning
Yanpeng Sun
Qiang Chen
Xiangyu He
Jian Wang
Haocheng Feng
Junyu Han
Errui Ding
Jian Cheng
Zechao Li
Jingdong Wang
23
56
0
13 Jun 2022
Building a Personalized Dialogue System with Prompt-Tuning
Building a Personalized Dialogue System with Prompt-Tuning
Tomohito Kasahara
Daisuke Kawahara
N. Tung
Sheng Li
K. Shinzato
Toshinori Sato
17
19
0
11 Jun 2022
Instance-wise Prompt Tuning for Pretrained Language Models
Instance-wise Prompt Tuning for Pretrained Language Models
Yuezihan Jiang
Hao-Yu Yang
Junyang Lin
Hanyu Zhao
An Yang
Chang Zhou
Hongxia Yang
Zhi-Xin Yang
Bin Cui
VLM
36
6
0
04 Jun 2022
ADAPT: Vision-Language Navigation with Modality-Aligned Action Prompts
ADAPT: Vision-Language Navigation with Modality-Aligned Action Prompts
Bingqian Lin
Yi Zhu
Zicong Chen
Xiwen Liang
Jian-zhuo Liu
Xiaodan Liang
LM&Ro
20
51
0
31 May 2022
Multimodal Knowledge Alignment with Reinforcement Learning
Multimodal Knowledge Alignment with Reinforcement Learning
Youngjae Yu
Jiwan Chung
Heeseung Yun
Jack Hessel
J. Park
...
Prithviraj Ammanabrolu
Rowan Zellers
Ronan Le Bras
Gunhee Kim
Yejin Choi
VLM
115
36
0
25 May 2022
History Compression via Language Models in Reinforcement Learning
History Compression via Language Models in Reinforcement Learning
Fabian Paischer
Thomas Adler
Vihang Patil
Angela Bitto-Nemling
Markus Holzleitner
Sebastian Lehner
Hamid Eghbalzadeh
Sepp Hochreiter
OffRL
AI4TS
11
42
0
24 May 2022
Reassessing Evaluation Practices in Visual Question Answering: A Case
  Study on Out-of-Distribution Generalization
Reassessing Evaluation Practices in Visual Question Answering: A Case Study on Out-of-Distribution Generalization
Aishwarya Agrawal
Ivana Kajić
Emanuele Bugliarello
Elnaz Davoodi
Anita Gergely
Phil Blunsom
Aida Nematzadeh
OOD
38
17
0
24 May 2022
Using Natural Language and Program Abstractions to Instill Human
  Inductive Biases in Machines
Using Natural Language and Program Abstractions to Instill Human Inductive Biases in Machines
Sreejan Kumar
Carlos G. Correa
Ishita Dasgupta
Raja Marjieh
Michael Y. Hu
Robert D. Hawkins
Nathaniel D. Daw
Jonathan D. Cohen
Karthik Narasimhan
Thomas L. Griffiths
AI4CE
14
30
0
23 May 2022
PEVL: Position-enhanced Pre-training and Prompt Tuning for
  Vision-language Models
PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models
Yuan Yao
Qi-An Chen
Ao Zhang
Wei Ji
Zhiyuan Liu
Tat-Seng Chua
Maosong Sun
VLM
MLLM
21
38
0
23 May 2022
Language Models with Image Descriptors are Strong Few-Shot
  Video-Language Learners
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Zhenhailong Wang
Manling Li
Ruochen Xu
Luowei Zhou
Jie Lei
...
Chenguang Zhu
Derek Hoiem
Shih-Fu Chang
Mohit Bansal
Heng Ji
MLLM
VLM
170
137
0
22 May 2022
PromptDA: Label-guided Data Augmentation for Prompt-based Few-shot
  Learners
PromptDA: Label-guided Data Augmentation for Prompt-based Few-shot Learners
Canyu Chen
Kai Shu
VLM
26
8
0
18 May 2022
What GPT Knows About Who is Who
What GPT Knows About Who is Who
Xiaohan Yang
Eduardo Peynetti
Vasco Meerman
Christy Tanner
122
13
0
16 May 2022
A Comprehensive Survey of Few-shot Learning: Evolution, Applications,
  Challenges, and Opportunities
A Comprehensive Survey of Few-shot Learning: Evolution, Applications, Challenges, and Opportunities
Yisheng Song
Ting-Yuan Wang
S. Mondal
J. P. Sahoo
SLR
36
342
0
13 May 2022
A Generalist Agent
A Generalist Agent
Scott E. Reed
Konrad Zolna
Emilio Parisotto
Sergio Gomez Colmenarejo
Alexander Novikov
...
Yutian Chen
R. Hadsell
Oriol Vinyals
Mahyar Bordbar
Nando de Freitas
LM&Ro
LLMAG
AI4CE
49
783
0
12 May 2022
Dynamic Prefix-Tuning for Generative Template-based Event Extraction
Dynamic Prefix-Tuning for Generative Template-based Event Extraction
Xiao Liu
Heyan Huang
Ge Shi
Bo Wang
31
99
0
12 May 2022
Clinical Prompt Learning with Frozen Language Models
Clinical Prompt Learning with Frozen Language Models
Niall Taylor
Yi Zhang
Dan W Joyce
A. Nevado-Holgado
Andrey Kormilitzin
VLM
LM&MA
16
31
0
11 May 2022
Declaration-based Prompt Tuning for Visual Question Answering
Declaration-based Prompt Tuning for Visual Question Answering
Yuhang Liu
Wei Wei
Daowan Peng
Feida Zhu
MLLM
VLM
11
19
0
05 May 2022
CoCa: Contrastive Captioners are Image-Text Foundation Models
CoCa: Contrastive Captioners are Image-Text Foundation Models
Jiahui Yu
Zirui Wang
Vijay Vasudevan
Legg Yeung
Mojtaba Seyedhosseini
Yonghui Wu
VLM
CLIP
OffRL
57
1,255
0
04 May 2022
Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering
Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering
A. Piergiovanni
Wei Li
Weicheng Kuo
M. Saffar
Fred Bertsch
A. Angelova
17
16
0
02 May 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
46
3,326
0
29 Apr 2022
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural
  Language Guidance
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance
Katherine Crowson
Stella Biderman
Daniel Kornis
Dashiell Stander
Eric Hallahan
Louis Castricato
Edward Raff
CLIP
57
367
0
18 Apr 2022
Imagination-Augmented Natural Language Understanding
Imagination-Augmented Natural Language Understanding
Yujie Lu
Wanrong Zhu
X. Wang
M. Eckstein
William Yang Wang
15
24
0
18 Apr 2022
Multi-Modal Few-Shot Object Detection with Meta-Learning-Based
  Cross-Modal Prompting
Multi-Modal Few-Shot Object Detection with Meta-Learning-Based Cross-Modal Prompting
G. Han
Long Chen
Jiawei Ma
Shiyuan Huang
Ramalingam Chellappa
Shih-Fu Chang
VLM
19
20
0
16 Apr 2022
Image Captioning In the Transformer Age
Image Captioning In the Transformer Age
Yangliu Xu
Li Li
Haiyang Xu
Songfang Huang
Fei Huang
Jianfei Cai
ViT
14
5
0
15 Apr 2022
"This is my unicorn, Fluffy": Personalizing frozen vision-language
  representations
"This is my unicorn, Fluffy": Personalizing frozen vision-language representations
Niv Cohen
Rinon Gal
E. Meirom
Gal Chechik
Y. Atzmon
VLM
MLLM
48
83
0
04 Apr 2022
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
Andy Zeng
Maria Attarian
Brian Ichter
K. Choromanski
Adrian S. Wong
...
Michael S. Ryoo
Vikas Sindhwani
Johnny Lee
Vincent Vanhoucke
Peter R. Florence
ReLM
LRM
13
569
0
01 Apr 2022
Exploring Visual Prompts for Adapting Large-Scale Models
Exploring Visual Prompts for Adapting Large-Scale Models
Hyojin Bahng
Ali Jahanian
S. Sankaranarayanan
Phillip Isola
VLM
VPVLM
LRM
25
255
0
31 Mar 2022
CREATE: A Benchmark for Chinese Short Video Retrieval and Title
  Generation
CREATE: A Benchmark for Chinese Short Video Retrieval and Title Generation
Ziqi Zhang
Yuxin Chen
Zongyang Ma
Zhongang Qi
Chunfen Yuan
Bing Li
Ying Shan
Weiming Hu
VGen
21
8
0
31 Mar 2022
WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen
  Language Models
WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models
Heting Gao
Junrui Ni
Kaizhi Qian
Yang Zhang
Shiyu Chang
M. Hasegawa-Johnson
VLM
14
31
0
29 Mar 2022
Single-Stream Multi-Level Alignment for Vision-Language Pretraining
Single-Stream Multi-Level Alignment for Vision-Language Pretraining
Zaid Khan
B. Vijaykumar
Xiang Yu
S. Schulter
Manmohan Chandraker
Y. Fu
CLIP
VLM
20
16
0
27 Mar 2022
On Robust Prefix-Tuning for Text Classification
On Robust Prefix-Tuning for Text Classification
Zonghan Yang
Yang Liu
VLM
11
20
0
19 Mar 2022
Integrating Language Guidance into Vision-based Deep Metric Learning
Integrating Language Guidance into Vision-based Deep Metric Learning
Karsten Roth
Oriol Vinyals
Zeynep Akata
VLM
14
29
0
16 Mar 2022
Modular and Parameter-Efficient Multimodal Fusion with Prompting
Modular and Parameter-Efficient Multimodal Fusion with Prompting
Sheng Liang
Mengjie Zhao
Hinrich Schütze
23
42
0
15 Mar 2022
CLIP Models are Few-shot Learners: Empirical Studies on VQA and Visual
  Entailment
CLIP Models are Few-shot Learners: Empirical Studies on VQA and Visual Entailment
Haoyu Song
Li Dong
Weinan Zhang
Ting Liu
Furu Wei
VLM
CLIP
25
136
0
14 Mar 2022
HIE-SQL: History Information Enhanced Network for Context-Dependent
  Text-to-SQL Semantic Parsing
HIE-SQL: History Information Enhanced Network for Context-Dependent Text-to-SQL Semantic Parsing
Yanzhao Zheng
Haibin Wang
B. Dong
Xingjun Wang
Changshan Li
8
32
0
14 Mar 2022
Enabling Multimodal Generation on CLIP via Vision-Language Knowledge
  Distillation
Enabling Multimodal Generation on CLIP via Vision-Language Knowledge Distillation
Wenliang Dai
Lu Hou
Lifeng Shang
Xin Jiang
Qun Liu
Pascale Fung
VLM
22
90
0
12 Mar 2022
REX: Reasoning-aware and Grounded Explanation
REX: Reasoning-aware and Grounded Explanation
Shi Chen
Qi Zhao
20
18
0
11 Mar 2022
HyperPELT: Unified Parameter-Efficient Language Model Tuning for Both
  Language and Vision-and-Language Tasks
HyperPELT: Unified Parameter-Efficient Language Model Tuning for Both Language and Vision-and-Language Tasks
Zhengkun Zhang
Wenya Guo
Xiaojun Meng
Yasheng Wang
Yadao Wang
Xin Jiang
Qun Liu
Zhenglu Yang
26
15
0
08 Mar 2022
VLP: A Survey on Vision-Language Pre-training
VLP: A Survey on Vision-Language Pre-training
Feilong Chen
Duzhen Zhang
Minglun Han
Xiuyi Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
82
212
0
18 Feb 2022
Personalized Prompt Learning for Explainable Recommendation
Personalized Prompt Learning for Explainable Recommendation
Lei Li
Yongfeng Zhang
Li Chen
LRM
17
164
0
15 Feb 2022
Previous
123...10119
Next