ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.01736
  4. Cited By
Few-shot Adaptation of Multi-modal Foundation Models: A Survey

Few-shot Adaptation of Multi-modal Foundation Models: A Survey

3 January 2024
Fan Liu
Tianshu Zhang
Wenwen Dai
Wenwen Cai
Wenwen Cai Xiaocong Zhou
Delong Chen
    VLM
    OffRL
ArXivPDFHTML

Papers citing "Few-shot Adaptation of Multi-modal Foundation Models: A Survey"

18 / 18 papers shown
Title
Mitigating Cache Noise in Test-Time Adaptation for Large Vision-Language Models
Mitigating Cache Noise in Test-Time Adaptation for Large Vision-Language Models
Haotian Zhai
Xinyu Chen
Can Zhang
Tianming Sha
Ruirui Li
BDL
VLM
45
0
0
24 Mar 2025
Visual-Language Prompt Tuning with Knowledge-guided Context Optimization
Visual-Language Prompt Tuning with Knowledge-guided Context Optimization
Hantao Yao
Rui Zhang
Changsheng Xu
VLM
VPVLM
122
193
0
23 Mar 2023
Gradient-Regulated Meta-Prompt Learning for Generalizable
  Vision-Language Models
Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models
Juncheng Li
Minghe Gao
Longhui Wei
Siliang Tang
Wenqiao Zhang
Meng Li
Wei Ji
Qi Tian
Tat-Seng Chua
Yueting Zhuang
VLM
VPVLM
22
18
0
12 Mar 2023
MaPLe: Multi-modal Prompt Learning
MaPLe: Multi-modal Prompt Learning
Muhammad Uzair Khattak
H. Rasheed
Muhammad Maaz
Salman Khan
F. Khan
VPVLM
VLM
178
521
0
06 Oct 2022
CALIP: Zero-Shot Enhancement of CLIP with Parameter-free Attention
CALIP: Zero-Shot Enhancement of CLIP with Parameter-free Attention
Ziyu Guo
Renrui Zhang
Longtian Qiu
Xianzheng Ma
Xupeng Miao
Xuming He
Bin Cui
VLM
AAML
52
108
0
28 Sep 2022
Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language
  Models
Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
Manli Shu
Weili Nie
De-An Huang
Zhiding Yu
Tom Goldstein
Anima Anandkumar
Chaowei Xiao
VLM
VPVLM
152
278
0
15 Sep 2022
What does a platypus look like? Generating customized prompts for
  zero-shot image classification
What does a platypus look like? Generating customized prompts for zero-shot image classification
Sarah M Pratt
Ian Covert
Rosanne Liu
Ali Farhadi
VLM
116
211
0
07 Sep 2022
Prompt-aligned Gradient for Prompt Tuning
Prompt-aligned Gradient for Prompt Tuning
Beier Zhu
Yulei Niu
Yucheng Han
Yuehua Wu
Hanwang Zhang
VLM
167
263
0
30 May 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
380
4,010
0
28 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
255
7,337
0
11 Nov 2021
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language
  Modeling
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Renrui Zhang
Rongyao Fang
Wei Zhang
Peng Gao
Kunchang Li
Jifeng Dai
Yu Qiao
Hongsheng Li
VLM
170
281
0
06 Nov 2021
CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP
CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP
Andreas Fürst
Elisabeth Rumetshofer
Johannes Lehner
Viet-Hung Tran
Fei Tang
...
David P. Kreil
Michael K Kopp
G. Klambauer
Angela Bitto-Nemling
Sepp Hochreiter
VLM
CLIP
179
101
0
21 Oct 2021
EfficientCLIP: Efficient Cross-Modal Pre-training by Ensemble Confident
  Learning and Language Modeling
EfficientCLIP: Efficient Cross-Modal Pre-training by Ensemble Confident Learning and Language Modeling
Jue Wang
Haofan Wang
Jincan Deng
Weijia Wu
Debing Zhang
VLM
CLIP
48
18
0
10 Sep 2021
Learning to Prompt for Vision-Language Models
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
319
2,108
0
02 Sep 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
275
3,784
0
18 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
2,875
0
11 Feb 2021
KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense
  Generation
KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation
Yiran Xing
Z. Shi
Zhao Meng
Gerhard Lakemeyer
Yunpu Ma
Roger Wattenhofer
VLM
57
40
0
02 Jan 2021
K-BERT: Enabling Language Representation with Knowledge Graph
K-BERT: Enabling Language Representation with Knowledge Graph
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Qi Ju
Haotang Deng
Ping Wang
215
773
0
17 Sep 2019
1