ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.02151
  4. Cited By
Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong
  Few-shot Learners

Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners

3 March 2023
Renrui Zhang
Xiangfei Hu
Bohao Li
Siyuan Huang
Hanqiu Deng
Hongsheng Li
Yu Qiao
Peng Gao
    VLM
    MLLM
ArXivPDFHTML

Papers citing "Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners"

50 / 145 papers shown
Title
On the test-time zero-shot generalization of vision-language models: Do
  we really need prompt learning?
On the test-time zero-shot generalization of vision-language models: Do we really need prompt learning?
Maxime Zanella
Ismail Ben Ayed
VLM
MLLM
35
22
0
03 May 2024
Multi-method Integration with Confidence-based Weighting for Zero-shot
  Image Classification
Multi-method Integration with Confidence-based Weighting for Zero-shot Image Classification
Siqi Yin
Lifan Jiang
19
0
0
03 May 2024
Understanding Retrieval-Augmented Task Adaptation for Vision-Language
  Models
Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models
Yifei Ming
Yixuan Li
VLM
23
7
0
02 May 2024
Large Language Model Informed Patent Image Retrieval
Large Language Model Informed Patent Image Retrieval
Hao-Cheng Lo
Jung-Mei Chu
Jieh Hsiang
Chun-Chieh Cho
VLM
15
2
0
30 Apr 2024
Do LLMs Understand Visual Anomalies? Uncovering LLM's Capabilities in Zero-shot Anomaly Detection
Do LLMs Understand Visual Anomalies? Uncovering LLM's Capabilities in Zero-shot Anomaly Detection
Jiaqi Zhu
Shaofeng Cai
Fang Deng
Junran Wu
Junran Wu
47
15
0
15 Apr 2024
AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning
AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning
Yuwei Tang
Zhenyi Lin
Qilong Wang
Pengfei Zhu
Qinghua Hu
26
11
0
13 Apr 2024
PM2: A New Prompting Multi-modal Model Paradigm for Few-shot Medical
  Image Classification
PM2: A New Prompting Multi-modal Model Paradigm for Few-shot Medical Image Classification
Zhenwei Wang
Qiule Sun
Bingbing Zhang
Pengfei Wang
Jianxin Zhang
Qiang Zhang
VLM
38
1
0
13 Apr 2024
Label Propagation for Zero-shot Classification with Vision-Language
  Models
Label Propagation for Zero-shot Classification with Vision-Language Models
Vladan Stojnić
Yannis Kalantidis
Giorgos Tolias
VLM
25
8
0
05 Apr 2024
No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D
  Scene Segmentation
No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation
Xiangyang Zhu
Renrui Zhang
Bowei He
Ziyu Guo
Jiaming Liu
Han Xiao
Chaoyou Fu
Hao Dong
Peng Gao
3DPC
31
8
0
05 Apr 2024
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency
  Determines Multimodal Model Performance
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
Vishaal Udandarao
Ameya Prabhu
Adhiraj Ghosh
Yash Sharma
Philip H. S. Torr
Adel Bibi
Samuel Albanie
Matthias Bethge
VLM
118
43
0
04 Apr 2024
Bayesian Exploration of Pre-trained Models for Low-shot Image
  Classification
Bayesian Exploration of Pre-trained Models for Low-shot Image Classification
Yibo Miao
Yu Lei
Feng Zhou
Zhijie Deng
VLM
UQCV
BDL
38
1
0
30 Mar 2024
Efficient Test-Time Adaptation of Vision-Language Models
Efficient Test-Time Adaptation of Vision-Language Models
Adilbek Karmanov
Dayan Guan
Shijian Lu
Abdulmotaleb El Saddik
Eric P. Xing
TTA
VLM
14
37
0
27 Mar 2024
Dual Memory Networks: A Versatile Adaptation Approach for
  Vision-Language Models
Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models
Yabin Zhang
Wen-Qing Zhu
Hui Tang
Zhiyuan Ma
Kaiyang Zhou
Lei Zhang
VLM
29
21
0
26 Mar 2024
Efficient Data Access Paths for Mixed Vector-Relational Search
Efficient Data Access Paths for Mixed Vector-Relational Search
Viktor Sanca
Anastasia Ailamaki
16
0
0
23 Mar 2024
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual
  Math Problems?
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Renrui Zhang
Dongzhi Jiang
Yichi Zhang
Haokun Lin
Ziyu Guo
...
Aojun Zhou
Pan Lu
Kai-Wei Chang
Peng Gao
Hongsheng Li
27
165
0
21 Mar 2024
MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with
  Module-wise Pruning Error Metric
MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric
Haokun Lin
Haoli Bai
Zhili Liu
Lu Hou
Muyi Sun
Linqi Song
Ying Wei
Zhenan Sun
CLIP
VLM
50
13
0
12 Mar 2024
RESTORE: Towards Feature Shift for Vision-Language Prompt Learning
RESTORE: Towards Feature Shift for Vision-Language Prompt Learning
Yuncheng Yang
Chuyan Zhang
Zuopeng Yang
Yuting Gao
Yulei Qin
Ke Li
Xing Sun
Jie-jin Yang
Yun Gu
VLM
VPVLM
44
0
0
10 Mar 2024
Test-time Distribution Learning Adapter for Cross-modal Visual Reasoning
Test-time Distribution Learning Adapter for Cross-modal Visual Reasoning
Yi Zhang
Ce Zhang
VLM
28
1
0
10 Mar 2024
Few-shot Learner Parameterization by Diffusion Time-steps
Few-shot Learner Parameterization by Diffusion Time-steps
Zhongqi Yue
Pan Zhou
Richang Hong
Hanwang Zhang
Qianru Sun
23
11
0
05 Mar 2024
Percept, Chat, and then Adapt: Multimodal Knowledge Transfer of
  Foundation Models for Open-World Video Recognition
Percept, Chat, and then Adapt: Multimodal Knowledge Transfer of Foundation Models for Open-World Video Recognition
Boyu Chen
Siran Chen
Kunchang Li
Qinglin Xu
Yu Qiao
Yali Wang
28
3
0
29 Feb 2024
Parameter-efficient Prompt Learning for 3D Point Cloud Understanding
Parameter-efficient Prompt Learning for 3D Point Cloud Understanding
Hongyu Sun
Yongcai Wang
Wang Chen
Haoran Deng
Deying Li
VPVLM
39
5
0
24 Feb 2024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Chris Liu
Renrui Zhang
Longtian Qiu
Siyuan Huang
Weifeng Lin
...
Hao Shao
Pan Lu
Hongsheng Li
Yu Qiao
Peng Gao
MLLM
123
106
0
08 Feb 2024
LLMs Meet VLMs: Boost Open Vocabulary Object Detection with Fine-grained
  Descriptors
LLMs Meet VLMs: Boost Open Vocabulary Object Detection with Fine-grained Descriptors
Sheng Jin
Xue-Qiu Jiang
Jiaxing Huang
Lewei Lu
Shijian Lu
VLM
ObjD
16
21
0
07 Feb 2024
Image-Caption Encoding for Improving Zero-Shot Generalization
Image-Caption Encoding for Improving Zero-Shot Generalization
Eric Yang Yu
Christopher Liao
Sathvik Ravi
Theodoros Tsiligkaridis
Brian Kulis
OODD
VLM
11
0
0
05 Feb 2024
Toward Robust Multimodal Learning using Multimodal Foundational Models
Toward Robust Multimodal Learning using Multimodal Foundational Models
Xianbing Zhao
Soujanya Poria
Xuejiao Li
Yixin Chen
Buzhou Tang
VLM
32
2
0
20 Jan 2024
CoCoT: Contrastive Chain-of-Thought Prompting for Large Multimodal
  Models with Multiple Image Inputs
CoCoT: Contrastive Chain-of-Thought Prompting for Large Multimodal Models with Multiple Image Inputs
Daoan Zhang
Junming Yang
Hanjia Lyu
Zijian Jin
Yuan Yao
Mingkai Chen
Jiebo Luo
21
33
0
05 Jan 2024
A Survey on Open-Set Image Recognition
A Survey on Open-Set Image Recognition
Jiaying Sun
Qiulei Dong
BDL
ObjD
30
3
0
25 Dec 2023
FM-OV3D: Foundation Model-based Cross-modal Knowledge Blending for
  Open-Vocabulary 3D Detection
FM-OV3D: Foundation Model-based Cross-modal Knowledge Blending for Open-Vocabulary 3D Detection
Dongmei Zhang
Chang Li
Ray Zhang
Shenghao Xie
Wei Xue
Xiaodong Xie
Shanghang Zhang
VLM
17
13
0
22 Dec 2023
LaViP:Language-Grounded Visual Prompts
LaViP:Language-Grounded Visual Prompts
Nilakshan Kunananthaseelan
Jing Zhang
Mehrtash Harandi
VLM
8
0
0
18 Dec 2023
Gradient-based Parameter Selection for Efficient Fine-Tuning
Gradient-based Parameter Selection for Efficient Fine-Tuning
Zhi Zhang
Qizhe Zhang
Zijun Gao
Renrui Zhang
Ekaterina Shutova
Shiji Zhou
Shanghang Zhang
15
15
0
15 Dec 2023
Compound Text-Guided Prompt Tuning via Image-Adaptive Cues
Compound Text-Guided Prompt Tuning via Image-Adaptive Cues
Hao Tan
Jun Li
Yi Zhou
Jun Wan
Zhen Lei
Xiangyu Zhang
VLM
30
4
0
11 Dec 2023
Learning Hierarchical Prompt with Structured Linguistic Knowledge for
  Vision-Language Models
Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models
Yubin Wang
Xinyang Jiang
De Cheng
Dongsheng Li
Cairong Zhao
VLM
33
15
0
11 Dec 2023
SYNC-CLIP: Synthetic Data Make CLIP Generalize Better in Data-Limited
  Scenarios
SYNC-CLIP: Synthetic Data Make CLIP Generalize Better in Data-Limited Scenarios
Mushui Liu
Weijie He
Ziqian Lu
Yunlong Yu
VLM
16
1
0
06 Dec 2023
Repurposing Diffusion-Based Image Generators for Monocular Depth
  Estimation
Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
B. Ke
Anton Obukhov
Shengyu Huang
Nando Metzger
Rodrigo Caye Daudt
Konrad Schindler
VLM
MDE
21
141
0
04 Dec 2023
MV-CLIP: Multi-View CLIP for Zero-shot 3D Shape Recognition
MV-CLIP: Multi-View CLIP for Zero-shot 3D Shape Recognition
Dan Song
Xinwei Fu
Weizhi Nie
Wenhui Li
Lanjun Wang
You Yang
Anan Liu
VLM
19
6
0
30 Nov 2023
GELDA: A generative language annotation framework to reveal visual
  biases in datasets
GELDA: A generative language annotation framework to reveal visual biases in datasets
Krish Kabra
Kathleen M. Lewis
Guha Balakrishnan
VLM
11
1
0
29 Nov 2023
Inferring Latent Class Statistics from Text for Robust Visual Few-Shot
  Learning
Inferring Latent Class Statistics from Text for Robust Visual Few-Shot Learning
Yassir Bendou
Vincent Gripon
Bastien Pasdeloup
G. Lioi
Lukas Mauch
Fabien Cardinaux
G. B. Hacene
13
0
0
24 Nov 2023
A Survey on Multimodal Large Language Models for Autonomous Driving
A Survey on Multimodal Large Language Models for Autonomous Driving
Can Cui
Yunsheng Ma
Xu Cao
Wenqian Ye
Yang Zhou
...
Xinrui Yan
Shuqi Mei
Jianguo Cao
Ziran Wang
Chao Zheng
36
248
0
21 Nov 2023
From Categories to Classifier: Name-Only Continual Learning by Exploring
  the Web
From Categories to Classifier: Name-Only Continual Learning by Exploring the Web
Ameya Prabhu
Hasan Hammoud
Ser-Nam Lim
Bernard Ghanem
Philip H. S. Torr
Adel Bibi
CLL
116
9
0
19 Nov 2023
Domain Aligned CLIP for Few-shot Classification
Domain Aligned CLIP for Few-shot Classification
Muhammad Waleed Gondal
Jochen Gast
Inigo Alonso Ruiz
Richard Droste
Tommaso Macri
Suren Kumar
Luitpold Staudigl
VLM
11
11
0
15 Nov 2023
Language Semantic Graph Guided Data-Efficient Learning
Language Semantic Graph Guided Data-Efficient Learning
Wenxuan Ma
Shuang Li
Lincan Cai
Jingxuan Kang
19
4
0
15 Nov 2023
SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for
  Multi-modal Large Language Models
SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models
Ziyi Lin
Chris Liu
Renrui Zhang
Peng Gao
Longtian Qiu
...
Siyuan Huang
Yichi Zhang
Xuming He
Hongsheng Li
Yu Qiao
MLLM
VLM
11
206
0
13 Nov 2023
Follow-Up Differential Descriptions: Language Models Resolve Ambiguities
  for Image Classification
Follow-Up Differential Descriptions: Language Models Resolve Ambiguities for Image Classification
Reza Esfandiarpoor
Stephen H. Bach
VLM
16
13
0
10 Nov 2023
Meta-Adapter: An Online Few-shot Learner for Vision-Language Model
Meta-Adapter: An Online Few-shot Learner for Vision-Language Model
Cheng Cheng
Lin Song
Ruoyi Xue
Hang Wang
Hongbin Sun
Yixiao Ge
Ying Shan
VLM
ObjD
27
18
0
07 Nov 2023
Woodpecker: Hallucination Correction for Multimodal Large Language
  Models
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Shukang Yin
Chaoyou Fu
Sirui Zhao
Tong Bill Xu
Hao Wang
Dianbo Sui
Yunhang Shen
Ke Li
Xingguo Sun
Enhong Chen
VLM
MLLM
30
112
0
24 Oct 2023
Large Language Models can Share Images, Too!
Large Language Models can Share Images, Too!
Young-Jun Lee
Dokyong Lee
Joo Won Sung
Jonghwan Hyeon
Ho-Jin Choi
MLLM
24
2
0
23 Oct 2023
Domain Generalization Using Large Pretrained Models with
  Mixture-of-Adapters
Domain Generalization Using Large Pretrained Models with Mixture-of-Adapters
Gyuseong Lee
Wooseok Jang
Jin Hyeon Kim
Jaewoo Jung
Seungryong Kim
MoE
OOD
17
2
0
17 Oct 2023
GPT-Prompt Controlled Diffusion for Weakly-Supervised Semantic
  Segmentation
GPT-Prompt Controlled Diffusion for Weakly-Supervised Semantic Segmentation
Wangyu Wu
Tianhong Dai
Xiaowei Huang
Fei Ma
Jimin Xiao
DiffM
13
1
0
15 Oct 2023
Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models
Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models
Yiwen Tang
Ivan Tang
Ray Gu
Dong Wang
Eric Zhang
Bin Zhao
Xuelong Li
3DPC
24
19
0
04 Oct 2023
Noise-Tolerant Unsupervised Adapter for Vision-Language Models
Noise-Tolerant Unsupervised Adapter for Vision-Language Models
Eman Ali
Dayan Guan
Muhammad Haris Khan
Abdulmotaleb Elsaddik
VLM
11
0
0
26 Sep 2023
Previous
123
Next