ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.04544
  4. Cited By
CLIP-Adapter: Better Vision-Language Models with Feature Adapters

CLIP-Adapter: Better Vision-Language Models with Feature Adapters

9 October 2021
Peng Gao
Shijie Geng
Renrui Zhang
Teli Ma
Rongyao Fang
Yongfeng Zhang
Hongsheng Li
Yu Qiao
    VLM
    CLIP
ArXivPDFHTML

Papers citing "CLIP-Adapter: Better Vision-Language Models with Feature Adapters"

50 / 635 papers shown
Title
Defense-Prefix for Preventing Typographic Attacks on CLIP
Defense-Prefix for Preventing Typographic Attacks on CLIP
Hiroki Azuma
Yusuke Matsui
VLM
AAML
11
16
0
10 Apr 2023
Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
Syed Talal Wasim
Muzammal Naseer
Salman Khan
F. Khan
M. Shah
VLM
VPVLM
19
73
0
06 Apr 2023
Object-centric Inference for Language Conditioned Placement: A
  Foundation Model based Approach
Object-centric Inference for Language Conditioned Placement: A Foundation Model based Approach
Zhi-Wei Xu
Kechun Xu
Yue Wang
R. Xiong
OCL
8
4
0
06 Apr 2023
Not All Features Matter: Enhancing Few-shot CLIP with Adaptive Prior
  Refinement
Not All Features Matter: Enhancing Few-shot CLIP with Adaptive Prior Refinement
Xiang-yu Zhu
Renrui Zhang
Bowei He
A-Long Zhou
Dong Wang
Bingyan Zhao
Peng Gao
VLM
27
76
0
03 Apr 2023
Vision-Language Models for Vision Tasks: A Survey
Vision-Language Models for Vision Tasks: A Survey
Jingyi Zhang
Jiaxing Huang
Sheng Jin
Shijian Lu
VLM
34
451
0
03 Apr 2023
AutoAD: Movie Description in Context
AutoAD: Movie Description in Context
Tengda Han
Max Bain
Arsha Nagrani
Gül Varol
Weidi Xie
Andrew Zisserman
VGen
14
34
0
29 Mar 2023
ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with
  GPT and Prototype Guidance
ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with GPT and Prototype Guidance
Zoey Guo
Yiwen Tang
Renrui Zhang
Dong Wang
Zhigang Wang
Bin Zhao
Xuelong Li
23
53
0
29 Mar 2023
Towards Foundation Models and Few-Shot Parameter-Efficient Fine-Tuning for Volumetric Organ Segmentation
Towards Foundation Models and Few-Shot Parameter-Efficient Fine-Tuning for Volumetric Organ Segmentation
Julio Silva-Rodríguez
Jose Dolz
Ismail Ben Ayed
51
12
0
29 Mar 2023
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init
  Attention
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
Renrui Zhang
Jiaming Han
Chris Liu
Peng Gao
Aojun Zhou
Xiangfei Hu
Shilin Yan
Pan Lu
Hongsheng Li
Yu Qiao
MLLM
23
736
0
28 Mar 2023
HOICLIP: Efficient Knowledge Transfer for HOI Detection with
  Vision-Language Models
HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models
Sha Ning
Longtian Qiu
Yongfei Liu
Xuming He
VLM
16
41
0
28 Mar 2023
Revisiting Multimodal Representation in Contrastive Learning: From Patch
  and Token Embeddings to Finite Discrete Tokens
Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete Tokens
Yuxiao Chen
Jianbo Yuan
Yu Tian
Shijie Geng
Xinyu Li
Ding Zhou
Dimitris N. Metaxas
Hongxia Yang
12
30
0
27 Mar 2023
BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning
BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning
Changdae Oh
Hyeji Hwang
Hee-young Lee
Yongtaek Lim
Geunyoung Jung
Jiyoung Jung
Hosik Choi
Kyungwoo Song
VLM
VPVLM
75
54
0
26 Mar 2023
VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic
  Scene Graph Prediction in Point Cloud
VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud
Ziqin Wang
Bowen Cheng
Lichen Zhao
Dong Xu
Yang Tang
Lu Sheng
3DPC
16
27
0
25 Mar 2023
Prompt Tuning based Adapter for Vision-Language Model Adaption
Prompt Tuning based Adapter for Vision-Language Model Adaption
Jingchen Sun
Jiayu Qin
Zihao Lin
Changyou Chen
VPVLM
MLLM
VLM
26
5
0
24 Mar 2023
CLIP for All Things Zero-Shot Sketch-Based Image Retrieval, Fine-Grained
  or Not
CLIP for All Things Zero-Shot Sketch-Based Image Retrieval, Fine-Grained or Not
Aneeshan Sain
A. Bhunia
Pinaki Nath Chowdhury
Subhadeep Koley
Tao Xiang
Yi-Zhe Song
VLM
26
73
0
23 Mar 2023
Visual-Language Prompt Tuning with Knowledge-guided Context Optimization
Visual-Language Prompt Tuning with Knowledge-guided Context Optimization
Hantao Yao
Rui Zhang
Changsheng Xu
VLM
VPVLM
122
193
0
23 Mar 2023
CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting
  and Anchor Pre-Matching
CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching
Xiaoshi Wu
Feng Zhu
Rui Zhao
Hongsheng Li
VLM
15
117
0
23 Mar 2023
CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation
CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation
Seokju Cho
Heeseong Shin
Sung‐Jin Hong
Anurag Arnab
Paul Hongsuck Seo
Seung Wook Kim
VLM
19
103
0
21 Mar 2023
Patch-Prompt Aligned Bayesian Prompt Tuning for Vision-Language Models
Patch-Prompt Aligned Bayesian Prompt Tuning for Vision-Language Models
Xinyang Liu
Dongsheng Wang
Bowei Fang
Miaoge Li
Zhibin Duan
Yishi Xu
Bo Chen
Mingyuan Zhou
VLM
VPVLM
13
5
0
16 Mar 2023
Parameter is Not All You Need: Starting from Non-Parametric Networks for
  3D Point Cloud Analysis
Parameter is Not All You Need: Starting from Non-Parametric Networks for 3D Point Cloud Analysis
Renrui Zhang
Liuhui Wang
Ziyu Guo
Yali Wang
Peng Gao
Hongsheng Li
Jianbo Shi
3DPC
12
50
0
14 Mar 2023
ViM: Vision Middleware for Unified Downstream Transferring
ViM: Vision Middleware for Unified Downstream Transferring
Yutong Feng
Biao Gong
Jianwen Jiang
Yiliang Lv
Yujun Shen
Deli Zhao
Jingren Zhou
6
1
0
13 Mar 2023
Preventing Zero-Shot Transfer Degradation in Continual Learning of
  Vision-Language Models
Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models
Zangwei Zheng
Mingyu Ma
Kai Wang
Ziheng Qin
Xiangyu Yue
Yang You
CLL
VLM
91
67
0
12 Mar 2023
Understanding and Constructing Latent Modality Structures in Multi-modal
  Representation Learning
Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning
Qian Jiang
Changyou Chen
Han Zhao
Liqun Chen
Q. Ping
S. D. Tran
Yi Xu
Belinda Zeng
Trishul M. Chilimbi
41
36
0
10 Mar 2023
Iterative Few-shot Semantic Segmentation from Image Label Text
Iterative Few-shot Semantic Segmentation from Image Label Text
Haohan Wang
L. Liu
Wuhao Zhang
Jiangning Zhang
Zhenye Gan
Yabiao Wang
Chengjie Wang
Haoqian Wang
VLM
6
16
0
10 Mar 2023
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set
  Object Detection
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
Shilong Liu
Zhaoyang Zeng
Tianhe Ren
Feng Li
Hao Zhang
...
Chun-yue Li
Jianwei Yang
Hang Su
Jun Zhu
Lei Zhang
ObjD
14
1,797
0
09 Mar 2023
HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware
  Attention
HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware Attention
Shijie Geng
Jianbo Yuan
Yu Tian
Yuxiao Chen
Yongfeng Zhang
CLIP
VLM
41
44
0
06 Mar 2023
CLIP-guided Prototype Modulating for Few-shot Action Recognition
CLIP-guided Prototype Modulating for Few-shot Action Recognition
Xiang Wang
Shiwei Zhang
Jun Cen
Changxin Gao
Yingya Zhang
Deli Zhao
Nong Sang
VLM
6
52
0
06 Mar 2023
Unleashing Text-to-Image Diffusion Models for Visual Perception
Unleashing Text-to-Image Diffusion Models for Visual Perception
Wenliang Zhao
Yongming Rao
Zuyan Liu
Benlin Liu
Jie Zhou
Jiwen Lu
ObjD
VLM
MDE
153
213
0
03 Mar 2023
Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong
  Few-shot Learners
Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners
Renrui Zhang
Xiangfei Hu
Bohao Li
Siyuan Huang
Hanqiu Deng
Hongsheng Li
Yu Qiao
Peng Gao
VLM
MLLM
13
167
0
03 Mar 2023
Visual Exemplar Driven Task-Prompting for Unified Perception in
  Autonomous Driving
Visual Exemplar Driven Task-Prompting for Unified Perception in Autonomous Driving
Xiwen Liang
Minzhe Niu
Jianhua Han
Hang Xu
Chunjing Xu
Xiaodan Liang
VLM
13
13
0
03 Mar 2023
Nearest Neighbors Meet Deep Neural Networks for Point Cloud Analysis
Nearest Neighbors Meet Deep Neural Networks for Point Cloud Analysis
Renrui Zhang
Liuhui Wang
Ziyu Guo
Jianbo Shi
3DPC
32
10
0
01 Mar 2023
Joint-MAE: 2D-3D Joint Masked Autoencoders for 3D Point Cloud
  Pre-training
Joint-MAE: 2D-3D Joint Masked Autoencoders for 3D Point Cloud Pre-training
Ziyu Guo
Renrui Zhang
Longtian Qiu
Xianzhi Li
Pheng-Ann Heng
3DPC
18
52
0
27 Feb 2023
LMSeg: Language-guided Multi-dataset Segmentation
LMSeg: Language-guided Multi-dataset Segmentation
Qiang-feng Zhou
Yuang Liu
Chaohui Yu
Jingliang Li
Zhibin Wang
Fan Wang
VLM
8
18
0
27 Feb 2023
Modular Deep Learning
Modular Deep Learning
Jonas Pfeiffer
Sebastian Ruder
Ivan Vulić
E. Ponti
MoMe
OOD
16
73
0
22 Feb 2023
StyleAdv: Meta Style Adversarial Training for Cross-Domain Few-Shot
  Learning
StyleAdv: Meta Style Adversarial Training for Cross-Domain Few-Shot Learning
Yu Fu
Yu Xie
Yanwei Fu
Yugang Jiang
18
31
0
18 Feb 2023
StyLIP: Multi-Scale Style-Conditioned Prompt Learning for CLIP-based
  Domain Generalization
StyLIP: Multi-Scale Style-Conditioned Prompt Learning for CLIP-based Domain Generalization
Shirsha Bose
Ankit Jha
Enrico Fini
Mainak Singha
Elisa Ricci
Biplab Banerjee
VLM
18
22
0
18 Feb 2023
PerAda: Parameter-Efficient Federated Learning Personalization with
  Generalization Guarantees
PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees
Chulin Xie
De-An Huang
Wen-Hsuan Chu
Daguang Xu
Chaowei Xiao
Bo-wen Li
Anima Anandkumar
FedML
6
10
0
13 Feb 2023
Distinguishability Calibration to In-Context Learning
Distinguishability Calibration to In-Context Learning
Hongjing Li
Hanqi Yan
Yanran Li
Li Qian
Yulan He
Lin Gui
19
2
0
13 Feb 2023
Actional Atomic-Concept Learning for Demystifying Vision-Language
  Navigation
Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation
Bingqian Lin
Yi Zhu
Xiaodan Liang
Liang Lin
Jian-zhuo Liu
CoGe
LM&Ro
29
3
0
13 Feb 2023
Adding Conditional Control to Text-to-Image Diffusion Models
Adding Conditional Control to Text-to-Image Diffusion Models
Lvmin Zhang
Anyi Rao
Maneesh Agrawala
AI4CE
14
3,843
1
10 Feb 2023
CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets
CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets
Zachary Novack
Julian McAuley
Zachary Chase Lipton
Saurabh Garg
VLM
13
78
0
06 Feb 2023
Multi-View Masked World Models for Visual Robotic Manipulation
Multi-View Masked World Models for Visual Robotic Manipulation
Younggyo Seo
Junsup Kim
Stephen James
Kimin Lee
Jinwoo Shin
Pieter Abbeel
VGen
10
54
0
05 Feb 2023
CLIPood: Generalizing CLIP to Out-of-Distributions
CLIPood: Generalizing CLIP to Out-of-Distributions
Yang Shu
Xingzhuo Guo
Jialong Wu
Ximei Wang
Jianmin Wang
Mingsheng Long
OODD
VLM
39
74
0
02 Feb 2023
Debiasing Vision-Language Models via Biased Prompts
Debiasing Vision-Language Models via Biased Prompts
Ching-Yao Chuang
Varun Jampani
Yuanzhen Li
Antonio Torralba
Stefanie Jegelka
VLM
12
95
0
31 Jan 2023
Debiased Fine-Tuning for Vision-language Models by Prompt Regularization
Debiased Fine-Tuning for Vision-language Models by Prompt Regularization
Beier Zhu
Yulei Niu
Saeil Lee
Minhoe Hur
Hanwang Zhang
VLM
VPVLM
19
22
0
29 Jan 2023
ZegOT: Zero-shot Segmentation Through Optimal Transport of Text Prompts
ZegOT: Zero-shot Segmentation Through Optimal Transport of Text Prompts
Kwanyoung Kim
Y. Oh
Jong Chul Ye
VLM
OT
CLIP
17
18
0
28 Jan 2023
Projected Subnetworks Scale Adaptation
Projected Subnetworks Scale Adaptation
Siddhartha Datta
N. Shadbolt
VLM
CLL
8
0
0
27 Jan 2023
Joint Representation Learning for Text and 3D Point Cloud
Joint Representation Learning for Text and 3D Point Cloud
Rui Huang
Xuran Pan
Henry Zheng
Haojun Jiang
Zhifeng Xie
S. Song
Gao Huang
11
19
0
18 Jan 2023
Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning with
  Multimodal Models
Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning with Multimodal Models
Zhiqiu Lin
Samuel Yu
Zhiyi Kuang
Deepak Pathak
Deva Ramana
VLM
13
88
0
16 Jan 2023
Logically at Factify 2: A Multi-Modal Fact Checking System Based on
  Evidence Retrieval techniques and Transformer Encoder Architecture
Logically at Factify 2: A Multi-Modal Fact Checking System Based on Evidence Retrieval techniques and Transformer Encoder Architecture
P. Verschuuren
Jie Gao
A. V. Eeden
Stylianos Oikonomou
Anil Bandhakavi
8
2
0
09 Jan 2023
Previous
123...10111213
Next