Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2211.09623
Cited By
v1
v2 (latest)
Cross-Modal Adapter for Vision-Language Retrieval
Pattern Recognition (Pattern Recogn.), 2022
17 November 2022
Haojun Jiang
Jianke Zhang
Rui Huang
Chunjiang Ge
Zanlin Ni
Jiwen Lu
Gao Huang
Re-assign community
ArXiv (abs)
PDF
HTML
Github (55★)
Papers citing
"Cross-Modal Adapter for Vision-Language Retrieval"
31 / 31 papers shown
Repeating Words for Video-Language Retrieval with Coarse-to-Fine Objectives
Haoyu Zhao
Jiaxi Gu
Shicong Wang
Xing Zhang
Hang Xu
Zuxuan Wu
Yu-Gang Jiang
198
0
0
20 Aug 2025
pFedMMA: Personalized Federated Fine-Tuning with Multi-Modal Adapter for Vision-Language Models
Sajjad Ghiasvand
Mahnoosh Alizadeh
Ramtin Pedarsani
VLM
382
1
0
07 Jul 2025
Representation Discrepancy Bridging Method for Remote Sensing Image-Text Retrieval
Hailong Ning
Siying Wang
Tao Lei
Xiaopeng Cao
Huanmin Dou
Bin Zhao
Asoke K. Nandi
Petia Radeva
198
3
0
22 May 2025
UP-Person: Unified Parameter-Efficient Transfer Learning for Text-based Person Retrieval
Yating Liu
Yaowei Li
Xiangyuan Lan
Wenming Yang
Zimo Liu
Q. Liao
313
4
0
14 Apr 2025
A Resource-Efficient Training Framework for Remote Sensing Text--Image Retrieval
Weihang Zhang
Jihao Li
Shuoke Li
Ziqing Niu
Jialiang Chen
Wenkai Zhang
VLM
295
1
0
18 Jan 2025
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation
Computer Vision and Pattern Recognition (CVPR), 2024
Claudia Cuttano
Gabriele Trivigno
Gabriele Rosi
Carlo Masone
Giuseppe Averta
VOS
581
38
0
26 Nov 2024
SPECTRUM: Semantic Processing and Emotion-informed video-Captioning Through Retrieval and Understanding Modalities
Ehsan Faghihi
Mohammedreza Zarenejad
Ali-Asghar Beheshti Shirazi
299
2
0
04 Nov 2024
Beyond Coarse-Grained Matching in Video-Text Retrieval
Asian Conference on Computer Vision (ACCV), 2024
Aozhu Chen
Hazel Doughty
Xirong Li
Cees G. M. Snoek
330
2
0
16 Oct 2024
Deep Transfer Learning: Model Framework and Error Analysis
Yuling Jiao
Huazhen Lin
Yuchen Luo
Jerry Zhijian Yang
518
2
0
12 Oct 2024
MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression Comprehension
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Ting Liu
Zunnan Xu
Yue Hu
Liangtao Shi
Zhiqiang Wang
Quanjun Yin
667
8
0
20 Sep 2024
OneEncoder: A Lightweight Framework for Progressive Alignment of Modalities
Hanane Azzag
Hanane Azzag
M. Lebbah
ObjD
383
3
0
17 Sep 2024
Selective Vision-Language Subspace Projection for Few-shot CLIP
Xingyu Zhu
Beier Zhu
Yi Tan
Shuo Wang
Yanbin Hao
Haiqi Zhang
VLM
265
23
0
24 Jul 2024
Structure-aware World Model for Probe Guidance via Large-scale Self-supervised Pre-train
Haojun Jiang
Meng Li
Zhenguo Sun
Ning Jia
Yu Sun
Shaqi Luo
Shiji Song
Gao Huang
334
6
0
28 Jun 2024
Cardiac Copilot: Automatic Probe Guidance for Echocardiography with World Model
International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2024
Haojun Jiang
Zhenguo Sun
Ning Jia
Meng Li
Yu Sun
Shaqi Luo
Shiji Song
Gao Huang
235
17
0
19 Jun 2024
Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Thong Nguyen
Yi Bin
Junbin Xiao
Leigang Qu
Yicong Li
Jay Zhangjie Wu
Cong-Duy Nguyen
See-Kiong Ng
Luu Anh Tuan
VLM
648
39
1
09 Jun 2024
RAP: Efficient Text-Video Retrieval with Sparse-and-Correlated Adapter
Meng Cao
Haoran Tang
Jinfa Huang
Peng Jin
Can Zhang
Ruyang Liu
Long Chen
Xiaodan Liang
Li-ming Yuan
Ge Li
340
23
0
29 May 2024
CLIP model is an Efficient Online Lifelong Learner
Leyuan Wang
Liuyu Xiang
Yujie Wei
Yunlong Wang
Zhaofeng He
VLM
CLL
294
4
0
24 May 2024
DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding
IEEE International Conference on Multimedia and Expo (ICME), 2024
Ting Liu
Xuyang Liu
Siteng Huang
Honggang Chen
Quanjun Yin
Long Qin
Donglin Wang
Yue Hu
321
13
0
10 May 2024
Efficient Remote Sensing with Harmonized Transfer Learning and Modality Alignment
Tengjun Huang
439
11
0
28 Apr 2024
DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval
Xiangpeng Yang
Linchao Zhu
Xiaohan Wang
Yi Yang
VLM
347
52
0
19 Jan 2024
FiGCLIP: Fine-Grained CLIP Adaptation via Densely Annotated Videos
S. DarshanSingh
Zeeshan Khan
Makarand Tapaswi
VLM
CLIP
257
6
0
15 Jan 2024
Few-shot Adaptation of Multi-modal Foundation Models: A Survey
Artificial Intelligence Review (Artif Intell Rev), 2024
Fan Liu
Tianshu Zhang
Wenwen Dai
Wenwen Cai
Wenwen Cai Xiaocong Zhou
Delong Chen
VLM
OffRL
377
58
0
03 Jan 2024
READ-PVLA: Recurrent Adapter with Partial Video-Language Alignment for Parameter-Efficient Transfer Learning in Low-Resource Video-Language Modeling
AAAI Conference on Artificial Intelligence (AAAI), 2023
Thong Nguyen
Xiaobao Wu
Xinshuai Dong
Khoi M. Le
Zhiyuan Hu
Cong-Duy Nguyen
See-Kiong Ng
Anh Tuan Luu
247
2
0
12 Dec 2023
RGNet: A Unified Clip Retrieval and Grounding Network for Long Videos
Tanveer Hannan
Md. Mohaiminul Islam
Thomas Seidl
Gedas Bertasius
574
13
0
11 Dec 2023
Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning
Huanjin Yao
Wenhao Wu
Zhiheng Li
VLM
358
16
0
27 Nov 2023
Unified Coarse-to-Fine Alignment for Video-Text Retrieval
IEEE International Conference on Computer Vision (ICCV), 2023
Ziyang Wang
Yi-Lin Sung
Feng Cheng
Gedas Bertasius
Joey Tianyi Zhou
470
89
0
18 Sep 2023
Parameter-Efficient Transfer Learning for Remote Sensing Image-Text Retrieval
IEEE Transactions on Geoscience and Remote Sensing (TGRS), 2023
Yuan. Yuan
Yangfan Zhan
Zhitong Xiong
VLM
290
70
0
24 Aug 2023
Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval
IEEE International Conference on Computer Vision (ICCV), 2023
Chaorui Deng
Qi Chen
Pengda Qin
Dave Zhenyu Chen
Qi Wu
VLM
CLIP
294
49
0
15 Aug 2023
TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter
Binjie Zhang
Yixiao Ge
Xuyuan Xu
Ying Shan
Mike Zheng Shou
239
9
0
22 Jun 2023
Visual Tuning
ACM Computing Surveys (ACM Comput. Surv.), 2023
Bruce X. B. Yu
Jianlong Chang
Haixin Wang
Lin Liu
Shijie Wang
...
Lingxi Xie
Haojie Li
Zhouchen Lin
Qi Tian
Chang Wen Chen
VLM
537
64
0
10 May 2023
Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning
Computer Vision and Pattern Recognition (CVPR), 2023
Siteng Huang
Biao Gong
Yutong Feng
Min Zhang
Yiliang Lv
Xuetao Zhang
CoGe
226
40
0
27 Mar 2023
1
Page 1 of 1