ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.06825
  4. Cited By
VL-Adapter: Parameter-Efficient Transfer Learning for
  Vision-and-Language Tasks

VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks

13 December 2021
Yi-Lin Sung
Jaemin Cho
Mohit Bansal
    VLM
    VPVLM
ArXivPDFHTML

Papers citing "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks"

50 / 51 papers shown
Title
Saliency-Motion Guided Trunk-Collateral Network for Unsupervised Video Object Segmentation
Saliency-Motion Guided Trunk-Collateral Network for Unsupervised Video Object Segmentation
Xiangyu Zheng
Wanyun Li
Songcheng He
Xiaoqiang Li
Xiaoqiang Li
We Zhang
VOS
25
0
0
08 Apr 2025
Vision-Language Models for Edge Networks: A Comprehensive Survey
Vision-Language Models for Edge Networks: A Comprehensive Survey
Ahmed Sharshar
Latif U. Khan
Waseem Ullah
Mohsen Guizani
VLM
62
3
0
11 Feb 2025
DiTASK: Multi-Task Fine-Tuning with Diffeomorphic Transformations
DiTASK: Multi-Task Fine-Tuning with Diffeomorphic Transformations
Krishna Sri Ipsit Mantri
Carola-Bibiane Schönlieb
Bruno Ribeiro
Chaim Baskin
Moshe Eliasof
36
0
0
09 Feb 2025
Transforming Vision Transformer: Towards Efficient Multi-Task Asynchronous Learning
Transforming Vision Transformer: Towards Efficient Multi-Task Asynchronous Learning
Hanwen Zhong
Jiaxin Chen
Yutong Zhang
Di Huang
Yunhong Wang
MoE
42
0
0
12 Jan 2025
Collective Model Intelligence Requires Compatible Specialization
Collective Model Intelligence Requires Compatible Specialization
Jyothish Pari
Samy Jelassi
Pulkit Agrawal
MoMe
30
1
0
04 Nov 2024
Situational Scene Graph for Structured Human-centric Situation Understanding
Situational Scene Graph for Structured Human-centric Situation Understanding
Chinthani Sugandhika
Chen Li
Deepu Rajan
Basura Fernando
60
1
0
30 Oct 2024
TransAgent: Transfer Vision-Language Foundation Models with
  Heterogeneous Agent Collaboration
TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration
Yiwei Guo
Shaobin Zhuang
Kunchang Li
Yu Qiao
Yali Wang
VLM
CLIP
21
0
0
16 Oct 2024
DA-Ada: Learning Domain-Aware Adapter for Domain Adaptive Object
  Detection
DA-Ada: Learning Domain-Aware Adapter for Domain Adaptive Object Detection
H. Li
Rui Zhang
Hantao Yao
X. Zhang
Yifan Hao
Xinkai Song
Xiaqing Li
Yongwei Zhao
Ling Li
Yunji Chen
ObjD
VLM
29
3
0
11 Oct 2024
CROME: Cross-Modal Adapters for Efficient Multimodal LLM
CROME: Cross-Modal Adapters for Efficient Multimodal LLM
Sayna Ebrahimi
Sercan Ö. Arik
Tejas Nama
Tomas Pfister
37
1
0
13 Aug 2024
Effectively Leveraging CLIP for Generating Situational Summaries of Images and Videos
Effectively Leveraging CLIP for Generating Situational Summaries of Images and Videos
Dhruv Verma
Debaditya Roy
Basura Fernando
27
1
0
30 Jul 2024
Ego-VPA: Egocentric Video Understanding with Parameter-efficient Adaptation
Ego-VPA: Egocentric Video Understanding with Parameter-efficient Adaptation
Tz-Ying Wu
Kyle Min
Subarna Tripathi
Nuno Vasconcelos
EgoV
53
0
0
28 Jul 2024
LoRA-Pro: Are Low-Rank Adapters Properly Optimized?
LoRA-Pro: Are Low-Rank Adapters Properly Optimized?
Zhengbo Wang
Jian Liang
Ran He
Zilei Wang
Tieniu Tan
50
15
0
25 Jul 2024
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
Yifei Gao
Jie Ou
Lei Wang
Fanhua Shang
Jaji Wu
MQ
37
0
0
22 Jul 2024
$\textit{Trans-LoRA}$: towards data-free Transferable Parameter
  Efficient Finetuning
Trans-LoRA\textit{Trans-LoRA}Trans-LoRA: towards data-free Transferable Parameter Efficient Finetuning
Runqian Wang
Soumya Ghosh
David D. Cox
Diego Antognini
Aude Oliva
Rogerio Feris
Leonid Karlinsky
30
1
0
27 May 2024
Improving Continuous Sign Language Recognition with Adapted Image Models
Improving Continuous Sign Language Recognition with Adapted Image Models
Lianyu Hu
Tongkai Shi
Liqing Gao
Zekang Liu
Wei Feng
VLM
18
5
0
12 Apr 2024
Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic
  Segmentation
Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation
Sina Hajimiri
Ismail Ben Ayed
Jose Dolz
VLM
31
22
0
12 Apr 2024
MmAP : Multi-modal Alignment Prompt for Cross-domain Multi-task Learning
MmAP : Multi-modal Alignment Prompt for Cross-domain Multi-task Learning
Yi Xin
Junlong Du
Qiang Wang
Ke Yan
Shouhong Ding
VLM
32
45
0
14 Dec 2023
End-to-End Temporal Action Detection with 1B Parameters Across 1000
  Frames
End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames
Shuming Liu
Chen-Da Liu-Zhang
Chen Zhao
Bernard Ghanem
24
25
0
28 Nov 2023
Decomposed Prompt Tuning via Low-Rank Reparameterization
Decomposed Prompt Tuning via Low-Rank Reparameterization
Yao Xiao
Lu Xu
Jiaxi Li
Wei Lu
Xiaoli Li
VLM
13
6
0
16 Oct 2023
Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts
Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts
Bipin Rajendran
Bashir M. Al-Hashimi
MLLM
VLM
26
2
0
27 Sep 2023
BLIP-Adapter: Parameter-Efficient Transfer Learning for Mobile
  Screenshot Captioning
BLIP-Adapter: Parameter-Efficient Transfer Learning for Mobile Screenshot Captioning
Ching-Yu Chiang
I-Hua Chang
Shih-Wei Liao
38
1
0
26 Sep 2023
Test-Time Training for Speech
Test-Time Training for Speech
Sri Harsha Dumpala
Chandramouli Shama Sastry
Sageev Oore
25
1
0
19 Sep 2023
Visual Prompt Flexible-Modal Face Anti-Spoofing
Visual Prompt Flexible-Modal Face Anti-Spoofing
Zitong Yu
Rizhao Cai
Yawen Cui
Ajian Liu
Changsheng Chen
30
6
0
26 Jul 2023
Transfer Learning for Portfolio Optimization
Transfer Learning for Portfolio Optimization
Haoyang Cao
Haotian Gu
Xin Guo
M. Rosenbaum
8
0
0
25 Jul 2023
Large AI Model-Based Semantic Communications
Large AI Model-Based Semantic Communications
Feibo Jiang
Yubo Peng
Li Dong
Kezhi Wang
Kun Yang
Cunhua Pan
Xiaohu You
25
47
0
07 Jul 2023
Multimodal Prompt Learning for Product Title Generation with Extremely
  Limited Labels
Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels
Bang-ju Yang
Fenglin Liu
Zheng Li
Qingyu Yin
Chenyu You
Bing Yin
Yuexian Zou
VLM
26
5
0
05 Jul 2023
Do We Really Need a Large Number of Visual Prompts?
Do We Really Need a Large Number of Visual Prompts?
Youngeun Kim
Yuhang Li
Abhishek Moitra
Ruokai Yin
Priyadarshini Panda
VLM
VPVLM
34
5
0
26 May 2023
Self-Chained Image-Language Model for Video Localization and Question
  Answering
Self-Chained Image-Language Model for Video Localization and Question Answering
Shoubin Yu
Jaemin Cho
Prateek Yadav
Mohit Bansal
31
129
0
11 May 2023
I2I: Initializing Adapters with Improvised Knowledge
I2I: Initializing Adapters with Improvised Knowledge
Tejas Srinivasan
Furong Jia
Mohammad Rostami
Jesse Thomason
CLL
24
6
0
04 Apr 2023
Procedure-Aware Pretraining for Instructional Video Understanding
Procedure-Aware Pretraining for Instructional Video Understanding
Honglu Zhou
Roberto Martín-Martín
Mubbasir Kapadia
Silvio Savarese
Juan Carlos Niebles
23
38
0
31 Mar 2023
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot
  AV-ASR
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
27
15
0
29 Mar 2023
HOICLIP: Efficient Knowledge Transfer for HOI Detection with
  Vision-Language Models
HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models
Sha Ning
Longtian Qiu
Yongfei Liu
Xuming He
VLM
16
41
0
28 Mar 2023
Feasibility and Transferability of Transfer Learning: A Mathematical
  Framework
Feasibility and Transferability of Transfer Learning: A Mathematical Framework
Haoyang Cao
Haotian Gu
Xin Guo
M. Rosenbaum
15
2
0
27 Jan 2023
Hierarchical3D Adapters for Long Video-to-text Summarization
Hierarchical3D Adapters for Long Video-to-text Summarization
Pinelopi Papalampidi
Mirella Lapata
VGen
27
12
0
10 Oct 2022
Towards Parameter-Efficient Integration of Pre-Trained Language Models
  In Temporal Video Grounding
Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video Grounding
Erica K. Shimomoto
Edison Marrese-Taylor
Hiroya Takamura
Ichiro Kobayashi
Hideki Nakayama
Yusuke Miyao
8
7
0
26 Sep 2022
Learning More May Not Be Better: Knowledge Transferability in Vision and
  Language Tasks
Learning More May Not Be Better: Knowledge Transferability in Vision and Language Tasks
Tianwei Chen
Noa Garcia
Mayu Otani
Chenhui Chu
Yuta Nakashima
Hajime Nagahara
VLM
22
0
0
23 Aug 2022
Multimodal Learning with Transformers: A Survey
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
41
518
0
13 Jun 2022
Parameter-efficient Model Adaptation for Vision Transformers
Parameter-efficient Model Adaptation for Vision Transformers
Xuehai He
Chunyuan Li
Pengchuan Zhang
Jianwei Yang
X. Wang
20
80
0
29 Mar 2022
Hyperdecoders: Instance-specific decoders for multi-task NLP
Hyperdecoders: Instance-specific decoders for multi-task NLP
Hamish Ivison
Matthew E. Peters
AI4CE
19
20
0
15 Mar 2022
HyperPELT: Unified Parameter-Efficient Language Model Tuning for Both
  Language and Vision-and-Language Tasks
HyperPELT: Unified Parameter-Efficient Language Model Tuning for Both Language and Vision-and-Language Tasks
Zhengkun Zhang
Wenya Guo
Xiaojun Meng
Yasheng Wang
Yadao Wang
Xin Jiang
Qun Liu
Zhenglu Yang
26
15
0
08 Mar 2022
Video Transformers: A Survey
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
20
101
0
16 Jan 2022
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language
  Modeling
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Renrui Zhang
Rongyao Fang
Wei Zhang
Peng Gao
Kunchang Li
Jifeng Dai
Yu Qiao
Hongsheng Li
VLM
184
384
0
06 Nov 2021
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
Peng Gao
Shijie Geng
Renrui Zhang
Teli Ma
Rongyao Fang
Yongfeng Zhang
Hongsheng Li
Yu Qiao
VLM
CLIP
54
974
0
09 Oct 2021
An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA
An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Yumao Lu
Zicheng Liu
Lijuan Wang
169
401
0
10 Sep 2021
Learning to Prompt for Vision-Language Models
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
322
2,249
0
02 Sep 2021
How Much Can CLIP Benefit Vision-and-Language Tasks?
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Mohit Bansal
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
185
403
0
13 Jul 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
278
3,784
0
18 Apr 2021
Beyond Fully-Connected Layers with Quaternions: Parameterization of
  Hypercomplex Multiplications with $1/n$ Parameters
Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with 1/n1/n1/n Parameters
Aston Zhang
Yi Tay
Shuai Zhang
Alvin Chan
A. Luu
S. Hui
Jie Fu
MQ
166
83
0
17 Feb 2021
High-Performance Large-Scale Image Recognition Without Normalization
High-Performance Large-Scale Image Recognition Without Normalization
Andrew Brock
Soham De
Samuel L. Smith
Karen Simonyan
VLM
220
510
0
11 Feb 2021
Unifying Vision-and-Language Tasks via Text Generation
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Mohit Bansal
MLLM
249
518
0
04 Feb 2021
12
Next