Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.06061
Cited By
Visual Tuning
10 May 2023
Bruce X. B. Yu
Jianlong Chang
Haixin Wang
Lin Liu
Shijie Wang
Zhiyu Wang
Junfan Lin
Lingxi Xie
Haojie Li
Zhouchen Lin
Qi Tian
Chang Wen Chen
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Visual Tuning"
33 / 33 papers shown
Title
Evaluating Vision Language Model Adaptations for Radiology Report Generation in Low-Resource Languages
Marco Salmè
R. Sicilia
Paolo Soda
V. Guarrasi
30
0
0
02 May 2025
Agent AI: Surveying the Horizons of Multimodal Interaction
Zane Durante
Qiuyuan Huang
Naoki Wake
Ran Gong
J. Park
...
Yejin Choi
Katsushi Ikeuchi
Hoi Vo
Fei-Fei Li
Jianfeng Gao
LM&Ro
92
28
0
07 Jan 2024
Consolidator: Mergeable Adapter with Grouped Connections for Visual Adaptation
Tianxiang Hao
Hui Chen
Yuchen Guo
Guiguang Ding
34
9
0
30 Apr 2023
Being Comes from Not-being: Open-vocabulary Text-to-Motion Generation with Wordless Training
Junfan Lin
Jianlong Chang
Lingbo Liu
Guanbin Li
Liang Lin
Qi Tian
Changan Chen
VGen
27
26
0
28 Oct 2022
Hierarchical3D Adapters for Long Video-to-text Summarization
Pinelopi Papalampidi
Mirella Lapata
VGen
19
7
0
10 Oct 2022
Polyhistor: Parameter-Efficient Multi-Task Adaptation for Dense Vision Tasks
Yen-Cheng Liu
Chih-Yao Ma
Junjiao Tian
Zijian He
Z. Kira
111
47
0
07 Oct 2022
MaPLe: Multi-modal Prompt Learning
Muhammad Uzair Khattak
H. Rasheed
Muhammad Maaz
Salman Khan
F. Khan
VPVLM
VLM
178
521
0
06 Oct 2022
LPT: Long-tailed Prompt Tuning for Image Classification
Bowen Dong
Pan Zhou
Shuicheng Yan
W. Zuo
VPVLM
VLM
33
52
0
03 Oct 2022
Visual Prompt Tuning for Generative Transfer Learning
Kihyuk Sohn
Yuan Hao
José Lezama
Luisa F. Polanía
Huiwen Chang
Han Zhang
Irfan Essa
Lu Jiang
VPVLM
VLM
42
80
0
03 Oct 2022
Towards a Unified View on Visual Parameter-Efficient Transfer Learning
Bruce X. B. Yu
Jianlong Chang
Lin Liu
Qi Tian
Changan Chen
VPVLM
VLM
59
33
0
03 Oct 2022
Multi-Prompt Alignment for Multi-Source Unsupervised Domain Adaptation
Haoran Chen
Xintong Han
Zuxuan Wu
Yu-Gang Jiang
71
25
0
30 Sep 2022
Collaboration of Pre-trained Models Makes Better Few-shot Learner
Renrui Zhang
Bohao Li
Wei Zhang
Hao Dong
Hongsheng Li
Peng Gao
Yu Qiao
VLM
40
7
0
25 Sep 2022
Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
Manli Shu
Weili Nie
De-An Huang
Zhiding Yu
Tom Goldstein
Anima Anandkumar
Chaowei Xiao
VLM
VPVLM
152
278
0
15 Sep 2022
Diffusion Models: A Comprehensive Survey of Methods and Applications
Ling Yang
Zhilong Zhang
Yingxia Shao
Shenda Hong
Runsheng Xu
Yue Zhao
Wentao Zhang
Bin Cui
Ming-Hsuan Yang
DiffM
MedIm
208
1,277
0
02 Sep 2022
Continual Learning with Transformers for Image Classification
B. Ermiş
Giovanni Zappella
Martin Wistuba
Aditya Rawal
Cédric Archambeau
CLL
28
21
0
28 Jun 2022
Prompt-aligned Gradient for Prompt Tuning
Beier Zhu
Yulei Niu
Yucheng Han
Yuehua Wu
Hanwang Zhang
VLM
167
263
0
30 May 2022
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition
Shoufa Chen
Chongjian Ge
Zhan Tong
Jiangliu Wang
Yibing Song
Jue Wang
Ping Luo
138
631
0
26 May 2022
PointCLIP: Point Cloud Understanding by CLIP
Renrui Zhang
Ziyu Guo
Wei Zhang
Kunchang Li
Xupeng Miao
Bin Cui
Yu Qiao
Peng Gao
Hongsheng Li
VLM
3DPC
158
428
0
04 Dec 2021
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
255
7,337
0
11 Nov 2021
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Renrui Zhang
Rongyao Fang
Wei Zhang
Peng Gao
Kunchang Li
Jifeng Dai
Yu Qiao
Hongsheng Li
VLM
170
281
0
06 Nov 2021
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
319
2,108
0
02 Sep 2021
CMT: Convolutional Neural Networks Meet Vision Transformers
Jianyuan Guo
Kai Han
Han Wu
Yehui Tang
Chunjing Xu
Yunhe Wang
Chang Xu
ViT
320
614
0
13 Jul 2021
Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges
B. Bischl
Martin Binder
Michel Lang
Tobias Pielok
Jakob Richter
...
Theresa Ullmann
Marc Becker
A. Boulesteix
Difan Deng
Marius Lindauer
69
268
0
13 Jul 2021
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Tsung-Yi Lin
Weicheng Kuo
Yin Cui
VLM
ObjD
203
698
0
28 Apr 2021
Distilling Knowledge via Knowledge Review
Pengguang Chen
Shu-Lin Liu
Hengshuang Zhao
Jiaya Jia
144
308
0
19 Apr 2021
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
276
1,490
0
27 Feb 2021
Localization Distillation for Dense Object Detection
Zhaohui Zheng
Rongguang Ye
Ping Wang
Dongwei Ren
W. Zuo
Qibin Hou
Ming-Ming Cheng
ObjD
76
111
0
24 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,538
0
24 Feb 2021
Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with
1
/
n
1/n
1/
n
Parameters
Aston Zhang
Yi Tay
Shuai Zhang
Alvin Chan
A. Luu
S. Hui
Jie Fu
MQ
163
83
0
17 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
2,875
0
11 Feb 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
216
2,404
0
04 Jan 2021
Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting
Lingbo Liu
Jiaqi Chen
Hefeng Wu
Guanbin Li
Chenglong Li
Liang Lin
43
93
0
08 Dec 2020
Meta Pseudo Labels
Hieu H. Pham
Zihang Dai
Qizhe Xie
Minh-Thang Luong
Quoc V. Le
VLM
245
648
0
23 Mar 2020
1