Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.11545
Cited By
Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation
21 July 2023
Zunnan Xu
Zhihong Chen
Yong Zhang
Yibing Song
Xiang Wan
Guanbin Li
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation"
17 / 17 papers shown
Title
FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model
Jun Zhou
J. Li
Zunnan Xu
Hanhui Li
Yiji Cheng
Fa-Ting Hong
Qin Lin
Qinglin Lu
Xiaodan Liang
DiffM
65
1
0
25 Mar 2025
MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression Comprehension
Ting Liu
Zunnan Xu
Yue Hu
Liangtao Shi
Zhiqiang Wang
Quanjun Yin
57
2
0
03 Jan 2025
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation
Claudia Cuttano
Gabriele Trivigno
Gabriele Rosi
Carlo Masone
Giuseppe Averta
VOS
101
2
0
26 Nov 2024
A Parameter-Efficient Tuning Framework for Language-guided Object Grounding and Robot Grasping
Houjian Yu
Mingen Li
Alireza Rezazadeh
Yang Yang
Changhyun Choi
40
1
0
28 Sep 2024
SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation
Yi-Chia Chen
Wei-Hua Li
Cheng Sun
Yu-Chiang Frank Wang
Chu-Song Chen
VLM
30
10
0
01 Sep 2024
DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding
Ting Liu
Xuyang Liu
Siteng Huang
Honggang Chen
Quanjun Yin
Long Qin
Donglin Wang
Yue Hu
30
5
0
10 May 2024
MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models
Zunnan Xu
Yukang Lin
Haonan Han
Sicheng Yang
Ronghui Li
Yachao Zhang
Xiu Li
Mamba
46
25
0
14 Mar 2024
RESMatch: Referring Expression Segmentation in a Semi-Supervised Manner
Ying-Dong Zang
Chenglong Fu
Runlong Cao
Didi Zhu
Min Zhang
Wenjun Hu
Lanyun Zhu
Tianrun Chen
21
6
0
08 Feb 2024
Synchronizing Vision and Language: Bidirectional Token-Masking AutoEncoder for Referring Image Segmentation
Minhyeok Lee
Dogyoon Lee
Jungho Lee
Suhwan Cho
Heeseung Choi
Ig-Jae Kim
Sangyoun Lee
23
0
0
29 Nov 2023
Strategic Preys Make Acute Predators: Enhancing Camouflaged Object Detectors by Generating Camouflaged Objects
Chunming He
Kai Li
Yachao Zhang
Yulun Zhang
Z. Guo
Xiu Li
Martin Danelljan
F. I. F. Richard Yu
AAML
25
44
0
06 Aug 2023
Extending CLIP's Image-Text Alignment to Referring Image Segmentation
Seoyeon Kim
Minguk Kang
Dongwon Kim
Jaesik Park
Suha Kwak
VLM
12
10
0
14 Jun 2023
Align, Reason and Learn: Enhancing Medical Vision-and-Language Pre-training with Knowledge
Zhihong Chen
Guanbin Li
Xiang Wan
119
65
0
15 Sep 2022
Multi-Modal Masked Autoencoders for Medical Vision-and-Language Pre-Training
Zhihong Chen
Yu Du
Jinpeng Hu
Yang Liu
Guanbin Li
Xiang Wan
Tsung-Hui Chang
79
107
0
15 Sep 2022
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition
Shoufa Chen
Chongjian Ge
Zhan Tong
Jiangliu Wang
Yibing Song
Jue Wang
Ping Luo
141
631
0
26 May 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
388
4,010
0
28 Jan 2022
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
322
2,249
0
02 Sep 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
1