Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.19242
Cited By
Deformable Attentive Visual Enhancement for Referring Segmentation Using Vision-Language Model
25 May 2025
Alaa Dalaq
Muzammil Behzad
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deformable Attentive Visual Enhancement for Referring Segmentation Using Vision-Language Model"
50 / 54 papers shown
Title
OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring Modeling
Linhui Xiao
Xiaoshan Yang
Fang Peng
Yaowei Wang
Changsheng Xu
ObjD
65
6
0
10 Oct 2024
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
Shengbang Tong
Ellis L Brown
Penghao Wu
Sanghyun Woo
Manoj Middepogu
...
Xichen Pan
Austin Wang
Rob Fergus
Yann LeCun
Saining Xie
3DV
MLLM
76
321
0
24 Jun 2024
Fuse & Calibrate: A bi-directional Vision-Language Guided Framework for Referring Image Segmentation
Yichen Yan
Xingjian He
Sihan Chen
Shichen Lu
Jing Liu
41
1
0
18 May 2024
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation
Yichi Zhang
Ziqiao Ma
Xiaofeng Gao
Suhaila Shakiah
Qiaozi Gao
Joyce Chai
MLLM
VLM
83
42
0
26 Feb 2024
Voila-A: Aligning Vision-Language Models with User's Gaze Attention
Kun Yan
Lei Ji
Zeyu Wang
Yuntao Wang
Nan Duan
Shuai Ma
81
10
0
22 Dec 2023
RISAM: Referring Image Segmentation via Mutual-Aware Attention Features
Mengxi Zhang
Yiming Liu
Xiangjun Yin
Huanjing Yue
Jingyu Yang
52
1
0
27 Nov 2023
MMSFormer: Multimodal Transformer for Material and Semantic Segmentation
Md Kaykobad Reza
Ashley Prater-Bennette
M. Salman Asif
99
13
0
07 Sep 2023
Contrastive Grouping with Transformer for Referring Image Segmentation
Jiajin Tang
Ge Zheng
Cheng Shi
Sibei Yang
ViT
79
39
0
02 Sep 2023
Referring Image Segmentation Using Text Supervision
Fang Liu
Yuhao Liu
Yuqiu Kong
Ke Xu
Lulu Zhang
Baocai Yin
Gerhard Hancke
Rynson W. H. Lau
47
29
0
28 Aug 2023
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
Jinze Bai
Shuai Bai
Shusheng Yang
Shijie Wang
Sinan Tan
Peng Wang
Junyang Lin
Chang Zhou
Jingren Zhou
MLLM
VLM
ObjD
65
851
0
24 Aug 2023
Relational Contrastive Learning for Scene Text Recognition
Jinglei Zhang
Tiancheng Lin
Yi Xu
Kaibo Chen
Rui Zhang
41
10
0
01 Aug 2023
GRES: Generalized Referring Expression Segmentation
Chang Liu
Henghui Ding
Xudong Jiang
60
150
0
01 Jun 2023
MMNet: Multi-Mask Network for Referring Image Segmentation
Yimin Yan
Xingjian He
Wenxuan Wan
Qingbin Liu
EgoV
36
2
0
24 May 2023
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Wenliang Dai
Junnan Li
Dongxu Li
A. M. H. Tiong
Junqi Zhao
Weisheng Wang
Boyang Albert Li
Pascale Fung
Steven C. H. Hoi
MLLM
VLM
62
1,977
0
11 May 2023
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
329
4,506
0
17 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
379
4,406
0
30 Jan 2023
Generalized Decoding for Pixel, Image, and Language
Xueyan Zou
Zi-Yi Dou
Jianwei Yang
Zhe Gan
Linjie Li
...
Lu Yuan
Nanyun Peng
Lijuan Wang
Yong Jae Lee
Jianfeng Gao
VLM
MLLM
ObjD
46
247
0
21 Dec 2022
Beyond Instance Discrimination: Relation-aware Contrastive Self-supervised Learning
Yifei Zhang
Chang-rui Liu
Yu Zhou
Weiping Wang
QiXiang Ye
Xiangyang Ji
SSL
ISeg
BDL
35
7
0
02 Nov 2022
OmniVL:One Foundation Model for Image-Language and Video-Language Tasks
Junke Wang
Dongdong Chen
Zuxuan Wu
Chong Luo
Luowei Zhou
Yucheng Zhao
Yujia Xie
Ce Liu
Yu-Gang Jiang
Lu Yuan
MLLM
VLM
65
150
0
15 Sep 2022
Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks
Wenhui Wang
Hangbo Bao
Li Dong
Johan Bjorck
Zhiliang Peng
...
Kriti Aggarwal
O. Mohammed
Saksham Singhal
Subhojit Som
Furu Wei
MLLM
VLM
ViT
110
636
0
22 Aug 2022
Fast-MoCo: Boost Momentum-based Contrastive Learning with Combinatorial Patches
Yuanzheng Ci
Chen Lin
Lei Bai
Wanli Ouyang
SSL
45
26
0
17 Jul 2022
Selective-Supervised Contrastive Learning with Noisy Labels
Shikun Li
Xiaobo Xia
Shiming Ge
Tongliang Liu
NoLa
51
174
0
08 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
436
4,283
0
28 Jan 2022
LAVT: Language-Aware Vision Transformer for Referring Image Segmentation
Zhao Yang
Jiaqi Wang
Yansong Tang
Kai-xiang Chen
Hengshuang Zhao
Philip Torr
188
317
0
04 Dec 2021
CRIS: CLIP-Driven Referring Image Segmentation
Zhaoqing Wang
Yu Lu
Qiang Li
Xunqiang Tao
Yan Guo
Ming Gong
Tongliang Liu
VLM
93
367
0
30 Nov 2021
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Faisal Ahmed
Zicheng Liu
Yumao Lu
Lijuan Wang
60
114
0
23 Nov 2021
MaIL: A Unified Mask-Image-Language Trimodal Network for Referring Image Segmentation
Zizhang Li
Mengmeng Wang
Jianbiao Mei
Yong Liu
27
19
0
21 Nov 2021
Vision-Language Transformer and Query Generation for Referring Segmentation
Henghui Ding
Chang-rui Liu
Suchen Wang
Xudong Jiang
64
258
0
12 Aug 2021
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
Aishwarya Kamath
Mannat Singh
Yann LeCun
Gabriel Synnaeve
Ishan Misra
Nicolas Carion
ObjD
VLM
147
872
0
26 Apr 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
661
28,659
0
26 Feb 2021
ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision
Wonjae Kim
Bokyung Son
Ildoo Kim
VLM
CLIP
94
1,722
0
05 Feb 2021
Dense Contrastive Learning for Self-Supervised Visual Pre-Training
Xinlong Wang
Rufeng Zhang
Chunhua Shen
Tao Kong
Lei Li
SSL
52
675
0
18 Nov 2020
Unsupervised Learning of Dense Visual Representations
Pedro H. O. Pinheiro
Amjad Almahairi
Ryan Y. Benmalek
Florian Golemo
Aaron Courville
SSL
MDE
67
189
0
11 Nov 2020
PhraseCut: Language-based Image Segmentation in the Wild
Chenyun Wu
Zhe Lin
Scott D. Cohen
Trung Bui
Subhransu Maji
VLM
32
113
0
03 Aug 2020
Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation
Gen Luo
Yiyi Zhou
Xiaoshuai Sun
Liujuan Cao
Chenglin Wu
Cheng Deng
Rongrong Ji
ObjD
222
288
0
19 Mar 2020
A Simple Framework for Contrastive Learning of Visual Representations
Ting-Li Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
SSL
186
18,523
0
13 Feb 2020
Momentum Contrast for Unsupervised Visual Representation Learning
Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross B. Girshick
SSL
99
11,959
0
13 Nov 2019
Cross-Modal Self-Attention Network for Referring Image Segmentation
Linwei Ye
Mrigank Rochan
Zhi Liu
Yang Wang
EgoV
18
472
0
09 Apr 2019
Dynamic Multimodal Instance Segmentation guided by natural language queries
Edgar Margffoy-Tuay
Juan C. Pérez
Emilio Botero
Pablo Arbelaez
37
172
0
06 Jul 2018
Interactive Visual Grounding of Referring Expressions for Human-Robot Interaction
Mohit Shridhar
David Hsu
42
144
0
11 Jun 2018
Context Encoding for Semantic Segmentation
Hang Zhang
Kristin J. Dana
Jianping Shi
Zhongyue Zhang
Xiaogang Wang
A. Tyagi
Amit Agrawal
SSeg
77
1,245
0
23 Mar 2018
MAttNet: Modular Attention Network for Referring Expression Comprehension
Licheng Yu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Joey Tianyi Zhou
Tamara L. Berg
ObjD
85
822
0
24 Jan 2018
Grounding Spatio-Semantic Referring Expressions for Human-Robot Interaction
Mohit Shridhar
David Hsu
ObjD
39
21
0
18 Jul 2017
Mask R-CNN
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
298
27,018
0
20 Mar 2017
Deformable Convolutional Networks
Jifeng Dai
Haozhi Qi
Yuwen Xiong
Yi Li
Guodong Zhang
Han Hu
Yichen Wei
184
5,291
0
17 Mar 2017
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
95
1,250
0
31 Jul 2016
V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation
Fausto Milletari
Nassir Navab
Seyed-Ahmad Ahmadi
190
8,615
0
15 Jun 2016
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Liang-Chieh Chen
George Papandreou
Iasonas Kokkinos
Kevin Patrick Murphy
Alan Yuille
SSeg
166
18,136
0
02 Jun 2016
Fully Convolutional Networks for Semantic Segmentation
Evan Shelhamer
Jonathan Long
Trevor Darrell
VOS
SSeg
286
37,704
0
20 May 2016
Segmentation from Natural Language Expressions
Ronghang Hu
Marcus Rohrbach
Trevor Darrell
VLM
EgoV
58
430
0
20 Mar 2016
1
2
Next