ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.15174
  4. Cited By
CRIS: CLIP-Driven Referring Image Segmentation
v1v2 (latest)

CRIS: CLIP-Driven Referring Image Segmentation

30 November 2021
Zhaoqing Wang
Yu Lu
Qiang Li
Xunqiang Tao
Yan Guo
Ming Gong
Tongliang Liu
    VLM
ArXiv (abs)PDFHTML

Papers citing "CRIS: CLIP-Driven Referring Image Segmentation"

50 / 288 papers shown
IteRPrimE: Zero-shot Referring Image Segmentation with Iterative Grad-CAM Refinement and Primary Word EmphasisAAAI Conference on Artificial Intelligence (AAAI), 2025
Yun Wang
Jingchen Ni
Yong-Jin Liu
Chun Yuan
Yansong Tang
292
15
0
02 Mar 2025
Grad-ECLIP: Gradient-based Visual and Textual Explanations for CLIP
Grad-ECLIP: Gradient-based Visual and Textual Explanations for CLIP
Chenyang Zhao
Kun Wang
J. H. Hsiao
Antoni B. Chan
CLIP
267
7
0
26 Feb 2025
A Survey on Foundation-Model-Based Industrial Defect Detection
A Survey on Foundation-Model-Based Industrial Defect Detection
Tianle Yang
Luyao Chang
Jiadong Yan
Jiajian Li
Zhi Wang
Ke Zhang
AI4CE
515
6
0
26 Feb 2025
Pixel-Level Reasoning Segmentation via Multi-turn Conversations
Pixel-Level Reasoning Segmentation via Multi-turn ConversationsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Dexian Cai
Xiaocui Yang
Yongkang Liu
Daling Wang
Shi Feng
Yifei Zhang
Soujanya Poria
LRM
335
3
0
13 Feb 2025
SIREN: Semantic, Initialization-Free Registration of Multi-Robot Gaussian Splatting Maps
Ola Shorinwa
Jiankai Sun
Mac Schwager
Anirudha Majumdar
3DGS
381
9
0
10 Feb 2025
Efficient Redundancy Reduction for Open-Vocabulary Semantic Segmentation
Efficient Redundancy Reduction for Open-Vocabulary Semantic Segmentation
Lin Chen
Qi Yang
Kun Ding
Tianying Wang
Gang Shen
Fei Li
Qiyuan Cao
Shiming Xiang
VLM
207
2
0
29 Jan 2025
Dynamic Scene Understanding from Vision-Language Representations
Dynamic Scene Understanding from Vision-Language Representations
Shahaf Pruss
Morris Alper
Hadar Averbuch-Elor
OCL
1.1K
1
0
20 Jan 2025
A Comprehensive Survey of Foundation Models in Medicine
A Comprehensive Survey of Foundation Models in MedicineIEEE Reviews in Biomedical Engineering (RBME), 2024
Wasif Khan
Seowung Leem
Kyle B. See
Joshua K. Wong
Shaoting Zhang
R. Fang
AI4CELM&MAVLM
767
71
0
17 Jan 2025
The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
The Devil is in Temporal Token: High Quality Video Reasoning SegmentationComputer Vision and Pattern Recognition (CVPR), 2025
Sitong Gong
Yunzhi Zhuge
Lu Zhang
Zhiyong Yang
Pingping Zhang
Huchuan Lu
245
17
0
15 Jan 2025
Continual Test-Time Adaptation for Single Image Defocus Deblurring via Causal Siamese Networks
Continual Test-Time Adaptation for Single Image Defocus Deblurring via Causal Siamese NetworksInternational Journal of Computer Vision (IJCV), 2025
Shuang Cui
Yi Li
Jiangmeng Li
Xiongxin Tang
Fuchun Sun
Jianwei Niu
Hui Xiong
292
1
0
15 Jan 2025
Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints
Multi-task Visual Grounding with Coarse-to-Fine Consistency ConstraintsAAAI Conference on Artificial Intelligence (AAAI), 2025
Ming Dai
Jian Li
Jiedong Zhuang
Xian Zhang
Wankou Yang
ObjD
369
13
0
12 Jan 2025
Hierarchical Alignment-enhanced Adaptive Grounding Network for Generalized Referring Expression ComprehensionAAAI Conference on Artificial Intelligence (AAAI), 2025
Yaxian Wang
Henghui Ding
Shuting He
Xudong Jiang
Bifan Wei
Jun Liu
ObjD
261
8
0
03 Jan 2025
Towards Visual Grounding: A Survey
Towards Visual Grounding: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
964
31
0
28 Dec 2024
Cross-Modal Few-Shot Learning with Second-Order Neural Ordinary
  Differential Equations
Cross-Modal Few-Shot Learning with Second-Order Neural Ordinary Differential EquationsAAAI Conference on Artificial Intelligence (AAAI), 2024
Yi Zhang
Chun-Wun Cheng
Junyi He
Zhihai He
Carola-Bibiane Schonlieb
Yuyan Chen
Angelica I Aviles-Rivero
AI4TS
321
0
0
20 Dec 2024
InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal
  Large Language Models
InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models
Cong Wei
Yujie Zhong
Haoxian Tan
Yingsen Zeng
Yong Liu
Zheng Zhao
Yujiu Yang
MLLMVLMVOS
287
11
0
18 Dec 2024
Unlocking Visual Secrets: Inverting Features with Diffusion Priors for
  Image Reconstruction
Unlocking Visual Secrets: Inverting Features with Diffusion Priors for Image Reconstruction
Sai Qian Zhang
Ziyun Li
Chuan Guo
Saeed Mahloujifar
Deeksha Dangwal
Edward Suh
B. D. Salvo
Chiao Liu
DiffM
310
2
0
11 Dec 2024
HyperSeg: Towards Universal Visual Segmentation with Large Language
  Model
HyperSeg: Towards Universal Visual Segmentation with Large Language Model
Cong Wei
Yujie Zhong
Haoxian Tan
Yong Liu
Zheng Zhao
Jie Hu
Yujiu Yang
VOSMLLMVLMLRM
274
18
0
26 Nov 2024
LaVin-DiT: Large Vision Diffusion TransformerComputer Vision and Pattern Recognition (CVPR), 2024
Zhaoqing Wang
Xiaobo Xia
Runnan Chen
Dongdong Yu
Changhu Wang
Mingming Gong
Tongliang Liu
553
19
0
18 Nov 2024
Finding NeMo: Negative-mined Mosaic Augmentation for Referring Image
  Segmentation
Finding NeMo: Negative-mined Mosaic Augmentation for Referring Image SegmentationEuropean Conference on Computer Vision (ECCV), 2024
Seongsu Ha
Chaeyun Kim
Donghwa Kim
Junho Lee
Sangho Lee
Joonseok Lee
258
6
0
03 Nov 2024
On Occlusions in Video Action Detection: Benchmark Datasets And Training
  Recipes
On Occlusions in Video Action Detection: Benchmark Datasets And Training RecipesNeural Information Processing Systems (NeurIPS), 2024
Rajat Modi
Vibhav Vineet
Yogesh S Rawat
329
3
0
25 Oct 2024
CLIP-VAD: Exploiting Vision-Language Models for Voice Activity Detection
CLIP-VAD: Exploiting Vision-Language Models for Voice Activity Detection
Andrea Appiani
Cigdem Beyan
CLIPVLM
297
2
0
18 Oct 2024
LESS: Label-Efficient and Single-Stage Referring 3D Segmentation
LESS: Label-Efficient and Single-Stage Referring 3D SegmentationNeural Information Processing Systems (NeurIPS), 2024
Xuexun Liu
Xiaoxu Xu
Jinlong Li
Qiudan Zhang
Xu Wang
Andrii Zadaianchuk
Lin Ma
356
3
0
17 Oct 2024
A Survey of Low-shot Vision-Language Model Adaptation via Representer
  Theorem
A Survey of Low-shot Vision-Language Model Adaptation via Representer Theorem
Kun Ding
Ying Wang
Gaofeng Meng
Shiming Xiang
VLM
285
0
0
15 Oct 2024
Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation
Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation
Kun Ding
Qiang Yu
Haojian Zhang
Gaofeng Meng
Shiming Xiang
VLM
198
2
0
11 Oct 2024
Cross-Modal Bidirectional Interaction Model for Referring Remote Sensing Image Segmentation
Cross-Modal Bidirectional Interaction Model for Referring Remote Sensing Image Segmentation
Zhe Dong
Yuzhe Sun
Tianzhu Liu
Wangmeng Zuo
Yanfeng Gu
398
20
0
11 Oct 2024
OneRef: Unified One-tower Expression Grounding and Segmentation with
  Mask Referring Modeling
OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring ModelingNeural Information Processing Systems (NeurIPS), 2024
Linhui Xiao
Xiaoshan Yang
Fang Peng
Yaowei Wang
Changsheng Xu
ObjD
434
22
0
10 Oct 2024
TuneVLSeg: Prompt Tuning Benchmark for Vision-Language Segmentation
  Models
TuneVLSeg: Prompt Tuning Benchmark for Vision-Language Segmentation ModelsAsian Conference on Computer Vision (ACCV), 2024
Rabin Adhikari
Safal Thapaliya
Manish Dhakal
Bishesh Khanal
MLLMVLM
282
2
0
07 Oct 2024
Segment as You Wish -- Free-Form Language-Based Segmentation for Medical Images
Segment as You Wish -- Free-Form Language-Based Segmentation for Medical Images
Longchao Da
Rui Wang
Xiaojian Xu
Parminder Bhatia
Taha A. Kass-Hout
Hua Wei
Cao Xiao
MedImVLM
289
2
0
02 Oct 2024
One Token to Seg Them All: Language Instructed Reasoning Segmentation in
  Videos
One Token to Seg Them All: Language Instructed Reasoning Segmentation in VideosNeural Information Processing Systems (NeurIPS), 2024
Zechen Bai
Tong He
Haiyang Mei
Pichao Wang
Ziteng Gao
Joya Chen
Lei Liu
Zheng Zhang
Mike Zheng Shou
VLMVOSMLLM
251
74
0
29 Sep 2024
Fully Aligned Network for Referring Image Segmentation
Fully Aligned Network for Referring Image SegmentationVisual Communications and Image Processing (VCIP), 2024
Yong-Jin Liu
Ruihao Xu
Yansong Tang
242
0
0
29 Sep 2024
A Parameter-Efficient Tuning Framework for Language-guided Object Grounding and Robot Grasping
A Parameter-Efficient Tuning Framework for Language-guided Object Grounding and Robot GraspingIEEE International Conference on Robotics and Automation (ICRA), 2024
Houjian Yu
Mingen Li
Alireza Rezazadeh
Yang Yang
Changhyun Choi
541
6
0
28 Sep 2024
PTQ4RIS: Post-Training Quantization for Referring Image Segmentation
PTQ4RIS: Post-Training Quantization for Referring Image SegmentationIEEE International Conference on Robotics and Automation (ICRA), 2024
Xiaoyan Jiang
Hang Yang
Kaiying Zhu
Xihe Qiu
Shibo Zhao
Sifan Zhou
MQ
156
2
0
25 Sep 2024
DIAL: Dense Image-text ALignment for Weakly Supervised Semantic
  Segmentation
DIAL: Dense Image-text ALignment for Weakly Supervised Semantic SegmentationEuropean Conference on Computer Vision (ECCV), 2024
Soojin Jang
Jungmin Yun
Junehyoung Kwon
Eunju Lee
Youngbin Kim
325
8
0
24 Sep 2024
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language InstructionsInternational Conference on Learning Representations (ICLR), 2024
Weifeng Lin
Xinyu Wei
Renrui Zhang
Le Zhuo
Shitian Zhao
...
Junlin Xie
Junlin Xie
Yu Qiao
Peng Gao
Hongsheng Li
MLLMDiffM
565
25
0
23 Sep 2024
Instruction-guided Multi-Granularity Segmentation and Captioning with
  Large Multimodal Model
Instruction-guided Multi-Granularity Segmentation and Captioning with Large Multimodal Model
Li Zhou
Xu Yuan
Zenghui Sun
Zikun Zhou
Jingsong Lan
VLMMLLM
861
7
0
20 Sep 2024
DetailCLIP: Detail-Oriented CLIP for Fine-Grained Tasks
DetailCLIP: Detail-Oriented CLIP for Fine-Grained Tasks
Amin Karimi Monsefi
Kishore Prakash Sailaja
Ali Alilooee
Ser-Nam Lim
R. Ramnath
VLM
376
16
0
10 Sep 2024
SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring
  Expression Segmentation
SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression SegmentationEuropean Conference on Computer Vision (ECCV), 2024
Yi-Chia Chen
Wei-Hua Li
Cheng Sun
Yu-Chiang Frank Wang
Chu-Song Chen
VLM
237
56
0
01 Sep 2024
Depth-Weighted Detection of Behaviours of Risk in People with Dementia using Cameras
Depth-Weighted Detection of Behaviours of Risk in People with Dementia using Cameras
Pratik K. Mishra
Irene Ballester
Andrea Iaboni
Bing Ye
Kristine Newman
Alex Mihailidis
Shehroz S. Khan
245
2
0
28 Aug 2024
Perceive-IR: Learning to Perceive Degradation Better for All-in-One Image Restoration
Perceive-IR: Learning to Perceive Degradation Better for All-in-One Image RestorationIEEE Transactions on Image Processing (TIP), 2024
Xu Zhang
Jiaqi Ma
Guoli Wang
Qian Zhang
Huan Zhang
Lefei Zhang
VLM
552
37
0
28 Aug 2024
Image Segmentation in Foundation Model Era: A Survey
Image Segmentation in Foundation Model Era: A Survey
Tianfei Zhou
Fei Zhang
Boyu Chang
Wenguan Wang
Ye Yuan
E. Konukoglu
Daniel Cremers
VLM
427
29
0
23 Aug 2024
Cross-aware Early Fusion with Stage-divided Vision and Language
  Transformer Encoders for Referring Image Segmentation
Cross-aware Early Fusion with Stage-divided Vision and Language Transformer Encoders for Referring Image SegmentationIEEE transactions on multimedia (IEEE TMM), 2024
Yubin Cho
Hyunwoo Yu
Suk-Ju Kang
282
34
0
14 Aug 2024
In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic
  Segmentation
In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic SegmentationEuropean Conference on Computer Vision (ECCV), 2024
Dahyun Kang
Minsu Cho
ObjDVLM
385
24
0
09 Aug 2024
ArtVLM: Attribute Recognition Through Vision-Based Prefix Language
  Modeling
ArtVLM: Attribute Recognition Through Vision-Based Prefix Language ModelingEuropean Conference on Computer Vision (ECCV), 2024
William Y. Zhu
Keren Ye
Junjie Ke
Jiahui Yu
Leonidas Guibas
P. Milanfar
Feng Yang
341
2
0
07 Aug 2024
UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks
  With Large Language Model
UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language ModelNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Zhaowei Li
Wei Wang
Yiqing Cai
Xu Qi
Pengyu Wang
Dong Zhang
Hang Song
Botian Jiang
Zhida Huang
Tao Wang
AIFinLRM
216
9
0
05 Aug 2024
An Efficient and Effective Transformer Decoder-Based Framework for
  Multi-Task Visual Grounding
An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual GroundingEuropean Conference on Computer Vision (ECCV), 2024
Wei Chen
Mahdieh Hatamian
Yu Wu
238
16
0
02 Aug 2024
Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey
Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey
Atsuyuki Miyai
Jingkang Yang
Jingyang Zhang
Yifei Ming
Sisir Dhakal
...
Yixuan Li
Hai "Helen" Li
Ziwei Liu
Toshihiko Yamasaki
Kiyoharu Aizawa
367
29
0
31 Jul 2024
Diffusion Feedback Helps CLIP See Better
Diffusion Feedback Helps CLIP See BetterInternational Conference on Learning Representations (ICLR), 2024
Wenxuan Wang
Quan-Sen Sun
Fan Zhang
Yepeng Tang
Jing Liu
Xinlong Wang
VLM
331
39
0
29 Jul 2024
RefMask3D: Language-Guided Transformer for 3D Referring Segmentation
RefMask3D: Language-Guided Transformer for 3D Referring Segmentation
Shuting He
Henghui Ding
256
21
0
25 Jul 2024
VISA: Reasoning Video Object Segmentation via Large Language Models
VISA: Reasoning Video Object Segmentation via Large Language Models
Cilin Yan
Haochen Wang
Shilin Yan
Xiaolong Jiang
Yao Hu
Guoliang Kang
Weidi Xie
E. Gavves
LRMVLMVOS
237
92
0
16 Jul 2024
Pseudo-RIS: Distinctive Pseudo-supervision Generation for Referring
  Image Segmentation
Pseudo-RIS: Distinctive Pseudo-supervision Generation for Referring Image Segmentation
Seonghoon Yu
Paul Hongsuck Seo
Jeany Son
DiffM
413
12
0
10 Jul 2024
Previous
123456
Next