v1v2 (latest)

CRIS: CLIP-Driven Referring Image Segmentation

30 November 2021

Papers citing "CRIS: CLIP-Driven Referring Image Segmentation"

50 / 288 papers shown

CLIP-Driven Semantic Discovery Network for Visible-Infrared Person Re-IdentificationIEEE transactions on multimedia (IEEE TMM), 2024

Xiaoyan Yu

Neng Dong

Liehuang Zhu

Hao Peng

Dapeng Tao

299

11 Jan 2024

FMGS: Foundation Model Embedded 3D Gaussian Splatting for Holistic 3D Scene UnderstandingInternational Journal of Computer Vision (IJCV), 2024

355

03 Jan 2024

UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces

Huchuan Lu

Ping Luo

273

25 Dec 2023

FoodLMM: A Versatile Food Assistant using Large Multi-modal Model

Chong-Wah Ngo

264

22 Dec 2023

Weakly Supervised Semantic Segmentation for Driving Scenes

554

21 Dec 2023

Spectral Prompt Tuning:Unveiling Unseen Classes for Zero-Shot Semantic Segmentation

247

20 Dec 2023

Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model

Ser-Nam Lim

386

19 Dec 2023

Mask Grounding for Referring Image Segmentation

Gao Huang

382

19 Dec 2023

Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation

Jiayi Ji

411

19 Dec 2023

GSVA: Generalized Segmentation via Multimodal Large Language ModelsComputer Vision and Pattern Recognition (CVPR), 2023

Gao Huang

596

125

15 Dec 2023

EVP: Enhanced Visual Perception using Inverse Multi-Attentive Feature Refinement and Regularized Image-Text Alignment

242

13 Dec 2023

See, Say, and Segment: Teaching LMMs to Overcome False PremisesComputer Vision and Pattern Recognition (CVPR), 2023

311

13 Dec 2023

Unveiling Parts Beyond Objects:Towards Finer-Granularity Referring Expression SegmentationComputer Vision and Pattern Recognition (CVPR), 2023

Yisi Zhang

Jing Liu

266

13 Dec 2023

CLIP in Medical Imaging: A Comprehensive SurveyMedical Image Analysis (MIA), 2023

577

12 Dec 2023

Foundation Models for Weather and Climate Data Understanding: A Comprehensive Survey

308

05 Dec 2023

Universal Segmentation at Arbitrary Granularity with Language InstructionComputer Vision and Pattern Recognition (CVPR), 2023

Yong Liu

Yujiu Yang

301

04 Dec 2023

PixelLM: Pixel Reasoning with Large Multimodal ModelComputer Vision and Pattern Recognition (CVPR), 2023

369

188

04 Dec 2023

Towards Generalizable Referring Image Segmentation via Target Prompt and Visual CoherenceInternational Conference on Information Photonics (ICIP), 2023

Qingjie Liu

Yunhong Wang

198

01 Dec 2023

Synchronizing Vision and Language: Bidirectional Token-Masking AutoEncoder for Referring Image Segmentation

175

29 Nov 2023

Explaining CLIP's performance disparities on data from blind/low vision usersComputer Vision and Pattern Recognition (CVPR), 2023

327

29 Nov 2023

RISAM: Referring Image Segmentation via Mutual-Aware Attention Features

442

27 Nov 2023

Align before Adapt: Leveraging Entity-to-Region Alignments for Generalizable Video Action RecognitionComputer Vision and Pattern Recognition (CVPR), 2023

279

27 Nov 2023

Spatially Covariant Image Registration with Text PromptsIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023

Xiang Chen

Min Liu

Rongguang Wang

Renjiu Hu

Dongdong Liu

Gaolei Li

Hang Zhang

MedIm

259

27 Nov 2023

End-to-End Breast Cancer Radiotherapy Planning via LMMs with Consistency Embedding

418

27 Nov 2023

Soulstyler: Using Large Language Model to Guide Image Style Transfer for Target Object

Xiang Li

179

22 Nov 2023

VGSG: Vision-Guided Semantic-Group Network for Text-based Person SearchIEEE Transactions on Image Processing (IEEE TIP), 2023

Henghui Ding

283

13 Nov 2023

Language-guided Robot Grasping: CLIP-based Referring Grasp Synthesis in ClutterConference on Robot Learning (CoRL), 2023

240

09 Nov 2023

NExT-Chat: An LMM for Chat, Detection and Segmentation

Ao Zhang

Yuan Yao

Wei Ji

Zhiyuan Liu

Tat-Seng Chua

MLLM VLM

355

08 Nov 2023

GLaMM: Pixel Grounding Large Multimodal ModelComputer Vision and Pattern Recognition (CVPR), 2023

H. Rasheed

Muhammad Maaz

Sahal Shaji Mullappilly

Abdelrahman M. Shaker

Salman Khan

Hisham Cholakkal

Rao M. Anwer

Erix Xing

Ming-Hsuan Yang

Fahad S. Khan

MLLM VLM

434

396

06 Nov 2023

Towards a Unified Transformer-based Framework for Scene Graph Generation and Human-object Interaction DetectionIEEE Transactions on Image Processing (IEEE TIP), 2023

Tao He

Lianli Gao

Jingkuan Song

Yuan-Fang Li

ViT

250

03 Nov 2023

Enriching Phrases with Coupled Pixel and Object Contexts for Panoptic Narrative GroundingInternational Joint Conference on Artificial Intelligence (IJCAI), 2023

Junshi Huang

300

02 Nov 2023

Towards Omni-supervised Referring Expression SegmentationIEEE International Conference on Multimedia and Expo (ICME), 2023

315

01 Nov 2023

CHAIN: Exploring Global-Local Spatio-Temporal Information for Improved Self-Supervised Video HashingACM Multimedia (ACM MM), 2023

Jingkuan Song

139

29 Oct 2023

Text Augmented Spatial-aware Zero-shot Referring Image SegmentationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Yuchen Suo

Linchao Zhu

Yi Yang

289

27 Oct 2023

Open-NeRF: Towards Open Vocabulary NeRF DecompositionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023

Hao Zhang

Fang Li

Narendra Ahuja

177

25 Oct 2023

FD-Align: Feature Discrimination Alignment for Fine-tuning Pre-Trained Models in Few-Shot LearningNeural Information Processing Systems (NeurIPS), 2023

Bochao Zou

308

23 Oct 2023

Segment, Select, Correct: A Framework for Weakly-Supervised Referring Segmentation

304

20 Oct 2023

Building an Open-Vocabulary Video CLIP Model with Better Architectures, Optimization and DataIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

Zuxuan Wu

243

08 Oct 2023

MOSAIC: Multi-Object Segmented Arbitrary Stylization Using CLIP

165

24 Sep 2023

Synthetic Boost: Leveraging Synthetic Data for Enhanced Vision-Language Segmentation in Echocardiography

211

22 Sep 2023

CLIPUNetr: Assisting Human-robot Interface for Uncalibrated Visual Servoing Control with CLIP-driven Referring Expression SegmentationIEEE International Conference on Robotics and Automation (ICRA), 2023

Chen Jiang

Yuchen Yang

Martin Jägersand

186

17 Sep 2023

Multi-dimensional Fusion and Consistency for Semi-supervised Medical Image SegmentationConference on Multimedia Modeling (MMM), 2023

Yixing Lu

Zhaoxin Fan

Min Xu

173

12 Sep 2023

Temporal Collection and Distribution for Referring Video Object SegmentationIEEE International Conference on Computer Vision (ICCV), 2023

185

07 Sep 2023

CoTDet: Affordance Knowledge Prompting for Task Driven Object DetectionIEEE International Conference on Computer Vision (ICCV), 2023

Jingyi Yu

215

03 Sep 2023

Contrastive Grouping with Transformer for Referring Image SegmentationComputer Vision and Pattern Recognition (CVPR), 2023

314

02 Sep 2023

Shatter and Gather: Learning Referring Image Segmentation with Text SupervisionIEEE International Conference on Computer Vision (ICCV), 2023

275

29 Aug 2023

Referring Image Segmentation Using Text SupervisionIEEE International Conference on Computer Vision (ICCV), 2023

Fang Liu

250

28 Aug 2023

Beyond One-to-One: Rethinking the Referring Image SegmentationIEEE International Conference on Computer Vision (ICCV), 2023

Jungong Han

Ping Luo

3DV

228

26 Aug 2023

CgT-GAN: CLIP-guided Text GAN for Image CaptioningACM Multimedia (ACM MM), 2023

219

23 Aug 2023

Blending-NeRF: Text-Driven Localized Editing in Neural Radiance FieldsIEEE International Conference on Computer Vision (ICCV), 2023

246

23 Aug 2023