Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf
Vision-Language Models

Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models

28 November 2023

Siddhesh Khandelwal

Boyang Albert Li

Papers citing "Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models"

13 / 13 papers shown

Title
MovSAM: A Single-image Moving Object Segmentation Framework Based on Deep Thinking Chang Nie Yiqing Xu Guangming Wang Zhe Liu Yanzi Miao Hesheng Wang VLM 33 0 0 09 Apr 2025
The Power of One: A Single Example is All it Takes for Segmentation in VLMs Mir Rayat Imtiaz Hossain Mennatullah Siam Leonid Sigal James J. Little MLLM VLM 61 0 0 13 Mar 2025
PixFoundation: Are We Heading in the Right Direction with Pixel-level Vision Foundation Models? Mennatullah Siam VLM 71 1 0 06 Feb 2025
Enhancing Vision-Language Tracking by Effectively Converting Textual Cues into Visual Cues X. Feng D. Zhang Shuyan Hu X. Li M. Wu Jie Zhang Xiaojing Chen K. Huang 33 0 0 27 Dec 2024
Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation Chanyoung Kim Dayun Ju Woojung Han Ming-Hsuan Yang Seong Jae Hwang VLM VOS 63 0 0 26 Nov 2024
Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation Sina Hajimiri Ismail Ben Ayed Jose Dolz VLM 23 22 0 12 Apr 2024
A Simple Baseline for Knowledge-Based Visual Question Answering Alexandros Xenos Themos Stafylakis Ioannis Patras Georgios Tzimiropoulos 56 7 0 20 Oct 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models Junnan Li Dongxu Li Silvio Savarese Steven C. H. Hoi VLM MLLM 244 4,186 0 30 Jan 2023
GroupViT: Semantic Segmentation Emerges from Text Supervision Jiarui Xu Shalini De Mello Sifei Liu Wonmin Byeon Thomas Breuel Jan Kautz X. Wang ViT VLM 173 494 0 22 Feb 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation Junnan Li Dongxu Li Caiming Xiong S. Hoi MLLM BDL VLM CLIP 380 4,010 0 28 Jan 2022
Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast Ye Du Zehua Fu Qingjie Liu Yunhong Wang 63 128 0 14 Oct 2021
Localizing Objects with Self-Supervised Transformers and no Labels Oriane Siméoni Gilles Puy Huy V. Vo Simon Roburin Spyros Gidaris Andrei Bursuc P. Pérez Renaud Marlet Jean Ponce ViT 159 195 0 29 Sep 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision Chao Jia Yinfei Yang Ye Xia Yi-Ting Chen Zarana Parekh Hieu H. Pham Quoc V. Le Yun-hsuan Sung Zhen Li Tom Duerig VLM CLIP 293 2,875 0 11 Feb 2021