Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.08181
Cited By
Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation
12 April 2024
Sina Hajimiri
Ismail Ben Ayed
Jose Dolz
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation"
24 / 24 papers shown
Title
Register and CLS tokens yield a decoupling of local and global features in large ViTs
Alexander Lappe
M. Giese
14
0
0
09 May 2025
Show or Tell? A Benchmark To Evaluate Visual and Textual Prompts in Semantic Segmentation
Gabriele Rosi
Fabio Cermelli
VLM
25
0
0
06 May 2025
FLOSS: Free Lunch in Open-vocabulary Semantic Segmentation
Yasser Benigmim
Mohammad Fahes
Tuan-Hung Vu
Andrei Bursuc
Raoul de Charette
VLM
30
0
0
14 Apr 2025
RayFronts: Open-Set Semantic Ray Frontiers for Online Scene Understanding and Exploration
Omar Alama
A. Bhattacharya
Haoyang He
Seungchan Kim
Yuheng Qiu
Wenshan Wang
Cherie Ho
Nikhil Varma Keetha
Sebastian A. Scherer
26
0
0
09 Apr 2025
Hybrid Global-Local Representation with Augmented Spatial Guidance for Zero-Shot Referring Image Segmentation
Ting Liu
Siyuan Li
36
0
0
01 Apr 2025
Show or Tell? Effectively prompting Vision-Language Models for semantic segmentation
Niccolo Avogaro
Thomas Frick
Mattia Rigotti
A. Bartezzaghi
Filip Janicki
C. Malossi
Konrad Schindler
Roy Assaf
MLLM
VLM
53
0
0
25 Mar 2025
Towards Training-free Anomaly Detection with Vision and Language Foundation Models
Jinjin Zhang
Guodong Wang
Yizhou Jin
Di Huang
42
1
0
24 Mar 2025
The Power of One: A Single Example is All it Takes for Segmentation in VLMs
Mir Rayat Imtiaz Hossain
Mennatullah Siam
Leonid Sigal
James J. Little
MLLM
VLM
63
0
0
13 Mar 2025
PixFoundation: Are We Heading in the Right Direction with Pixel-level Vision Foundation Models?
Mennatullah Siam
VLM
76
1
0
06 Feb 2025
Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection
Xiangyu Gao
Yu Dai
Benliu Qiu
Hongliang Li
Heqian Qiu
Hongliang Li
ObjD
VLM
56
0
0
28 Jan 2025
TeD-Loc: Text Distillation for Weakly Supervised Object Localization
Shakeeb Murtaza
Soufiane Belharbi
M. Pedersoli
Eric Granger
WSOL
VLM
84
1
0
22 Jan 2025
Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation
Luca Barsellotti
Lorenzo Bianchi
Nicola Messina
F. Carrara
Marcella Cornia
Lorenzo Baraldi
Fabrizio Falchi
Rita Cucchiara
VLM
62
2
0
28 Nov 2024
Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation
Chanyoung Kim
Dayun Ju
Woojung Han
Ming-Hsuan Yang
Seong Jae Hwang
VLM
VOS
66
0
0
26 Nov 2024
ResCLIP: Residual Attention for Training-free Dense Vision-language Inference
Yuhang Yang
Jinhong Deng
Wen Li
Lixin Duan
VLM
68
0
0
24 Nov 2024
Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation
Sule Bai
Yong-Jin Liu
Yifei Han
Haoji Zhang
Yansong Tang
VLM
72
3
0
24 Nov 2024
ITACLIP: Boosting Training-Free Semantic Segmentation with Image, Text, and Architectural Enhancements
M. Arda Aydın
Efe Mert Çırpar
Elvin Abdinli
Gözde B. Ünal
Y. Sahin
VLM
59
0
0
18 Nov 2024
Harnessing Vision Foundation Models for High-Performance, Training-Free Open Vocabulary Segmentation
Yuheng Shi
Minjing Dong
Chang Xu
VLM
24
1
0
14 Nov 2024
Multilingual Vision-Language Pre-training for the Remote Sensing Domain
João Daniel Silva
João Magalhães
D. Tuia
Bruno Martins
CLIP
VLM
25
1
0
30 Oct 2024
Brain Mapping with Dense Features: Grounding Cortical Semantic Selectivity in Natural Images With Vision Transformers
Andrew F. Luo
Jacob Yeung
Rushikesh Zawar
Shaurya Dewan
Margaret M. Henderson
Leila Wehbe
Michael J. Tarr
21
3
0
07 Oct 2024
Image Segmentation in Foundation Model Era: A Survey
Tianfei Zhou
Fei Zhang
Boyu Chang
Wenguan Wang
Ye Yuan
E. Konukoglu
Daniel Cremers
VLM
38
4
0
23 Aug 2024
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models
Jiarui Xu
Sifei Liu
Arash Vahdat
Wonmin Byeon
Xiaolong Wang
Shalini De Mello
VLM
198
318
0
08 Mar 2023
GroupViT: Semantic Segmentation Emerges from Text Supervision
Jiarui Xu
Shalini De Mello
Sifei Liu
Wonmin Byeon
Thomas Breuel
Jan Kautz
X. Wang
ViT
VLM
175
494
0
22 Feb 2022
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Renrui Zhang
Rongyao Fang
Wei Zhang
Peng Gao
Kunchang Li
Jifeng Dai
Yu Qiao
Hongsheng Li
VLM
172
281
0
06 Nov 2021
Semantic Understanding of Scenes through the ADE20K Dataset
Bolei Zhou
Hang Zhao
Xavier Puig
Tete Xiao
Sanja Fidler
Adela Barriuso
Antonio Torralba
SSeg
243
1,817
0
18 Aug 2016
1