Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2411.19331
Cited By
v1
v2
v3 (latest)
Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation
28 November 2024
Luca Barsellotti
Lorenzo Bianchi
Nicola Messina
F. Carrara
Marcella Cornia
Lorenzo Baraldi
Fabrizio Falchi
Rita Cucchiara
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Github
Papers citing
"Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation"
11 / 11 papers shown
One Patch to Caption Them All: A Unified Zero-Shot Captioning Framework
Lorenzo Bianchi
Giacomo Pacini
F. Carrara
Nicola Messina
Giuseppe Amato
Fabrizio Falchi
3DV
VLM
229
1
0
30 Mar 2026
Easy3D-Labels: Supervising Semantic Occupancy Estimation with 3D Pseudo-Labels for Automotive Perception
Seamie Hayes
Ganesh Sistu
Ciarán Eising
Ciaran Eising
3DPC
322
3
0
27 Mar 2026
ShelfGaussian: Shelf-Supervised Open-Vocabulary Gaussian-based 3D Scene Understanding
Lingjun Zhao
Yandong Luo
James Hay
Lu Gan
3DGS
199
1
0
03 Dec 2025
KM-ViPE: Online Tightly Coupled Vision-Language-Geometry Fusion for Open-Vocabulary Semantic SLAM
Zaid Nasser
Mikhail Iumanov
Tianhao Li
Maxim Popov
Jaafar Mahmoud
Malik Mohrat
Ilya Obrubov
Ekaterina Derevyanka
Ivan Sosin
Sergey Kolyubin
168
0
0
01 Dec 2025
RADSeg: Unleashing Parameter and Compute Efficient Zero-Shot Open-Vocabulary Segmentation Using Agglomerative Models
Omar Alama
Darshil Jariwala
A. Bhattacharya
Seungchan Kim
Wenshan Wang
Sebastian A. Scherer
ObjD
VLM
243
1
0
24 Nov 2025
SuperQuadricOcc: Real-Time Self-Supervised Semantic Occupancy Estimation with Superquadric Volume Rendering
Seamie Hayes
Reenu Mohandas
Tim Brophy
Alexandre Boulch
Ganesh Sistu
Ciarán Eising
Ciaran Eising
3DPC
383
1
0
21 Nov 2025
FarSLIP: Discovering Effective CLIP Adaptation for Fine-Grained Remote Sensing Understanding
Z. Li
W. Yu
Dilxat Muhtar
X. Zhang
Pengfeng Xiao
Pedram Ghamisi
Xiao Xiang Zhu
CLIP
VLM
251
1
0
18 Nov 2025
Talk2SAM: Text-Guided Semantic Enhancement for Complex-Shaped Object Segmentation
Luka Vetoshkin
Dmitry Yudin
230
0
0
03 Jun 2025
LEG-SLAM: Real-Time Language-Enhanced Gaussian Splatting for SLAM
Roman Titkov
Egor Zubkov
Dmitry A. Yudin
Jaafar Mahmoud
Malik Mohrat
Gennady Sidorov
3DGS
267
2
0
03 Jun 2025
Resource-Efficient Affordance Grounding with Complementary Depth and Semantic Prompts
Yizhou Huang
Fan Yang
Guoliang Zhu
Gen Li
Hao-miao Shi
Yukun Zuo
Wenrui Chen
Hui Yuan
Kailun Yang
482
0
0
04 Mar 2025
GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding
Computer Vision and Pattern Recognition (CVPR), 2024
Haoyi Jiang
Liu Liu
Tianheng Cheng
Xinjie Wang
Tianwei Lin
Zhizhong Su
Wen Liu
Xinyu Wang
3DGS
ViT
547
43
0
17 Dec 2024
1
Page 1 of 1