ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.12143
  4. Cited By
Scaling Open-Vocabulary Image Segmentation with Image-Level Labels

Scaling Open-Vocabulary Image Segmentation with Image-Level Labels

22 December 2021
Golnaz Ghiasi
Xiuye Gu
Yin Cui
Tsung-Yi Lin
    VLM
ArXivPDFHTML

Papers citing "Scaling Open-Vocabulary Image Segmentation with Image-Level Labels"

50 / 292 papers shown
Title
Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for
  Zero-Shot Semantic Segmentation
Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation
Yunheng Li
Zhongyu Li
Quansheng Zeng
Qibin Hou
Ming-Ming Cheng
VLM
30
8
0
02 Jun 2024
Open-Vocabulary SAM3D: Understand Any 3D Scene
Open-Vocabulary SAM3D: Understand Any 3D Scene
Hanchen Tai
Qingdong He
Jiangning Zhang
Yijie Qian
Zhenyu Zhang
Xiaobin Hu
Yabiao Wang
Yong Liu
VLM
36
0
0
24 May 2024
Boosting Medical Image-based Cancer Detection via Text-guided
  Supervision from Reports
Boosting Medical Image-based Cancer Detection via Text-guided Supervision from Reports
Guangyu Guo
Jiawen Yao
Yingda Xia
Tony C. W. Mok
Zhilin Zheng
Junwei Han
Le Lu
Dingwen Zhang
Jian Zhou
Ling Zhang
32
1
0
23 May 2024
Unsupervised Pre-training with Language-Vision Prompts for Low-Data
  Instance Segmentation
Unsupervised Pre-training with Language-Vision Prompts for Low-Data Instance Segmentation
Dingwen Zhang
Hao Li
Diqi He
Nian Liu
Lechao Cheng
Jingdong Wang
Junwei Han
VLM
30
0
0
22 May 2024
Unifying 3D Vision-Language Understanding via Promptable Queries
Unifying 3D Vision-Language Understanding via Promptable Queries
Ziyu Zhu
Zhuofan Zhang
Xiaojian Ma
Xuesong Niu
Yixin Chen
Baoxiong Jia
Zhidong Deng
Siyuan Huang
Qing Li
40
21
0
19 May 2024
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks
  via Multi-modal Large Language Models
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models
Xianzheng Ma
Yash Bhalgat
Brandon Smart
Shuai Chen
Xinghui Li
...
Matthias Nießner
Ian D Reid
Angel X. Chang
Iro Laina
V. Prisacariu
LRM
29
12
0
16 May 2024
Probing Multimodal LLMs as World Models for Driving
Probing Multimodal LLMs as World Models for Driving
Shiva Sreeram
T. Wang
Alaa Maalouf
Guy Rosman
S. Karaman
Daniela Rus
25
7
0
09 May 2024
OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies
OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies
Lingdong Kong
You-Chen Liu
Lai Xing Ng
Benoit R. Cottereau
Wei Tsang Ooi
VLM
29
12
0
08 May 2024
LocInv: Localization-aware Inversion for Text-Guided Image Editing
LocInv: Localization-aware Inversion for Text-Guided Image Editing
Chuanming Tang
Kai Wang
Fei Yang
J. Weijer
DiffM
31
3
0
02 May 2024
Vocabulary-free Image Classification and Semantic Segmentation
Vocabulary-free Image Classification and Semantic Segmentation
Alessandro Conti
Enrico Fini
Massimiliano Mancini
Paolo Rota
Yiming Wang
Elisa Ricci
VLM
38
2
0
16 Apr 2024
Unifying Global and Local Scene Entities Modelling for Precise Action
  Spotting
Unifying Global and Local Scene Entities Modelling for Precise Action Spotting
Kim Hoang Tran
Phuc Vuong Do
Ngoc Quoc Ly
Ngan Le
34
4
0
15 Apr 2024
kNN-CLIP: Retrieval Enables Training-Free Segmentation on Continually
  Expanding Large Vocabularies
kNN-CLIP: Retrieval Enables Training-Free Segmentation on Continually Expanding Large Vocabularies
Zhongrui Gui
Shuyang Sun
Runjia Li
Jianhao Yuan
Zhaochong An
Karsten Roth
Ameya Prabhu
Philip H. S. Torr
VLM
CLL
24
6
0
15 Apr 2024
COCONut: Modernizing COCO Segmentation
COCONut: Modernizing COCO Segmentation
XueQing Deng
Qihang Yu
Peng Wang
Xiaohui Shen
Liang-Chieh Chen
37
16
0
12 Apr 2024
On the Robustness of Language Guidance for Low-Level Vision Tasks:
  Findings from Depth Estimation
On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation
Agneet Chatterjee
Tejas Gokhale
Chitta Baral
Yezhou Yang
VLM
25
2
0
12 Apr 2024
LaSagnA: Language-based Segmentation Assistant for Complex Queries
LaSagnA: Language-based Segmentation Assistant for Complex Queries
Cong Wei
Haoxian Tan
Yujie Zhong
Yujiu Yang
Lin Ma
36
14
0
12 Apr 2024
Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic
  Segmentation
Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation
Sina Hajimiri
Ismail Ben Ayed
Jose Dolz
VLM
31
22
0
12 Apr 2024
Transferable and Principled Efficiency for Open-Vocabulary Segmentation
Transferable and Principled Efficiency for Open-Vocabulary Segmentation
Jingxuan Xu
Wuyang Chen
Yao-Min Zhao
Yunchao Wei
VLM
31
2
0
11 Apr 2024
Training-Free Open-Vocabulary Segmentation with Offline
  Diffusion-Augmented Prototype Generation
Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation
Luca Barsellotti
Roberto Amoroso
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
VLM
DiffM
37
13
0
09 Apr 2024
GHOST: Grounded Human Motion Generation with Open Vocabulary
  Scene-and-Text Contexts
GHOST: Grounded Human Motion Generation with Open Vocabulary Scene-and-Text Contexts
Z. '. Milacski
Koichiro Niinuma
Ryosuke Kawamura
Fernando de la Torre
László A. Jeni
27
1
0
08 Apr 2024
Mixed-Query Transformer: A Unified Image Segmentation Architecture
Mixed-Query Transformer: A Unified Image Segmentation Architecture
Pei Wang
Zhaowei Cai
Hao-Yu Yang
Ashwin Swaminathan
R. Manmatha
Stefano Soatto
70
2
0
06 Apr 2024
Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation
Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation
Ji-Jia Wu
Andy Chia-Hao Chang
Chieh-Yu Chuang
Chun-Pei Chen
Yu-Lun Liu
Min-Hung Chen
Hou-Ning Hu
Yung-Yu Chuang
Yen-Yu Lin
VLM
38
9
0
05 Apr 2024
OpenNeRF: Open Set 3D Neural Scene Segmentation with Pixel-Wise Features
  and Rendered Novel Views
OpenNeRF: Open Set 3D Neural Scene Segmentation with Pixel-Wise Features and Rendered Novel Views
Francis Engelmann
Fabian Manhardt
Michael Niemeyer
Keisuke Tateno
Marc Pollefeys
Federico Tombari
VLM
65
32
1
04 Apr 2024
Unsegment Anything by Simulating Deformation
Unsegment Anything by Simulating Deformation
Jiahao Lu
Xingyi Yang
Xinchao Wang
29
4
0
03 Apr 2024
Segment Any 3D Object with Language
Segment Any 3D Object with Language
Seungjun Lee
Yuyang Zhao
Gim Hee Lee
31
1
0
02 Apr 2024
ViTamin: Designing Scalable Vision Models in the Vision-Language Era
ViTamin: Designing Scalable Vision Models in the Vision-Language Era
Jienneg Chen
Qihang Yu
Xiaohui Shen
Alan L. Yuille
Liang-Chieh Chen
3DV
VLM
28
24
0
02 Apr 2024
T-VSL: Text-Guided Visual Sound Source Localization in Mixtures
T-VSL: Text-Guided Visual Sound Source Localization in Mixtures
Tanvir Mahmud
Yapeng Tian
Diana Marculescu
42
8
0
02 Apr 2024
OVFoodSeg: Elevating Open-Vocabulary Food Image Segmentation via
  Image-Informed Textual Representation
OVFoodSeg: Elevating Open-Vocabulary Food Image Segmentation via Image-Informed Textual Representation
Xiongwei Wu
Sicheng Yu
Ee-Peng Lim
Chong-Wah Ngo
VLM
30
2
0
01 Apr 2024
GOV-NeSF: Generalizable Open-Vocabulary Neural Semantic Fields
GOV-NeSF: Generalizable Open-Vocabulary Neural Semantic Fields
Yunsong Wang
Hanlin Chen
Gim Hee Lee
32
5
0
01 Apr 2024
Training-Free Semantic Segmentation via LLM-Supervision
Training-Free Semantic Segmentation via LLM-Supervision
Wenfang Sun
Yingjun Du
Gaowen Liu
Ramana Rao Kompella
Cees G. M. Snoek
VLM
35
2
0
31 Mar 2024
SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs
SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs
Yang Miao
Francis Engelmann
Olga Vysotska
Federico Tombari
Marc Pollefeys
Daniel Barath
3DPC
35
6
0
30 Mar 2024
Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian
  Splatting
Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting
Jun Guo
Xiaojian Ma
Yue Fan
Huaping Liu
Qing Li
3DGS
36
26
0
22 Mar 2024
T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
Qing Jiang
Feng Li
Zhaoyang Zeng
Tianhe Ren
Shilong Liu
Lei Zhang
VLM
27
36
0
21 Mar 2024
Better Call SAL: Towards Learning to Segment Anything in Lidar
Better Call SAL: Towards Learning to Segment Anything in Lidar
Aljovsa Ovsep
Tim Meinhardt
Francesco Ferroni
Neehar Peri
Deva Ramanan
Laura Leal-Taixé
VLM
25
15
0
19 Mar 2024
OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy
  Representation
OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation
Haochen Jiang
Yueming Xu
Yihan Zeng
Hang Xu
Wei Zhang
Jianfeng Feng
Li Zhang
27
1
0
18 Mar 2024
Lifelong LERF: Local 3D Semantic Inventory Monitoring Using FogROS2
Lifelong LERF: Local 3D Semantic Inventory Monitoring Using FogROS2
Adam Rashid
C. Kim
J. Kerr
Letian Fu
Kush Hari
...
Michael Wang
Christian Juette
Nan Tian
Liu Ren
Kenneth Y. Goldberg
30
6
0
15 Mar 2024
PosSAM: Panoptic Open-vocabulary Segment Anything
PosSAM: Panoptic Open-vocabulary Segment Anything
VS Vibashan
Shubhankar Borse
Hyojin Park
Debasmit Das
Vishal M. Patel
Munawar Hayat
Fatih Porikli
VLM
MLLM
31
6
0
14 Mar 2024
Renovating Names in Open-Vocabulary Segmentation Benchmarks
Renovating Names in Open-Vocabulary Segmentation Benchmarks
Haiwen Huang
Songyou Peng
Dan Zhang
Andreas Geiger
VLM
29
3
0
14 Mar 2024
Annotation Free Semantic Segmentation with Vision Foundation Models
Annotation Free Semantic Segmentation with Vision Foundation Models
Soroush Seifi
Daniel Olmeda Reino
Fabien Despinoy
Rahaf Aljundi
VLM
24
1
0
14 Mar 2024
Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation
Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation
Zicheng Zhang
Tong Zhang
Yi Zhu
Jian-zhuo Liu
Xiaodan Liang
QiXiang Ye
Wei Ke
VLM
44
2
0
13 Mar 2024
PointSeg: A Training-Free Paradigm for 3D Scene Segmentation via
  Foundation Models
PointSeg: A Training-Free Paradigm for 3D Scene Segmentation via Foundation Models
Qingdong He
Jinlong Peng
Zhengkai Jiang
Xiaobin Hu
Jiangning Zhang
Qiang Nie
Yabiao Wang
Chengjie Wang
3DPC
VLM
41
5
0
11 Mar 2024
Reframe Anything: LLM Agent for Open World Video Reframing
Reframe Anything: LLM Agent for Open World Video Reframing
Jiawang Cao
Yongliang Wu
Weiheng Chi
Wenbo Zhu
Ziyue Su
Jay Wu
26
3
0
10 Mar 2024
Self-Adapting Large Visual-Language Models to Edge Devices across Visual
  Modalities
Self-Adapting Large Visual-Language Models to Edge Devices across Visual Modalities
Kaiwen Cai
Zhekai Duan
Gaowen Liu
Charles Fleming
Chris Xiaoxuan Lu
VLM
28
3
0
07 Mar 2024
Generalizable Semantic Vision Query Generation for Zero-shot Panoptic
  and Semantic Segmentation
Generalizable Semantic Vision Query Generation for Zero-shot Panoptic and Semantic Segmentation
Jialei Chen
Daisuke Deguchi
Chenkai Zhang
Hiroshi Murase
VLM
33
1
0
21 Feb 2024
Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with
  Queryable Objects and Open-Set Relationships
Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set Relationships
Sebastian Koch
Narunas Vaskevicius
Mirco Colosi
Pedro Hermosilla
Timo Ropinski
3DPC
28
25
0
19 Feb 2024
HaLo-NeRF: Learning Geometry-Guided Semantics for Exploring
  Unconstrained Photo Collections
HaLo-NeRF: Learning Geometry-Guided Semantics for Exploring Unconstrained Photo Collections
Chen Dudai
Morris Alper
Hana Bezalel
Rana Hanocka
Itai Lang
Hadar Averbuch-Elor
23
2
0
14 Feb 2024
Open-Vocabulary Segmentation with Unpaired Mask-Text Supervision
Open-Vocabulary Segmentation with Unpaired Mask-Text Supervision
Zhaoqing Wang
Xiaobo Xia
Ziye Chen
Xiao He
Yandong Guo
Mingming Gong
Tongliang Liu
VLM
14
11
0
14 Feb 2024
V-IRL: Grounding Virtual Intelligence in Real Life
V-IRL: Grounding Virtual Intelligence in Real Life
Jihan Yang
Runyu Ding
Ellis L Brown
Xiaojuan Qi
Saining Xie
LM&Ro
48
19
0
05 Feb 2024
CLIP Can Understand Depth
CLIP Can Understand Depth
Dunam Kim
Seokju Lee
VLM
MDE
36
2
0
05 Feb 2024
Exploring Simple Open-Vocabulary Semantic Segmentation
Exploring Simple Open-Vocabulary Semantic Segmentation
Zihang Lai
VLM
14
0
0
22 Jan 2024
Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation
Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation
Ci-Siang Lin
Chien-Yi Wang
Yu-Chiang Frank Wang
Min-Hung Chen
VLM
21
0
0
22 Jan 2024
Previous
123456
Next