ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.04150
  4. Cited By
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP

Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP

9 October 2022
Feng Liang
Bichen Wu
Xiaoliang Dai
Kunpeng Li
Yinan Zhao
Hang Zhang
Peizhao Zhang
Peter Vajda
Diana Marculescu
    CLIP
    VLM
ArXivPDFHTML

Papers citing "Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP"

50 / 331 papers shown
Title
LMSeg: Unleashing the Power of Large-Scale Models for Open-Vocabulary Semantic Segmentation
Huadong Tang
Youpeng Zhao
Y. Huang
Min Xu
Jun Wang
Qiang Wu
MLLM
VLM
78
0
0
30 Nov 2024
ROSE: Revolutionizing Open-Set Dense Segmentation with Patch-Wise Perceptual Large Multimodal Model
Kunyang Han
Yibo Hu
Mengxue Qu
Hailin Shi
Yao Zhao
Y. X. Wei
MLLM
VLM
3DV
77
1
0
29 Nov 2024
Talking to DINO: Bridging Self-Supervised Vision Backbones with Language
  for Open-Vocabulary Segmentation
Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation
Luca Barsellotti
Lorenzo Bianchi
Nicola Messina
F. Carrara
Marcella Cornia
Lorenzo Baraldi
Fabrizio Falchi
Rita Cucchiara
VLM
64
2
0
28 Nov 2024
CLIP meets DINO for Tuning Zero-Shot Classifier using Unlabeled Image Collections
CLIP meets DINO for Tuning Zero-Shot Classifier using Unlabeled Image Collections
Mohamed Fazli Mohamed Imam
Rufael Fedaku Marew
Jameel Hassan
M. Fiaz
Alham Fikri Aji
Hisham Cholakkal
VLM
76
0
0
28 Nov 2024
Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation
Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation
Chanyoung Kim
Dayun Ju
Woojung Han
Ming-Hsuan Yang
Seong Jae Hwang
VLM
VOS
68
0
0
26 Nov 2024
ResCLIP: Residual Attention for Training-free Dense Vision-language
  Inference
ResCLIP: Residual Attention for Training-free Dense Vision-language Inference
Yuhang Yang
Jinhong Deng
Wen Li
Lixin Duan
VLM
68
0
0
24 Nov 2024
Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation
Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation
Sule Bai
Yong-Jin Liu
Yifei Han
Haoji Zhang
Yansong Tang
VLM
72
3
0
24 Nov 2024
ITACLIP: Boosting Training-Free Semantic Segmentation with Image, Text, and Architectural Enhancements
ITACLIP: Boosting Training-Free Semantic Segmentation with Image, Text, and Architectural Enhancements
M. Arda Aydın
Efe Mert Çırpar
Elvin Abdinli
Gözde B. Ünal
Y. Sahin
VLM
59
0
0
18 Nov 2024
Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection
Wentao Bao
K. Li
Yuxiao Chen
Deep Patel
Martin Renqiang Min
Yu Kong
VLM
ObjD
32
2
0
17 Nov 2024
CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf
  Foundation Models for Open-Vocabulary Semantic Segmentation
CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf Foundation Models for Open-Vocabulary Semantic Segmentation
Dengke Zhang
Fagui Liu
Quan Tang
VLM
40
1
0
15 Nov 2024
Harnessing Vision Foundation Models for High-Performance, Training-Free
  Open Vocabulary Segmentation
Harnessing Vision Foundation Models for High-Performance, Training-Free Open Vocabulary Segmentation
Yuheng Shi
Minjing Dong
Chang Xu
VLM
27
1
0
14 Nov 2024
VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
Shehan Munasinghe
Hanan Gani
Wenqi Zhu
Jiale Cao
Eric P. Xing
F. Khan
Salman Khan
MLLM
VGen
VLM
42
6
0
07 Nov 2024
Multiple Information Prompt Learning for Cloth-Changing Person Re-Identification
Multiple Information Prompt Learning for Cloth-Changing Person Re-Identification
Shengxun Wei
Zan Gao
Yibo Zhao
Weili Guan
Weili Guan
Shengyong Chen
41
1
0
01 Nov 2024
OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup
OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup
Xize Cheng
Siqi Zheng
Zehan Wang
Minghui Fang
Ziang Zhang
...
Z. Ma
Shengpeng Ji
Jialong Zuo
Tao Jin
Zhou Zhao
17
1
0
28 Oct 2024
Scene Graph Generation with Role-Playing Large Language Models
Scene Graph Generation with Role-Playing Large Language Models
Guikun Chen
Jin Li
Wenguan Wang
VLM
40
5
0
20 Oct 2024
ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding
ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding
Guangda Ji
Silvan Weder
Francis Engelmann
Marc Pollefeys
Hermann Blum
3DV
44
3
0
17 Oct 2024
Sensitivity of Generative VLMs to Semantically and Lexically Altered
  Prompts
Sensitivity of Generative VLMs to Semantically and Lexically Altered Prompts
Sri Harsha Dumpala
Aman Jaiswal
Chandramouli Shama Sastry
E. Milios
Sageev Oore
Hassan Sajjad
VLM
16
2
0
16 Oct 2024
SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture
SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture
Jiayi Han
Liang Du
Hongwei Du
Xiangguo Zhou
Yiwen Wu
Weibo Zheng
Donghong Han
CLL
MoMe
MoE
33
2
0
10 Oct 2024
Open-RGBT: Open-vocabulary RGB-T Zero-shot Semantic Segmentation in
  Open-world Environments
Open-RGBT: Open-vocabulary RGB-T Zero-shot Semantic Segmentation in Open-world Environments
Meng Yu
Luojie Yang
Xunjie He
Yi Yang
Yufeng Yue
VLM
20
0
0
09 Oct 2024
Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving
  Vision-Linguistic Compositionality
Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality
Youngtaek Oh
Jae-Won Cho
Dong-Jin Kim
In So Kweon
Junmo Kim
VLM
CoGe
CLIP
14
4
0
07 Oct 2024
VISTA: A Visual and Textual Attention Dataset for Interpreting
  Multimodal Models
VISTA: A Visual and Textual Attention Dataset for Interpreting Multimodal Models
Harshit
Tolga Tasdizen
CoGe
VLM
17
1
0
06 Oct 2024
Visual-O1: Understanding Ambiguous Instructions via Multi-modal
  Multi-turn Chain-of-thoughts Reasoning
Visual-O1: Understanding Ambiguous Instructions via Multi-modal Multi-turn Chain-of-thoughts Reasoning
Minheng Ni
Yutao Fan
Lei Zhang
Wangmeng Zuo
LRM
AI4CE
21
6
0
04 Oct 2024
Understanding and Mitigating Miscalibration in Prompt Tuning for
  Vision-Language Models
Understanding and Mitigating Miscalibration in Prompt Tuning for Vision-Language Models
Shuoyuan Wang
Yixuan Li
Hongxin Wei
VLM
36
2
0
03 Oct 2024
Towards Open-Vocabulary Semantic Segmentation Without Semantic Labels
Towards Open-Vocabulary Semantic Segmentation Without Semantic Labels
Heeseong Shin
Chaehyun Kim
Sunghwan Hong
Seokju Cho
Anurag Arnab
Paul Hongsuck Seo
Seungryong Kim
VLM
27
1
0
30 Sep 2024
One Token to Seg Them All: Language Instructed Reasoning Segmentation in
  Videos
One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos
Zechen Bai
Tong He
Haiyang Mei
Pichao Wang
Ziteng Gao
Joya Chen
Lei Liu
Zheng Zhang
Mike Zheng Shou
VLM
VOS
MLLM
32
17
0
29 Sep 2024
Search3D: Hierarchical Open-Vocabulary 3D Segmentation
Search3D: Hierarchical Open-Vocabulary 3D Segmentation
Ayca Takmaz
Alexandros Delitzas
R. Sumner
Francis Engelmann
Johanna Wald
Federico Tombari
48
10
0
27 Sep 2024
DIAL: Dense Image-text ALignment for Weakly Supervised Semantic
  Segmentation
DIAL: Dense Image-text ALignment for Weakly Supervised Semantic Segmentation
Soojin Jang
Jungmin Yun
Junehyoung Kwon
Eunju Lee
Youngbin Kim
38
3
0
24 Sep 2024
Tag Map: A Text-Based Map for Spatial Reasoning and Navigation with
  Large Language Models
Tag Map: A Text-Based Map for Spatial Reasoning and Navigation with Large Language Models
Mike Zhang
Kaixian Qu
Vaishakh Patil
César Cadena
Marco Hutter
LM&Ro
3DV
25
3
0
23 Sep 2024
From Experts to the Public: Governing Multimodal Language Models in
  Politically Sensitive Video Analysis
From Experts to the Public: Governing Multimodal Language Models in Politically Sensitive Video Analysis
Tanusree Sharma
Yujin Potter
Zachary Kilhoffer
Yun Huang
Dawn Song
Yang Wang
51
3
0
15 Sep 2024
Generalization Boosted Adapter for Open-Vocabulary Segmentation
Generalization Boosted Adapter for Open-Vocabulary Segmentation
Wenhao Xu
Changwei Wang
Xuxiang Feng
Rongtao Xu
Longzhao Huang
Zherui Zhang
Li Guo
Shibiao Xu
VLM
28
1
0
13 Sep 2024
Open-Vocabulary Remote Sensing Image Semantic Segmentation
Open-Vocabulary Remote Sensing Image Semantic Segmentation
Qinglong Cao
Yuntian Chen
Chao Ma
Xiaokang Yang
17
3
0
12 Sep 2024
Revisiting Prompt Pretraining of Vision-Language Models
Revisiting Prompt Pretraining of Vision-Language Models
Zhenyuan Chen
Lingfeng Yang
Shuo Chen
Zhaowei Chen
Jiajun Liang
Xiang Li
MLLM
VPVLM
VLM
33
1
0
10 Sep 2024
FrozenSeg: Harmonizing Frozen Foundation Models for Open-Vocabulary
  Segmentation
FrozenSeg: Harmonizing Frozen Foundation Models for Open-Vocabulary Segmentation
Xi Chen
Haosen Yang
Sheng Jin
Xiatian Zhu
H. Yao
VLM
29
3
0
05 Sep 2024
iSeg: An Iterative Refinement-based Framework for Training-free
  Segmentation
iSeg: An Iterative Refinement-based Framework for Training-free Segmentation
Lin Sun
Jiale Cao
J. Xie
F. Khan
Yanwei Pang
DiffM
30
1
0
05 Sep 2024
SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring
  Expression Segmentation
SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation
Yi-Chia Chen
Wei-Hua Li
Cheng Sun
Yu-Chiang Frank Wang
Chu-Song Chen
VLM
30
10
0
01 Sep 2024
EMPOWER: Embodied Multi-role Open-vocabulary Planning with Online
  Grounding and Execution
EMPOWER: Embodied Multi-role Open-vocabulary Planning with Online Grounding and Execution
F. Argenziano
Michele Brienza
Vincenzo Suriani
Daniele Nardi
D. Bloisi
LM&Ro
41
0
0
30 Aug 2024
MROVSeg: Breaking the Resolution Curse of Vision-Language Models in
  Open-Vocabulary Semantic Segmentation
MROVSeg: Breaking the Resolution Curse of Vision-Language Models in Open-Vocabulary Semantic Segmentation
Yuanbing Zhu
Bingke Zhu
Zhen Chen
Huan Xu
Ming Tang
Jinqiao Wang
VLM
24
0
0
27 Aug 2024
Image Segmentation in Foundation Model Era: A Survey
Image Segmentation in Foundation Model Era: A Survey
Tianfei Zhou
Fei Zhang
Boyu Chang
Wenguan Wang
Ye Yuan
E. Konukoglu
Daniel Cremers
VLM
38
4
0
23 Aug 2024
Vocabulary-Free 3D Instance Segmentation with Vision and Language Assistant
Vocabulary-Free 3D Instance Segmentation with Vision and Language Assistant
Guofeng Mei
Luigi Riz
Yiming Wang
Fabio Poiesi
ISeg
VLM
52
3
0
20 Aug 2024
OVOSE: Open-Vocabulary Semantic Segmentation in Event-Based Cameras
OVOSE: Open-Vocabulary Semantic Segmentation in Event-Based Cameras
Muhammad Rameez Ur Rahman
Jhony H. Giraldo
Indro Spinelli
Stéphane Lathuilière
Fabio Galasso
VLM
21
0
0
18 Aug 2024
Zero-Shot Dual-Path Integration Framework for Open-Vocabulary 3D
  Instance Segmentation
Zero-Shot Dual-Path Integration Framework for Open-Vocabulary 3D Instance Segmentation
Tri Ton
Ji Woo Hong
Soohwan Eom
Jun Yeop Shim
Junyeong Kim
Chang D. Yoo
3DPC
ISeg
30
2
0
16 Aug 2024
Visual Agents as Fast and Slow Thinkers
Visual Agents as Fast and Slow Thinkers
Guangyan Sun
Mingyu Jin
Zhenting Wang
Cheng-Long Wang
Siqi Ma
Qifan Wang
Ying Nian Wu
Ying Nian Wu
Dongfang Liu
Dongfang Liu
LLMAG
LRM
68
11
0
16 Aug 2024
Towards Flexible Visual Relationship Segmentation
Towards Flexible Visual Relationship Segmentation
Fangrui Zhu
Jianwei Yang
Huaizu Jiang
VOS
26
1
0
15 Aug 2024
ReCLIP++: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation
ReCLIP++: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation
Jingyun Wang
Guoliang Kang
VLM
SSL
29
7
0
13 Aug 2024
Efficient and Versatile Robust Fine-Tuning of Zero-shot Models
Efficient and Versatile Robust Fine-Tuning of Zero-shot Models
Sungyeon Kim
Boseung Jeong
Donghyun Kim
Suha Kwak
VLM
20
2
0
11 Aug 2024
In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic
  Segmentation
In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation
Dahyun Kang
Minsu Cho
ObjD
VLM
21
9
0
09 Aug 2024
Visual Grounding for Object-Level Generalization in Reinforcement
  Learning
Visual Grounding for Object-Level Generalization in Reinforcement Learning
Haobin Jiang
Zongqing Lu
LM&Ro
22
2
0
04 Aug 2024
Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and
  Flexible Scene Text Retrieval
Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text Retrieval
Gangyan Zeng
Yuan Zhang
Jin Wei
Dongbao Yang
Peng Zhang
Yiwen Gao
Xugong Qin
Yu Zhou
VLM
CLIP
13
0
0
01 Aug 2024
Diffusion Feedback Helps CLIP See Better
Diffusion Feedback Helps CLIP See Better
Wenxuan Wang
Quan-Sen Sun
Fan Zhang
Yepeng Tang
Jing Liu
Xinlong Wang
VLM
38
14
0
29 Jul 2024
Every Part Matters: Integrity Verification of Scientific Figures Based
  on Multimodal Large Language Models
Every Part Matters: Integrity Verification of Scientific Figures Based on Multimodal Large Language Models
Xiang Shi
Jiawei Liu
Yinpeng Liu
Qikai Cheng
Wei Lu
39
0
0
26 Jul 2024
Previous
1234567
Next