ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.14940
  4. Cited By
Learning to Prompt for Open-Vocabulary Object Detection with
  Vision-Language Model

Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model

Computer Vision and Pattern Recognition (CVPR), 2022
28 March 2022
Yu Du
Fangyun Wei
Zihe Zhang
Miaojing Shi
Yue Gao
Guoqi Li
    VPVLMVLM
ArXiv (abs)PDFHTMLGithub (181★)

Papers citing "Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model"

50 / 278 papers shown
Title
Fine-Grained Visual Prompting
Fine-Grained Visual PromptingNeural Information Processing Systems (NeurIPS), 2023
Lingfeng Yang
Yueze Wang
Xiang Li
Xinlong Wang
Jian Yang
ObjDVLM
197
97
0
07 Jun 2023
LoCoOp: Few-Shot Out-of-Distribution Detection via Prompt Learning
LoCoOp: Few-Shot Out-of-Distribution Detection via Prompt LearningNeural Information Processing Systems (NeurIPS), 2023
Atsuyuki Miyai
Qing Yu
Go Irie
Kiyoharu Aizawa
OODD
395
112
0
02 Jun 2023
LOWA: Localize Objects in the Wild with Attributes
LOWA: Localize Objects in the Wild with Attributes
Xiaoyuan Guo
Kezhen Chen
Jinmeng Rao
Yawen Zhang
Baochen Sun
Jie Yang
ObjD
145
2
0
31 May 2023
Multi-modal Queried Object Detection in the Wild
Multi-modal Queried Object Detection in the WildNeural Information Processing Systems (NeurIPS), 2023
Yifan Xu
Mengdan Zhang
Chaoyou Fu
Peixian Chen
Xiaoshan Yang
Ke Li
Changsheng Xu
ObjDVLM
320
46
0
30 May 2023
Contextual Object Detection with Multimodal Large Language Models
Contextual Object Detection with Multimodal Large Language ModelsInternational Journal of Computer Vision (IJCV), 2023
Yuhang Zang
Wei Li
Jun Han
Kaiyang Zhou
Chen Change Loy
ObjDVLMMLLM
277
135
0
29 May 2023
Discovering Novel Actions from Open World Egocentric Videos with
  Object-Grounded Visual Commonsense Reasoning
Discovering Novel Actions from Open World Egocentric Videos with Object-Grounded Visual Commonsense ReasoningEuropean Conference on Computer Vision (ECCV), 2023
Sanjoy Kundu
Shubham Trehan
Sathyanarayanan N. Aakur
LRMLM&Ro
284
5
0
26 May 2023
Few-Shot Learning with Visual Distribution Calibration and Cross-Modal
  Distribution Alignment
Few-Shot Learning with Visual Distribution Calibration and Cross-Modal Distribution AlignmentComputer Vision and Pattern Recognition (CVPR), 2023
Runqi Wang
Hao Zheng
Xiaoyue Duan
Jianzhuang Liu
Yuning Lu
Tian Wang
Songcen Xu
Baochang Zhang
VLM
174
14
0
19 May 2023
Going Denser with Open-Vocabulary Part Segmentation
Going Denser with Open-Vocabulary Part SegmentationIEEE International Conference on Computer Vision (ICCV), 2023
Pei Sun
Shoufa Chen
Chenchen Zhu
Fanyi Xiao
Ping Luo
Saining Xie
Zhicheng Yan
ObjDVLM
202
70
0
18 May 2023
Mobile User Interface Element Detection Via Adaptively Prompt Tuning
Mobile User Interface Element Detection Via Adaptively Prompt TuningComputer Vision and Pattern Recognition (CVPR), 2023
Zhangxuan Gu
Zhuoer Xu
Haoxing Chen
Jun Lan
Changhua Meng
Weiqiang Wang
142
8
0
16 May 2023
Region-Aware Pretraining for Open-Vocabulary Object Detection with
  Vision Transformers
Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision TransformersComputer Vision and Pattern Recognition (CVPR), 2023
Dahun Kim
A. Angelova
Weicheng Kuo
ObjDViTVLM
380
107
0
11 May 2023
Vision-Language Models in Remote Sensing: Current Progress and Future
  Trends
Vision-Language Models in Remote Sensing: Current Progress and Future TrendsIEEE Geoscience and Remote Sensing Magazine (GRSM), 2023
Xiang Li
Congcong Wen
Yuan Hu
Zhenghang Yuan
Xiao Xiang Zhu
VLM
310
151
0
09 May 2023
Hypernymization of named entity-rich captions for grounding-based
  multi-modal pretraining
Hypernymization of named entity-rich captions for grounding-based multi-modal pretrainingInternational Conference on Multimedia Retrieval (ICMR), 2023
Giacomo Nebbia
Adriana Kovashka
150
0
0
25 Apr 2023
OVTrack: Open-Vocabulary Multiple Object Tracking
OVTrack: Open-Vocabulary Multiple Object TrackingComputer Vision and Pattern Recognition (CVPR), 2023
Siyuan Li
Tobias Fischer
Lei Ke
Henghui Ding
Martin Danelljan
Feng Yu
DiffM
247
59
0
17 Apr 2023
Progressive Visual Prompt Learning with Contrastive Feature Re-formation
Progressive Visual Prompt Learning with Contrastive Feature Re-formationInternational Journal of Computer Vision (IJCV), 2023
C. Xu
Yuhan Zhu
Haocheng Shen
Fengyuan Shi
Boheng Chen
Yixuan Liao
Xiaoxin Chen
Limin Wang
VLM
259
45
0
17 Apr 2023
TagCLIP: Improving Discrimination Ability of Open-Vocabulary Semantic
  Segmentation
TagCLIP: Improving Discrimination Ability of Open-Vocabulary Semantic Segmentation
Jingyao Li
Pengguang Chen
Shengju Qian
Jiaya Jia
VLM
131
15
0
15 Apr 2023
CLIP Surgery for Better Explainability with Enhancement in
  Open-Vocabulary Tasks
CLIP Surgery for Better Explainability with Enhancement in Open-Vocabulary TasksPattern Recognition (Pattern Recogn.), 2023
Yi Li
Hualiang Wang
Yiqun Duan
Xuelong Li
VLMMedImAAML
95
69
0
12 Apr 2023
Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary
  Visual Recognition
Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual RecognitionNeural Information Processing Systems (NeurIPS), 2023
Shuhuai Ren
Aston Zhang
Yi Zhu
Shuai Zhang
Shuai Zheng
Mu Li
Alexander J. Smola
Xu Sun
VPVLMVLM
209
40
0
10 Apr 2023
Defense-Prefix for Preventing Typographic Attacks on CLIP
Defense-Prefix for Preventing Typographic Attacks on CLIP
Hiroki Azuma
Yusuke Matsui
VLMAAML
238
25
0
10 Apr 2023
V3Det: Vast Vocabulary Visual Detection Dataset
V3Det: Vast Vocabulary Visual Detection DatasetIEEE International Conference on Computer Vision (ICCV), 2023
Yuan Liu
Pan Zhang
Tao Chu
Yuhang Cao
Yujie Zhou
Tong Wu
Sijin Yu
Conghui He
Dahua Lin
VLMObjD
269
76
0
07 Apr 2023
Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
Vita-CLIP: Video and text adaptive CLIP via Multimodal PromptingComputer Vision and Pattern Recognition (CVPR), 2023
Syed Talal Wasim
Muzammal Naseer
Salman Khan
Fahad Shahbaz Khan
M. Shah
VLMVPVLM
203
108
0
06 Apr 2023
Zero-shot Generative Model Adaptation via Image-specific Prompt Learning
Zero-shot Generative Model Adaptation via Image-specific Prompt LearningComputer Vision and Pattern Recognition (CVPR), 2023
Jiayi Guo
Chaofei Wang
You Wu
Eric Zhang
Kai Wang
Xingqian Xu
Qing Xiao
Humphrey Shi
Gao Huang
DiffMVLM
225
36
0
06 Apr 2023
Learning to Name Classes for Vision and Language Models
Learning to Name Classes for Vision and Language ModelsComputer Vision and Pattern Recognition (CVPR), 2023
Sarah Parisot
Yongxin Yang
Jingyu Sun
VLM
187
15
0
04 Apr 2023
Towards Open-Vocabulary Video Instance Segmentation
Towards Open-Vocabulary Video Instance SegmentationIEEE International Conference on Computer Vision (ICCV), 2023
Haochen Wang
Cilin Yan
Shuailong Wang
Xiaolong Jiang
XU Tang
Yao Hu
Weidi Xie
E. Gavves
VOSVLM
213
46
0
04 Apr 2023
RegionPLC: Regional Point-Language Contrastive Learning for Open-World
  3D Scene Understanding
RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene UnderstandingComputer Vision and Pattern Recognition (CVPR), 2023
Jihan Yang
Runyu Ding
Weipeng Deng
Zhe Wang
Xiaojuan Qi
272
98
0
03 Apr 2023
Vision-Language Models for Vision Tasks: A Survey
Vision-Language Models for Vision Tasks: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Jingyi Zhang
Jiaxing Huang
Sheng Jin
Shijian Lu
VLM
475
967
0
03 Apr 2023
Zero-shot Referring Image Segmentation with Global-Local Context
  Features
Zero-shot Referring Image Segmentation with Global-Local Context FeaturesComputer Vision and Pattern Recognition (CVPR), 2023
S. Yu
Paul Hongsuck Seo
Jeany Son
276
76
0
31 Mar 2023
Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual
  Mask Annotations
Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask AnnotationsComputer Vision and Pattern Recognition (CVPR), 2023
VS Vibashan
Ning Yu
Chen Xing
Can Qin
M. Gao
Juan Carlos Niebles
Vishal M. Patel
Ran Xu
VLMISeg
230
19
0
29 Mar 2023
MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Weicheng Kuo
A. Piergiovanni
Dahun Kim
Xiyang Luo
Benjamin Caine
...
Luowei Zhou
Andrew M. Dai
Zhifeng Chen
Claire Cui
A. Angelova
MLLMVLM
341
30
0
29 Mar 2023
HOICLIP: Efficient Knowledge Transfer for HOI Detection with
  Vision-Language Models
HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language ModelsComputer Vision and Pattern Recognition (CVPR), 2023
Sha Ning
Longtian Qiu
Yongfei Liu
Xuming He
VLM
357
70
0
28 Mar 2023
POAR: Towards Open Vocabulary Pedestrian Attribute Recognition
POAR: Towards Open Vocabulary Pedestrian Attribute RecognitionACM Multimedia (ACM MM), 2023
Yue Zhang
Suchen Wang
Shichao Kan
Zhenyu Weng
Yigang Cen
Yap-Peng Tan
ViT
137
10
0
26 Mar 2023
Prompt-Guided Transformers for End-to-End Open-Vocabulary Object
  Detection
Prompt-Guided Transformers for End-to-End Open-Vocabulary Object Detection
Hwanjun Song
Jihwan Bang
VLMObjD
236
21
0
25 Mar 2023
Three ways to improve feature alignment for open vocabulary detection
Three ways to improve feature alignment for open vocabulary detection
Relja Arandjelović
A. Andonian
A. Mensch
Olivier J. Hénaff
Jean-Baptiste Alayrac
Andrew Zisserman
VLMObjD
231
19
0
23 Mar 2023
CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive
  Learning
CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive LearningComputer Vision and Pattern Recognition (CVPR), 2023
Yiting Cheng
Fangyun Wei
Jianmin Bao
Dong Chen
Wenqian Zhang
SLR
178
39
0
22 Mar 2023
Natural Language-Assisted Sign Language Recognition
Natural Language-Assisted Sign Language RecognitionComputer Vision and Pattern Recognition (CVPR), 2023
Ronglai Zuo
Fangyun Wei
Brian Mak
SLR
208
77
0
21 Mar 2023
Detecting Everything in the Open World: Towards Universal Object
  Detection
Detecting Everything in the Open World: Towards Universal Object DetectionComputer Vision and Pattern Recognition (CVPR), 2023
Zhenyu Wang
Yali Li
Xi Chen
Ser-Nam Lim
Antonio Torralba
Hengshuang Zhao
Shengjin Wang
ObjDVLM
192
104
0
21 Mar 2023
Decomposed Prototype Learning for Few-Shot Scene Graph Generation
Decomposed Prototype Learning for Few-Shot Scene Graph Generation
Xingchen Li
Long Chen
Guikun Chen
Yinfu Feng
Yi Yang
Jun Xiao
160
7
0
20 Mar 2023
Investigating the Role of Attribute Context in Vision-Language Models
  for Object Recognition and Detection
Investigating the Role of Attribute Context in Vision-Language Models for Object Recognition and DetectionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Kyle Buettner
Adriana Kovashka
177
0
0
17 Mar 2023
VEIL: Vetting Extracted Image Labels from In-the-Wild Captions for
  Weakly-Supervised Object Detection
VEIL: Vetting Extracted Image Labels from In-the-Wild Captions for Weakly-Supervised Object DetectionConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Arushi Rai
Adriana Kovashka
256
0
0
16 Mar 2023
GridCLIP: One-Stage Object Detection by Grid-Level CLIP Representation
  Learning
GridCLIP: One-Stage Object Detection by Grid-Level CLIP Representation Learning
Jiaying Lin
S. Gong
VLMCLIPObjD
183
27
0
16 Mar 2023
Patch-Prompt Aligned Bayesian Prompt Tuning for Vision-Language Models
Patch-Prompt Aligned Bayesian Prompt Tuning for Vision-Language ModelsConference on Uncertainty in Artificial Intelligence (UAI), 2023
Xinyang Liu
Dongsheng Wang
Bowei Fang
Miaoge Li
Zhibin Duan
Yishi Xu
Bo Chen
Mingyuan Zhou
VLMVPVLM
290
7
0
16 Mar 2023
SelfPromer: Self-Prompt Dehazing Transformers with Depth-Consistency
SelfPromer: Self-Prompt Dehazing Transformers with Depth-ConsistencyAAAI Conference on Artificial Intelligence (AAAI), 2023
Cong Wang
Jin-shan Pan
Wanyu Lin
Jiangxin Dong
Xiaomei Wu
VLMMDE
266
51
0
13 Mar 2023
Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection
Object-Aware Distillation Pyramid for Open-Vocabulary Object DetectionComputer Vision and Pattern Recognition (CVPR), 2023
Luting Wang
Yi Liu
Penghui Du
Zihan Ding
Yue Liao
Qiaosong Qi
Biaolong Chen
Si Liu
ObjDVLM
226
87
0
10 Mar 2023
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion
  Models
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2023
Jiarui Xu
Sifei Liu
Arash Vahdat
Wonmin Byeon
Xiaolong Wang
Shalini De Mello
VLM
824
400
0
08 Mar 2023
CLIP-FO3D: Learning Free Open-world 3D Scene Representations from 2D
  Dense CLIP
CLIP-FO3D: Learning Free Open-world 3D Scene Representations from 2D Dense CLIP
Junbo Zhang
Runpei Dong
Kaisheng Ma
CLIPVLM
227
107
0
08 Mar 2023
CapDet: Unifying Dense Captioning and Open-World Detection Pretraining
CapDet: Unifying Dense Captioning and Open-World Detection PretrainingComputer Vision and Pattern Recognition (CVPR), 2023
Yanxin Long
Youpeng Wen
Jianhua Han
Hang Xu
Pengzhen Ren
Wei Zhang
Sheng Zhao
Xiaodan Liang
ObjDVLM
173
44
0
04 Mar 2023
Weakly-supervised HOI Detection via Prior-guided Bi-level Representation
  Learning
Weakly-supervised HOI Detection via Prior-guided Bi-level Representation LearningInternational Conference on Learning Representations (ICLR), 2023
Bo Wan
Yongfei Liu
Desen Zhou
Tinne Tuytelaars
Xuming He
111
16
0
02 Mar 2023
Nearest Neighbors Meet Deep Neural Networks for Point Cloud Analysis
Nearest Neighbors Meet Deep Neural Networks for Point Cloud AnalysisIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Renrui Zhang
Liuhui Wang
Ziyu Guo
Jianbo Shi
3DPC
253
11
0
01 Mar 2023
Aligning Bag of Regions for Open-Vocabulary Object Detection
Aligning Bag of Regions for Open-Vocabulary Object DetectionComputer Vision and Pattern Recognition (CVPR), 2023
Size Wu
Wenwei Zhang
Sheng Jin
Wentao Liu
Chen Change Loy
VLMObjD
169
148
0
27 Feb 2023
Frustratingly Simple but Effective Zero-shot Detection and Segmentation:
  Analysis and a Strong Baseline
Frustratingly Simple but Effective Zero-shot Detection and Segmentation: Analysis and a Strong Baseline
Siddhesh Khandelwal
Anirudth Nambirajan
Behjat Siddiquie
J. Eledath
Leonid Sigal
VLM
230
6
0
14 Feb 2023
Compositional Prompt Tuning with Motion Cues for Open-vocabulary Video
  Relation Detection
Compositional Prompt Tuning with Motion Cues for Open-vocabulary Video Relation DetectionInternational Conference on Learning Representations (ICLR), 2023
Kaifeng Gao
Long Chen
Hanwang Zhang
Jun Xiao
Qianru Sun
VLMVPVLM
161
34
0
01 Feb 2023
Previous
123456
Next