ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.14940
  4. Cited By
Learning to Prompt for Open-Vocabulary Object Detection with
  Vision-Language Model

Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model

Computer Vision and Pattern Recognition (CVPR), 2022
28 March 2022
Yu Du
Fangyun Wei
Zihe Zhang
Miaojing Shi
Yue Gao
Guoqi Li
    VPVLMVLM
ArXiv (abs)PDFHTMLGithub (181★)

Papers citing "Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model"

50 / 278 papers shown
Title
Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection
Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection
Yang Cao
Yihan Zeng
Hang Xu
Dan Xu
3DPCObjD
276
14
0
02 Jun 2024
Learning Background Prompts to Discover Implicit Knowledge for Open
  Vocabulary Object Detection
Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection
Jiaming Li
Jiacheng Zhang
Jichang Li
Ge Li
Si Liu
Liang Lin
Guanbin Li
ObjDVLM
310
27
0
01 Jun 2024
RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection
RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection
Fangyi Chen
Han Zhang
Zhantao Yang
Hao Chen
Kai Hu
Marios Savvides
ObjDVLM
194
7
0
30 May 2024
OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and
  Open-World Unknown Objects Supervision
OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision
Junjie Wang
Bin Chen
Bin Kang
Yulin Li
Yichi Chen
Weizhi Xian
Huifeng Chang
VLMObjD
205
15
0
28 May 2024
Diagnosing the Compositional Knowledge of Vision Language Models from a
  Game-Theoretic View
Diagnosing the Compositional Knowledge of Vision Language Models from a Game-Theoretic View
Jin Wang
Shichao Dong
Yapeng Zhu
Kelu Yao
Weidong Zhao
Chao Li
Ping Luo
CoGeLRM
233
5
0
27 May 2024
Large Language Model (LLM) for Telecommunications: A Comprehensive
  Survey on Principles, Key Techniques, and Opportunities
Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and OpportunitiesIEEE Communications Surveys and Tutorials (COMST), 2024
Hao Zhou
Chengming Hu
Ye Yuan
Yufei Cui
Yili Jin
...
Di Wu
Xue Liu
Charlie Zhang
Xianbin Wang
Jiangchuan Liu
244
164
0
17 May 2024
SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection
SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object DetectionComputer Vision and Pattern Recognition (CVPR), 2024
Mingxuan Liu
Tyler L. Hayes
Elisa Ricci
G. Csurka
Riccardo Volpi
ObjD
249
9
0
16 May 2024
Open-Vocabulary Object Detection via Neighboring Region Attention
  Alignment
Open-Vocabulary Object Detection via Neighboring Region Attention AlignmentEngineering applications of artificial intelligence (EAAI), 2024
Sunyuan Qiang
Xianfei Li
Yanyan Liang
Wenlong Liao
Tao He
Pai Peng
ObjD
185
0
0
14 May 2024
Curriculum Point Prompting for Weakly-Supervised Referring Image
  Segmentation
Curriculum Point Prompting for Weakly-Supervised Referring Image Segmentation
Qiyuan Dai
Sibei Yang
181
24
0
18 Apr 2024
Progressive Multi-modal Conditional Prompt Tuning
Progressive Multi-modal Conditional Prompt Tuning
Xiaoyu Qiu
Hao Feng
Yuechen Wang
Wen-gang Zhou
Houqiang Li
VLM
227
5
0
18 Apr 2024
Single-temporal Supervised Remote Change Detection for Domain
  Generalization
Single-temporal Supervised Remote Change Detection for Domain Generalization
Qiangang Du
Jinlong Peng
Xu Chen
Qingdong He
Liren He
Qiang Nie
Wenbing Zhu
Mingmin Chi
Yabiao Wang
Chengjie Wang
248
1
0
17 Apr 2024
Zero-shot detection of buildings in mobile LiDAR using Language Vision
  Model
Zero-shot detection of buildings in mobile LiDAR using Language Vision Model
June Moh Goo
Zichao Zeng
Jan Boehm
251
3
0
15 Apr 2024
DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection
DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection
Lewei Yao
Renjie Pi
Jianhua Han
Xiaodan Liang
Hang Xu
Wei Zhang
Zhenguo Li
Dan Xu
VLMObjD
240
43
0
14 Apr 2024
Training-free Boost for Open-Vocabulary Object Detection with Confidence
  Aggregation
Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation
Yanhao Zheng
Kai Liu
ObjD
180
3
0
12 Apr 2024
Deep Learning-Based Out-of-distribution Source Code Data Identification:
  How Far Have We Gone?
Deep Learning-Based Out-of-distribution Source Code Data Identification: How Far Have We Gone?
Van Nguyen
Xingliang Yuan
Tingmin Wu
Surya Nepal
M. Grobler
Carsten Rudolph
209
2
0
09 Apr 2024
Retrieval-Augmented Open-Vocabulary Object Detection
Retrieval-Augmented Open-Vocabulary Object Detection
Jooyeon Kim
Eulrang Cho
Sehyung Kim
Hyunwoo J. Kim
VLMObjD
224
20
0
08 Apr 2024
Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation
Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation
Ji-Jia Wu
Andy Chia-Hao Chang
Chieh-Yu Chuang
Chun-Pei Chen
Yu-Lun Liu
Min-Hung Chen
Hou-Ning Hu
Yung-Yu Chuang
Yen-Yu Lin
VLM
321
18
0
05 Apr 2024
Is CLIP the main roadblock for fine-grained open-world perception?
Is CLIP the main roadblock for fine-grained open-world perception?International Conference on Content-Based Multimedia Indexing (CBMI), 2024
Lorenzo Bianchi
F. Carrara
Nicola Messina
Fabrizio Falchi
VLM
193
10
0
04 Apr 2024
ViTamin: Designing Scalable Vision Models in the Vision-Language Era
ViTamin: Designing Scalable Vision Models in the Vision-Language EraComputer Vision and Pattern Recognition (CVPR), 2024
Jienneg Chen
Qihang Yu
Xiaohui Shen
Yaoyao Liu
Liang-Chieh Chen
3DVVLM
385
48
0
02 Apr 2024
T-VSL: Text-Guided Visual Sound Source Localization in Mixtures
T-VSL: Text-Guided Visual Sound Source Localization in MixturesComputer Vision and Pattern Recognition (CVPR), 2024
Tanvir Mahmud
Yapeng Tian
Diana Marculescu
166
21
0
02 Apr 2024
Weakly-supervised Audio Separation via Bi-modal Semantic Similarity
Weakly-supervised Audio Separation via Bi-modal Semantic SimilarityInternational Conference on Learning Representations (ICLR), 2024
Tanvir Mahmud
Saeed Amizadeh
K. Koishida
Diana Marculescu
AI4TS
209
4
0
02 Apr 2024
OVFoodSeg: Elevating Open-Vocabulary Food Image Segmentation via
  Image-Informed Textual Representation
OVFoodSeg: Elevating Open-Vocabulary Food Image Segmentation via Image-Informed Textual Representation
Xiongwei Wu
Sicheng Yu
Ee-Peng Lim
Chong-Wah Ngo
VLM
181
5
0
01 Apr 2024
Open-Vocabulary Object Detectors: Robustness Challenges under
  Distribution Shifts
Open-Vocabulary Object Detectors: Robustness Challenges under Distribution Shifts
Prakash Chandra Chhipa
Kanjar De
Meenakshi Subhash Chippa
Rajkumar Saini
Marcus Liwicki
ObjDVLM
229
4
0
01 Apr 2024
Prompt Learning for Oriented Power Transmission Tower Detection in
  High-Resolution SAR Images
Prompt Learning for Oriented Power Transmission Tower Detection in High-Resolution SAR Images
Tianyang Li
Chao Wang
Hong Zhang
84
0
0
01 Apr 2024
Lipsum-FT: Robust Fine-Tuning of Zero-Shot Models Using Random Text
  Guidance
Lipsum-FT: Robust Fine-Tuning of Zero-Shot Models Using Random Text Guidance
G. Nam
Byeongho Heo
Juho Lee
VLM
173
12
0
01 Apr 2024
Prompt Learning via Meta-Regularization
Prompt Learning via Meta-Regularization
Jinyoung Park
Juyeon Ko
Hyunwoo J. Kim
VLMVPVLM
280
38
0
01 Apr 2024
Training-Free Semantic Segmentation via LLM-Supervision
Training-Free Semantic Segmentation via LLM-Supervision
Wenfang Sun
Yingjun Du
Gaowen Liu
Ramana Rao Kompella
Cees G. M. Snoek
VLM
267
11
0
31 Mar 2024
OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via
  Cycle-Modality Propagation
OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation
Zhenyu Wang
Yali Li
Taichi Liu
Hengshuang Zhao
Shengjin Wang
3DPCObjD
261
15
0
28 Mar 2024
Open-Set Recognition in the Age of Vision-Language Models
Open-Set Recognition in the Age of Vision-Language Models
Dimity Miller
Niko Sünderhauf
Alex Kenna
Keita Mason
VLM
222
10
0
25 Mar 2024
Knowledge-Enhanced Dual-stream Zero-shot Composed Image Retrieval
Knowledge-Enhanced Dual-stream Zero-shot Composed Image RetrievalComputer Vision and Pattern Recognition (CVPR), 2024
Yuchen Suo
Fan Ma
Linchao Zhu
Yi Yang
229
40
0
24 Mar 2024
FairerCLIP: Debiasing CLIP's Zero-Shot Predictions using Functions in
  RKHSs
FairerCLIP: Debiasing CLIP's Zero-Shot Predictions using Functions in RKHSsInternational Conference on Learning Representations (ICLR), 2024
Sepehr Dehdashtian
Lan Wang
Vishnu Boddeti
VLM
251
28
0
22 Mar 2024
CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation
CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation
Wenqi Zhu
Jiale Cao
Jin Xie
Shuangming Yang
Yanwei Pang
VLMCLIP
258
10
0
19 Mar 2024
OV9D: Open-Vocabulary Category-Level 9D Object Pose and Size Estimation
OV9D: Open-Vocabulary Category-Level 9D Object Pose and Size Estimation
Junhao Cai
Yisheng He
Weihao Yuan
Siyu Zhu
Zilong Dong
Liefeng Bo
Qifeng Chen
DiffM
261
10
0
19 Mar 2024
Open-Vocabulary Object Detection with Meta Prompt Representation and
  Instance Contrastive Optimization
Open-Vocabulary Object Detection with Meta Prompt Representation and Instance Contrastive OptimizationBritish Machine Vision Conference (BMVC), 2024
Zhao Wang
Aoxue Li
Fengwei Zhou
Zhenguo Li
Qi Dou
ObjDVLM
180
4
0
14 Mar 2024
Towards Zero-shot Human-Object Interaction Detection via Vision-Language
  Integration
Towards Zero-shot Human-Object Interaction Detection via Vision-Language Integration
Weiying Xue
Nan Zhuang
Qiwei Xiong
Yuxiao Wang
Zhenao Wei
Xiaofen Xing
Xiangmin Xu
VLM
294
6
0
12 Mar 2024
Test-time Distribution Learning Adapter for Cross-modal Visual Reasoning
Test-time Distribution Learning Adapter for Cross-modal Visual ReasoningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Yi Zhang
Ce Zhang
VLM
201
3
0
10 Mar 2024
Exploring Robust Features for Few-Shot Object Detection in Satellite
  Imagery
Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery
Xavier Bou
Gabriele Facciolo
R. G. V. Gioi
Jean-Michel Morel
T. Ehret
ObjD
249
8
0
08 Mar 2024
Self-Adapting Large Visual-Language Models to Edge Devices across Visual
  Modalities
Self-Adapting Large Visual-Language Models to Edge Devices across Visual ModalitiesEuropean Conference on Computer Vision (ECCV), 2024
Kaiwen Cai
Zhekai Duan
Gaowen Liu
Charles Fleming
Chris Xiaoxuan Lu
VLM
201
8
0
07 Mar 2024
Controllable Prompt Tuning For Balancing Group Distributional Robustness
Controllable Prompt Tuning For Balancing Group Distributional Robustness
Hoang Phan
Andrew Gordon Wilson
Qi Lei
280
10
0
05 Mar 2024
InstaGen: Enhancing Object Detection by Training on Synthetic Dataset
InstaGen: Enhancing Object Detection by Training on Synthetic Dataset
Chengjian Feng
Yujie Zhong
Zequn Jie
Weidi Xie
Lin Ma
ObjD
287
33
0
08 Feb 2024
LLMs Meet VLMs: Boost Open Vocabulary Object Detection with Fine-grained
  Descriptors
LLMs Meet VLMs: Boost Open Vocabulary Object Detection with Fine-grained Descriptors
Sheng Jin
Xue-Qiu Jiang
Jiaxing Huang
Lewei Lu
Shijian Lu
VLMObjD
152
38
0
07 Feb 2024
YOLO-World: Real-Time Open-Vocabulary Object Detection
YOLO-World: Real-Time Open-Vocabulary Object Detection
Tianheng Cheng
Lin Song
Yixiao Ge
Wenyu Liu
Xinggang Wang
Ying Shan
VLMObjD
361
605
0
30 Jan 2024
Towards Lifelong Scene Graph Generation with Knowledge-ware In-context
  Prompt Learning
Towards Lifelong Scene Graph Generation with Knowledge-ware In-context Prompt Learning
Tao He
Tongtong Wu
Dongyang Zhang
Guiduo Duan
Ke Qin
Yuan-Fang Li
CLL
269
1
0
26 Jan 2024
Learning to Prompt with Text Only Supervision for Vision-Language Models
Learning to Prompt with Text Only Supervision for Vision-Language Models
Muhammad Uzair Khattak
Muhammad Ferjad Naeem
Muzammal Naseer
Luc Van Gool
F. Tombari
VLMVPVLM
238
38
0
04 Jan 2024
3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language
  Distillation
3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation
Zihao Xiao
Longlong Jing
Shangxuan Wu
Alex Zihao Zhu
Jingwei Ji
...
Thomas Funkhouser
Weicheng Kuo
A. Angelova
Yin Zhou
Shiwei Sheng
VLM
421
11
0
04 Jan 2024
Query-Based Knowledge Sharing for Open-Vocabulary Multi-Label
  Classification
Query-Based Knowledge Sharing for Open-Vocabulary Multi-Label Classification
Xueling Zhu
Jian Liu
Dongqi Tang
Jiawei Ge
Weijia Liu
Bo Liu
Jiuxin Cao
VLM
161
1
0
02 Jan 2024
Leveraging Open-Vocabulary Diffusion to Camouflaged Instance
  Segmentation
Leveraging Open-Vocabulary Diffusion to Camouflaged Instance Segmentation
Tuan-Anh Vu
Duc Thanh Nguyen
Qing Guo
Binh-Son Hua
N. Chung
Ivor W. Tsang
Sai-Kit Yeung
DiffM
190
5
0
29 Dec 2023
Revisiting Few-Shot Object Detection with Vision-Language Models
Revisiting Few-Shot Object Detection with Vision-Language Models
Anish Madan
Neehar Peri
Shu Kong
Deva Ramanan
VLM
345
28
0
22 Dec 2023
CLIM: Contrastive Language-Image Mosaic for Region Representation
CLIM: Contrastive Language-Image Mosaic for Region Representation
Size Wu
Wenwei Zhang
Lumin Xu
Sheng Jin
Wentao Liu
Chen Change Loy
ObjDVLM
174
24
0
18 Dec 2023
Simple Image-level Classification Improves Open-vocabulary Object
  Detection
Simple Image-level Classification Improves Open-vocabulary Object DetectionAAAI Conference on Artificial Intelligence (AAAI), 2023
Ru Fang
Guansong Pang
Xiaolong Bai
ObjDVLM
262
20
0
16 Dec 2023
Previous
123456
Next