ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.10678
  4. Cited By
Open-Vocabulary Object Detection Using Captions

Open-Vocabulary Object Detection Using Captions

20 November 2020
Alireza Zareian
Kevin Dela Rosa
Derek Hao Hu
Shih-Fu Chang
    VLM
    ObjD
ArXivPDFHTML

Papers citing "Open-Vocabulary Object Detection Using Captions"

50 / 317 papers shown
Title
FG-CLIP: Fine-Grained Visual and Textual Alignment
FG-CLIP: Fine-Grained Visual and Textual Alignment
Chunyu Xie
Bin Wang
Fanjing Kong
Jincheng Li
Dawei Liang
Gengshen Zhang
Dawei Leng
Yuhui Yin
CLIP
VLM
42
0
0
08 May 2025
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
Junjie Wang
Bin Chen
Yulin Li
Bin Kang
Y. Chen
Zhuotao Tian
VLM
38
0
0
07 May 2025
Generalized Visual Relation Detection with Diffusion Models
Generalized Visual Relation Detection with Diffusion Models
Kaifeng Gao
Siqi Chen
Hanwang Zhang
Jun Xiao
Yueting Zhuang
Qianru Sun
30
0
0
16 Apr 2025
Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions
Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions
Yifei Dong
Fengyi Wu
Sanjian Zhang
Guangyu Chen
Yuzhi Hu
...
Jingdong Sun
Siyu Huang
Feng Liu
Qi Dai
Zhi-Qi Cheng
39
0
0
16 Apr 2025
Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding
Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding
Tao Zhang
X. Li
Zilong Huang
Y. Li
Weixian Lei
XueQing Deng
Shihao Chen
S. Ji
Jiashi Feng
MLLM
LRM
56
1
0
14 Apr 2025
Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object Detection
Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object Detection
Jiancheng Pan
Yanxing Liu
Xiao He
Long Peng
Jiahao Li
Yuze Sun
Xiaomeng Huang
33
0
0
06 Apr 2025
Post-processing for Fair Regression via Explainable SVD
Post-processing for Fair Regression via Explainable SVD
Zhiqun Zuo
Ding Zhu
Mohammad Mahdi Khalili
95
0
0
04 Apr 2025
Refining CLIP's Spatial Awareness: A Visual-Centric Perspective
Refining CLIP's Spatial Awareness: A Visual-Centric Perspective
Congpei Qiu
Yanhao Wu
Wei Ke
Xiuxiu Bai
Tong Zhang
VLM
44
0
0
03 Apr 2025
Zero-Shot 4D Lidar Panoptic Segmentation
Zero-Shot 4D Lidar Panoptic Segmentation
Yushan Zhang
Aljosa Osep
Laura Leal-Taixé
Tim Meinhardt
3DPC
42
1
0
01 Apr 2025
GLRD: Global-Local Collaborative Reason and Debate with PSL for 3D Open-Vocabulary Detection
GLRD: Global-Local Collaborative Reason and Debate with PSL for 3D Open-Vocabulary Detection
Xingyu Peng
Si Liu
Chen Gao
Yan Bai
Beipeng Mu
Xiaofei Wang
Huaxia Xia
62
0
0
26 Mar 2025
CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection
CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection
Zhichao Sun
Huazhang Hu
Yidong Ma
Gang Liu
Nemo Chen
Xu Tang
Yao Hu
Yongchao Xu
ObjD
47
0
0
24 Mar 2025
Anomize: Better Open Vocabulary Video Anomaly Detection
Anomize: Better Open Vocabulary Video Anomaly Detection
Fei Li
Wenxuan Liu
J. Chen
Ruixu Zhang
Y. Wang
X. Zhong
Zheng Wang
48
0
0
23 Mar 2025
Classifier-guided CLIP Distillation for Unsupervised Multi-label Classification
Classifier-guided CLIP Distillation for Unsupervised Multi-label Classification
Dongseob Kim
Hyunjung Shim
VLM
44
0
0
21 Mar 2025
Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark
Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark
Ying Liu
Yijing Hua
Haojiang Chai
Yanbo Wang
TengQi Ye
ObjD
54
0
0
19 Mar 2025
LED: LLM Enhanced Open-Vocabulary Object Detection without Human Curated Data Generation
LED: LLM Enhanced Open-Vocabulary Object Detection without Human Curated Data Generation
Yang Zhou
Shiyu Zhao
Y. Chen
Z. Wang
Dimitris N. Metaxas
ObjD
56
0
0
18 Mar 2025
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
Chuhan Zhang
Chaoyang Zhu
Pingcheng Dong
Long Chen
Dong Zhang
ObjD
VLM
108
0
0
14 Mar 2025
A Hierarchical Semantic Distillation Framework for Open-Vocabulary Object Detection
Shenghao Fu
Junkai Yan
Q. Yang
Xihan Wei
Xiaohua Xie
Wei-Shi Zheng
ObjD
VLM
40
0
0
13 Mar 2025
Style Evolving along Chain-of-Thought for Unknown-Domain Object Detection
Zihao Zhang
Aming Wu
Yahong Han
ObjD
48
0
0
13 Mar 2025
Referring to Any Person
Referring to Any Person
Qing Jiang
Lin Wu
Zhaoyang Zeng
Tianhe Ren
Yuda Xiong
Yihao Chen
Qin Liu
Lei Zhang
101
0
0
11 Mar 2025
Attention to Trajectory: Trajectory-Aware Open-Vocabulary Tracking
Yunhao Li
Yifan Jiao
Dan Meng
Heng Fan
L. Zhang
58
0
0
11 Mar 2025
YOLOE: Real-Time Seeing Anything
Ao Wang
Lihao Liu
Hui Chen
Zijia Lin
J. Han
Guiguang Ding
VLM
ObjD
72
1
0
10 Mar 2025
OV-SCAN: Semantically Consistent Alignment for Novel Object Discovery in Open-Vocabulary 3D Object Detection
Adrian Chow
Evelien Riddell
Yimu Wang
Sean Sedwards
Krzysztof Czarnecki
3DPC
46
0
0
09 Mar 2025
From Dataset to Real-world: General 3D Object Detection via Generalized Cross-domain Few-shot Learning
Shuangzhi Li
Junlong Shen
Lei Ma
Xingyu Li
3DPC
48
0
0
08 Mar 2025
OpenRSD: Towards Open-prompts for Object Detection in Remote Sensing Images
OpenRSD: Towards Open-prompts for Object Detection in Remote Sensing Images
Ziyue Huang
Yongchao Feng
Shuai Yang
Z. Liu
Qingjie Liu
Y. Wang
ObjD
108
0
0
08 Mar 2025
RTGen: Real-Time Generative Detection Transformer
RTGen: Real-Time Generative Detection Transformer
Chi Ruan
ObjD
VLM
42
0
0
28 Feb 2025
Grad-ECLIP: Gradient-based Visual and Textual Explanations for CLIP
Grad-ECLIP: Gradient-based Visual and Textual Explanations for CLIP
Chenyang Zhao
Kun Wang
J. H. Hsiao
Antoni B. Chan
CLIP
66
0
0
26 Feb 2025
Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection
Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection
Xiangyu Gao
Yu Dai
Benliu Qiu
Hongliang Li
Heqian Qiu
Hongliang Li
ObjD
VLM
94
0
0
28 Jan 2025
Enhancing Novel Object Detection via Cooperative Foundational Models
Enhancing Novel Object Detection via Cooperative Foundational Models
Rohit K Bharadwaj
Muzammal Naseer
Salman Khan
F. Khan
ObjD
VLM
121
1
0
17 Jan 2025
Towards Visual Grounding: A Survey
Towards Visual Grounding: A Survey
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
46
3
0
31 Dec 2024
V-MIND: Building Versatile Monocular Indoor 3D Detector with Diverse 2D
  Annotations
V-MIND: Building Versatile Monocular Indoor 3D Detector with Diverse 2D Annotations
Jin-Cheng Jhang
Tao Tu
Fu-En Wang
Ke Zhang
Min Sun
Cheng-Hao Kuo
71
2
0
16 Dec 2024
CATALOG: A Camera Trap Language-guided Contrastive Learning Model
CATALOG: A Camera Trap Language-guided Contrastive Learning Model
Julian D. Santamaria
Claudia Isaza
Jhony H. Giraldo
76
0
0
14 Dec 2024
Open-Vocabulary High-Resolution 3D (OVHR3D) Data Segmentation and
  Annotation Framework
Open-Vocabulary High-Resolution 3D (OVHR3D) Data Segmentation and Annotation Framework
Jiuyi Xu
Meida Chen
Andrew Feng
Yangming Shi
Zifan Yu
57
0
0
09 Dec 2024
Towards Real-Time Open-Vocabulary Video Instance Segmentation
Towards Real-Time Open-Vocabulary Video Instance Segmentation
Bin Yan
Martin Sundermeyer
D. Tan
Huchuan Lu
F. Tombari
VLM
VOS
89
1
0
05 Dec 2024
OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection
OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection
Zhongyu Xia
Jishuo Li
Zhiwei Lin
Xinhao Wang
Y. Wang
Ming-Hsuan Yang
VLM
61
2
0
26 Nov 2024
Open Vocabulary Monocular 3D Object Detection
Open Vocabulary Monocular 3D Object Detection
Jin Yao
Hao Gu
Xuweiyi Chen
Jiayun Wang
Zezhou Cheng
ObjD
VLM
71
3
0
25 Nov 2024
Language Driven Occupancy Prediction
Language Driven Occupancy Prediction
Zhu Yu
Bowen Pang
Lizhe Liu
Runmin Zhang
Qihao Peng
Maochun Luo
Sheng Yang
Mingxia Chen
Si-Yuan Cao
Hui-Liang Shen
81
2
0
25 Nov 2024
Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection
Wentao Bao
K. Li
Yuxiao Chen
Deep Patel
Martin Renqiang Min
Yu Kong
VLM
ObjD
32
2
0
17 Nov 2024
Which Viewpoint Shows it Best? Language for Weakly Supervising View Selection in Multi-view Instructional Videos
Which Viewpoint Shows it Best? Language for Weakly Supervising View Selection in Multi-view Instructional Videos
Sagnik Majumder
Tushar Nagarajan
Ziad Al-Halah
Reina Pradhan
Kristen Grauman
26
0
0
13 Nov 2024
Learning from Feedback: Semantic Enhancement for Object SLAM Using
  Foundation Models
Learning from Feedback: Semantic Enhancement for Object SLAM Using Foundation Models
Jungseok Hong
Ran Choi
John Leonard
VLM
37
0
0
11 Nov 2024
ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for
  Autonomous Driving
ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for Autonomous Driving
Tao Ma
Hongbin Zhou
Qiusheng Huang
Xuemeng Yang
Jianfei Guo
Bo Zhang
Min Dou
Yu Qiao
Botian Shi
Hongsheng Li
21
1
0
08 Nov 2024
Exploiting Unlabeled Data with Multiple Expert Teachers for Open
  Vocabulary Aerial Object Detection and Its Orientation Adaptation
Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation
Yan Li
Weiwei Guo
Xue Yang
Ning Liao
Shaofeng Zhang
Yi Yu
Wenxian Yu
Junchi Yan
ObjD
38
0
0
04 Nov 2024
ImOV3D: Learning Open-Vocabulary Point Clouds 3D Object Detection from
  Only 2D Images
ImOV3D: Learning Open-Vocabulary Point Clouds 3D Object Detection from Only 2D Images
Timing Yang
Yuanliang Ju
Li Yi
3DPC
32
3
0
31 Oct 2024
Unsupervised Object Discovery: A Comprehensive Survey and Unified
  Taxonomy
Unsupervised Object Discovery: A Comprehensive Survey and Unified Taxonomy
José-Fabian Villa-Vásquez
M. Pedersoli
34
1
0
30 Oct 2024
EMMA: End-to-End Multimodal Model for Autonomous Driving
EMMA: End-to-End Multimodal Model for Autonomous Driving
Jyh-Jing Hwang
Runsheng Xu
Hubert Lin
Wei-Chih Hung
Jingwei Ji
...
Benjamin Sapp
Yin Zhou
James Guo
Dragomir Anguelov
Mingxing Tan
VLM
LM&Ro
41
28
0
30 Oct 2024
Open-Vocabulary Object Detection via Language Hierarchy
Open-Vocabulary Object Detection via Language Hierarchy
Jiaxing Huang
Jingyi Zhang
Kai Jiang
Shijian Lu
ObjD
VLM
26
1
0
27 Oct 2024
OVT-B: A New Large-Scale Benchmark for Open-Vocabulary Multi-Object
  Tracking
OVT-B: A New Large-Scale Benchmark for Open-Vocabulary Multi-Object Tracking
Haiji Liang
Ruize Han
VLM
21
1
0
23 Oct 2024
Scene Graph Generation with Role-Playing Large Language Models
Scene Graph Generation with Role-Playing Large Language Models
Guikun Chen
Jin Li
Wenguan Wang
VLM
40
5
0
20 Oct 2024
Open-vocabulary vs. Closed-set: Best Practice for Few-shot Object
  Detection Considering Text Describability
Open-vocabulary vs. Closed-set: Best Practice for Few-shot Object Detection Considering Text Describability
Yusuke Hosoya
Masanori Suganuma
Takayuki Okatani
ObjD
16
0
0
20 Oct 2024
MC-Bench: A Benchmark for Multi-Context Visual Grounding in the Era of
  MLLMs
MC-Bench: A Benchmark for Multi-Context Visual Grounding in the Era of MLLMs
Yunqiu Xu
Linchao Zhu
Yi Yang
23
3
0
16 Oct 2024
Open World Object Detection: A Survey
Open World Object Detection: A Survey
Yiming Li
Yi Wang
Wenqian Wang
Dan Lin
Bingbing Li
Kim-Hui Yap
ObjD
30
0
0
15 Oct 2024
1234567
Next