Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1902.09630
Cited By
v1
v2 (latest)
Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression
25 February 2019
S. Hamid Rezatofighi
Deyuan Li
JunYoung Gwak
Amir Sadeghian
Ian Reid
Silvio Savarese
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression"
50 / 1,203 papers shown
Swap Path Network for Robust Person Search Pre-training
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Lucas Jaffe
A. Zakhor
3DPC
204
1
0
06 Dec 2024
CopyrightShield: Enhancing Diffusion Model Security against Copyright Infringement Attacks
Zhixiang Guo
Yaning Tan
Aishan Liu
Dacheng Tao
AAML
469
4
0
02 Dec 2024
VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval
Dhiman Paul
Md Rizwan Parvez
Nabeel Mohammed
Shafin Rahman
VGen
278
4
0
02 Dec 2024
DLaVA: Document Language and Vision Assistant for Answer Localization with Enhanced Interpretability and Trustworthiness
Ahmad Mohammadshirazi
Pinaki Prasad Guha Neogi
Ser-Nam Lim
R. Ramnath
434
6
0
29 Nov 2024
Improving Accuracy and Generalization for Efficient Visual Tracking
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Ram J. Zaveri
Shivang Patel
Yu Gu
Gianfranco Doretto
VLM
468
1
0
28 Nov 2024
HUPE: Heuristic Underwater Perceptual Enhancement with Semantic Collaborative Learning
International Journal of Computer Vision (IJCV), 2024
Zengxi Zhang
Zhiying Jiang
Long Ma
Jinyuan Liu
Xin-Yue Fan
Risheng Liu
377
10
0
27 Nov 2024
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
Qing Jiang
Gen Luo
Yuqin Yang
Yuda Xiong
Yihao Chen
Zhaoyang Zeng
Tianhe Ren
Lei Zhang
VLM
LRM
555
22
0
27 Nov 2024
Leverage Task Context for Object Affordance Ranking
Haojie Huang
Hongchen Luo
Wei-dong Zhai
Yang Cao
Zheng-jun Zha
302
0
0
25 Nov 2024
Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2024
Ronghuan Wu
Wanchao Su
Jing Liao
DiffM
321
16
0
25 Nov 2024
Corner2Net: Detecting Objects as Cascade Corners
European Conference on Artificial Intelligence (ECAI), 2024
Chenglong Liu
Jintao Liu
Haorao Wei
Jinze Yang
Liangyu Xu
Yuchen Guo
Lu Fang
207
0
0
24 Nov 2024
MambaTrack: Exploiting Dual-Enhancement for Night UAV Tracking
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Chunhui Zhang
Li Liu
Hao Wen
Xi Zhou
Yijiao Wang
Mamba
462
8
0
24 Nov 2024
SEMPose: A Single End-to-end Network for Multi-object Pose Estimation
Xin Liu
Hao Wang
Shibei Xue
Dezong Zhao
3DH
3DPC
266
2
0
21 Nov 2024
3D Reconstruction by Looking: Instantaneous Blind Spot Detector for Indoor SLAM through Mixed Reality
Hanbeom Chang
Jongseong Brad Choi
C. Yeum
252
1
0
19 Nov 2024
WoodYOLO: A Novel Object Detector for Wood Species Detection in Microscopic Images
Lars Nieradzik
Henrike Stephani
Jördis Sieburg-Rockel
Stephanie Helmling
Andrea Olbrich
Stephanie Wrage
J. Keuper
259
1
0
18 Nov 2024
Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Wentao Bao
Keqin Li
Yuxiao Chen
Deep Patel
Martin Renqiang Min
Yu Kong
VLM
ObjD
285
7
0
17 Nov 2024
RETR: Multi-View Radar Detection Transformer for Indoor Perception
Neural Information Processing Systems (NeurIPS), 2024
Ryoma Yataka
Adriano Cardace
Peng Wang
P. Boufounos
R. Takahashi
365
11
0
15 Nov 2024
Grounded Video Caption Generation
Evangelos Kazakos
Cordelia Schmid
Josef Sivic
273
0
0
12 Nov 2024
AutoGameUI: Constructing High-Fidelity Game UIs via Multimodal Learning and Interactive Web-Based Tool
Zhongliang Tang
Mengchen Tan
Fei Xia
Qingrong Cheng
Hao Jiang
Yujiao Shi
177
0
0
06 Nov 2024
SIRA: Scalable Inter-frame Relation and Association for Radar Perception
Computer Vision and Pattern Recognition (CVPR), 2024
Ryoma Yataka
Peng Wang
P. Boufounos
R. Takahashi
262
10
0
04 Nov 2024
Polar R-CNN: End-to-End Lane Detection with Fewer Anchors
Shengqi Wang
Junmin Liu
Xiangyong Cao
Zengjie Song
Kai Sun
350
5
0
03 Nov 2024
Is Multiple Object Tracking a Matter of Specialization?
Neural Information Processing Systems (NeurIPS), 2024
Gianluca Mancusi
Mattia Bernardi
Aniello Panariello
Angelo Porrello
Rita Cucchiara
Simone Calderara
MoMe
331
4
0
01 Nov 2024
LAM-YOLO: Drones-based Small Object Detection on Lighting-Occlusion Attention Mechanism YOLO
Computer Vision and Image Understanding (CVIU), 2024
Yuchen Zheng
Yuxin Jing
Jufeng Zhao
Guangmang Cui
ObjD
319
9
0
01 Nov 2024
GigaCheck: Detecting LLM-generated Content
Irina Tolstykh
Aleksandra Tsybina
Sergey Yakubson
Aleksandr Gordeev
Vladimir Dokholyan
Maksim Kuprashevich
DeLMO
306
4
0
31 Oct 2024
Phrase Decoupling Cross-Modal Hierarchical Matching and Progressive Position Correction for Visual Grounding
IEEE transactions on multimedia (IEEE TMM), 2024
Minghong Xie
Ming Wang
Huafeng Li
Yafei Zhang
Dapeng Tao
Z. Yu
ObjD
183
6
0
31 Oct 2024
Unbiased Regression Loss for DETRs
Edric
Ueta Daisuke
Kurokawa Yukimasa
Karlekar Jayashree
Sugiri Pranata
150
0
0
30 Oct 2024
PK-YOLO: Pretrained Knowledge Guided YOLO for Brain Tumor Detection in Multiplanar MRI Slices
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Ming Kang
F. F. Ting
Raphaël C.-W. Phan
C. Ting
ViT
MedIm
361
5
0
29 Oct 2024
Referring Human Pose and Mask Estimation in the Wild
Neural Information Processing Systems (NeurIPS), 2024
Bo Miao
Mingtao Feng
Zijie Wu
Mohammed Bennamoun
Yongsheng Gao
Lin Wang
224
7
0
27 Oct 2024
AlphaChimp: Tracking and Behavior Recognition of Chimpanzees
Xiaoxuan Ma
Yutang Lin
Yuan Xu
Stephan P. Kaufhold
Jack Terwilliger
Andres Meza
Yixin Zhu
Federico Rossano
Yizhou Wang
455
4
0
22 Oct 2024
Sketch2Code: Evaluating Vision-Language Models for Interactive Web Design Prototyping
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Ryan Li
Yanzhe Zhang
Diyi Yang
3DV
183
16
0
21 Oct 2024
A Paradigm Shift in Mouza Map Vectorization: A Human-Machine Collaboration Approach
Mahir Shahriar Dhrubo
Samira Akter
Anwarul Bashir Shuaib
Md Toki Tahmid
Zahid Hasan
A. B. M. Alim Al Islam
221
0
0
21 Oct 2024
SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects
International Conference on Learning Representations (ICLR), 2024
Jiayi Liu
Denys Iliash
Angel X. Chang
Manolis Savva
Ali Mahdavi-Amiri
507
34
0
21 Oct 2024
ContextDet: Temporal Action Detection with Adaptive Context Aggregation
Ning Wang
Yun Xiao
Xiaopeng Peng
Xiaojun Chang
Xuanhong Wang
Dingyi Fang
363
4
0
20 Oct 2024
Let Me Finish My Sentence: Video Temporal Grounding with Holistic Text Understanding
ACM Multimedia (MM), 2024
Jongbhin Woo
H. Ryu
Youngjoon Jang
Jae-Won Cho
Joon Son Chung
214
4
0
17 Oct 2024
VividMed: Vision Language Model with Versatile Visual Grounding for Medicine
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Lingxiao Luo
Bingda Tang
Xuanzhong Chen
Rong Han
Ting Chen
VLM
262
14
0
16 Oct 2024
Multiview Scene Graph
Neural Information Processing Systems (NeurIPS), 2024
Juexiao Zhang
Gao Zhu
Sihang Li
Xinhao Liu
Haorui Song
Xinran Tang
Chen Feng
3DV
377
7
0
15 Oct 2024
Point Cloud Mixture-of-Domain-Experts Model for 3D Self-supervised Learning
International Joint Conference on Artificial Intelligence (IJCAI), 2024
Yaohua Zha
Tao Dai
Yanzi Wang
Hang Guo
Bin Chen
Zhihao Ouyang
Chunlin Fan
3DPC
393
1
0
13 Oct 2024
Token Pruning using a Lightweight Background Aware Vision Transformer
Sudhakar Sah
Ravish Kumar
Honnesh Rohmetra
Ehsan Saboori
ViT
276
2
0
12 Oct 2024
OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring Modeling
Neural Information Processing Systems (NeurIPS), 2024
Linhui Xiao
Xiaoshan Yang
Fang Peng
Yaowei Wang
Changsheng Xu
ObjD
440
22
0
10 Oct 2024
Saliency-Guided DETR for Moment Retrieval and Highlight Detection
Aleksandr Gordeev
Vladimir Dokholyan
Irina Tolstykh
Maksim Kuprashevich
159
16
0
02 Oct 2024
KPCA-CAM: Visual Explainability of Deep Computer Vision Models using Kernel PCA
IEEE International Workshop on Multimedia Signal Processing (MMSP), 2024
Sachin Karmani
Thanushon Sivakaran
Gaurav Prasad
Mehmet Ali
Wenbo Yang
Sheyang Tang
FAtt
257
7
0
30 Sep 2024
Improving Visual Object Tracking through Visual Prompting
IEEE transactions on multimedia (IEEE TMM), 2024
Shih-Fang Chen
Jun-Cheng Chen
I-Hong Jhuo
Yen-Yu Lin
VLM
318
5
0
27 Sep 2024
SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion
Neural Information Processing Systems (NeurIPS), 2024
Ming Dai
Lingfeng Yang
Yihao Xu
Zhenhua Feng
Wankou Yang
ObjD
452
39
0
26 Sep 2024
MorphoSeg: An Uncertainty-Aware Deep Learning Method for Biomedical Segmentation of Complex Cellular Morphologies
Tianhao Zhang
Heather J. McCourty
Berardo M. Sanchez-Tafolla
Anton Nikolaev
Lyudmila Mihaylova
200
1
0
25 Sep 2024
Language-based Audio Moment Retrieval
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Hokuto Munakata
Taichi Nishimura
Shota Nakada
Tatsuya Komatsu
523
3
0
24 Sep 2024
Provably Efficient Exploration in Inverse Constrained Reinforcement Learning
Bo Yue
Jian Li
Guiliang Liu
396
3
0
24 Sep 2024
OW-Rep: Open World Object Detection with Instance Representation Learning
Sunoh Lee
Minsik Jeon
Jihong Min
Junwon Seo
ObjD
1.2K
1
0
24 Sep 2024
MCTrack: A Unified 3D Multi-Object Tracking Framework for Autonomous Driving
Xiyang Wang
Shouzheng Qi
Jieyou Zhao
Hangning Zhou
Siyu Zhang
...
Kai Tu
Songlin Guo
Jianbo Zhao
Jian Li
Mu Yang
VOT
346
19
0
23 Sep 2024
Region Prompt Tuning: Fine-grained Scene Text Detection Utilizing Region Text Prompt
Xingtao Lin
Heqian Qiu
Lanxiao Wang
RUihang Wang
Linfeng XU
Hongliang Li
VLM
153
1
0
20 Sep 2024
End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Yongqi Wang
Xinxiao Wu
Shuo Yang
Jiebo Luo
988
2
0
19 Sep 2024
Associate Everything Detected: Facilitating Tracking-by-Detection to the Unknown
IEEE Transactions on Image Processing (TIP), 2024
Zimeng Fang
Chao Liang
Xue Zhou
Shuyuan Zhu
Xi Li
277
3
0
14 Sep 2024
Previous
1
2
3
4
5
6
...
23
24
25
Next