v1v2 (latest)

Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression

25 February 2019

Silvio Savarese

Papers citing "Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression"

50 / 1,203 papers shown

Swap Path Network for Robust Person Search Pre-trainingIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

Lucas Jaffe

A. Zakhor

3DPC

204

06 Dec 2024

CopyrightShield: Enhancing Diffusion Model Security against Copyright Infringement Attacks

469

02 Dec 2024

VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval

278

02 Dec 2024

DLaVA: Document Language and Vision Assistant for Answer Localization with Enhanced Interpretability and Trustworthiness

Ahmad Mohammadshirazi

Pinaki Prasad Guha Neogi

Ser-Nam Lim

R. Ramnath

434

29 Nov 2024

Improving Accuracy and Generalization for Efficient Visual TrackingIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

468

28 Nov 2024

HUPE: Heuristic Underwater Perceptual Enhancement with Semantic Collaborative LearningInternational Journal of Computer Vision (IJCV), 2024

377

27 Nov 2024

ChatRex: Taming Multimodal LLM for Joint Perception and Understanding

555

27 Nov 2024

Leverage Task Context for Object Affordance Ranking

302

25 Nov 2024

Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2024

321

25 Nov 2024

Corner2Net: Detecting Objects as Cascade CornersEuropean Conference on Artificial Intelligence (ECAI), 2024

207

24 Nov 2024

MambaTrack: Exploiting Dual-Enhancement for Night UAV TrackingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

462

24 Nov 2024

SEMPose: A Single End-to-end Network for Multi-object Pose Estimation

266

21 Nov 2024

3D Reconstruction by Looking: Instantaneous Blind Spot Detector for Indoor SLAM through Mixed Reality

Hanbeom Chang

Jongseong Brad Choi

C. Yeum

252

19 Nov 2024

WoodYOLO: A Novel Object Detector for Wood Species Detection in Microscopic Images

Lars Nieradzik

Henrike Stephani

Jördis Sieburg-Rockel

259

18 Nov 2024

Exploiting VLM Localizability and Semantics for Open Vocabulary Action DetectionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

285

17 Nov 2024

RETR: Multi-View Radar Detection Transformer for Indoor PerceptionNeural Information Processing Systems (NeurIPS), 2024

365

15 Nov 2024

Grounded Video Caption Generation

Evangelos Kazakos

Cordelia Schmid

Josef Sivic

273

12 Nov 2024

AutoGameUI: Constructing High-Fidelity Game UIs via Multimodal Learning and Interactive Web-Based Tool

177

06 Nov 2024

SIRA: Scalable Inter-frame Relation and Association for Radar PerceptionComputer Vision and Pattern Recognition (CVPR), 2024

262

04 Nov 2024

Polar R-CNN: End-to-End Lane Detection with Fewer Anchors

350

03 Nov 2024

Is Multiple Object Tracking a Matter of Specialization?Neural Information Processing Systems (NeurIPS), 2024

331

01 Nov 2024

LAM-YOLO: Drones-based Small Object Detection on Lighting-Occlusion Attention Mechanism YOLOComputer Vision and Image Understanding (CVIU), 2024

319

01 Nov 2024

GigaCheck: Detecting LLM-generated Content

306

31 Oct 2024

Phrase Decoupling Cross-Modal Hierarchical Matching and Progressive Position Correction for Visual GroundingIEEE transactions on multimedia (IEEE TMM), 2024

Huafeng Li

183

31 Oct 2024

Unbiased Regression Loss for DETRs

150

30 Oct 2024

PK-YOLO: Pretrained Knowledge Guided YOLO for Brain Tumor Detection in Multiplanar MRI SlicesIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

361

29 Oct 2024

Referring Human Pose and Mask Estimation in the WildNeural Information Processing Systems (NeurIPS), 2024

224

27 Oct 2024

AlphaChimp: Tracking and Behavior Recognition of Chimpanzees

455

22 Oct 2024

Sketch2Code: Evaluating Vision-Language Models for Interactive Web Design PrototypingNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

183

21 Oct 2024

A Paradigm Shift in Mouza Map Vectorization: A Human-Machine Collaboration Approach

Mahir Shahriar Dhrubo

Samira Akter

Anwarul Bashir Shuaib

Md Toki Tahmid

Zahid Hasan

A. B. M. Alim Al Islam

221

21 Oct 2024

SINGAPO: Single Image Controlled Generation of Articulated Parts in ObjectsInternational Conference on Learning Representations (ICLR), 2024

507

21 Oct 2024

ContextDet: Temporal Action Detection with Adaptive Context Aggregation

363

20 Oct 2024

Let Me Finish My Sentence: Video Temporal Grounding with Holistic Text UnderstandingACM Multimedia (MM), 2024

214

17 Oct 2024

VividMed: Vision Language Model with Versatile Visual Grounding for MedicineNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

Ting Chen

262

16 Oct 2024

Multiview Scene GraphNeural Information Processing Systems (NeurIPS), 2024

377

15 Oct 2024

Point Cloud Mixture-of-Domain-Experts Model for 3D Self-supervised LearningInternational Joint Conference on Artificial Intelligence (IJCAI), 2024

393

13 Oct 2024

Token Pruning using a Lightweight Background Aware Vision Transformer

276

12 Oct 2024

OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring ModelingNeural Information Processing Systems (NeurIPS), 2024

Fang Peng

440

10 Oct 2024

Saliency-Guided DETR for Moment Retrieval and Highlight Detection

159

02 Oct 2024

KPCA-CAM: Visual Explainability of Deep Computer Vision Models using Kernel PCAIEEE International Workshop on Multimedia Signal Processing (MMSP), 2024

257

30 Sep 2024

Improving Visual Object Tracking through Visual PromptingIEEE transactions on multimedia (IEEE TMM), 2024

Yen-Yu Lin

318

27 Sep 2024

SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal FusionNeural Information Processing Systems (NeurIPS), 2024

Wankou Yang

452

26 Sep 2024

MorphoSeg: An Uncertainty-Aware Deep Learning Method for Biomedical Segmentation of Complex Cellular Morphologies

Tianhao Zhang

Heather J. McCourty

Berardo M. Sanchez-Tafolla

Anton Nikolaev

Lyudmila Mihaylova

200

25 Sep 2024

Language-based Audio Moment RetrievalIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

523

24 Sep 2024

Provably Efficient Exploration in Inverse Constrained Reinforcement Learning

Bo Yue

Jian Li

Guiliang Liu

396

24 Sep 2024

OW-Rep: Open World Object Detection with Instance Representation Learning

1.2K

24 Sep 2024

MCTrack: A Unified 3D Multi-Object Tracking Framework for Autonomous Driving

Xiyang Wang

...

346

23 Sep 2024

Region Prompt Tuning: Fine-grained Scene Text Detection Utilizing Region Text Prompt

Hongliang Li

153

20 Sep 2024

End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal PromptingIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

Yongqi Wang

Xinxiao Wu

Shuo Yang

Jiebo Luo

988

19 Sep 2024

Associate Everything Detected: Facilitating Tracking-by-Detection to the UnknownIEEE Transactions on Image Processing (TIP), 2024

277

14 Sep 2024