v1v2 (latest)

Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression

25 February 2019

Silvio Savarese

Papers citing "Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression"

50 / 1,203 papers shown

AVA-Bench: Atomic Visual Ability Benchmark for Vision Foundation Models

240

10 Jun 2025

Decoupling the Image Perception and Multimodal Reasoning for Reasoning Segmentation with Digital Twin Representations

198

09 Jun 2025

MiMo-VL Technical Report

...

259

04 Jun 2025

HiLO: High-Level Object Fusion for Autonomous Driving using Transformers

250

03 Jun 2025

Conformal Object Detection by Sequential Risk Control

Léo Andéol

Luca Mossina

Adrien Mazoyer

Sébastien Gerchinovitz

429

29 May 2025

CADReview: Automatically Reviewing CAD Programs with Error Detection and CorrectionAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

189

28 May 2025

Can NeRFs See without Cameras?

216

28 May 2025

MUSEG: Reinforcing Video Temporal Understanding via Timestamp-Aware Multi-Segment Grounding

...

254

27 May 2025

Fully Spiking Neural Networks for Unified Frame-Event Object Tracking

188

27 May 2025

Open-Det: An Efficient Learning Framework for Open-Ended Detection

202

27 May 2025

MSA at SemEval-2025 Task 3: High Quality Weak Labeling and LLM Ensemble Verification for Multilingual Hallucination Detection

128

27 May 2025

CSTrack: Enhancing RGB-X Tracking via Compact Spatiotemporal Features

203

26 May 2025

From Data to Modeling: Fully Open-vocabulary Scene Graph Generation

183

26 May 2025

MLLMs are Deeply Affected by Modality Bias

...

332

24 May 2025

The Coherence Trap: When MLLM-Crafted Narratives Exploit Manipulated Visual Contexts

Zhedong Zheng

436

23 May 2025

Efficient Motion Prompt Learning for Robust Visual Tracking

181

22 May 2025

Diff-MM: Exploring Pre-trained Text-to-Image Generation Model for Unified Multi-modal Object Tracking

Shiyu Xuan

Zechao Li

Jinhui Tang

331

19 May 2025

LiDAR MOT-DETR: A LiDAR-based Two-Stage Transformer for 3D Multiple Object Tracking

648

19 May 2025

Content Generation Models in Computational Pathology: A Comprehensive Survey on Methods, Applications, and Challenges

469

16 May 2025

Using Cross-Domain Detection Loss to Infer Multi-Scale Information for Improved Tiny Head TrackingIEEE International Conference on Automatic Face & Gesture Recognition (FG), 2025

174

14 May 2025

Visually Interpretable Subtask Reasoning for Visual Question Answering

253

12 May 2025

DiGIT: Multi-Dilated Gated Encoder and Central-Adjacent Region Integrated Decoder for Temporal Action Detection TransformerComputer Vision and Pattern Recognition (CVPR), 2025

371

09 May 2025

CGTrack: Cascade Gating Network with Hierarchical Feature Aggregation for UAV TrackingIEEE International Conference on Robotics and Automation (ICRA), 2025

220

09 May 2025

RGBX-DiffusionDet: A Framework for Multi-Modal RGB-X Object Detection Using DiffusionDetPattern Recognition (Pattern Recogn.), 2025

Eliraz Orfaig

Inna Stainvas

Igal Bilik

325

05 May 2025

Efficient Vision-based Vehicle Speed EstimationJournal of Real-Time Image Processing (JRIP), 2025

Andrej Macko

Lukás Gajdosech

Viktor Kocur

906

02 May 2025

Can Foundation Models Really Segment Tumors? A Benchmarking Odyssey in Lung CT Imaging

Elena Mulero Ayllón

Massimiliano Mantegna

264

02 May 2025

Foundation Model-Driven Framework for Human-Object Interaction Prediction with Segmentation Mask Integration

274

28 Apr 2025

Improving Open-World Object Localization by Discovering Background

Ashish Singh

Michael Jeffrey Jones

307

24 Apr 2025

Marginalized Generalized IoU (MGIoU): A Unified Objective Function for Optimizing Any Convex Parametric Shapes

332

23 Apr 2025

Progressive Language-guided Visual Learning for Multi-Task Visual Grounding

368

22 Apr 2025

EIoU-EMC: A Novel Loss for Domain-specific Nested Entity RecognitionAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2025

205

19 Apr 2025

FocusTrack: A Self-Adaptive Local Sampling Algorithm for Efficient Anti-UAV TrackingIEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2025

Ying Wang

Tingfa Xu

Jianan Li

324

18 Apr 2025

Weak Cube R-CNN: Weakly Supervised 3D Detection using only 2D Bounding BoxesScandinavian Conference on Image Analysis (SCIA), 2025

Andreas Lau Hansen

Lukas Wanzeck

Dim P. Papadopoulos

207

17 Apr 2025

EarthGPT-X: A Spatial MLLM for Multi-level Multi-Source Remote Sensing Imagery Understanding with Visual PromptingIEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2025

409

17 Apr 2025

Image Editing with Diffusion Models: A Survey

324

17 Apr 2025

Learning Occlusion-Robust Vision Transformers for Real-Time UAV TrackingComputer Vision and Pattern Recognition (CVPR), 2025

257

12 Apr 2025

Light-YOLOv8-Flame: A Lightweight High-Performance Flame Detection Algorithm

410

11 Apr 2025

Memory-efficient Streaming VideoLLMs for Real-time Procedural Video Understanding

...

264

10 Apr 2025

End-to-End Facial Expression Detection in Long Videos

152

10 Apr 2025

Few-Shot Adaptation of Grounding DINO for Agricultural Domain

315

09 Apr 2025

Are We Done with Object-Centric Learning?

Alexander Rubinstein

Christian Schroeder de Witt

Matthias Bethge

Seong Joon Oh

OCL

2.1K

09 Apr 2025

PRIMEDrive-CoT: A Precognitive Chain-of-Thought Framework for Uncertainty-Aware Object Interaction in Driving Scene Scenario

Sriram Mandalika

Lalitha V

Athira Nambiar

228

08 Apr 2025

Mathematical Modeling of Option Pricing with an Extended Black-Scholes Framework

Nikhil Shivakumar Nayak

418

04 Apr 2025

COST: Contrastive One-Stage Transformer for Vision-Language Small Object TrackingInformation Fusion (Inf. Fusion), 2025

300

02 Apr 2025

InteractionMap: Improving Online Vectorized HDMap Construction with InteractionComputer Vision and Pattern Recognition (CVPR), 2025

Kuang Wu

Chuan Yang

Zhanbin Li

287

27 Mar 2025

BOOTPLACE: Bootstrapped Object Placement with Detection TransformersComputer Vision and Pattern Recognition (CVPR), 2025

285

27 Mar 2025

HierRelTriple: Guiding Indoor Layout Generation with Hierarchical Relationship Triplet Losses

305

26 Mar 2025

RoboEngine: Plug-and-Play Robot Data Augmentation with Semantic Robot Segmentation and Background Generation

336

24 Mar 2025

CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection

469

24 Mar 2025

SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual TrackingComputer Vision and Pattern Recognition (CVPR), 2025

382

24 Mar 2025