v1v2 (latest)

Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression

25 February 2019

Silvio Savarese

Papers citing "Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression"

50 / 1,203 papers shown

Reinforcement Learning for Large Model: A Survey

323

24 Dec 2025

CauSight: Learning to Supersense for Visual Causal Discovery

Yize Zhang

150

01 Dec 2025

InstanceV: Instance-Level Video Generation

126

28 Nov 2025

DocVAL: Validated Chain-of-Thought Distillation for Grounded Document VQA

Ahmad Mohammadshirazi

Pinaki Prasad Guha Neogi

Dheeraj Kulshrestha

R. Ramnath

VGen

147

27 Nov 2025

Ar2Can: An Architect and an Artist Leveraging a Canvas for Multi-Human Generation

27 Nov 2025

Context-Aware Token Pruning and Discriminative Selective Attention for Transformer Tracking

117

25 Nov 2025

StableTrack: Stabilizing Multi-Object Tracking on Low-Frequency Detections

421

25 Nov 2025

SFHand: A Streaming Framework for Language-guided 3D Hand Forecasting and Embodied Manipulation

106

22 Nov 2025

MGA-VQA: Secure and Interpretable Graph-Augmented Visual Question Answering with Memory-Guided Protection Against Unauthorized Knowledge Use

Ahmad Mohammadshirazi

Pinaki Prasad Guha Neogi

Dheeraj Kulshrestha

R. Ramnath

107

22 Nov 2025

REXO: Indoor Multi-View Radar Object Detection via 3D Bounding Box Diffusion

132

21 Nov 2025

Real-Time 3D Object Detection with Inference-Aligned Learning

244

20 Nov 2025

Backdoor Attacks on Open Vocabulary Object Detectors via Multi-Modal Prompt Tuning

Ankita Raj

Chetan Arora

ObjD AAML VLM

300

16 Nov 2025

Scale-Aware Relay and Scale-Adaptive Loss for Tiny Object Detection in Aerial Images

233

13 Nov 2025

Sim4Seg: Boosting Multimodal Multi-disease Medical Diagnosis Segmentation with Region-Aware Vision-Language Similarity Masks

Lingran Song

Yucheng Zhou

Jianbing Shen

120

10 Nov 2025

SportR: A Benchmark for Multimodal Large Language Model Reasoning in Sports

...

441

09 Nov 2025

Interaction-Centric Knowledge Infusion and Transfer for Open-Vocabulary Scene Graph Generation

161

08 Nov 2025

Temporal Zoom Networks: Distance Regression and Continuous Depth for Efficient Action Localization

Ibne Farabi Shihab

Sanjeda Akter

Anuj Sharma

202

06 Nov 2025

Part-Aware Bottom-Up Group Reasoning for Fine-Grained Social Interaction Detection

Dongkeun Kim

Minsu Cho

Suha Kwak

136

05 Nov 2025

RIS-Assisted 3D Spherical Splatting for Object Composition Visualization using Detection Transformers

Anastasios T. Sotiropoulos

04 Nov 2025

Unsupervised Learning for Industrial Defect Detection: A Case Study on Shearographic Data

102

04 Nov 2025

3EED: Ground Everything Everywhere in 3D

139

03 Nov 2025

Gaussian Combined Distance: A Generic Metric for Object DetectionIEEE Geoscience and Remote Sensing Letters (GRSL), 2025

168

31 Oct 2025

PT-DETR: Small Target Detection Based on Partially-Aware Detail Focus

Bingcong Huo

Zhiming Wang

30 Oct 2025

MELDAE: A Framework for Micro-Expression Spotting, Detection, and Automatic Evaluation in In-the-Wild Conversational Scenes

26 Oct 2025

Enrich and Detect: Video Temporal Grounding with Multimodal LLMs

Triantafyllos Afouras

203

19 Oct 2025

Enhancing Rotated Object Detection via Anisotropic Gaussian Bounding Box and Bhattacharyya Distance

151

18 Oct 2025

Beat Tracking as Object Detection

Jaehoon Ahn

Moon-Ryul Jung

ObjD

248

16 Oct 2025

Structured Universal Adversarial Attacks on Object Detection for Video Sequences

111

16 Oct 2025

MATRIX: Mask Track Alignment for Interaction-aware Video Generation

106

08 Oct 2025

VLA-R1: Enhancing Reasoning in Vision-Language-Action Models

158

02 Oct 2025

Forestpest-YOLO: A High-Performance Detection Framework for Small Forestry Pests

166

01 Oct 2025

Hierarchy-Aware Neural Subgraph Matching with Enhanced Similarity MeasureIEEE Transactions on Knowledge and Data Engineering (TKDE), 2025

137

01 Oct 2025

Contrastive Diffusion Guidance for Spatial Inverse Problems

30 Sep 2025

Anchor-free Cross-view Object Geo-localization with Gaussian Position Encoding and Cross-view Association

Xingtao Ling

Chenlin Fu

Yingying Zhu

102

30 Sep 2025

Talk in Pieces, See in Whole: Disentangling and Hierarchical Aggregating Representations for Language-based Object Detection

161

29 Sep 2025

Collaborating Vision, Depth, and Thermal Signals for Multi-Modal Tracking: Dataset and Algorithm

217

29 Sep 2025

FM-SIREN & FM-FINER: Nyquist-Informed Frequency Multiplier for Implicit Neural Representation with Periodic Activation

154

27 Sep 2025

Real-Time Object Detection Meets DINOv3

406

25 Sep 2025

Fine-Tuning LLMs to Analyze Multiple Dimensions of Code Review: A Maximum Entropy Regulated Long Chain-of-Thought Approach

...

141

25 Sep 2025

Robust RGB-T Tracking via Learnable Visual Fourier Prompt Fine-tuning and Modality Fusion Prompt Generation

161

24 Sep 2025

LayoutAgent: A Vision-Language Agent Guided Compositional Diffusion for Spatial Layout Planning

181

24 Sep 2025

SynapFlow: A Modular Framework Towards Large-Scale Analysis of Dendritic Spines

113

23 Sep 2025

Surgical-MambaLLM: Mamba2-enhanced Multimodal Large Language Model for VQLA in Robotic SurgeryInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025

139

20 Sep 2025

Speech-to-See: End-to-End Speech-Driven Open-Set Object Detection

20 Sep 2025

Robust Object Detection for Autonomous Driving via Curriculum-Guided Group Relative Policy Optimization

Xu Jia

142

19 Sep 2025

T-SiamTPN: Temporal Siamese Transformer Pyramid Networks for Robust and Efficient UAV Tracking

Hojat Ardi

Amir Jahanshahi

Ali Diba

157

16 Sep 2025

Beyond Artificial Misalignment: Detecting and Grounding Semantic-Coordinated Multimodal Manipulations

160

16 Sep 2025

Towards Understanding Visual Grounding in Visual Language Models

Georgios Pantazopoulos

Eda B. Özyiğit

ObjD

324

12 Sep 2025

Hyperspectral Mamba for Hyperspectral Object Tracking

145

10 Sep 2025

Point Linguist Model: Segment Any Object via Bridged Large 3D-Language Model

Zhuoxu Huang

Mingqi Gao

Jungong Han

147

09 Sep 2025