v1v2 (latest)

Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression

25 February 2019

Silvio Savarese

Papers citing "Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression"

50 / 1,203 papers shown

Reinforcement Learning for Large Model: A Survey

308

24 Dec 2025

CauSight: Learning to Supersense for Visual Causal Discovery

Yize Zhang

141

01 Dec 2025

InstanceV: Instance-Level Video Generation

120

28 Nov 2025

DocVAL: Validated Chain-of-Thought Distillation for Grounded Document VQA

Ahmad Mohammadshirazi

Pinaki Prasad Guha Neogi

Dheeraj Kulshrestha

R. Ramnath

VGen

130

27 Nov 2025

Ar2Can: An Architect and an Artist Leveraging a Canvas for Multi-Human Generation

27 Nov 2025

Context-Aware Token Pruning and Discriminative Selective Attention for Transformer Tracking

25 Nov 2025

StableTrack: Stabilizing Multi-Object Tracking on Low-Frequency Detections

403

25 Nov 2025

SFHand: A Streaming Framework for Language-guided 3D Hand Forecasting and Embodied Manipulation

22 Nov 2025

MGA-VQA: Secure and Interpretable Graph-Augmented Visual Question Answering with Memory-Guided Protection Against Unauthorized Knowledge Use

Ahmad Mohammadshirazi

Pinaki Prasad Guha Neogi

Dheeraj Kulshrestha

R. Ramnath

104

22 Nov 2025

REXO: Indoor Multi-View Radar Object Detection via 3D Bounding Box Diffusion

113

21 Nov 2025

Real-Time 3D Object Detection with Inference-Aligned Learning

232

20 Nov 2025

Backdoor Attacks on Open Vocabulary Object Detectors via Multi-Modal Prompt Tuning

Ankita Raj

Chetan Arora

ObjD AAML VLM

253

16 Nov 2025

Scale-Aware Relay and Scale-Adaptive Loss for Tiny Object Detection in Aerial Images

222

13 Nov 2025

Sim4Seg: Boosting Multimodal Multi-disease Medical Diagnosis Segmentation with Region-Aware Vision-Language Similarity Masks

Lingran Song

Yucheng Zhou

Jianbing Shen

107

10 Nov 2025

SportR: A Benchmark for Multimodal Large Language Model Reasoning in Sports

...

423

09 Nov 2025

Interaction-Centric Knowledge Infusion and Transfer for Open-Vocabulary Scene Graph Generation

148

08 Nov 2025

Temporal Zoom Networks: Distance Regression and Continuous Depth for Efficient Action Localization

Ibne Farabi Shihab

Sanjeda Akter

Anuj Sharma

192

06 Nov 2025

Part-Aware Bottom-Up Group Reasoning for Fine-Grained Social Interaction Detection

Dongkeun Kim

Minsu Cho

Suha Kwak

05 Nov 2025

RIS-Assisted 3D Spherical Splatting for Object Composition Visualization using Detection Transformers

Anastasios T. Sotiropoulos

04 Nov 2025

Unsupervised Learning for Industrial Defect Detection: A Case Study on Shearographic Data

04 Nov 2025

3EED: Ground Everything Everywhere in 3D

116

03 Nov 2025

Gaussian Combined Distance: A Generic Metric for Object DetectionIEEE Geoscience and Remote Sensing Letters (GRSL), 2025

162

31 Oct 2025

PT-DETR: Small Target Detection Based on Partially-Aware Detail Focus

Bingcong Huo

Zhiming Wang

30 Oct 2025

MELDAE: A Framework for Micro-Expression Spotting, Detection, and Automatic Evaluation in In-the-Wild Conversational Scenes

26 Oct 2025

Enrich and Detect: Video Temporal Grounding with Multimodal LLMs

Triantafyllos Afouras

180

19 Oct 2025

Enhancing Rotated Object Detection via Anisotropic Gaussian Bounding Box and Bhattacharyya Distance

145

18 Oct 2025

Beat Tracking as Object Detection

Jaehoon Ahn

Moon-Ryul Jung

ObjD

239

16 Oct 2025

Structured Universal Adversarial Attacks on Object Detection for Video Sequences

100

16 Oct 2025

MATRIX: Mask Track Alignment for Interaction-aware Video Generation

106

08 Oct 2025

VLA-R1: Enhancing Reasoning in Vision-Language-Action Models

147

02 Oct 2025

Forestpest-YOLO: A High-Performance Detection Framework for Small Forestry Pests

152

01 Oct 2025

Hierarchy-Aware Neural Subgraph Matching with Enhanced Similarity MeasureIEEE Transactions on Knowledge and Data Engineering (TKDE), 2025

125

01 Oct 2025

Contrastive Diffusion Guidance for Spatial Inverse Problems

30 Sep 2025

Anchor-free Cross-view Object Geo-localization with Gaussian Position Encoding and Cross-view Association

Xingtao Ling

Chenlin Fu

Yingying Zhu

30 Sep 2025

Talk in Pieces, See in Whole: Disentangling and Hierarchical Aggregating Representations for Language-based Object Detection

156

29 Sep 2025

Collaborating Vision, Depth, and Thermal Signals for Multi-Modal Tracking: Dataset and Algorithm

186

29 Sep 2025

FM-SIREN & FM-FINER: Nyquist-Informed Frequency Multiplier for Implicit Neural Representation with Periodic Activation

140

27 Sep 2025

Real-Time Object Detection Meets DINOv3

364

25 Sep 2025

Fine-Tuning LLMs to Analyze Multiple Dimensions of Code Review: A Maximum Entropy Regulated Long Chain-of-Thought Approach

...

136

25 Sep 2025

Robust RGB-T Tracking via Learnable Visual Fourier Prompt Fine-tuning and Modality Fusion Prompt Generation

157

24 Sep 2025

LayoutAgent: A Vision-Language Agent Guided Compositional Diffusion for Spatial Layout Planning

170

24 Sep 2025

SynapFlow: A Modular Framework Towards Large-Scale Analysis of Dendritic Spines

105

23 Sep 2025

Surgical-MambaLLM: Mamba2-enhanced Multimodal Large Language Model for VQLA in Robotic SurgeryInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025

108

20 Sep 2025

Speech-to-See: End-to-End Speech-Driven Open-Set Object Detection

20 Sep 2025

Robust Object Detection for Autonomous Driving via Curriculum-Guided Group Relative Policy Optimization

Xu Jia

129

19 Sep 2025

T-SiamTPN: Temporal Siamese Transformer Pyramid Networks for Robust and Efficient UAV Tracking

Hojat Ardi

Amir Jahanshahi

Ali Diba

153

16 Sep 2025

Beyond Artificial Misalignment: Detecting and Grounding Semantic-Coordinated Multimodal Manipulations

142

16 Sep 2025

Towards Understanding Visual Grounding in Visual Language Models

Georgios Pantazopoulos

Eda B. Özyiğit

ObjD

300

12 Sep 2025

Hyperspectral Mamba for Hyperspectral Object Tracking

130

10 Sep 2025

Point Linguist Model: Segment Any Object via Bridged Large 3D-Language Model

Zhuoxu Huang

Mingqi Gao

Jungong Han

136

09 Sep 2025