v1v2v3 (latest)

Detecting Twenty-thousand Classes using Image-level Supervision

European Conference on Computer Vision (ECCV), 2022

7 January 2022

ArXiv (abs)PDF HTML Github (1950★)

Papers citing "Detecting Twenty-thousand Classes using Image-level Supervision"

50 / 520 papers shown

HELIOS: Hierarchical Exploration for Language-Grounded Interaction in Open Scenes

Kostas Daniilidis

Bernadette Bucher

LM&Ro

167

30 Mar 2026

One Patch to Caption Them All: A Unified Zero-Shot Captioning Framework

232

30 Mar 2026

FALCON: Actively Decoupled Visuomotor Policies for Loco-Manipulation with Foundation-Model-Based Coordination

205

04 Dec 2025

OpenBox: Annotate Any Bounding Boxes in 3D

153

01 Dec 2025

VaMP: Variational Multi-Modal Prompt Learning for Vision-Language Models

Silin Cheng

Kai Han

MLLM VPVLM VLM

349

27 Nov 2025

OVOD-Agent: A Markov-Bandit Framework for Proactive Visual Reasoning and Self-Evolving Detection

301

26 Nov 2025

LocateAnything3D: Vision-Language 3D Detection with Chain-of-Sight

179

25 Nov 2025

MedROV: Towards Real-Time Open-Vocabulary Detection Across Diverse Medical Imaging Modalities

380

25 Nov 2025

State and Scene Enhanced Prototypes for Weakly Supervised Open-Vocabulary Object Detection

Jiaying Zhou

Qingchao Chen

161

22 Nov 2025

ShelfOcc: Native 3D Supervision beyond LiDAR for Vision-Based Occupancy Estimation

139

19 Nov 2025

GazeVLM: A Vision-Language Model for Multi-Task Gaze Understanding

166

09 Nov 2025

IDALC: A Semi-Supervised Framework for Intent Detection and Active Learning based CorrectionIEEE Transactions on Artificial Intelligence (IEEE TAI), 2025

184

08 Nov 2025

HyPerNav: Hybrid Perception for Object-Oriented Navigation in Unknown Environment

Zecheng Yin

H. Vicky Zhao

Zhen Li

189

27 Oct 2025

Towards 3D Objectness Learning in an Open World

217

20 Oct 2025

Enhancing Compositional Reasoning in CLIP via Reconstruction and Alignment of Text Descriptions

214

18 Oct 2025

CoT-PL: Chain-of-Thought Pseudo-Labeling for Open-Vocabulary Object Detection

452

16 Oct 2025

Language as a Label: Zero-Shot Multimodal Classification of Everyday Postures under Data Scarcity

MingZe Tang

Jubal Chandy Jacob

VLM

157

15 Oct 2025

Synthetic Object Compositions for Scalable and Accurate Learning in Detection, Segmentation, and Grounding

292

10 Oct 2025

Vision-Language-Action Models for Robotics: A Review Towards Real-World ApplicationsIEEE Access (IEEE Access), 2025

423

08 Oct 2025

Ultralytics YOLO Evolution: An Overview of YOLO26, YOLO11, YOLOv8 and YOLOv5 Object Detectors for Computer Vision and Pattern Recognition

Ranjan Sapkota

Manoj Karkee

ObjD MU

337

06 Oct 2025

General and Efficient Visual Goal-Conditioned Reinforcement Learning using Object-Agnostic Masks

181

06 Oct 2025

Cross-View Open-Vocabulary Object Detection in Aerial Imagery

279

04 Oct 2025

VLOD-TTA: Test-Time Adaptation of Vision-Language Object Detectors

167

01 Oct 2025

C3-OWD: A Curriculum Cross-modal Contrastive Learning Framework for Open-World Detection

...

345

27 Sep 2025

Spatial Reasoning in Foundation Models: Benchmarking Object-Centric Spatial Understanding

...

142

26 Sep 2025

LABELING COPILOT: A Deep Research Agent for Automated Data Curation in Computer Vision

Debargha Ganguly

Sumit Kumar

Ishwar B Balappanawar

Weicong Chen

Shashank Kambhatla

Srinivasan Iyengar

Shivkumar Kalyanaraman

Ponnurangam Kumaraguru

Vipin Chaudhary

VLM

247

26 Sep 2025

MVP: Motion Vector Propagation for Zero-Shot Video Object Detection

156

22 Sep 2025

Sparse Multiview Open-Vocabulary 3D Detection

Olivier Moliner

Viktor Larsson

Kalle Åström

152

19 Sep 2025

Pre-Manipulation Alignment Prediction with Parallel Deep State-Space and Transformer Models

Motonari Kambara

Komei Sugiura

278

17 Sep 2025

Toward Ownership Understanding of Objects: Active Question Generation with Large Language Model and Probabilistic Generative Model

159

16 Sep 2025

GBPP: Grasp-Aware Base Placement Prediction for Robots via Two-Stage Learning

188

15 Sep 2025

Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation

...

241

09 Sep 2025

Harnessing Object Grounding for Time-Sensitive Video Understanding

Tz-Ying Wu

S. N. Sridhar

Subarna Tripathi

254

08 Sep 2025

Visibility-Aware Language Aggregation for Open-Vocabulary Segmentation in 3D Gaussian Splatting

181

05 Sep 2025

OVGrasp: Open-Vocabulary Grasping Assistance via Multimodal Intent Detection

Chen Hu

Shan Luo

Letizia Gionfrida

129

04 Sep 2025

InstaDA: Augmenting Instance Segmentation Data with Dual-Agent System

177

03 Sep 2025

GENNAV: Polygon Mask Generation for Generalized Referring Navigable Regions

178

28 Aug 2025

Context-Aware Risk Estimation in Home Environments: A Probabilistic Framework for Service Robots

192

27 Aug 2025

OpenM3D: Open Vocabulary Multi-view Indoor 3D Object Detection without Human Annotations

155

27 Aug 2025

Robust and Label-Efficient Deep Waste Detection

178

26 Aug 2025

Take That for Me: Multimodal Exophora Resolution with Interactive Questioning for Ambiguous Out-of-View Instructions

139

22 Aug 2025

Generalized Decoupled Learning for Enhancing Open-Vocabulary Dense Perception

176

15 Aug 2025

Composed Object Retrieval: Object-level Retrieval via Composed Expressions

269

06 Aug 2025

Weakly-Supervised Image Forgery Localization via Vision-Language Collaborative Reasoning Framework

421

02 Aug 2025

ODOV: Towards Open-Domain Open-Vocabulary Object Detection

265

02 Aug 2025

3D-MOOD: Lifting 2D to 3D for Monocular Open-Set Object Detection

317

31 Jul 2025

Details Matter for Indoor Open-vocabulary 3D Instance Segmentation

...

302

30 Jul 2025

When Engineering Outruns Intelligence: Rethinking Instruction-Guided Navigation

Matin Aghaei

Lingfeng Zhang

Mohammad Ali Alomrani

Mahdi Biparva

Yingxue Zhang

LRM

195

26 Jul 2025

Open World Object Detection: A Survey

481

01 Jul 2025

ControlVLA: Few-shot Object-centric Adaptation for Pre-trained Vision-Language-Action Models

...

257

19 Jun 2025