v1v2 (latest)

LVIS: A Dataset for Large Vocabulary Instance Segmentation

Computer Vision and Pattern Recognition (CVPR), 2019

8 August 2019

Piotr Dollár

Papers citing "LVIS: A Dataset for Large Vocabulary Instance Segmentation"

50 / 1,058 papers shown

Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance

Zhe Chen

...

402

21 Oct 2024

Open-vocabulary vs. Closed-set: Best Practice for Few-shot Object Detection Considering Text Describability

281

20 Oct 2024

LocateBench: Evaluating the Locating Ability of Vision Language Models

243

17 Oct 2024

Configurable Embodied Data Generation for Class-Agnostic RGB-D Video SegmentationIEEE Robotics and Automation Letters (RA-L), 2024

Anthony Opipari

Aravindhan K. Krishnan

Odest Chadwicke Jenkins

VOS

255

16 Oct 2024

LocoMotion: Learning Motion-Focused Video-Language RepresentationsAsian Conference on Computer Vision (ACCV), 2024

Hazel Doughty

Fida Mohammad Thoker

Cees G. M. Snoek

373

15 Oct 2024

OVS Meets Continual Learning: Towards Sustainable Open-Vocabulary Segmentation

723

15 Oct 2024

Fractal Calibration for long-tailed object detectionComputer Vision and Pattern Recognition (CVPR), 2024

Konstantinos Panagiotis Alexandridis

1.0K

15 Oct 2024

AutoTurb: Using Large Language Models for Automatic Algebraic Model Discovery of Turbulence Closure

Yu Zhang

Kefeng Zheng

Fei Liu

Qingfu Zhang

Zhenkun Wang

250

14 Oct 2024

big.LITTLE Vision Transformer for Efficient Visual Recognition

Yulong Wang

Jifeng Dai

262

14 Oct 2024

Locality Alignment Improves Vision-Language ModelsInternational Conference on Learning Representations (ICLR), 2024

592

14 Oct 2024

Boosting Open-Vocabulary Object Detection by Handling Background Samples

194

11 Oct 2024

DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing AttentionAsian Conference on Computer Vision (ACCV), 2024

221

11 Oct 2024

Interactive4D: Interactive 4D LiDAR SegmentationIEEE International Conference on Robotics and Automation (ICRA), 2024

305

10 Oct 2024

Training-Free Open-Ended Object Detection and Segmentation via Attention as PromptsNeural Information Processing Systems (NeurIPS), 2024

Zhiwei Lin

Yongtao Wang

Zhi Tang

ObjD VLM

207

08 Oct 2024

A Simple Image Segmentation Framework via In-Context ExamplesNeural Information Processing Systems (NeurIPS), 2024

Yang Liu

Chenchen Jing

Hengtao Li

Huanyi Zheng

Hao Chen

Xinlong Wang

Chunhua Shen

178

07 Oct 2024

On Efficient Variants of Segment Anything Model: A SurveyInternational Journal of Computer Vision (IJCV), 2024

530

07 Oct 2024

AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic ManipulationInternational Conference on Learning Representations (ICLR), 2024

Dieter Fox

Ajay Mandlekar

Yijie Guo

VLM LRM

268

01 Oct 2024

MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning

Haotian Zhang

Mingfei Gao

...

Zirui Wang

Yinfei Yang

303

30 Sep 2024

ProMerge: Prompt and Merge for Unsupervised Instance SegmentationEuropean Conference on Computer Vision (ECCV), 2024

Dylan Li

Gyungin Shin

218

27 Sep 2024

A Novel Unified Architecture for Low-Shot Counting by Detection and SegmentationNeural Information Processing Systems (NeurIPS), 2024

292

27 Sep 2024

You Only Speak Once to SeeIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Wenhao Yang

Jianguo Wei

Wenhuan Lu

Lei Li

VOS

228

27 Sep 2024

Visual Concept Networks: A Graph-Based Approach to Detecting Anomalous Data in Deep Neural NetworksInternational Conferences on Pattern Recognition and Artificial Intelligence (ICCPRAI), 2024

Debargha Ganguly

Debayan Gupta

Vipin Chaudhary

GNN

181

26 Sep 2024

Search and Detect: Training-Free Long Tail Object Detection via Web-Image RetrievalComputer Vision and Pattern Recognition (CVPR), 2024

Revanth Gangi Reddy

Heng Ji

ObjD VLM

184

26 Sep 2024

DARE: Diverse Visual Question Answering with Robustness EvaluationTransactions of the Association for Computational Linguistics (TACL), 2024

346

26 Sep 2024

Episodic Memory Verbalization using Hierarchical Representations of Life-Long Robot Experience

427

26 Sep 2024

SSE: Multimodal Semantic Data Selection and Enrichment for Industrial-scale Data AssimilationKnowledge Discovery and Data Mining (KDD), 2024

Maying Shen

Nadine Chang

Sifei Liu

Jose M. Alvarez

247

20 Sep 2024

GraspSAM: When Segment Anything Model Meets Grasp DetectionIEEE International Conference on Robotics and Automation (ICRA), 2024

Sangjun Noh

Jongwon Kim

Dongwoo Nam

Seunghyeok Back

Raeyoung Kang

Kyoobin Lee

VLM

354

19 Sep 2024

LPT++: Efficient Training on Mixture of Long-tailed Experts

Bowen Dong

Pan Zhou

W. Zuo

VLM

232

17 Sep 2024

SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary TrackingEuropean Conference on Computer Vision (ECCV), 2024

Yung-Hsu Yang

Luc Van Gool

215

17 Sep 2024

Label Convergence: Defining an Upper Performance Bound in Object Recognition through Contradictory AnnotationsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

David Tschirschwitz

Volker Rodehorst

289

14 Sep 2024

Associate Everything Detected: Facilitating Tracking-by-Detection to the UnknownIEEE Transactions on Image Processing (TIP), 2024

277

14 Sep 2024

Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Haoxuan Wang

Qu He

Jinlong Peng

Hao Yang

Mingmin Chi

Yabiao Wang

Mamba

278

13 Sep 2024

GroundingBooth: Grounding Text-to-Image Customization

Nathan Jacobs

434

13 Sep 2024

From COCO to COCO-FP: A Deep Dive into Background False Positives for COCO Detectors

Cheng Li

247

12 Sep 2024

Rethinking The Training And Evaluation of Rich-Context Layout-to-Image GenerationNeural Information Processing Systems (NeurIPS), 2024

382

07 Sep 2024

FrozenSeg: Harmonizing Frozen Foundation Models for Open-Vocabulary Segmentation

Xiatian Zhu

244

05 Sep 2024

Semantically Controllable Augmentations for Generalizable Robot Learning

339

02 Sep 2024

Anno-incomplete Multi-dataset Detection

Yiran Xu

Haoxiang Zhong

Kai Wu

Jialin Li

Yong Liu

Chengjie Wang

Shu-Tao Xia

Hongen Liao

ObjD

163

29 Aug 2024

More Pictures Say More: Visual Intersection Network for Open Set Object Detection

169

26 Aug 2024

A Survey of Embodied Learning for Object-Centric Robotic ManipulationMachine Intelligence Research (MIR), 2024

Lap-Pui Chau

243

21 Aug 2024

Reflex-Based Open-Vocabulary Navigation without Prior Knowledge Using Omnidirectional Camera and Multiple Vision-Language Models

Kei Okada

152

21 Aug 2024

SAM-REF: Introducing Image-Prompt Synergy during Interaction for Detail Enhancement in the Segment Anything ModelComputer Vision and Pattern Recognition (CVPR), 2024

Chongkai Yu

Anqi Li

Xiaochao Qu

Chengjing Wu

Luoqi Liu

Xiaolin Hu

VLM

279

21 Aug 2024

OE3DIS: Open-Ended 3D Point Cloud Instance Segmentation

286

21 Aug 2024

OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding

424

20 Aug 2024

Image-Based Leopard Seal Recognition: Approaches and Challenges in Current Automated Systems

138

14 Aug 2024

ClickAttention: Click Region Similarity Guided Interactive SegmentationNeural Networks (NN), 2024

Yongquan Chen

Rui Huang

254

12 Aug 2024

In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic SegmentationEuropean Conference on Computer Vision (ECCV), 2024

Dahyun Kang

Minsu Cho

ObjD VLM

390

09 Aug 2024

Cross-Domain Semantic Segmentation on Inconsistent Taxonomy using VLMsEuropean Conference on Computer Vision (ECCV), 2024

Jeongkee Lim

Yusung Kim

307

05 Aug 2024

SAM 2: Segment Anything in Images and VideosInternational Conference on Learning Representations (ICLR), 2024

...

Piotr Dollár

Christoph Feichtenhofer

VLM MLLM

502

2,234

01 Aug 2024

A Systematic Review on Long-Tailed LearningIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2024

406

01 Aug 2024