Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1908.03195
Cited By
v1
v2 (latest)
LVIS: A Dataset for Large Vocabulary Instance Segmentation
Computer Vision and Pattern Recognition (CVPR), 2019
8 August 2019
Agrim Gupta
Piotr Dollár
Ross B. Girshick
ISeg
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"LVIS: A Dataset for Large Vocabulary Instance Segmentation"
50 / 1,053 papers shown
Title
LTDA-Drive: LLMs-guided Generative Models based Long-tail Data Augmentation for Autonomous Driving
Mahmut Yurt
Xin Ye
Yunsheng Ma
Jingru Luo
Abhirup Mallik
John Pauly
Burhaneddin Yaman
Liu Ren
171
2
0
21 May 2025
Unlocking the Power of SAM 2 for Few-Shot Segmentation
Qianxiong Xu
Lanyun Zhu
Xuanyi Liu
Guosheng Lin
Cheng Long
Ziyue Li
Rui Zhao
VLM
209
2
0
20 May 2025
SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning
Yang Liu
Ming Ma
Xiaomin Yu
Pengxiang Ding
Han Zhao
Mingyang Sun
Siteng Huang
Xuetao Zhang
LRM
458
19
0
18 May 2025
AoP-SAM: Automation of Prompts for Efficient Segmentation
AAAI Conference on Artificial Intelligence (AAAI), 2025
Yi Chen
Mu-Young Son
Chuanbo Hua
Joo-Young Kim
VLM
253
4
0
17 May 2025
VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learning
Yuqi Liu
Tianyuan Qu
Zhisheng Zhong
Bohao Peng
Shu Liu
Bei Yu
Jiaya Jia
VLM
LRM
397
5
0
17 May 2025
Pseudo-Label Quality Decoupling and Correction for Semi-Supervised Instance Segmentation
Jianghang Lin
Yilin Lu
Chunjiang Ge
Chaoyang Zhu
Shengchuan Zhang
Liujuan Cao
Rongrong Ji
ISeg
388
0
0
16 May 2025
FG-CLIP: Fine-Grained Visual and Textual Alignment
Chunyu Xie
Bin Wang
Fanjing Kong
Jincheng Li
Dawei Liang
Gengshen Zhang
Dawei Leng
Yuhui Yin
CLIP
VLM
485
29
0
08 May 2025
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
Computer Vision and Pattern Recognition (CVPR), 2025
Junjie Wang
Bin Chen
Yulin Li
Bin Kang
Yulin Chen
Zhuotao Tian
VLM
269
5
0
07 May 2025
Object-Shot Enhanced Grounding Network for Egocentric Video
Computer Vision and Pattern Recognition (CVPR), 2025
Yisen Feng
Haoyu Zhang
Meng Liu
Weili Guan
Liqiang Nie
191
7
0
07 May 2025
T2ID-CAS: Diffusion Model and Class Aware Sampling to Mitigate Class Imbalance in Neck Ultrasound Anatomical Landmark Detection
Manikanta Varaganti
Amulya Vankayalapati
Nour Awad
Gregory R. Dion
Laura J. Brattain
DiffM
MedIm
214
1
0
29 Apr 2025
Revisiting Data Auditing in Large Vision-Language Models
Hongyu Zhu
Sichu Liang
Wenjie Wang
Boheng Li
Tongxin Yuan
Fangqi Li
Shilin Wang
Zhuosheng Zhang
VLM
977
2
0
25 Apr 2025
Improving Open-World Object Localization by Discovering Background
Ashish Singh
Michael Jeffrey Jones
Kuan-Chuan Peng
A. Cherian
Moitreya Chatterjee
Erik Learned-Miller
ObjD
OCL
VLM
255
0
0
24 Apr 2025
DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs
Zehao Wang
Senthil Purushwalkam
Caiming Xiong
Siyang Song
Chenhui Xu
Ran Xu
334
5
0
23 Apr 2025
Perception Encoder: The best visual embeddings are not at the output of the network
Daniel Bolya
Po-Yao (Bernie) Huang
Peize Sun
Jang Hyun Cho
Andrea Madotto
...
Shiyu Dong
Nikhila Ravi
Daniel Li
Piotr Dollár
Christoph Feichtenhofer
ObjD
VOS
568
92
0
17 Apr 2025
LIFT+: Lightweight Fine-Tuning for Long-Tail Learning
Jiang-Xin Shi
Tong Wei
Yu-Feng Li
147
2
0
17 Apr 2025
Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation
Yongchao Feng
Yajie Liu
Shuai Yang
Wenrui Cai
Jing Zhang
...
Jiahui Lv
Ziqiang Liu
Tengyuan Shi
Qingjie Liu
Longji Xu
MLLM
VLM
262
8
0
13 Apr 2025
Digital Twin Catalog: A Large-Scale Photorealistic 3D Object Digital Twin Dataset
Computer Vision and Pattern Recognition (CVPR), 2025
Zhao Dong
Ka Chen
Zhaoyang Lv
Hong-Xing Yu
Yunzhi Zhang
...
Xiaqing Pan
Mingfei Yan
Jiajun Wu
Carl Ren
Richard Newcombe
307
14
0
11 Apr 2025
Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Ruoyu Chen
Hua Zhang
Jingzhi Li
Li Liu
Zhen Huang
Simeng Qin
239
2
0
09 Apr 2025
AD-Det: Boosting Object Detection in UAV Images with Focused Small Objects and Balanced Tail Classes
Remote Sensing (RS), 2025
Zhenteng Li
Sheng Lian
Dengfeng Pan
Yijiao Wang
Wei Liu
268
5
0
08 Apr 2025
Studying Image Diffusion Features for Zero-Shot Video Object Segmentation
Thanos Delatolas
Vicky S. Kalogeiton
Dim P. Papadopoulos
DiffM
VOS
301
3
0
07 Apr 2025
Post-processing for Fair Regression via Explainable SVD
International Conference on Artificial Intelligence and Statistics (AISTATS), 2025
Zhiqun Zuo
Ding Zhu
Mohammad Mahdi Khalili
830
0
0
04 Apr 2025
v-CLR: View-Consistent Learning for Open-World Instance Segmentation
Computer Vision and Pattern Recognition (CVPR), 2025
Chang-Bin Zhang
Jinhong Ni
Yujie Zhong
Kai Han
3DV
VLM
398
2
0
02 Apr 2025
SMILE: Infusing Spatial and Motion Semantics in Masked Video Learning
Computer Vision and Pattern Recognition (CVPR), 2025
Fida Mohammad Thoker
Letian Jiang
Chen Zhao
Bernard Ghanem
302
3
0
01 Apr 2025
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1
Yi Chen
Yuying Ge
Rui Wang
Yixiao Ge
Lu Qiu
Mingyu Ding
Xihui Liu
ReLM
VLM
OffRL
LRM
211
13
0
31 Mar 2025
Efficient Multi-Instance Generation with Janus-Pro-Dirven Prompt Parsing
Fan Qi
Yu Duan
Changsheng Xu
DiffM
231
0
0
27 Mar 2025
Foveated Instance Segmentation
Computer Vision and Pattern Recognition (CVPR), 2025
Hongyi Zeng
Wenxuan Liu
Tianhua Xia
Jintai Chen
Ziyun Li
Sai Qian Zhang
ISeg
295
1
0
27 Mar 2025
Exponentially Weighted Instance-Aware Repeat Factor Sampling for Long-Tailed Object Detection Model Training in Unmanned Aerial Vehicles Surveillance Scenarios
Taufiq Ahmed
Abhishek Kumar
Constantino Álvarez Casado
Anlan Zhang
Tuomo Hänninen
Lauri Lovén
Miguel Bordallo López
Sasu Tarkoma
199
0
0
27 Mar 2025
Show or Tell? Effectively prompting Vision-Language Models for semantic segmentation
Niccolo Avogaro
Thomas Frick
Mattia Rigotti
Andrea Bartezzaghi
Filip M. Janicki
Cristiano Malossi
Konrad Schindler
Roy Assaf
MLLM
VLM
224
2
0
25 Mar 2025
The Coralscapes Dataset: Semantic Scene Understanding in Coral Reefs
Jonathan Sauder
Viktor Domazetoski
G. Banc-Prandi
Gabriela Perna
Anders Meibom
D. Tuia
234
5
0
25 Mar 2025
CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection
Zhichao Sun
Huazhang Hu
Yidong Ma
Gang Liu
Nemo Chen
Xu Tang
Feng-Long Xie
Yongchao Xu
ObjD
378
0
0
24 Mar 2025
Retrieval Augmented Generation and Understanding in Vision: A Survey and New Outlook
Xu Zheng
Ziqiao Weng
Yuanhuiyi Lyu
Lutao Jiang
Haiwei Xue
Bin Ren
Danda Pani Paudel
Andrii Zadaianchuk
Luc Van Gool
Xuming Hu
3DV
340
23
0
23 Mar 2025
RefCut: Interactive Segmentation with Reference Guidance
Zheng Lin
Nan Zhou
Chen-Xi Du
Deng-Ping Fan
Shi-Min Hu
315
0
0
22 Mar 2025
An Iterative Feedback Mechanism for Improving Natural Language Class Descriptions in Open-Vocabulary Object Detection
Louis Y. Kim
Michelle Karker
Victoria Valledor
Seiyoung C. Lee
Karl F. Brzoska
Margaret Duff
Anthony Palladino
VLM
ObjD
188
1
0
21 Mar 2025
M2N2V2: Multi-Modal Unsupervised and Training-free Interactive Segmentation
Markus Karmann
Peng-Tao Jiang
Bo Li
O. Urfalioglu
188
0
0
20 Mar 2025
Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark
IEEE International Conference on Robotics and Automation (ICRA), 2025
Ying Liu
Yijing Hua
Haojiang Chai
Yanbo Wang
TengQi Ye
ObjD
252
1
0
19 Mar 2025
LED: LLM Enhanced Open-Vocabulary Object Detection without Human Curated Data Generation
Yang Zhou
Shiyu Zhao
Yuxiao Chen
Zhenting Wang
Can Jin
Dimitris N. Metaxas
ObjD
514
5
0
18 Mar 2025
SAM2 for Image and Video Segmentation: A Comprehensive Survey
Zhang Jiaxing
Tang Hao
VLM
274
12
0
17 Mar 2025
Crab: A Unified Audio-Visual Scene Understanding Model with Explicit Cooperation
Computer Vision and Pattern Recognition (CVPR), 2025
Henghui Du
Guangyao Li
Chang Zhou
Chunjie Zhang
Alan Zhao
D. Hu
208
10
0
17 Mar 2025
Segment Any-Quality Images with Generative Latent Space Enhancement
Computer Vision and Pattern Recognition (CVPR), 2025
Guangqian Guo
Yoong Guo
Xuehui Yu
Wenbo Li
Yaoxing Wang
Shan Gao
VLM
493
0
0
16 Mar 2025
A Survey on Self-supervised Contrastive Learning for Multimodal Text-Image Analysis
Asifullah Khan
Laiba Asmatullah
Anza Malik
Shahzaib Khan
Hamna Asif
SSL
VLM
471
5
0
14 Mar 2025
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
International Conference on Learning Representations (ICLR), 2025
Chuhan Zhang
Chaoyang Zhu
Pingcheng Dong
Long Chen
Dong Zhang
ObjD
VLM
970
4
0
14 Mar 2025
A Hierarchical Semantic Distillation Framework for Open-Vocabulary Object Detection
Shenghao Fu
Junkai Yan
Q. Yang
Xihan Wei
Xiaohua Xie
Wei-Shi Zheng
ObjD
VLM
207
3
0
13 Mar 2025
OVTR: End-to-End Open-Vocabulary Multiple Object Tracking with Transformer
International Conference on Learning Representations (ICLR), 2025
Jinyang Li
En Yu
Sijia Chen
Wenbing Tao
319
6
0
13 Mar 2025
Attention to Trajectory: Trajectory-Aware Open-Vocabulary Tracking
Yunhao Li
Yifan Jiao
Dan Meng
Heng Fan
L. Zhang
220
0
0
11 Mar 2025
SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
Computer Vision and Pattern Recognition (CVPR), 2025
Huanyi Zheng
Yuzhuo Tian
Hao Chen
Chunluan Zhou
Qingpei Guo
Yongxu Liu
M. Yang
Chunhua Shen
MLLM
VLM
225
9
0
11 Mar 2025
Referring to Any Person
Qing Jiang
Lin Wu
Zhaoyang Zeng
Tianhe Ren
Yuda Xiong
Yihao Chen
Qin Liu
Lei Zhang
843
12
0
11 Mar 2025
TRACE: Your Diffusion Model is Secretly an Instance Edge Detector
Sanghyun Jo
Ziseok Lee
Wooyeol Lee
Jonghyun Choi
Jaesik Park
Kyungsu Kim
414
2
0
11 Mar 2025
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning
Computer Vision and Pattern Recognition (CVPR), 2025
Xin Wen
Bingchen Zhao
Yilun Chen
Jiangmiao Pang
Xiaojuan Qi
LM&Ro
420
3
0
10 Mar 2025
YOLOE: Real-Time Seeing Anything
Ao Wang
Lihao Liu
Hui Chen
Zijia Lin
Jiawei Han
Guiguang Ding
VLM
ObjD
482
31
0
10 Mar 2025
Segment Anything, Even Occluded
Computer Vision and Pattern Recognition (CVPR), 2025
Wei-En Tai
Yu-Lin Shih
Cheng Sun
Y. Wang
Hwann-Tzong Chen
VLM
273
1
0
08 Mar 2025
Previous
1
2
3
4
5
6
...
20
21
22
Next