Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1908.03195
Cited By
LVIS: A Dataset for Large Vocabulary Instance Segmentation
8 August 2019
Agrim Gupta
Piotr Dollár
Ross B. Girshick
ISeg
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LVIS: A Dataset for Large Vocabulary Instance Segmentation"
50 / 285 papers shown
Title
FG-CLIP: Fine-Grained Visual and Textual Alignment
Chunyu Xie
Bin Wang
Fanjing Kong
Jincheng Li
Dawei Liang
Gengshen Zhang
Dawei Leng
Yuhui Yin
CLIP
VLM
46
0
0
08 May 2025
Object-Shot Enhanced Grounding Network for Egocentric Video
Yisen Feng
Haoyu Zhang
Meng Liu
Weili Guan
Liqiang Nie
41
0
0
07 May 2025
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
Junjie Wang
Bin Chen
Yulin Li
Bin Kang
Y. Chen
Zhuotao Tian
VLM
38
0
0
07 May 2025
T2ID-CAS: Diffusion Model and Class Aware Sampling to Mitigate Class Imbalance in Neck Ultrasound Anatomical Landmark Detection
Manikanta Varaganti
Amulya Vankayalapati
Nour Awad
Gregory R. Dion
Laura J. Brattain
DiffM
MedIm
64
0
0
29 Apr 2025
Revisiting Data Auditing in Large Vision-Language Models
Hongyu Zhu
Sichu Liang
W. Wang
Boheng Li
Tongxin Yuan
Fangqi Li
Shilin Wang
Zhuosheng Zhang
VLM
170
0
0
25 Apr 2025
DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs
Z. Wang
Senthil Purushwalkam
Caiming Xiong
S.
Heng Ji
R. Xu
38
0
0
23 Apr 2025
Perception Encoder: The best visual embeddings are not at the output of the network
Daniel Bolya
Po-Yao (Bernie) Huang
Peize Sun
Jang Hyun Cho
Andrea Madotto
...
Shiyu Dong
Nikhila Ravi
Daniel Li
Piotr Dollár
Christoph Feichtenhofer
ObjD
VOS
103
0
0
17 Apr 2025
AD-Det: Boosting Object Detection in UAV Images with Focused Small Objects and Balanced Tail Classes
Zhenteng Li
Sheng Lian
Dengfeng Pan
Y. Wang
Wei Liu
56
0
0
08 Apr 2025
Post-processing for Fair Regression via Explainable SVD
Zhiqun Zuo
Ding Zhu
Mohammad Mahdi Khalili
149
0
0
04 Apr 2025
The Coralscapes Dataset: Semantic Scene Understanding in Coral Reefs
Jonathan Sauder
Viktor Domazetoski
G. Banc-Prandi
Gabriela Perna
Anders Meibom
D. Tuia
50
0
0
25 Mar 2025
Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark
Ying Liu
Yijing Hua
Haojiang Chai
Yanbo Wang
TengQi Ye
ObjD
56
0
0
19 Mar 2025
Segment Any-Quality Images with Generative Latent Space Enhancement
Guangqian Guo
Yoong Guo
Xuehui Yu
Wenbo Li
Yaoxing Wang
Shan Gao
VLM
77
0
0
16 Mar 2025
A Survey on Self-supervised Contrastive Learning for Multimodal Text-Image Analysis
Asifullah Khan
Laiba Asmatullah
Anza Malik
Shahzaib Khan
Hamna Asif
SSL
VLM
76
0
0
14 Mar 2025
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
Chuhan Zhang
Chaoyang Zhu
Pingcheng Dong
Long Chen
Dong Zhang
ObjD
VLM
161
0
0
14 Mar 2025
OVTR: End-to-End Open-Vocabulary Multiple Object Tracking with Transformer
Jinyang Li
En Yu
Sijia Chen
Wenbing Tao
54
1
0
13 Mar 2025
Referring to Any Person
Qing Jiang
Lin Wu
Zhaoyang Zeng
Tianhe Ren
Yuda Xiong
Yihao Chen
Qin Liu
Lei Zhang
148
0
0
11 Mar 2025
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning
Xin Wen
Bingchen Zhao
Yilun Chen
Jiangmiao Pang
Xiaojuan Qi
LM&Ro
46
0
0
10 Mar 2025
Enhancing Collective Intelligence in Large Language Models Through Emotional Integration
Likith Kadiyala
Ramteja Sajja
Y. Sermet
Ibrahim Demir
143
0
0
05 Mar 2025
Generalized Class Discovery in Instance Segmentation
Cuong Manh Hoang
Yeejin Lee
Byeongkeun Kang
ISeg
89
0
0
12 Feb 2025
Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection
Xiangyu Gao
Yu Dai
Benliu Qiu
Hongliang Li
Heqian Qiu
Hongliang Li
ObjD
VLM
139
0
0
28 Jan 2025
Enhancing Novel Object Detection via Cooperative Foundational Models
Rohit K Bharadwaj
Muzammal Naseer
Salman Khan
F. Khan
ObjD
VLM
157
1
0
17 Jan 2025
Are Open-Vocabulary Models Ready for Detection of MEP Elements on Construction Sites
Abdalwhab Abdalwhab
A. Imran
Sina Heydarian
I. Iordanova
David St-Onge
49
0
0
16 Jan 2025
Guided SAM: Label-Efficient Part Segmentation
S.B. van Rooij
G.J. Burghouts
VLM
43
0
0
13 Jan 2025
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
Jiannan Wu
Muyan Zhong
Sen Xing
Zeqiang Lai
Zhaoyang Liu
...
Lewei Lu
Tong Lu
Ping Luo
Yu Qiao
Jifeng Dai
MLLM
VLM
LRM
99
48
0
03 Jan 2025
Towards Automatic Evaluation for Image Transcreation
Simran Khanuja
Vivek Iyer
Claire He
Graham Neubig
ViT
82
1
0
18 Dec 2024
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
Qing Jiang
Gen Luo
Yuqin Yang
Yuda Xiong
Yihao Chen
Zhaoyang Zeng
Tianhe Ren
Lei Zhang
VLM
LRM
109
6
0
27 Nov 2024
Interpreting Object-level Foundation Models via Visual Precision Search
Ruoyu Chen
Siyuan Liang
Jingzhi Li
Shiming Liu
Maosen Li
Zheng Huang
Hua Zhang
Xiaochun Cao
FAtt
82
4
0
25 Nov 2024
AeroGen: Enhancing Remote Sensing Object Detection with Diffusion-Driven Data Generation
Datao Tang
Xiangyong Cao
Xuan Wu
Jialin Li
Jing Yao
Xueru Bai
Deyu Meng
Yin Li
Deyu Meng
DiffM
80
6
0
23 Nov 2024
BlinkVision: A Benchmark for Optical Flow, Scene Flow and Point Tracking Estimation using RGB Frames and Events
Yijin Li
Yichen Shen
Zhaoyang Huang
Shuo Chen
Weikang Bian
...
Keqiang Sun
Hujun Bao
Zhaopeng Cui
Guofeng Zhang
Hongsheng Li
3DPC
50
5
0
27 Oct 2024
GiVE: Guiding Visual Encoder to Perceive Overlooked Information
Junjie Li
Jianghong Ma
Xiaofeng Zhang
Yuhang Li
Jianyang Shi
33
0
0
26 Oct 2024
Configurable Embodied Data Generation for Class-Agnostic RGB-D Video Segmentation
Anthony Opipari
Aravindhan K. Krishnan
Shreekant Gayaka
Min Sun
Cheng-Hao Kuo
Arnie Sen
Odest Chadwicke Jenkins
VOS
44
0
0
16 Oct 2024
Fractal Calibration for long-tailed object detection
Konstantinos Panagiotis Alexandridis
Ismail Elezi
Jiankang Deng
Anh H. Nguyen
Shan Luo
117
0
0
15 Oct 2024
Locality Alignment Improves Vision-Language Models
Ian Covert
Tony Sun
James Y. Zou
Tatsunori Hashimoto
VLM
67
3
0
14 Oct 2024
Interactive4D: Interactive 4D LiDAR Segmentation
Ilya Fradlin
Idil Esen Zulfikar
Kadir Yilmaz
Theodora Kontogianni
Bastian Leibe
47
1
0
10 Oct 2024
ProMerge: Prompt and Merge for Unsupervised Instance Segmentation
Dylan Li
Gyungin Shin
30
3
0
27 Sep 2024
Label Convergence: Defining an Upper Performance Bound in Object Recognition through Contradictory Annotations
David Tschirschwitz
Volker Rodehorst
26
1
0
14 Sep 2024
GroundingBooth: Grounding Text-to-Image Customization
Zhexiao Xiong
Wei Xiong
Jing Shi
He Zhang
Yizhi Song
Nathan Jacobs
DiffM
59
6
0
13 Sep 2024
Rethinking The Training And Evaluation of Rich-Context Layout-to-Image Generation
Jiaxin Cheng
Zixu Zhao
Tong He
Tianjun Xiao
Yicong Zhou
Zheng Zhang
DiffM
39
0
0
07 Sep 2024
FrozenSeg: Harmonizing Frozen Foundation Models for Open-Vocabulary Segmentation
Xi Chen
Haosen Yang
Sheng Jin
Xiatian Zhu
H. Yao
VLM
29
3
0
05 Sep 2024
Open-Ended 3D Point Cloud Instance Segmentation
Phuc D. A. Nguyen
Minh Luu
Anh Tran
Cuong Pham
Khoi Nguyen
3DPC
48
1
0
21 Aug 2024
OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding
Youjun Zhao
Jiaying Lin
Shuquan Ye
Qianshi Pang
Rynson W. H. Lau
61
1
0
20 Aug 2024
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models
Ming-Kuan Wu
Xinyue Cai
Jiayi Ji
Jiale Li
Oucheng Huang
Gen Luo
Hao Fei
Xiaoshuai Sun
Rongrong Ji
MLLM
45
7
0
31 Jul 2024
BIV-Priv-Seg: Locating Private Content in Images Taken by People With Visual Impairments
Yu-Yun Tseng
Tanusree Sharma
Lotus Zhang
Abigale Stangl
Leah Findlater
Yang Wang
Danna Gurari
66
0
0
25 Jul 2024
Learning Visual Grounding from Generative Vision and Language Model
Shijie Wang
Dahun Kim
A. Taalimi
Chen Sun
Weicheng Kuo
ObjD
34
5
0
18 Jul 2024
Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge
Sriram Yenamandra
Arun Ramachandran
Mukul Khanna
Karmesh Yadav
Jay Vakil
...
Z. Kira
Dhruv Batra
Roozbeh Mottaghi
Yonatan Bisk
Chris Paxton
LM&Ro
54
6
0
09 Jul 2024
Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language
Yicheng Chen
Xiangtai Li
Yining Li
Yanhong Zeng
Jianzong Wu
Xiangyu Zhao
Kai Chen
VLM
DiffM
56
3
0
28 Jun 2024
TraceNet: Segment one thing efficiently
Mingyuan Wu
Zichuan Liu
Haozhen Zheng
Hongpeng Guo
Bo Chen
Xin Lu
Klara Nahrstedt
33
0
0
21 Jun 2024
DiPEx: Dispersing Prompt Expansion for Class-Agnostic Object Detection
Jia Syuen Lim
Zhuoxiao Chen
Mahsa Baktashmotlagh
Zhi Chen
Xin Yu
Zi Huang
Yadan Luo
VLM
ObjD
80
1
0
21 Jun 2024
Open-Vocabulary X-ray Prohibited Item Detection via Fine-tuning CLIP
Shuyang Lin
Tong Jia
Hao Wang
Bowen Ma
Mingyuan Li
Dongyue Chen
VLM
ObjD
38
0
0
16 Jun 2024
Matching Anything by Segmenting Anything
Siyuan Li
Lei Ke
Martin Danelljan
Luigi Piccinelli
Mattia Segu
Luc Van Gool
Fisher Yu
VOS
37
22
0
06 Jun 2024
1
2
3
4
5
6
Next