ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1908.03195
  4. Cited By
LVIS: A Dataset for Large Vocabulary Instance Segmentation
v1v2 (latest)

LVIS: A Dataset for Large Vocabulary Instance Segmentation

Computer Vision and Pattern Recognition (CVPR), 2019
8 August 2019
Agrim Gupta
Piotr Dollár
Ross B. Girshick
    ISegVLM
ArXiv (abs)PDFHTML

Papers citing "LVIS: A Dataset for Large Vocabulary Instance Segmentation"

50 / 1,056 papers shown
Title
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large
  Language Models
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models
Ming-Kuan Wu
Xinyue Cai
Jiayi Ji
Jiale Li
Oucheng Huang
Gen Luo
Hao Fei
Xiaoshuai Sun
Rongrong Ji
MLLM
315
29
0
31 Jul 2024
MarvelOVD: Marrying Object Recognition and Vision-Language Models for
  Robust Open-Vocabulary Object Detection
MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object Detection
Kuo Wang
Lechao Cheng
Weikai Chen
Pingping Zhang
Liang Lin
Fan Zhou
Guanbin Li
VLMObjD
198
8
0
31 Jul 2024
Add-SD: Rational Generation without Manual Reference
Add-SD: Rational Generation without Manual Reference
Lingfeng Yang
Xinyu Zhang
Xiang Li
Jinwen Chen
Kun Yao
Qiang Chen
Errui Ding
Ling-Ling Liu
Jingdong Wang
Jian Yang
153
1
0
30 Jul 2024
Text2LiDAR: Text-guided LiDAR Point Cloud Generation via Equirectangular
  Transformer
Text2LiDAR: Text-guided LiDAR Point Cloud Generation via Equirectangular TransformerEuropean Conference on Computer Vision (ECCV), 2024
Yang Wu
Kaihua Zhang
Jianjun Qian
Jin Xie
Zhiqiang Wang
DiffM
347
22
0
29 Jul 2024
Dual-Decoupling Learning and Metric-Adaptive Thresholding for
  Semi-Supervised Multi-Label Learning
Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning
Jia-Hao Xiao
Ming-Kun Xie
Heng-Bo Fan
Gang Niu
Masashi Sugiyama
Sheng-Jun Huang
226
0
0
26 Jul 2024
BIV-Priv-Seg: Locating Private Content in Images Taken by People With Visual Impairments
BIV-Priv-Seg: Locating Private Content in Images Taken by People With Visual Impairments
Yu-Yun Tseng
Tanusree Sharma
Lotus Zhang
Abigale Stangl
Leah Findlater
Yang Wang
Danna Gurari
421
3
0
25 Jul 2024
Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model
Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model
Lirui Zhao
Tianshuo Yang
Wenqi Shao
Yuxin Zhang
Yu Qiao
Ping Luo
Kaipeng Zhang
Rongrong Ji
DiffM
193
6
0
24 Jul 2024
PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects
PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects
Junyi Li
Junfeng Wu
Weizhi Zhao
Song Bai
Xiang Bai
200
13
0
23 Jul 2024
Weak-to-Strong Compositional Learning from Generative Models for
  Language-based Object Detection
Weak-to-Strong Compositional Learning from Generative Models for Language-based Object Detection
Kwanyong Park
Kuniaki Saito
Donghyun Kim
VLMCoGe
209
5
0
21 Jul 2024
Bucketed Ranking-based Losses for Efficient Training of Object Detectors
Bucketed Ranking-based Losses for Efficient Training of Object Detectors
Feyza Yavuz
Baris Can Cam
Adnan Harun Dogan
Kemal Oksuz
Emre Akbas
Sinan Kalkan
255
5
0
19 Jul 2024
OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking
OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking
Zekun Qian
Ruize Han
Wei Feng
Junhui Hou
Linqi Song
Song Wang
261
1
0
19 Jul 2024
Learning Visual Grounding from Generative Vision and Language Model
Learning Visual Grounding from Generative Vision and Language Model
Shijie Wang
Dahun Kim
A. Taalimi
Chen Sun
Weicheng Kuo
ObjD
249
17
0
18 Jul 2024
OVGNet: A Unified Visual-Linguistic Framework for Open-Vocabulary
  Robotic Grasping
OVGNet: A Unified Visual-Linguistic Framework for Open-Vocabulary Robotic Grasping
Meng Li
Qi Zhao
Shuchang Lyu
Chunlei Wang
Yujing Ma
Guangliang Cheng
Chenguang Yang
274
10
0
18 Jul 2024
SDPT: Synchronous Dual Prompt Tuning for Fusion-based Visual-Language
  Pre-trained Models
SDPT: Synchronous Dual Prompt Tuning for Fusion-based Visual-Language Pre-trained Models
Yang Zhou
Yongjian Wu
Jiya Saiyin
Bingzheng Wei
Maode Lai
Eric Chang
Yan Xu
VLM
187
2
0
16 Jul 2024
LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction
LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction
Penghui Du
Yu Wang
Yifan Sun
Luting Wang
Yue Liao
Qiang Chen
Errui Ding
Yan Wang
Jingdong Wang
Si Liu
VLMObjD
229
12
0
16 Jul 2024
Plain-Det: A Plain Multi-Dataset Object Detector
Plain-Det: A Plain Multi-Dataset Object Detector
Cheng Shi
Yuchen Zhu
Sibei Yang
ObjDVLM
204
8
0
14 Jul 2024
3x2: 3D Object Part Segmentation by 2D Semantic Correspondences
3x2: 3D Object Part Segmentation by 2D Semantic Correspondences
Anh Thai
Weiyao Wang
Hao Tang
Stefan Stojanov
Matt Feiszli
James M. Rehg
3DPC
248
11
0
12 Jul 2024
Lite-SAM Is Actually What You Need for Segment Everything
Lite-SAM Is Actually What You Need for Segment Everything
Jianhai Fu
Yuanjie Yu
Ningchuan Li
Yi Zhang
Qichao Chen
Jianping Xiong
Jun Yin
Zhiyu Xiang
VLM
213
5
0
12 Jul 2024
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal
  Perception
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
Xiaotong Li
Fan Zhang
Haiwen Diao
Yueze Wang
Xinlong Wang
Ling-yu Duan
VLM
300
48
0
11 Jul 2024
Adaptive Parametric Activation: Unifying and Generalising Activation Functions Across Tasks
Adaptive Parametric Activation: Unifying and Generalising Activation Functions Across TasksEuropean Conference on Computer Vision (ECCV), 2024
Konstantinos Panagiotis Alexandridis
Jiankang Deng
A. Nguyen
Shan Luo
265
7
0
11 Jul 2024
Unified Embedding Alignment for Open-Vocabulary Video Instance
  Segmentation
Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation
Hao Fang
Peng Wu
Yawei Li
Xinxin Zhang
Xiankai Lu
VLM
271
19
0
10 Jul 2024
Towards Open-World Mobile Manipulation in Homes: Lessons from the
  Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge
Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge
Sriram Yenamandra
Arun Ramachandran
Mukul Khanna
Karmesh Yadav
Jay Vakil
...
Z. Kira
Dhruv Batra
Roozbeh Mottaghi
Yonatan Bisk
Chris Paxton
LM&Ro
265
9
0
09 Jul 2024
Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side
  Images
Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images
Zhangyang Qi
Yunhan Yang
Mengchen Zhang
Long Xing
Xiaoyang Wu
Tong Wu
Dahua Lin
Xihui Liu
Jiaqi Wang
Hengshuang Zhao
DiffM
204
23
0
08 Jul 2024
Improving Computer Vision Interpretability: Transparent Two-level
  Classification for Complex Scenes
Improving Computer Vision Interpretability: Transparent Two-level Classification for Complex Scenes
Stefan Scholz
Nils B. Weidmann
Zachary C. Steinert-Threlkeld
Eda Keremoğlu
Bastian Goldlücke
146
3
0
04 Jul 2024
Slice-100K: A Multimodal Dataset for Extrusion-based 3D Printing
Slice-100K: A Multimodal Dataset for Extrusion-based 3D Printing
Anushrut Jignasu
Kelly O. Marshall
Ankush Kumar Mishra
Lucas Nerone Rillo
Baskar Ganapathysubramanian
Aditya Balu
Chinmay Hegde
Adarsh Krishnamurthy
217
1
0
04 Jul 2024
Towards Efficient Pixel Labeling for Industrial Anomaly Detection and
  Localization
Towards Efficient Pixel Labeling for Industrial Anomaly Detection and Localization
Hanxi Li
Jingqi Wu
Lin Yuanbo Wu
Hao Chen
Deyin Liu
Chunhua Shen
221
1
0
03 Jul 2024
Open Scene Graphs for Open World Object-Goal Navigation
Open Scene Graphs for Open World Object-Goal Navigation
Joel Loo
Zhanxin Wu
David Hsu
LM&Ro
218
14
0
02 Jul 2024
When Robots Get Chatty: Grounding Multimodal Human-Robot Conversation
  and Collaboration
When Robots Get Chatty: Grounding Multimodal Human-Robot Conversation and Collaboration
Philipp Allgeuer
Hassan Ali
Stefan Wermter
LM&Ro
220
18
0
29 Jun 2024
Segment Anything without Supervision
Segment Anything without Supervision
Xudong Wang
Jingfeng Yang
Trevor Darrell
VLM
300
24
0
28 Jun 2024
Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language
Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language
Yicheng Chen
Xiangtai Li
Yining Li
Yanhong Zeng
Jianzong Wu
Xiangyu Zhao
Kai Chen
VLMDiffM
383
3
0
28 Jun 2024
Revisiting Referring Expression Comprehension Evaluation in the Era of
  Large Multimodal Models
Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal Models
Jierun Chen
Fangyun Wei
Jinjing Zhao
Sizhe Song
Bohuai Wu
Zhuoxuan Peng
S.-H. Gary Chan
Hongyang R. Zhang
229
31
0
24 Jun 2024
ObjectNLQ @ Ego4D Episodic Memory Challenge 2024
ObjectNLQ @ Ego4D Episodic Memory Challenge 2024
Yisen Feng
Haoyu Zhang
Yuquan Xie
Zaijing Li
Meng Liu
Liqiang Nie
253
8
0
22 Jun 2024
DiPEx: Dispersing Prompt Expansion for Class-Agnostic Object Detection
DiPEx: Dispersing Prompt Expansion for Class-Agnostic Object Detection
Jia Syuen Lim
Zhuoxiao Chen
Mahsa Baktashmotlagh
Zhi Chen
Xin Yu
Zi Huang
Yadan Luo
VLMObjD
357
7
0
21 Jun 2024
TraceNet: Segment one thing efficiently
TraceNet: Segment one thing efficiently
Mingyuan Wu
Zichuan Liu
Haozhen Zheng
Hongpeng Guo
Bo Chen
Xin Lu
Klara Nahrstedt
289
0
0
21 Jun 2024
Does Object Grounding Really Reduce Hallucination of Large
  Vision-Language Models?
Does Object Grounding Really Reduce Hallucination of Large Vision-Language Models?
Gregor Geigle
Radu Timofte
Goran Glavaš
223
2
0
20 Jun 2024
Improving Visual Commonsense in Language Models via Multiple Image
  Generation
Improving Visual Commonsense in Language Models via Multiple Image Generation
Guy Yariv
Idan Schwartz
Yossi Adi
Sagie Benaim
VLMLRM
138
1
0
19 Jun 2024
V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object
  Detection: Methods and Results
V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results
Jiaqi Wang
Yuhang Zang
Pan Zhang
Tao Chu
Yuhang Cao
...
Kehong Yuan
Yanyan Zu
Jiayao Ha
Qiong Gao
Licheng Jiao
ObjD
234
1
0
17 Jun 2024
Lightweight Model Pre-training via Language Guided Knowledge
  Distillation
Lightweight Model Pre-training via Language Guided Knowledge Distillation
Mingsheng Li
Lin Zhang
Mingzhen Zhu
Zilong Huang
Gang Yu
Jiayuan Fan
Tao Chen
188
4
0
17 Jun 2024
AnyMaker: Zero-shot General Object Customization via Decoupled
  Dual-Level ID Injection
AnyMaker: Zero-shot General Object Customization via Decoupled Dual-Level ID Injection
Lingjie Kong
Kai WU
Xiaobin Hu
Wenhui Han
Jinlong Peng
Chengming Xu
Donghao Luo
Jiangning Zhang
Chengjie Wang
Yanwei Fu
DiffM
227
0
0
17 Jun 2024
Learning from Exemplars for Interactive Image Segmentation
Learning from Exemplars for Interactive Image Segmentation
Kun Li
Hao Cheng
G. Vosselman
Michael Ying Yang
VLM
158
3
0
17 Jun 2024
ClawMachine: Learning to Fetch Visual Tokens for Referential Comprehension
ClawMachine: Learning to Fetch Visual Tokens for Referential Comprehension
Tianren Ma
Lingxi Xie
Yunjie Tian
Boyu Yang
Yuan Zhang
191
0
0
17 Jun 2024
Open-Vocabulary X-ray Prohibited Item Detection via Fine-tuning CLIP
Open-Vocabulary X-ray Prohibited Item Detection via Fine-tuning CLIP
Shuyang Lin
Tong Jia
Hao Wang
Bowen Ma
Mingyuan Li
Dongyue Chen
VLMObjD
191
2
0
16 Jun 2024
RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for
  Robotics
RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics
Wentao Yuan
Jiafei Duan
Valts Blukis
Wilbert Pumacay
Ranjay Krishna
Adithyavairavan Murali
Arsalan Mousavian
Dieter Fox
LM&Ro
302
151
0
15 Jun 2024
From Pixels to Prose: A Large Dataset of Dense Image Captions
From Pixels to Prose: A Large Dataset of Dense Image Captions
Vasu Singla
Kaiyu Yue
Sukriti Paul
Reza Shirkavand
Mayuka Jayawardhana
Alireza Ganjdanesh
Heng Huang
A. Bhatele
Gowthami Somepalli
Tom Goldstein
3DVVLM
270
38
0
14 Jun 2024
EFM3D: A Benchmark for Measuring Progress Towards 3D Egocentric
  Foundation Models
EFM3D: A Benchmark for Measuring Progress Towards 3D Egocentric Foundation Models
Julian Straub
Daniel DeTone
Tianwei Shen
Nan Yang
Chris Sweeney
Richard Newcombe
EgoV
226
14
0
14 Jun 2024
RobustSAM: Segment Anything Robustly on Degraded Images
RobustSAM: Segment Anything Robustly on Degraded ImagesComputer Vision and Pattern Recognition (CVPR), 2024
Wei-Ting Chen
Yu-Jiet Vong
Sy-Yen Kuo
Sizhuo Ma
Jian Wang
VLM
290
28
0
13 Jun 2024
Language-driven Grasp Detection
Language-driven Grasp Detection
An Dinh Vuong
Minh Nhat Vu
Baoru Huang
Nghia Nguyen
Hieu Le
T. Vo
Anh Nguyen
VLM
297
30
0
13 Jun 2024
Enhanced Object Detection: A Study on Vast Vocabulary Object Detection
  Track for V3Det Challenge 2024
Enhanced Object Detection: A Study on Vast Vocabulary Object Detection Track for V3Det Challenge 2024
Peixi Wu
Bosong Chai
Xuan Nie
Longquan Yan
Zeyu Wang
Qifan Zhou
Boning Wang
Yansong Peng
Hebei Li
ObjD
165
1
0
13 Jun 2024
LaMOT: Language-Guided Multi-Object Tracking
LaMOT: Language-Guided Multi-Object Tracking
Yunhao Li
Xiaoqiong Liu
Luke Liu
Heng Fan
Libo Zhang
VOT
226
4
0
12 Jun 2024
Advancing Grounded Multimodal Named Entity Recognition via LLM-Based Reformulation and Box-Based Segmentation
Advancing Grounded Multimodal Named Entity Recognition via LLM-Based Reformulation and Box-Based Segmentation
Jinyuan Li
Z. Li
Han Li
Jianfei Yu
Rui Xia
Di Sun
Gang Pan
227
4
0
11 Jun 2024
Previous
123...567...202122
Next