Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.15933
Cited By
Bridging the Gap between 2D and 3D Visual Question Answering: A Fusion Approach for 3D VQA
24 February 2024
Wentao Mo
Yang Liu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Bridging the Gap between 2D and 3D Visual Question Answering: A Fusion Approach for 3D VQA"
4 / 4 papers shown
Title
ResVG: Enhancing Relation and Semantic Understanding in Multiple Instances for Visual Grounding
Minghang Zheng
Jiahua Zhang
Qingchao Chen
Yuxin Peng
Yang Liu
ObjD
19
2
0
29 Aug 2024
Training-free Video Temporal Grounding using Large-scale Pre-trained Models
Minghang Zheng
Xinhao Cai
Qingchao Chen
Yuxin Peng
Yang Liu
32
4
0
29 Aug 2024
3D Vision and Language Pretraining with Large-Scale Synthetic Data
Dejie Yang
Zhu Xu
Wentao Mo
Qingchao Chen
Siyuan Huang
Yang Liu
24
5
0
08 Jul 2024
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
385
4,010
0
28 Jan 2022
1