Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.11313
Cited By
CLIP goes 3D: Leveraging Prompt Tuning for Language Grounded 3D Recognition
20 March 2023
Deepti Hegde
Jeya Maria Jose Valanarasu
Vishal M. Patel
CLIP
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CLIP goes 3D: Leveraging Prompt Tuning for Language Grounded 3D Recognition"
18 / 18 papers shown
Title
A Review of 3D Object Detection with Vision-Language Models
Ranjan Sapkota
Konstantinos I Roumeliotis
Rahul Harsha Cheppally
Marco Flores Calero
Manoj Karkee
VLM
74
1
0
25 Apr 2025
PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation
Ziyao Zeng
Jingcheng Ni
Daniel Wang
Patrick Rim
Younjoon Chung
Fengyu Yang
Byung-Woo Hong
A. Wong
DiffM
MDE
88
2
0
24 Nov 2024
3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance
Xiaoxu Xu
Yitian Yuan
Jinlong Li
Qiudan Zhang
Zequn Jie
Lin Ma
Hao Tang
N. Sebe
Xu Wang
38
2
0
13 Jul 2024
Back to 3D: Few-Shot 3D Keypoint Detection with Back-Projected 2D Features
Thomas Wimmer
Peter Wonka
M. Ovsjanikov
11
8
0
29 Nov 2023
Extending Multi-modal Contrastive Representations
Zehan Wang
Ziang Zhang
Luping Liu
Yang Zhao
Haifeng Huang
Tao Jin
Zhou Zhao
13
5
0
13 Oct 2023
OpenShape: Scaling Up 3D Shape Representation Towards Open-World Understanding
Minghua Liu
Ruoxi Shi
Kaiming Kuang
Yinhao Zhu
Xuanlin Li
Shizhong Han
H. Cai
Fatih Porikli
Hao Su
3DPC
14
115
0
18 May 2023
Visual Prompt Tuning for Generative Transfer Learning
Kihyuk Sohn
Yuan Hao
José Lezama
Luisa F. Polanía
Huiwen Chang
Han Zhang
Irfan Essa
Lu Jiang
VPVLM
VLM
51
80
0
03 Oct 2022
CyCLIP: Cyclic Contrastive Language-Image Pretraining
Shashank Goel
Hritik Bansal
S. Bhatia
Ryan A. Rossi
Vishwa Vinay
Aditya Grover
CLIP
VLM
160
131
0
28 May 2022
Fine-grained Image Captioning with CLIP Reward
Jaemin Cho
Seunghyun Yoon
Ajinkya Kale
Franck Dernoncourt
Trung Bui
Mohit Bansal
CLIP
118
76
0
26 May 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
380
4,010
0
28 Jan 2022
PointCLIP: Point Cloud Understanding by CLIP
Renrui Zhang
Ziyu Guo
Wei Zhang
Kunchang Li
Xupeng Miao
Bin Cui
Yu Qiao
Peng Gao
Hongsheng Li
VLM
3DPC
158
428
0
04 Dec 2021
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval
Huaishao Luo
Lei Ji
Ming Zhong
Yang Chen
Wen Lei
Nan Duan
Tianrui Li
CLIP
VLM
301
771
0
18 Apr 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
275
3,784
0
18 Apr 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
Deep Visual Domain Adaptation
G. Csurka
OOD
121
185
0
28 Dec 2020
PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding
Saining Xie
Jiatao Gu
Demi Guo
C. Qi
Leonidas J. Guibas
Or Litany
3DPC
134
618
0
21 Jul 2020
Joint 2D-3D-Semantic Data for Indoor Scene Understanding
Iro Armeni
S. Sax
Amir Zamir
Silvio Savarese
3DV
3DPC
111
864
0
03 Feb 2017
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
C. Qi
Hao Su
Kaichun Mo
Leonidas J. Guibas
3DH
3DPC
3DV
PINN
210
13,886
0
02 Dec 2016
1