Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2112.02413
Cited By
PointCLIP: Point Cloud Understanding by CLIP
Computer Vision and Pattern Recognition (CVPR), 2021
4 December 2021
Renrui Zhang
Ziyu Guo
Wei Zhang
Kunchang Li
Xupeng Miao
Tengjiao Wang
Yu Qiao
Shiyang Feng
Jiaming Song
VLM
3DPC
Re-assign community
ArXiv (abs)
PDF
HTML
Github (371★)
Papers citing
"PointCLIP: Point Cloud Understanding by CLIP"
50 / 223 papers shown
ShelfGaussian: Shelf-Supervised Open-Vocabulary Gaussian-based 3D Scene Understanding
Lingjun Zhao
Yandong Luo
James Hay
Lu Gan
3DGS
176
1
0
03 Dec 2025
Multimodal Robust Prompt Distillation for 3D Point Cloud Models
Xiang Gu
Liming Lu
Xu Zheng
Anan Du
Yongbin Zhou
Shuchao Pang
273
0
0
26 Nov 2025
LocateAnything3D: Vision-Language 3D Detection with Chain-of-Sight
Yunze Man
S. S. Wang
Guowen Zhang
Johan Bjorck
Zhiqi Li
Liang-Yan Gui
Jim Fan
Jan Kautz
Yu Wang
Zhiding Yu
148
1
0
25 Nov 2025
CrossJEPA: Cross-Modal Joint-Embedding Predictive Architecture for Efficient 3D Representation Learning from 2D Images
Avishka Perera
Kumal Hewagamage
Saeedha Nazar
Kavishka Abeywardana
Hasitha Gallella
Ranga Rodrigo
Mohamed Afham
3DV
218
0
0
23 Nov 2025
Improving Multimodal Distillation for 3D Semantic Segmentation under Domain Shift
Björn Michele
Alexandre Boulch
Gilles Puy
Tuan-Hung Vu
Renaud Marlet
Nicolas Courty
109
0
0
21 Nov 2025
3DAlign-DAER: Dynamic Attention Policy and Efficient Retrieval Strategy for Fine-grained 3D-Text Alignment at Scale
Yijia Fan
Jusheng Zhang
Kaitong Cai
Jing Yang
Jian Wang
Keze Wang
101
12
0
17 Nov 2025
Point Cloud Quantization through Multimodal Prompting for 3D Understanding
Hongxuan Li
Wencheng Zhu
Huiying Xu
Xinzhong Zhu
Q. Hu
MQ
3DPC
472
0
0
15 Nov 2025
A Systematic Study of Model Extraction Attacks on Graph Foundation Models
Haoyan Xu
Ruizhi Qian
Jiate Li
Yushun Dong
Minghao Lin
...
Qinghua Liu
Junhao Dong
Ruopeng Huang
Yue Zhao
Mengyuan Li
AAML
142
0
0
14 Nov 2025
PointCubeNet: 3D Part-level Reasoning with 3x3x3 Point Cloud Blocks
Da-Yeong Kim
Yeong-Jun Cho
3DPC
3DV
193
0
0
10 Nov 2025
CSGaze: Context-aware Social Gaze Prediction
Surbhi Madan
Shreya Ghosh
Ramanathan Subramanian
Abhinav Dhall
Tom Gedeon
159
0
0
08 Nov 2025
Open-World 3D Scene Graph Generation for Retrieval-Augmented Reasoning
Fei Yu
Quan Deng
Shengeng Tang
Yuehua Li
Lechao Cheng
3DV
LRM
299
0
0
08 Nov 2025
How Many Tokens Do 3D Point Cloud Transformer Architectures Really Need?
Tuan Anh Tran
Duy M. Nguyen
Hoai-Chau Tran
Michael Barz
Khoa D. Doan
Roger Wattenhofer
Ngo Anh Vien
Mathias Niepert
Daniel Sonntag
Paul Swoboda
250
1
0
07 Nov 2025
BlendCLIP: Bridging Synthetic and Real Domains for Zero-Shot 3D Object Classification with Multimodal Pretraining
Ajinkya Khoche
Gergő László Nagy
Maciej K. Wozniak
Thomas Gustafsson
Patric Jensfelt
159
0
0
21 Oct 2025
Towards 3D Objectness Learning in an Open World
Taichi Liu
Zhenyu Wang
Ruofeng Liu
Guang Wang
Desheng Zhang
3DPC
VLM
192
0
0
20 Oct 2025
Hybrid-grained Feature Aggregation with Coarse-to-fine Language Guidance for Self-supervised Monocular Depth Estimation
Wenyao Zhang
Hongsi Liu
Bohan Li
Jiawei He
Zekun Qi
Yunnan Wang
Shengyang Zhao
Xinqiang Yu
Wenjun Zeng
Jianfeng Dong
VLM
MDE
218
3
0
10 Oct 2025
PIT-QMM: A Large Multimodal Model For No-Reference Point Cloud Quality Assessment
International Conference on Information Photonics (ICIP), 2025
Shashank Gupta
Gregoire Phillips
Alan Bovik
101
1
0
09 Oct 2025
MetaFind: Scene-Aware 3D Asset Retrieval for Coherent Metaverse Scene Generation
Zhenyu Pan
Yucheng Lu
Han Liu
VGen
139
1
0
05 Oct 2025
SkyLink: Unifying Street-Satellite Geo-Localization via UAV-Mediated 3D Scene Alignment
Hongyang Zhang
Yinhao Liu
Zhenyu Kuang
191
0
0
29 Sep 2025
GenCAD-3D: CAD Program Generation using Multimodal Latent Space Alignment and Synthetic Dataset Balancing
Nomi Yu
Md Ferdous Alam
A. John Hart
Faez Ahmed
178
1
0
17 Sep 2025
OpenUrban3D: Annotation-Free Open-Vocabulary Semantic Segmentation of Large-Scale Urban Point Clouds
Chongyu Wang
Kunlei Jing
J. Zhu
Di Wang
3DPC
228
0
0
13 Sep 2025
O
3
^3
3
Afford: One-Shot 3D Object-to-Object Affordance Grounding for Generalizable Robotic Manipulation
Tongxuan Tian
Xuhui Kang
Yen-Ling Kuo
137
1
0
07 Sep 2025
PointAD+: Learning Hierarchical Representations for Zero-shot 3D Anomaly Detection
Qihang Zhou
Shibo He
Jiangtao Yan
Wenchao Meng
Jiming Chen
3DPC
265
0
0
03 Sep 2025
OpenM3D: Open Vocabulary Multi-view Indoor 3D Object Detection without Human Annotations
Peng-Hao Hsu
Ke Zhang
Fu-En Wang
Tao Tu
Ming-feng Li
Yu-Lun Liu
Albert Y. C. Chen
Min Sun
Cheng-Hao Kuo
3DPC
VLM
124
5
0
27 Aug 2025
TinyGiantVLM: A Lightweight Vision-Language Architecture for Spatial Reasoning under Resource Constraints
Vinh-Thuan Ly
Hoang M. Truong
Xuan-Huong Nguyen
LRM
84
0
0
25 Aug 2025
Masked Clustering Prediction for Unsupervised Point Cloud Pre-training
Bin Ren
Xiaoshui Huang
Mengyuan Liu
Hong Liu
Fabio Poiesi
Andrii Zadaianchuk
Guofeng Mei
3DPC
ISeg
205
0
0
12 Aug 2025
Propagating Sparse Depth via Depth Foundation Model for Out-of-Distribution Depth Completion
IEEE Transactions on Image Processing (IEEE TIP), 2025
Shenglun Chen
Cheng Wang
Hong Zhang
Haojie Li
Zhihui Wang
VLM
MDE
134
0
0
07 Aug 2025
Describe, Adapt and Combine: Empowering CLIP Encoders for Open-set 3D Object Retrieval
Zhichuan Wang
Yang Zhou
Zhe Liu
Jingbo Xia
Song Bai
Yulong Wang
Xinwei He
Xiang Bai
169
1
0
29 Jul 2025
SmartCLIP: Modular Vision-language Alignment with Identification Guarantees
Computer Vision and Pattern Recognition (CVPR), 2025
Shaoan Xie
Lingjing Kong
Yujia Zheng
Yu Yao
Zeyu Tang
Eric Xing
Guangyi Chen
Kun Zhang
VLM
254
4
0
29 Jul 2025
BANG: Dividing 3D Assets via Generative Exploded Dynamics
ACM Transactions on Graphics (TOG), 2025
Longwen Zhang
Qixuan Zhang
Haoran Jiang
Yinuo Bai
Wei Yang
Lan Xu
Jingyi Yu
228
15
0
29 Jul 2025
Multi-modal Multi-task Pre-training for Improved Point Cloud Understanding
Liwen Liu
Weidong Yang
Lipeng Ma
Ben Fei
3DPC
204
0
0
23 Jul 2025
Principled Multimodal Representation Learning
Xiaohao Liu
Xiaobo Xia
See-Kiong Ng
Tat-Seng Chua
257
11
0
23 Jul 2025
TriCLIP-3D: A Unified Parameter-Efficient Framework for Tri-Modal 3D Visual Grounding based on CLIP
Fan Li
Zanyi Wang
Zeyi Huang
Guang Dai
Jingdong Wang
Mengmeng Wang
295
0
0
20 Jul 2025
Stereo-based 3D Anomaly Object Detection for Autonomous Driving: A New Dataset and Baseline
Shiyi Mu
Zichong Gu
Hanqi Lyu
Yilin Gao
Shugong Xu
3DPC
212
0
0
12 Jul 2025
PointVDP: Learning View-Dependent Projection by Fireworks Rays for 3D Point Cloud Segmentation
Yang Chen
Yueqi Duan
Haowen Sun
Ziwei Wang
Jiwen Lu
Yap-Peng Tan
3DPC
252
0
0
09 Jul 2025
Zero-Shot Skeleton-Based Action Recognition With Prototype-Guided Feature Alignment
IEEE Transactions on Image Processing (IEEE TIP), 2025
Kai Zhou
Shuhai Zhang
Zeng You
Jinwu Hu
Mingkui Tan
Fei Liu
267
1
0
01 Jul 2025
MR-COSMO: Visual-Text Memory Recall and Direct CrOSs-MOdal Alignment Method for Query-Driven 3D Segmentation
Chade Li
Pengju Zhang
Yihong Wu
3DV
254
0
0
26 Jun 2025
TR2M: Transferring Monocular Relative Depth to Metric Depth with Language Descriptions and Scale-Oriented Contrast
Beilei Cui
Yiming Huang
Long Bai
Hongliang Ren
291
0
0
16 Jun 2025
EKPC: Elastic Knowledge Preservation and Compensation for Class-Incremental Learning
Huaijie Wang
De Cheng
Lingfeng He
Yan Li
Jie Li
Nannan Wang
X. Gao
CLL
226
1
0
14 Jun 2025
AntiGrounding: Lifting Robotic Actions into VLM Representation Space for Decision Making
Wenbo Li
Shiyi Wang
Yiteng Chen
Huiping Zhuang
Qingyao Wu
329
0
0
14 Jun 2025
3D-Aware Vision-Language Models Fine-Tuning with Geometric Distillation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Seonho Lee
Jiho Choi
Inha Kang
Jiwook Kim
J. Park
Hyunjung Shim
VLM
222
2
0
11 Jun 2025
Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs
Haoyuan Li
Yanpeng Zhou
Yufei Gao
Tao Tang
J. N. Han
Yujie Yuan
Dave Zhenyu Chen
Jiawang Bian
Hang Xu
Xiaodan Liang
398
5
0
05 Jun 2025
FOLIAGE: Towards Physical Intelligence World Models Via Unbounded Surface Evolution
Xiaoyi Liu
Hao Tang
AI4CE
278
1
0
29 May 2025
HuMoCon: Concept Discovery for Human Motion Understanding
Computer Vision and Pattern Recognition (CVPR), 2025
Qihang Fang
Chengcheng Tang
Bugra Tekin
Shugao Ma
Yanchao Yang
244
4
0
27 May 2025
SVL: Spike-based Vision-language Pretraining for Efficient 3D Open-world Understanding
Xuerui Qiu
Peixi Wu
Yaozhi Wen
Shaowei Gu
Yuqi Pan
Xinhao Luo
Bo Xu
Guoqi Li
VLM
414
0
0
23 May 2025
RAZER: Robust Accelerated Zero-Shot 3D Open-Vocabulary Panoptic Reconstruction with Spatio-Temporal Aggregation
Naman Patel
Prashanth Krishnamurthy
Farshad Khorrami
308
4
0
21 May 2025
Synergy-CLIP: Extending CLIP with Multi-modal Integration for Robust Representation Learning
IEEE Access (IEEE Access), 2025
Sangyeon Cho
Jangyeong Jeon
Mingi Kim
Junyeong Kim
CLIP
VLM
529
2
0
30 Apr 2025
Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D
Sergio Arnaud
Paul Mcvay
Ada Martin
Arjun Majumdar
Krishna Murthy Jatavallabhula
...
Nicolas Ballas
Mido Assran
Oleksandr Maksymets
Aravind Rajeswaran
Franziska Meier
3DPC
289
15
0
19 Apr 2025
Semantic Consistent Language Gaussian Splatting for Point-Level Open-vocabulary Querying
Hairong Yin
Huangying Zhan
Yi Tian Xu
Raymond A. Yeh
313
3
0
27 Mar 2025
SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining
Yue Li
Qi Ma
Runyi Yang
Huapeng Li
Mengjiao Ma
...
E. Konukoglu
Theo Gevers
Luc Van Gool
Martin R. Oswald
Danda Pani Paudel
3DGS
VLM
670
22
0
23 Mar 2025
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Computer Vision and Pattern Recognition (CVPR), 2025
Jinlong Li
Cristiano Saltori
Fabio Poiesi
Andrii Zadaianchuk
1.1K
8
0
20 Mar 2025
1
2
3
4
5
Next
Page 1 of 5