ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.08141
  4. Cited By
An End-to-End Transformer Model for 3D Object Detection

An End-to-End Transformer Model for 3D Object Detection

16 September 2021
Ishan Misra
Rohit Girdhar
Armand Joulin
    3DPCViT
ArXiv (abs)PDFHTML

Papers citing "An End-to-End Transformer Model for 3D Object Detection"

50 / 294 papers shown
M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts
M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts
Mingsheng Li
Xin Chen
C. Zhang
Sijin Chen
Erik Cambria
Fukun Yin
Gang Yu
Tao Chen
307
36
0
17 Dec 2023
SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point
  Cloud Registration
SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point Cloud RegistrationAAAI Conference on Artificial Intelligence (AAAI), 2023
Kezheng Xiong
Maoji Zheng
Qingshan Xu
Chenglu Wen
Siqi Shen
Cheng-Yu Wang
3DPC
220
17
0
14 Dec 2023
Cross-BERT for Point Cloud Pretraining
Cross-BERT for Point Cloud Pretraining
Xin Li
Peng Li
Zeyong Wei
Zhe Zhu
Mingqiang Wei
Junhui Hou
Liangliang Nan
J. Qin
H. Xie
F. Wang
SSL3DPC
189
2
0
08 Dec 2023
Uni3DL: Unified Model for 3D and Language Understanding
Uni3DL: Unified Model for 3D and Language Understanding
Xiang Li
Jian Ding
Zhaoyang Chen
Mohamed Elhoseiny
344
9
0
05 Dec 2023
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding,
  Reasoning, and Planning
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and PlanningComputer Vision and Pattern Recognition (CVPR), 2023
Sijin Chen
Xin Chen
C. Zhang
Mingsheng Li
Gang Yu
Hao Fei
Erik Cambria
Jiayuan Fan
Tao Chen
MLLM
316
175
0
30 Nov 2023
Point Cloud Pre-training with Diffusion Models
Point Cloud Pre-training with Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2023
Xiao Zheng
Xiaoshui Huang
Guofeng Mei
Yuenan Hou
Zhaoyang Lyu
Bo Dai
Wanli Ouyang
Yongshun Gong
245
60
0
25 Nov 2023
Multiple View Geometry Transformers for 3D Human Pose Estimation
Multiple View Geometry Transformers for 3D Human Pose Estimation
Ziwei Liao
Jialiang Zhu
Chunyu Wang
Han Hu
Steven L. Waslander
ViT
206
14
0
18 Nov 2023
Point Cloud Self-supervised Learning via 3D to Multi-view Masked Learner
Point Cloud Self-supervised Learning via 3D to Multi-view Masked Learner
Zhimin Chen
Yingwei Li
Xiao Guo
Yingwei Li
Longlong Jing
Liang Yang
Bing Li
3DPC
322
9
0
17 Nov 2023
3DifFusionDet: Diffusion Model for 3D Object Detection with Robust
  LiDAR-Camera Fusion
3DifFusionDet: Diffusion Model for 3D Object Detection with Robust LiDAR-Camera Fusion
Xinhao Xiang
Simon Dräger
Jiawei Zhang
189
7
0
07 Nov 2023
FusionViT: Hierarchical 3D Object Detection via LiDAR-Camera Vision
  Transformer Fusion
FusionViT: Hierarchical 3D Object Detection via LiDAR-Camera Vision Transformer Fusion
Xinhao Xiang
Jiawei Zhang
3DPCViT
256
2
0
07 Nov 2023
Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive
  Survey and Evaluation
Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive Survey and Evaluation
Yinjie Lei
Zixuan Wang
Feng Chen
Guoqing Wang
Peng Wang
Yang Yang
264
17
0
24 Oct 2023
SoybeanNet: Transformer-Based Convolutional Neural Network for Soybean
  Pod Counting from Unmanned Aerial Vehicle (UAV) Images
SoybeanNet: Transformer-Based Convolutional Neural Network for Soybean Pod Counting from Unmanned Aerial Vehicle (UAV) ImagesComputers and Electronics in Agriculture (Comput. Electron. Agric.), 2023
Jiajia Li
Raju Thada Magar
Dong Chen
Feng Lin
Dechun Wang
Xiang Yin
Weichao Zhuang
Zhao Li
177
23
0
16 Oct 2023
Multimodal Object Query Initialization for 3D Object Detection
Multimodal Object Query Initialization for 3D Object DetectionIEEE International Conference on Robotics and Automation (ICRA), 2023
Mathijs R. van Geerenstein
Felicia Ruppel
Klaus C. J. Dietmayer
D. Gavrila
3DPC
259
4
0
16 Oct 2023
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
Haoyi Zhu
Honghui Yang
Xiaoyang Wu
Di Huang
Sha Zhang
...
Hengshuang Zhao
Chunhua Shen
Yu Qiao
Tong He
Wanli Ouyang
SSL
622
56
0
12 Oct 2023
3DS-SLAM: A 3D Object Detection based Semantic SLAM towards Dynamic
  Indoor Environments
3DS-SLAM: A 3D Object Detection based Semantic SLAM towards Dynamic Indoor Environments
G. S. Krishna
Kundrapu Supriya
S. Baidya
3DPC
210
7
0
10 Oct 2023
Uni3DETR: Unified 3D Detection Transformer
Uni3DETR: Unified 3D Detection TransformerNeural Information Processing Systems (NeurIPS), 2023
Zhenyu Wang
Yali Li
Xi Chen
Hengshuang Zhao
Shengjin Wang
3DPC
327
44
0
09 Oct 2023
Anyview: Generalizable Indoor 3D Object Detection with Variable Frames
Anyview: Generalizable Indoor 3D Object Detection with Variable Frames
Zhenyu Wu
Xiuwei Xu
Ziwei Wang
Chong Xia
Linqing Zhao
Jiwen Lu
Haibin Yan
3DPC
300
4
0
09 Oct 2023
CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for
  Open-vocabulary 3D Object Detection
CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object DetectionNeural Information Processing Systems (NeurIPS), 2023
Yang Cao
Yihan Zeng
Hang Xu
Dan Xu
3DPCObjD
243
53
0
04 Oct 2023
CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and Favorable Transferability For ViTs
CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and Favorable Transferability For ViTsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Ao Wang
Hui Chen
Zijia Lin
Sicheng Zhao
Jiawei Han
Guiguang Ding
ViT
280
8
0
27 Sep 2023
Unsupervised 3D Perception with 2D Vision-Language Distillation for
  Autonomous Driving
Unsupervised 3D Perception with 2D Vision-Language Distillation for Autonomous DrivingIEEE International Conference on Computer Vision (ICCV), 2023
Mahyar Najibi
Jingwei Ji
Yin Zhou
C. Qi
Xinchen Yan
Scott Ettinger
Drago Anguelov
242
46
0
25 Sep 2023
Regress Before Construct: Regress Autoencoder for Point Cloud
  Self-supervised Learning
Regress Before Construct: Regress Autoencoder for Point Cloud Self-supervised LearningACM Multimedia (ACM MM), 2023
Yang Liu
Chong Chen
Can Wang
Xulin King
Mengyuan Liu
3DPC
186
13
0
25 Sep 2023
Holistic Geometric Feature Learning for Structured Reconstruction
Holistic Geometric Feature Learning for Structured ReconstructionIEEE International Conference on Computer Vision (ICCV), 2023
Ziqiong Lu
Linxi Huan
Qiyuan Ma
Xianwei Zheng
200
2
0
18 Sep 2023
Object2Scene: Putting Objects in Context for Open-Vocabulary 3D
  Detection
Object2Scene: Putting Objects in Context for Open-Vocabulary 3D Detection
Chenming Zhu
Wenwei Zhang
Tai Wang
Xihui Liu
Kai-xiang Chen
3DPC
241
27
0
18 Sep 2023
Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End
  3D Dense Captioning
Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense CaptioningIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Sijin Chen
Erik Cambria
Mingsheng Li
Xin Chen
Peng Guo
Yinjie Lei
Gang Yu
Taihao Li
Tao Chen
300
40
0
06 Sep 2023
Dense Object Grounding in 3D Scenes
Dense Object Grounding in 3D ScenesACM Multimedia (ACM MM), 2023
Wencan Huang
Daizong Liu
Wei Hu
259
24
0
05 Sep 2023
RADIO: Reference-Agnostic Dubbing Video Synthesis
RADIO: Reference-Agnostic Dubbing Video SynthesisIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Dongyeun Lee
Chaewon Kim
Sangjoon Yu
Jaejun Yoo
Gyeong-Moon Park
VGenDiffM
273
2
0
05 Sep 2023
Mask-Attention-Free Transformer for 3D Instance Segmentation
Mask-Attention-Free Transformer for 3D Instance SegmentationIEEE International Conference on Computer Vision (ICCV), 2023
Xin Lai
Yuhui Yuan
Ruihang Chu
Yukang Chen
Han Hu
Jiaya Jia
MedImISeg3DPC
281
45
0
04 Sep 2023
OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation
OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance SegmentationEuropean Conference on Computer Vision (ECCV), 2023
Zhening Huang
Xiaoyang Wu
Xi Chen
Hengshuang Zhao
Lei Zhu
Joan Lasenby
ISeg3DPCVLM
483
82
0
01 Sep 2023
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D
  Understanding, Generation, and Instruction Following
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following
Ziyu Guo
Renrui Zhang
Xiangyang Zhu
Yiwen Tang
Xianzheng Ma
...
Ke Chen
Shiyang Feng
Xianzhi Li
Jiaming Song
Pheng-Ann Heng
MLLM
376
189
0
01 Sep 2023
Group Regression for Query Based Object Detection and Tracking
Group Regression for Query Based Object Detection and Tracking
Felicia Ruppel
F. Faion
Claudius Gläser
Klaus C. J. Dietmayer
109
1
0
28 Aug 2023
ImGeoNet: Image-induced Geometry-aware Voxel Representation for
  Multi-view 3D Object Detection
ImGeoNet: Image-induced Geometry-aware Voxel Representation for Multi-view 3D Object DetectionIEEE International Conference on Computer Vision (ICCV), 2023
Tao Tu
Shun-Po Chuang
Yu-Lun Liu
Cheng Sun
Kecheng Zhang
D. Roy
Cheng-Hao Kuo
Min Sun
3DPC
306
12
0
17 Aug 2023
Chat-3D: Data-efficiently Tuning Large Language Model for Universal
  Dialogue of 3D Scenes
Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes
Zehan Wang
Haifeng Huang
Yang Zhao
Ziang Zhang
Zhou Zhao
288
109
0
17 Aug 2023
V-DETR: DETR with Vertex Relative Position Encoding for 3D Object
  Detection
V-DETR: DETR with Vertex Relative Position Encoding for 3D Object DetectionInternational Conference on Learning Representations (ICLR), 2023
Yichao Shen
Zigang Geng
Yuhui Yuan
Yutong Lin
Ze Liu
Chunyu Wang
Han Hu
Nanning Zheng
B. Guo
3DPC
186
37
0
08 Aug 2023
Lowis3D: Language-Driven Open-World Instance-Level 3D Scene
  Understanding
Lowis3D: Language-Driven Open-World Instance-Level 3D Scene UnderstandingIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Runyu Ding
Jihan Yang
Chuhui Xue
Wenqing Zhang
Song Bai
Xiaojuan Qi
3DVVLM
182
43
0
01 Aug 2023
Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models
Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud ModelsIEEE International Conference on Computer Vision (ICCV), 2023
Ziyi Wang
Xumin Yu
Yongming Rao
Jie Zhou
Jiwen Lu
DiffM3DPC
248
29
0
27 Jul 2023
3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding
3DRP-Net: 3D Relative Position-aware Network for 3D Visual GroundingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Zehan Wang
Haifeng Huang
Yang Zhao
Lin Li
Xize Cheng
Yichen Zhu
Aoxiong Yin
Zhou Zhao
3DPC
176
29
0
25 Jul 2023
GaPro: Box-Supervised 3D Point Cloud Instance Segmentation Using
  Gaussian Processes as Pseudo Labelers
GaPro: Box-Supervised 3D Point Cloud Instance Segmentation Using Gaussian Processes as Pseudo LabelersIEEE International Conference on Computer Vision (ICCV), 2023
T.D. Ngo
Binh-Son Hua
Khoi Duc Minh Nguyen
3DPC
236
9
0
25 Jul 2023
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present,
  and Future
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and FutureIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Chaoyang Zhu
Long Chen
ObjDVLM
511
68
0
18 Jul 2023
Towards Open Vocabulary Learning: A Survey
Towards Open Vocabulary Learning: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Jianzong Wu
Xiangtai Li
Shilin Xu
Haobo Yuan
Henghui Ding
...
Jiangning Zhang
Yu Tong
Xudong Jiang
Guohao Li
Dacheng Tao
ObjDVLM
406
218
0
28 Jun 2023
UniG3D: A Unified 3D Object Generation Dataset
UniG3D: A Unified 3D Object Generation Dataset
Qinghong Sun
Yangguang Li
Zexia Liu
Xiaoshui Huang
Fenggang Liu
Xihui Liu
Wanli Ouyang
Jing Shao
208
6
0
19 Jun 2023
Randomized 3D Scene Generation for Generalizable Self-Supervised
  Pre-Training
Randomized 3D Scene Generation for Generalizable Self-Supervised Pre-Training
Lanxiao Li
M. Heizmann
160
0
0
07 Jun 2023
Multi-View Representation is What You Need for Point-Cloud Pre-Training
Multi-View Representation is What You Need for Point-Cloud Pre-TrainingInternational Conference on Learning Representations (ICLR), 2023
Siming Yan
Chen Song
Youkang Kong
Qi-Xing Huang
3DPC
498
6
0
05 Jun 2023
Multi-CLIP: Contrastive Vision-Language Pre-training for Question
  Answering tasks in 3D Scenes
Multi-CLIP: Contrastive Vision-Language Pre-training for Question Answering tasks in 3D ScenesBritish Machine Vision Conference (BMVC), 2023
Alexandros Delitzas
Maria Parelli
Nikolas Hars
G. Vlassis
Sotiris Anagnostidis
Gregor Bachmann
Thomas Hofmann
CLIP
200
29
0
04 Jun 2023
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
Hiera: A Hierarchical Vision Transformer without the Bells-and-WhistlesInternational Conference on Machine Learning (ICML), 2023
Chaitanya K. Ryali
Yuan-Ting Hu
Daniel Bolya
Chen Wei
Haoqi Fan
...
Omid Poursaeed
Judy Hoffman
Jitendra Malik
Yanghao Li
Christoph Feichtenhofer
3DH
305
304
0
01 Jun 2023
Point-GCC: Universal Self-supervised 3D Scene Pre-training via
  Geometry-Color Contrast
Point-GCC: Universal Self-supervised 3D Scene Pre-training via Geometry-Color ContrastACM Multimedia (ACM MM), 2023
Guo Fan
Zekun Qi
Wenkai Shi
Kaisheng Ma
3DPCSSL
458
18
0
31 May 2023
VoxDet: Voxel Learning for Novel Instance Detection
VoxDet: Voxel Learning for Novel Instance DetectionNeural Information Processing Systems (NeurIPS), 2023
Bowen Li
Jiashun Wang
Yaoyu Hu
Chen Wang
Sebastian Scherer
426
8
0
26 May 2023
Hierarchical Adaptive Voxel-guided Sampling for Real-time Applications
  in Large-scale Point Clouds
Hierarchical Adaptive Voxel-guided Sampling for Real-time Applications in Large-scale Point Clouds
Ju Ouyang
Xiao Liu
Haoyao Chen
3DPC
193
2
0
23 May 2023
Cross3DVG: Cross-Dataset 3D Visual Grounding on Different RGB-D Scans
Cross3DVG: Cross-Dataset 3D Visual Grounding on Different RGB-D ScansInternational Conference on 3D Vision (3DV), 2023
Taiki Miyanishi
Daich Azuma
Shuhei Kurita
M. Kawanabe
301
11
0
23 May 2023
Bridging the Domain Gap: Self-Supervised 3D Scene Understanding with
  Foundation Models
Bridging the Domain Gap: Self-Supervised 3D Scene Understanding with Foundation ModelsNeural Information Processing Systems (NeurIPS), 2023
Zhimin Chen
Longlong Jing
Yingwei Li
Bing Li
367
49
0
15 May 2023
ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding
ULIP-2: Towards Scalable Multimodal Pre-training for 3D UnderstandingComputer Vision and Pattern Recognition (CVPR), 2023
Le Xue
Ning Yu
Shu Zhen Zhang
Artemis Panagopoulou
Junnan Li
...
Jiajun Wu
Caiming Xiong
Ran Xu
Juan Carlos Niebles
Silvio Savarese
380
192
0
14 May 2023
Previous
123456
Next