ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.08045
  4. Cited By
Forging Vision Foundation Models for Autonomous Driving: Challenges,
  Methodologies, and Opportunities

Forging Vision Foundation Models for Autonomous Driving: Challenges, Methodologies, and Opportunities

16 January 2024
Xu Yan
Haiming Zhang
Yingjie Cai
Jingming Guo
Weichao Qiu
Bin-Bin Gao
Kaiqiang Zhou
Yue Zhao
Huan Jin
Jiantao Gao
Zhen Li
Lihui Jiang
Wei Zhang
Hongbo Zhang
Dengxin Dai
Bingbing Liu
ArXivPDFHTML

Papers citing "Forging Vision Foundation Models for Autonomous Driving: Challenges, Methodologies, and Opportunities"

35 / 35 papers shown
Title
Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting
Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting
Yunzhi Yan
Haotong Lin
Chenxu Zhou
Weijie Wang
Haiyang Sun
Kun Zhan
Xianpeng Lang
Xiaowei Zhou
Sida Peng
3DGS
39
35
0
02 Jan 2024
RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering
  Assisted Distillation
RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering Assisted Distillation
Haiming Zhang
Xu Yan
Dongfeng Bai
Jiantao Gao
Pan Wang
Bingbing Liu
Shuguang Cui
Zhen Li
42
12
0
19 Dec 2023
SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction
SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction
Yuanhui Huang
Wenzhao Zheng
Borui Zhang
Jie Zhou
Jiwen Lu
3DPC
32
22
0
21 Nov 2023
DrivingDiffusion: Layout-Guided multi-view driving scene video
  generation with latent diffusion model
DrivingDiffusion: Layout-Guided multi-view driving scene video generation with latent diffusion model
Xiaofan Li
Yifu Zhang
Xiaoqing Ye
VGen
48
29
0
11 Oct 2023
LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving
LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving
Hao Sha
Yao Mu
Yuxuan Jiang
Li Chen
Chenfeng Xu
Ping Luo
Shengbo Eben Li
Masayoshi Tomizuka
Wei Zhan
Mingyu Ding
67
79
0
04 Oct 2023
RenderOcc: Vision-Centric 3D Occupancy Prediction with 2D Rendering
  Supervision
RenderOcc: Vision-Centric 3D Occupancy Prediction with 2D Rendering Supervision
Mingjie Pan
Jiaming Liu
Renrui Zhang
Peixiang Huang
Xiaoqi Li
Bing Wang
Hongwei Xie
Li Liu
Shanghang Zhang
30
31
0
18 Sep 2023
MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image
  Segmentation
MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image Segmentation
Cheng Chen
Juzheng Miao
Dufan Wu
Zhiling Yan
Sekeun Kim
...
Lichao Sun
Xiang Li
Tianming Liu
Pheng-Ann Heng
Quanzheng Li
MedIm
29
27
0
16 Sep 2023
Caption Anything: Interactive Image Description with Diverse Multimodal
  Controls
Caption Anything: Interactive Image Description with Diverse Multimodal Controls
Teng Wang
Jinrui Zhang
Junjie Fei
Hao Zheng
Yunlong Tang
Zhe Li
Mingqi Gao
Shanshan Zhao
MLLM
84
53
0
04 May 2023
Zenseact Open Dataset: A large-scale and diverse multimodal dataset for
  autonomous driving
Zenseact Open Dataset: A large-scale and diverse multimodal dataset for autonomous driving
Mina Alibeigi
William Ljungbergh
Adam Tonderski
Georg Hess
Adam Lilja
Carl Lindström
D. Motorniuk
Junsheng Fu
Jenny Widahl
Christoffer Petersson
VGen
21
32
0
03 May 2023
Scalable Mask Annotation for Video Text Spotting
Scalable Mask Annotation for Video Text Spotting
Haibin He
Jing Zhang
Mengyang Xu
Juhua Liu
Bo Du
Dacheng Tao
64
8
0
02 May 2023
NeRF-LiDAR: Generating Realistic LiDAR Point Clouds with Neural Radiance
  Fields
NeRF-LiDAR: Generating Realistic LiDAR Point Clouds with Neural Radiance Fields
Junge Zhang
Feihu Zhang
Shaochen Kuang
Li Zhang
3DPC
38
20
0
28 Apr 2023
Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous
  Driving
Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving
Xiaoyu Tian
Tao Jiang
Longfei Yun
Yucheng Mao
Huitong Yang
Yue Wang
Yilun Wang
Hang Zhao
3DPC
3DV
34
92
0
27 Apr 2023
PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D
  Object Detection
PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection
Anthony Chen
Kevin Zhang
Renrui Zhang
Zihan Wang
Yuheng Lu
Yandong Guo
Shanghang Zhang
3DPC
32
39
0
14 Mar 2023
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion
  Models
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models
Jiarui Xu
Sifei Liu
Arash Vahdat
Wonmin Byeon
Xiaolong Wang
Shalini De Mello
VLM
185
262
0
08 Mar 2023
OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic
  Occupancy Perception
OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception
Xiaofeng Wang
Zhengbiao Zhu
Wenbo Xu
Yunpeng Zhang
Yi Wei
Xu Chi
Yun Ye
Dalong Du
Jiwen Lu
Xingang Wang
26
92
0
07 Mar 2023
S-NeRF: Neural Radiance Fields for Street Views
S-NeRF: Neural Radiance Fields for Street Views
Ziyang Xie
Junge Zhang
Wenye Li
Feihu Zhang
Li Zhang
78
59
0
01 Mar 2023
GLM-130B: An Open Bilingual Pre-trained Model
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng-Zhen Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
234
840
0
05 Oct 2022
DRAMA: Joint Risk Localization and Captioning in Driving
DRAMA: Joint Risk Localization and Captioning in Driving
Srikanth Malla
Chiho Choi
Isht Dwivedi
Joonhyang Choi
Jiachen Li
70
56
0
22 Sep 2022
Delving into the Devils of Bird's-eye-view Perception: A Review,
  Evaluation and Recipe
Delving into the Devils of Bird's-eye-view Perception: A Review, Evaluation and Recipe
Hongyang Li
Chonghao Sima
Jifeng Dai
Wenhai Wang
Lewei Lu
...
Xiaosong Jia
Siqian Liu
Jianping Shi
Dahua Lin
Yu Qiao
70
91
0
12 Sep 2022
2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds
2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds
Xu Yan
Jiantao Gao
Chaoda Zheng
Chao Zheng
Ruimao Zhang
Shenghui Cui
Zhen Li
3DPC
60
146
0
10 Jul 2022
Flexible Diffusion Modeling of Long Videos
Flexible Diffusion Modeling of Long Videos
William Harvey
Saeid Naderiparizi
Vaden Masrani
Christian Weilbach
Frank D. Wood
DiffM
BDL
VGen
161
213
0
23 May 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
375
2,713
0
28 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
255
5,353
0
11 Nov 2021
A Reinforcement Learning Benchmark for Autonomous Driving in
  Intersection Scenarios
A Reinforcement Learning Benchmark for Autonomous Driving in Intersection Scenarios
Yuqi Liu
Qichao Zhang
Dongbin Zhao
OffRL
58
9
0
22 Sep 2021
Learning to Prompt for Vision-Language Models
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
311
2,108
0
02 Sep 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
280
4,299
0
29 Apr 2021
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip
  Retrieval
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval
Huaishao Luo
Lei Ji
Ming Zhong
Yang Chen
Wen Lei
Nan Duan
Tianrui Li
CLIP
VLM
292
587
0
18 Apr 2021
Self-Supervised Pretraining of 3D Features on any Point-Cloud
Self-Supervised Pretraining of 3D Features on any Point-Cloud
Zaiwei Zhang
Rohit Girdhar
Armand Joulin
Ishan Misra
3DPC
94
235
0
07 Jan 2021
Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection
Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection
Jiajun Deng
Shaoshuai Shi
Pei-Cian Li
Wen-gang Zhou
Yanyong Zhang
Houqiang Li
3DPC
198
660
0
31 Dec 2020
AMVNet: Assertion-based Multi-View Fusion Network for LiDAR Semantic
  Segmentation
AMVNet: Assertion-based Multi-View Fusion Network for LiDAR Semantic Segmentation
Venice Erin Liong
Thi Ngoc Tho Nguyen
S. Widjaja
Dhananjai Sharma
Z. J. Chong
3DPC
106
97
0
09 Dec 2020
PointContrast: Unsupervised Pre-training for 3D Point Cloud
  Understanding
PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding
Saining Xie
Jiatao Gu
Demi Guo
C. Qi
Leonidas J. Guibas
Or Litany
3DPC
107
525
0
21 Jul 2020
Improved Baselines with Momentum Contrastive Learning
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
227
3,029
0
09 Mar 2020
Deep Generative Modeling of LiDAR Data
Deep Generative Modeling of LiDAR Data
Lucas Page-Caccia
H. V. Hoof
Aaron Courville
Joelle Pineau
3DPC
123
55
0
04 Dec 2018
Image-to-Image Translation with Conditional Adversarial Networks
Image-to-Image Translation with Conditional Adversarial Networks
Phillip Isola
Jun-Yan Zhu
Tinghui Zhou
Alexei A. Efros
SSeg
200
7,006
0
21 Nov 2016
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object Detection
Joseph Redmon
S. Divvala
Ross B. Girshick
Ali Farhadi
ObjD
261
31,717
0
08 Jun 2015
1