ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.12320
  4. Cited By
A Survey on Multimodal Large Language Models for Autonomous Driving

A Survey on Multimodal Large Language Models for Autonomous Driving

21 November 2023
Can Cui
Yunsheng Ma
Xu Cao
Wenqian Ye
Yang Zhou
Kaizhao Liang
Jintai Chen
Juanwu Lu
Zichong Yang
Kuei-Da Liao
Tianren Gao
Erlong Li
Kun Tang
Zhipeng Cao
Tongxi Zhou
Ao Liu
Xinrui Yan
Shuqi Mei
Jianguo Cao
Ziran Wang
Chao Zheng
ArXivPDFHTML

Papers citing "A Survey on Multimodal Large Language Models for Autonomous Driving"

50 / 179 papers shown
Title
CoDriveVLM: VLM-Enhanced Urban Cooperative Dispatching and Motion Planning for Future Autonomous Mobility on Demand Systems
CoDriveVLM: VLM-Enhanced Urban Cooperative Dispatching and Motion Planning for Future Autonomous Mobility on Demand Systems
Haichao Liu
Ruoyu Yao
Wenru Liu
Zhenmin Huang
Shaojie Shen
Jun Ma
40
1
0
10 Jan 2025
Integrating LLMs with ITS: Recent Advances, Potentials, Challenges, and Future Directions
Integrating LLMs with ITS: Recent Advances, Potentials, Challenges, and Future Directions
Doaa Mahmud
Hadeel Hajmohamed
Shamma Almentheri
Shamma Alqaydi
Lameya Aldhaheri
R. A. Khalil
Nasir Saeed
AI4TS
38
4
0
08 Jan 2025
Visual Large Language Models for Generalized and Specialized Applications
Yifan Li
Zhixin Lai
Wentao Bao
Zhen Tan
Anh Dao
Kewei Sui
Jiayi Shen
Dong Liu
Huan Liu
Yu Kong
VLM
83
10
0
06 Jan 2025
General Information Metrics for Improving AI Model Training Efficiency
Jianfeng Xu
Congcong Liu
Xiaoying Tan
Xiaojie Zhu
Anpeng Wu
...
Weijun Kong
Chun Li
Hu Xu
Kun Kuang
Fei Wu
62
0
0
02 Jan 2025
Large-scale moral machine experiment on large language models
Large-scale moral machine experiment on large language models
Muhammad Shahrul Zaim bin Ahmad
Kazuhiro Takemoto
ELM
AILaw
31
1
1
31 Dec 2024
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models
Yue Zhang
Ziqiao Ma
Jialu Li
Yanyuan Qiao
Zun Wang
J. Chai
Qi Wu
Mohit Bansal
Parisa Kordjamshidi
LRM
51
17
0
31 Dec 2024
Large Language Model guided Deep Reinforcement Learning for Decision
  Making in Autonomous Driving
Large Language Model guided Deep Reinforcement Learning for Decision Making in Autonomous Driving
Hao Pang
Zhenpo Wang
Guoqiang Li
39
1
0
24 Dec 2024
PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following
  Models Need for Efficient Generation
PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation
Ao Wang
Hui Chen
Jianchao Tan
K. Zhang
Xunliang Cai
Zijia Lin
J. Han
Guiguang Ding
VLM
77
3
0
04 Dec 2024
Large Multimodal Agents for Accurate Phishing Detection with Enhanced
  Token Optimization and Cost Reduction
Large Multimodal Agents for Accurate Phishing Detection with Enhanced Token Optimization and Cost Reduction
Fouad Trad
Ali Chehab
LLMAG
70
3
0
03 Dec 2024
Explainable and Interpretable Multimodal Large Language Models: A
  Comprehensive Survey
Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Yunkai Dang
Kaichen Huang
Jiahao Huo
Yibo Yan
S. Huang
...
Kun Wang
Yong Liu
Jing Shao
Hui Xiong
Xuming Hu
LRM
96
14
0
03 Dec 2024
ForgerySleuth: Empowering Multimodal Large Language Models for Image
  Manipulation Detection
ForgerySleuth: Empowering Multimodal Large Language Models for Image Manipulation Detection
Zhihao Sun
Haoran Jiang
Haoran Chen
Yixin Cao
Xipeng Qiu
Zuxuan Wu
Yu Jiang
64
1
0
29 Nov 2024
Large Language Model-based Decision-making for COLREGs and the Control of Autonomous Surface Vehicles
Large Language Model-based Decision-making for COLREGs and the Control of Autonomous Surface Vehicles
Klinsmann Agyei
Pouria Sarhadi
W. Naeem
67
0
0
25 Nov 2024
Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Attention Lens
Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Attention Lens
Zhangqi Jiang
Junkai Chen
Beier Zhu
Tingjin Luo
Yankun Shen
Xu Yang
88
4
0
23 Nov 2024
On-Board Vision-Language Models for Personalized Autonomous Vehicle
  Motion Control: System Design and Real-World Validation
On-Board Vision-Language Models for Personalized Autonomous Vehicle Motion Control: System Design and Real-World Validation
Can Cui
Zichong Yang
Yupeng Zhou
Juntong Peng
Sung-Yeon Park
...
Yiheng Feng
Jitesh Panchal
Lingxi Li
Yaobin Chen
Ziran Wang
67
7
0
17 Nov 2024
Mitigating Hallucination in Multimodal Large Language Model via
  Hallucination-targeted Direct Preference Optimization
Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference Optimization
Yuhan Fu
Ruobing Xie
X. Sun
Zhanhui Kang
Xirong Li
MLLM
33
3
0
15 Nov 2024
Exploring the Interplay Between Video Generation and World Models in
  Autonomous Driving: A Survey
Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey
Ao Fu
Yi Zhou
Tao Zhou
Y. Yang
Bojun Gao
Qun Li
Guobin Wu
Ling Shao
VGen
56
2
0
05 Nov 2024
Foundation Models for Rapid Autonomy Validation
Foundation Models for Rapid Autonomy Validation
Alec Farid
Peter Schleede
Aaron Huang
Christoffer Heckman
32
0
0
22 Oct 2024
MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
Chenxi Wang
Xiang Chen
N. Zhang
Bozhong Tian
Haoming Xu
Shumin Deng
H. Chen
MLLM
LRM
29
4
0
15 Oct 2024
SplitLLM: Collaborative Inference of LLMs for Model Placement and
  Throughput Optimization
SplitLLM: Collaborative Inference of LLMs for Model Placement and Throughput Optimization
Akrit Mudvari
Yuang Jiang
Leandros Tassiulas
25
0
0
14 Oct 2024
LADEV: A Language-Driven Testing and Evaluation Platform for
  Vision-Language-Action Models in Robotic Manipulation
LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation
Zhijie Wang
Zhehua Zhou
Jiayang Song
Yuheng Huang
Zhan Shu
Lei Ma
21
0
0
07 Oct 2024
ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal
  Large Language Models Via Error Detection
ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection
Yibo Yan
Shen Wang
Jiahao Huo
Hang Li
B. Li
...
Kun Wang
Hui Xiong
Philip S. Yu
Xuming Hu
Qingsong Wen
LRM
25
13
0
06 Oct 2024
Consultation on Industrial Machine Faults with Large language Models
Consultation on Industrial Machine Faults with Large language Models
Apiradee Boonmee
Kritsada Wongsuwan
Pimchanok Sukjai
AI4CE
20
0
0
04 Oct 2024
Uncertainty-aware Reward Model: Teaching Reward Models to Know What is Unknown
Uncertainty-aware Reward Model: Teaching Reward Models to Know What is Unknown
Xingzhou Lou
Dong Yan
Wei Shen
Yuzi Yan
Jian Xie
Junge Zhang
45
21
0
01 Oct 2024
MM-CamObj: A Comprehensive Multimodal Dataset for Camouflaged Object
  Scenarios
MM-CamObj: A Comprehensive Multimodal Dataset for Camouflaged Object Scenarios
Jiacheng Ruan
Wenzhen Yuan
Zehao Lin
Ning Liao
Zhiyu Li
Feiyu Xiong
Ting Liu
Yuzhuo Fu
41
5
0
24 Sep 2024
MediConfusion: Can you trust your AI radiologist? Probing the
  reliability of multimodal medical foundation models
MediConfusion: Can you trust your AI radiologist? Probing the reliability of multimodal medical foundation models
Mohammad Shahab Sepehri
Zalan Fabian
Maryam Soltanolkotabi
Mahdi Soltanolkotabi
MedIm
32
3
0
23 Sep 2024
Enhancing LLM-based Autonomous Driving Agents to Mitigate Perception
  Attacks
Enhancing LLM-based Autonomous Driving Agents to Mitigate Perception Attacks
Ruoyu Song
Muslum Ozgur Ozmen
Hyungsub Kim
Antonio Bianchi
Z. Berkay Celik
AAML
24
5
0
22 Sep 2024
JourneyBench: A Challenging One-Stop Vision-Language Understanding Benchmark of Generated Images
JourneyBench: A Challenging One-Stop Vision-Language Understanding Benchmark of Generated Images
Zhecan Wang
Junzhang Liu
Chia-Wei Tang
Hani Alomari
Anushka Sivakumar
...
Haoxuan You
A. Ishmam
Kai-Wei Chang
Shih-Fu Chang
Chris Thomas
CoGe
VLM
57
2
0
19 Sep 2024
LMMCoDrive: Cooperative Driving with Large Multimodal Model
LMMCoDrive: Cooperative Driving with Large Multimodal Model
Haichao Liu
Ruoyu Yao
Zhenmin Huang
Shaojie Shen
Jun Ma
16
3
0
18 Sep 2024
From Words to Wheels: Automated Style-Customized Policy Generation for
  Autonomous Driving
From Words to Wheels: Automated Style-Customized Policy Generation for Autonomous Driving
Xu Han
Xianda Chen
Zhenghan Cai
Pinlong Cai
Meixin Zhu
Xiaowen Chu
34
1
0
18 Sep 2024
Video Token Sparsification for Efficient Multimodal LLMs in Autonomous
  Driving
Video Token Sparsification for Efficient Multimodal LLMs in Autonomous Driving
Yunsheng Ma
Amr Abdelraouf
Rohit Gupta
Ziran Wang
Kyungtae Han
21
3
0
16 Sep 2024
Hint-AD: Holistically Aligned Interpretability in End-to-End Autonomous
  Driving
Hint-AD: Holistically Aligned Interpretability in End-to-End Autonomous Driving
Kairui Ding
Boyuan Chen
Yuchen Su
Huan-ang Gao
Bu Jin
...
Wuqiang Zhang
Xiaohui Li
Paul Barsch
Hongyang Li
Hao Zhao
50
3
0
10 Sep 2024
Generative Hierarchical Materials Search
Generative Hierarchical Materials Search
Sherry Yang
Simon L. Batzner
Ruiqi Gao
Muratahan Aykol
Alexander L. Gaunt
Brendan McMorrow
Danilo J. Rezende
Dale Schuurmans
Igor Mordatch
E. D. Cubuk
AI4CE
27
5
0
10 Sep 2024
Multimodal Large Language Model Driven Scenario Testing for Autonomous
  Vehicles
Multimodal Large Language Model Driven Scenario Testing for Autonomous Vehicles
Qiujing Lu
Xuanhan Wang
Yiwei Jiang
Guangming Zhao
Mingyue Ma
Shuo Feng
39
7
0
10 Sep 2024
An overview of domain-specific foundation model: key technologies,
  applications and challenges
An overview of domain-specific foundation model: key technologies, applications and challenges
Haolong Chen
Hanzhi Chen
Zijian Zhao
Kaifeng Han
Guangxu Zhu
Yichen Zhao
Ying Du
Wei Xu
Qingjiang Shi
ALM
VLM
61
4
0
06 Sep 2024
Think Twice Before Recognizing: Large Multimodal Models for General
  Fine-grained Traffic Sign Recognition
Think Twice Before Recognizing: Large Multimodal Models for General Fine-grained Traffic Sign Recognition
Yaozong Gan
Guang Li
Ren Togo
Keisuke Maeda
Takahiro Ogawa
Miki Haseyama
37
0
0
03 Sep 2024
ContextVLM: Zero-Shot and Few-Shot Context Understanding for Autonomous
  Driving using Vision Language Models
ContextVLM: Zero-Shot and Few-Shot Context Understanding for Autonomous Driving using Vision Language Models
Shounak Sural
Naren
R. Rajkumar
24
1
0
30 Aug 2024
How Could Generative AI Support Compliance with the EU AI Act? A Review
  for Safe Automated Driving Perception
How Could Generative AI Support Compliance with the EU AI Act? A Review for Safe Automated Driving Perception
Mert Keser
Youssef Shoeb
Alois Knoll
30
2
0
30 Aug 2024
Look, Compare, Decide: Alleviating Hallucination in Large
  Vision-Language Models via Multi-View Multi-Path Reasoning
Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning
Xiaoye Qu
Jiashuo Sun
Wei Wei
Yu Cheng
MLLM
LRM
18
13
0
30 Aug 2024
UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios
UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios
Baichuan Zhou
Haote Yang
Dairong Chen
Junyan Ye
Tianyi Bai
Jinhua Yu
Songyang Zhang
Dahua Lin
Conghui He
Weijia Li
VLM
53
3
0
30 Aug 2024
Enhancing AI-Driven Psychological Consultation: Layered Prompts with
  Large Language Models
Enhancing AI-Driven Psychological Consultation: Layered Prompts with Large Language Models
Rafael Souza
Jia-Hao Lim
Alexander Davis
LM&MA
AI4MH
18
0
0
29 Aug 2024
Negation Blindness in Large Language Models: Unveiling the NO Syndrome
  in Image Generation
Negation Blindness in Large Language Models: Unveiling the NO Syndrome in Image Generation
Mohammad Nadeem
S. Sohail
Erik Cambria
Björn W. Schuller
Amir Hussain
20
3
0
27 Aug 2024
Surprisingly Fragile: Assessing and Addressing Prompt Instability in
  Multimodal Foundation Models
Surprisingly Fragile: Assessing and Addressing Prompt Instability in Multimodal Foundation Models
Ian Stewart
Sameera Horawalavithana
Brendan Kennedy
Sai Munikoti
Karl Pazdernik
AAML
16
2
0
26 Aug 2024
Cross-Domain Foundation Model Adaptation: Pioneering Computer Vision
  Models for Geophysical Data Analysis
Cross-Domain Foundation Model Adaptation: Pioneering Computer Vision Models for Geophysical Data Analysis
Zhixiang Guo
Xinming Wu
Luming Liang
Hanlin Sheng
Nuo Chen
Zhengfa Bi
AI4CE
40
1
0
22 Aug 2024
Towards Analyzing and Mitigating Sycophancy in Large Vision-Language
  Models
Towards Analyzing and Mitigating Sycophancy in Large Vision-Language Models
Yunpu Zhao
Rui Zhang
Junbin Xiao
Changxin Ke
Ruibo Hou
Yifan Hao
Qi Guo
Yunji Chen
21
4
0
21 Aug 2024
From Feature Importance to Natural Language Explanations Using LLMs with
  RAG
From Feature Importance to Natural Language Explanations Using LLMs with RAG
Sule Tekkesinoglu
Lars Kunze
FAtt
14
1
0
30 Jul 2024
Text2LiDAR: Text-guided LiDAR Point Cloud Generation via Equirectangular
  Transformer
Text2LiDAR: Text-guided LiDAR Point Cloud Generation via Equirectangular Transformer
Yang Wu
Kaihua Zhang
Jianjun Qian
Jin Xie
Jian Yang
DiffM
37
4
0
29 Jul 2024
Testing Large Language Models on Driving Theory Knowledge and Skills for
  Connected Autonomous Vehicles
Testing Large Language Models on Driving Theory Knowledge and Skills for Connected Autonomous Vehicles
Zuoyin Tang
Jianhua He
Dashuai Pei
Kezhong Liu
Tao Gao
ELM
33
2
0
24 Jul 2024
A Benchmark Dataset for Multimodal Prediction of Enzymatic Function
  Coupling DNA Sequences and Natural Language
A Benchmark Dataset for Multimodal Prediction of Enzymatic Function Coupling DNA Sequences and Natural Language
Yuchen Zhang
Ratish Kumar Chandrakant Jha
Soumya Bharadwaj
Vatsal Sanjaykumar Thakkar
Adrienne Hoarfrost
Jin Sun
17
1
0
21 Jul 2024
Is Sarcasm Detection A Step-by-Step Reasoning Process in Large Language
  Models?
Is Sarcasm Detection A Step-by-Step Reasoning Process in Large Language Models?
Ben Yao
Yazhou Zhang
Qiuchi Li
Jing Qin
ReLM
LRM
37
3
0
17 Jul 2024
Mobile Edge Intelligence for Large Language Models: A Contemporary Survey
Mobile Edge Intelligence for Large Language Models: A Contemporary Survey
Guanqiao Qu
Qiyuan Chen
Wei Wei
Zheng Lin
Xianhao Chen
Kaibin Huang
33
37
0
09 Jul 2024
Previous
1234
Next