Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2209.10767
Cited By
v1
v2 (latest)
DRAMA: Joint Risk Localization and Captioning in Driving
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
22 September 2022
Srikanth Malla
Chiho Choi
Isht Dwivedi
Joonhyang Choi
Jiachen Li
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"DRAMA: Joint Risk Localization and Captioning in Driving"
50 / 79 papers shown
Title
The Catastrophic Paradox of Human Cognitive Frameworks in Large Language Model Evaluation: A Comprehensive Empirical Analysis of the CHC-LLM Incompatibility
Mohan Reddy
ELM
136
0
0
23 Nov 2025
Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail
Nvidia
Yan Wang
W. Luo
Junjie Bai
Yulong Cao
...
Yurong You
Xiaohui Zeng
Wenyuan Zhang
Boris Ivanovic
Marco Pavone
LRM
104
8
0
30 Oct 2025
MITS: A Large-Scale Multimodal Benchmark Dataset for Intelligent Traffic Surveillance
Image and Vision Computing (IVC), 2025
Kaikai Zhao
Zhaoxiang Liu
Liang Luo
X. Wang
Zhicheng Ma
Yajun Xu
Wenjing Zhang
Yibing Nan
Kai Wang
Shiguo Lian
MLLM
AI4TS
VLM
119
0
0
10 Sep 2025
OmniReason: A Temporal-Guided Vision-Language-Action Framework for Autonomous Driving
Pei Liu
Qingtian Ning
Zehan Zhang
Haipeng Liu
Weiliang Ma
Dangen She
Fu Liu
Xianpeng Lang
Jun Ma
LRM
57
1
0
31 Aug 2025
TRIDE: A Text-assisted Radar-Image weather-aware fusion network for Depth Estimation
Huawei Sun
Zixu Wang
Hao Feng
Julius Ott
Lorenzo Servadei
Robert Wille
96
1
0
11 Aug 2025
SafeDriveRAG: Towards Safe Autonomous Driving with Knowledge Graph-based Retrieval-Augmented Generation
Hao Ye
Mengshi Qi
Zhaohong Liu
Liang Liu
Huadong Ma
120
5
0
29 Jul 2025
BEV-LLM: Leveraging Multimodal BEV Maps for Scene Captioning in Autonomous Driving
Felix Brandstaetter
Erik Schuetz
Katharina Winter
Fabian B. Flohr
109
1
0
25 Jul 2025
RoD-TAL: A Benchmark for Answering Questions in Romanian Driving License Exams
Andrei Vlad Man
Razvan-Alexandru Smadu
Cristian-George Crăciun
Dumitru-Clementin Cercel
Florin-Catalin Pop
Mihaela-Claudia Cercel
80
0
0
25 Jul 2025
InterAct-Video: Reasoning-Rich Video QA for Urban Traffic
Joseph Raj Vishal
Rutuja Patil
Manas Srinivas Gowda
Katha Naik
Yezhou Yang
Bharatesh Chakravarthi
Bharatesh Chakravarthi
122
0
0
19 Jul 2025
VRU-Accident: A Vision-Language Benchmark for Video Question Answering and Dense Captioning for Accident Scene Understanding
Younggun Kim
Ahmed S. Abdelrahman
Mohamed Abdel-Aty
118
3
0
13 Jul 2025
DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For Driving
Mihir Godbole
Xiangbo Gao
Zhengzhong Tu
244
2
0
21 Jun 2025
Domain Specific Benchmarks for Evaluating Multimodal Large Language Models
Khizar Anjuma
Muhammad Arbab Arshad
Kadhim Hayawi
Efstathios Polyzos
A. Tariq
...
Nishith Reddy Mannuru
Ravi Varma Kumar Bevara
Taslim Mahbub
Muhammad Zeeshan Akram
Sakib Shahriar
ELM
LRM
320
2
0
15 Jun 2025
Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis
Yuan Gao
Mattia Piccinini
Yuchen Zhang
Dingrui Wang
Korbinian Moller
...
Steven Peters
Andrea Stocco
Bassam Alrifaee
Marco Pavone
Johannes Betz
179
17
0
13 Jun 2025
ABC-FHE : A Resource-Efficient Accelerator Enabling Bootstrappable Parameters for Client-Side Fully Homomorphic Encryption
Design Automation Conference (DAC), 2025
Sungwoong Yune
Hyojeong Lee
Adiwena Putra
Hyunjun Cho
Cuong Duong Manh
Jaeho Jeon
Joo-Young Kim
237
1
0
10 Jun 2025
Super Encoding Network: Recursive Association of Multi-Modal Encoders for Video Understanding
Boyu Chen
Siran Chen
Kunchang Li
Qinglin Xu
Yu Qiao
Yali Wang
VOS
147
5
0
09 Jun 2025
DriveAction: A Benchmark for Exploring Human-like Driving Decisions in VLA Models
Yuhan Hao
Zhengning Li
Lei Sun
Weilong Wang
Naixin Yi
Sheng Song
Caihong Qin
Mofan Zhou
Yifei Zhan
Fu Liu
VLM
163
6
0
06 Jun 2025
Structured Labeling Enables Faster Vision-Language Models for End-to-End Autonomous Driving
Hao Jiang
Chuan Hu
Yukang Shi
Yuan He
Ke Wang
X. Zhang
Zhipeng Zhang
3DV
VLM
144
1
0
05 Jun 2025
Chain-of-Thought for Autonomous Driving: A Comprehensive Survey and Future Prospects
Yixin Cui
Haotian Lin
Shuo Yang
Yixiao Wang
Yanjun Huang
Hong Chen
LM&Ro
LRM
ELM
289
6
0
26 May 2025
Real-time Traffic Accident Anticipation with Feature Reuse
International Conference on Information Photonics (ICIP), 2025
Inpyo Song
Jangwon Lee
114
1
0
23 May 2025
Large Language Models and Their Applications in Roadway Safety and Mobility Enhancement: A Comprehensive Review
Muhammad Monjurul Karim
Yan Shi
Shucheng Zhang
Bingzhang Wang
Mehrdad Nasri
Yinhai Wang
147
7
0
19 May 2025
Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving
Zongchuang Zhao
Haoyu Fu
Dingkang Liang
Xin Zhou
Dingyuan Zhang
Hongwei Xie
Bing Wang
Xiang Bai
MLLM
VLM
318
3
0
13 May 2025
Deep Learning Advances in Vision-Based Traffic Accident Anticipation: A Comprehensive Review of Methods, Datasets, and Future Directions
Ruonan Lin
Tao Tang
Y. Liu
Wenye Zhou
Xin Yang
Hao Zheng
Jianpu Lin
Yi Zhang
229
1
0
12 May 2025
DriveSOTIF: Advancing Perception SOTIF Through Multimodal Large Language Models
Shucheng Huang
Freda Shi
Chen Sun
Jiaming Zhong
Minghao Ning
Yufeng Yang
Yukun Lu
Hong Wang
A. Khajepour
353
0
0
11 May 2025
Multimodal Large Language Models for Enhanced Traffic Safety: A Comprehensive Review and Future Trends
M. Tami
Mohammed Elhenawy
Huthaifa I. Ashqar
260
1
0
21 Apr 2025
Are Vision LLMs Road-Ready? A Comprehensive Benchmark for Safety-Critical Driving Video Understanding
Tong Zeng
Longfeng Wu
Liang Shi
Dawei Zhou
Feng Guo
166
6
0
20 Apr 2025
RoadSocial: A Diverse VideoQA Dataset and Benchmark for Road Event Understanding from Social Video Narratives
Computer Vision and Pattern Recognition (CVPR), 2025
Chirag Parikh
Deepti Rawat
Rakshitha R. T.
Tathagata Ghosh
Ravi Kiran Sarvadevabhatla
113
5
0
27 Mar 2025
Fine-Grained Evaluation of Large Vision-Language Models in Autonomous Driving
Yue Li
Meng Tian
Zhenyu Lin
Jiangtong Zhu
Dechang Zhu
Haiqiang Liu
Zining Wang
Yueyi Zhang
Zhiwei Xiong
Xinhai Zhao
CoGe
VLM
300
8
0
27 Mar 2025
ATARS: An Aerial Traffic Atomic Activity Recognition and Temporal Segmentation Dataset
Zihao Chen
Hsuanyu Wu
Chi-Hsi Kung
Yi-Ting Chen
Yan-Tsung Peng
209
1
0
24 Mar 2025
DynRsl-VLM: Enhancing Autonomous Driving Perception with Dynamic Resolution Vision-Language Models
Xirui Zhou
Lianlei Shan
Xiaolin Gui
160
14
0
14 Mar 2025
HazardNet: A Small-Scale Vision Language Model for Real-Time Traffic Safety Detection at Edge Devices
International Conference on Multimodal Interaction (ICMI), 2025
M. Tami
Mohammed Elhenawy
Huthaifa I. Ashqar
283
0
0
27 Feb 2025
Distilling Multi-modal Large Language Models for Autonomous Driving
Computer Vision and Pattern Recognition (CVPR), 2025
Deepti Hegde
R. Yasarla
H. Cai
Shizhong Han
Apratim Bhattacharyya
Shweta Mahajan
Litian Liu
Risheek Garrepalli
Vishal M. Patel
Fatih Porikli
175
22
0
17 Jan 2025
Embodied Scene Understanding for Vision Language Models via MetaVQA
Computer Vision and Pattern Recognition (CVPR), 2025
Weizhen Wang
Chenda Duan
Zhenghao Peng
Yuxin Liu
Bolei Zhou
LM&Ro
243
7
0
17 Jan 2025
DriveLM: Driving with Graph Visual Question Answering
European Conference on Computer Vision (ECCV), 2023
Chonghao Sima
Katrin Renz
Kashyap Chitta
Lawrence Yunliang Chen
Hanxue Zhang
Chengen Xie
Jens Beißwenger
Ping Luo
Andreas Geiger
Hongyang Li
690
334
0
17 Jan 2025
TB-Bench: Training and Testing Multi-Modal AI for Understanding Spatio-Temporal Traffic Behaviors from Dashcam Images/Videos
Korawat Charoenpitaks
Van-Quang Nguyen
Masanori Suganuma
Kentaro Arai
Seiji Totsuka
Hiroshi Ino
Takayuki Okatani
VLM
100
2
0
10 Jan 2025
H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving
AAAI Conference on Artificial Intelligence (AAAI), 2025
Tian Jin
Yuxiao Luo
Yue Ma
Yu Qiao
Yali Wang
Mamba
234
5
0
08 Jan 2025
doScenes: An Autonomous Driving Dataset with Natural Language Instruction for Human Interaction and Vision-Language Navigation
Parthib Roy
Srinivasa Perisetla
Shashank Shriram
Harsha Krishnaswamy
Aryan Keskar
Ross Greer
VGen
241
3
0
08 Dec 2024
On-Road Object Importance Estimation: A New Dataset and A Model with Multi-Fold Top-Down Guidance
Neural Information Processing Systems (NeurIPS), 2024
Jingjing Jiang
Yilong Chen
Tianfei Zhou
Tao Xiang
266
0
0
26 Nov 2024
Explanation for Trajectory Planning using Multi-modal Large Language Model for Autonomous Driving
Shota Yamazaki
Chenyu Zhang
Takuya Nanri
Akio Shigekane
Siyuan Wang
Jo Nishiyama
Tao Chu
Kohei Yokosawa
LRM
213
1
0
15 Nov 2024
Driving with Regulation: Trustworthy and Interpretable Decision-Making for Autonomous Driving with Retrieval-Augmented Reasoning
Tianhui Cai
Yifan Liu
Zewei Zhou
Haoxuan Ma
Seth Z. Zhao
Zhiwen Wu
Xu Han
Zhiyu Huang
Jiaqi Ma
337
0
0
07 Oct 2024
Video Token Sparsification for Efficient Multimodal LLMs in Autonomous Driving
Yunsheng Ma
Lingxi Li
Rohit Gupta
Ziran Wang
Kyungtae Han
238
7
0
16 Sep 2024
Hint-AD: Holistically Aligned Interpretability in End-to-End Autonomous Driving
Conference on Robot Learning (CoRL), 2024
Kairui Ding
Boyuan Chen
Yuchen Su
Huan-ang Gao
Bu Jin
...
Wuqiang Zhang
Xiaohui Li
Paul Barsch
Hongyang Li
Hao Zhao
196
19
0
10 Sep 2024
How Could Generative AI Support Compliance with the EU AI Act? A Review for Safe Automated Driving Perception
International Conference on Vehicular Electronics and Safety (ICVES), 2024
Mert Keser
Youssef Shoeb
Alois Knoll
247
4
0
30 Aug 2024
CoVLA: Comprehensive Vision-Language-Action Dataset for Autonomous Driving
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Hidehisa Arai
Keita Miwa
Kento Sasaki
Yu Yamaguchi
Kohei Watanabe
Shunsuke Aoki
Issei Yamamoto
274
42
0
19 Aug 2024
WTS: A Pedestrian-Centric Traffic Video Dataset for Fine-grained Spatial-Temporal Understanding
Quan Kong
Yuki Kawana
Rajat Saini
Ashutosh Kumar
Jingjing Pan
...
Yohei Ozao
Balázs Opra
D. Anastasiu
Yoichi Sato
Norimasa Kobori
VGen
120
18
0
22 Jul 2024
VisionTrap: Vision-Augmented Trajectory Prediction Guided by Textual Descriptions
Seokha Moon
Hyun Woo
Hongbeen Park
Haeji Jung
R. Mahjourian
Hyung-Gun Chi
Hyerin Lim
Sangpil Kim
Jinkyu Kim
239
21
0
17 Jul 2024
WOMD-Reasoning: A Large-Scale Dataset for Interaction Reasoning in Driving
Yiheng Li
Cunxin Fan
Chongjian Ge
Zhihao Zhao
Chenran Li
...
Masayoshi Tomizuka
Bolei Zhou
Chen Tang
Mingyu Ding
Wei Zhan
VGen
LRM
304
1
0
05 Jul 2024
Using Multimodal Large Language Models for Automated Detection of Traffic Safety Critical Events
M. Tami
Huthaifa I. Ashqar
Mohammed Elhenawy
192
6
0
19 Jun 2024
OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual Reasoning
Shihao Wang
Zhiding Yu
Xiaohui Jiang
Shiyi Lan
Min Shi
Nadine Chang
Jan Kautz
Ying Li
Jose M. Alvarez
LRM
201
46
0
02 May 2024
Instance-free Text to Point Cloud Localization with Relative Position Awareness
Lichao Wang
Zhihao Yuan
Jinke Ren
Shuguang Cui
Zhen Li
263
0
0
27 Apr 2024
Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models
Zhenyang Ni
Rui Ye
Yuxian Wei
Zhen Xiang
Yanfeng Wang
Siheng Chen
AAML
272
27
0
19 Apr 2024
1
2
Next