Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.03948
Cited By
Reading Between the Lanes: Text VideoQA on the Road
8 July 2023
George Tom
Minesh Mathew
Sergi Garcia
Dimosthenis Karatzas
C. V. Jawahar
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reading Between the Lanes: Text VideoQA on the Road"
9 / 9 papers shown
Title
An LMM for Efficient Video Understanding via Reinforced Compression of Video Cubes
Ji Qi
Y. Yao
Yushi Bai
Bin Xu
Juanzi Li
Zhiyuan Liu
Tat-Seng Chua
29
0
0
21 Apr 2025
VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models
Haojian Huang
Haodong Chen
Shengqiong Wu
Meng Luo
Jinlan Fu
Xinya Du
H. Zhang
Hao Fei
AI4TS
124
0
0
17 Apr 2025
Vision Technologies with Applications in Traffic Surveillance Systems: A Holistic Survey
Wei Zhou
Lei Zhao
Runyu Zhang
Yifan Cui
Hongpu Huang
Kun Qie
Chen Wang
AI4TS
73
0
0
30 Nov 2024
Scene-Text Grounding for Text-Based Video Question Answering
Sheng Zhou
Junbin Xiao
Xun Yang
Peipei Song
Dan Guo
Angela Yao
Meng Wang
Tat-Seng Chua
116
1
0
22 Sep 2024
Understanding Video Scenes through Text: Insights from Text-based Video Question Answering
Soumya Jahagirdar
Minesh Mathew
Dimosthenis Karatzas
C. V. Jawahar
17
1
0
04 Sep 2023
WildQA: In-the-Wild Video Question Answering
Santiago Castro
Naihao Deng
Pingxuan Huang
Mihai Burzo
Rada Mihalcea
68
7
0
14 Sep 2022
PreSTU: Pre-Training for Scene-Text Understanding
Jihyung Kil
Soravit Changpinyo
Xi Chen
Hexiang Hu
Sebastian Goodman
Wei-Lun Chao
Radu Soricut
VLM
135
29
0
12 Sep 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
390
4,124
0
28 Jan 2022
COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images
Andreas Veit
Tomas Matera
Lukás Neumann
Jirí Matas
Serge J. Belongie
180
515
0
26 Jan 2016
1