Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.22810
Cited By
VidText: Towards Comprehensive Evaluation for Video Text Understanding
28 May 2025
Zhoufaran Yang
Yan Shu
Zhifei Yang
Yan Zhang
Yu-Hong Li
K. Lu
Gangyan Zeng
Shaohui Liu
Yu Zhou
N. Sebe
CoGe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"VidText: Towards Comprehensive Evaluation for Video Text Understanding"
7 / 7 papers shown
Title
Visual Text Processing: A Comprehensive Review and Unified Evaluation
Yan Shu
Weichao Zeng
Fangmin Zhao
Zeyu Chen
Zhiyu Li
...
Paolo Rota
Xiang Bai
Lianwen Jin
Xu-Cheng Yin
N. Sebe
CoGe
123
3
0
30 Apr 2025
Video-XL-Pro: Reconstructive Token Compression for Extremely Long Video Understanding
Xiangrui Liu
Yan Shu
Zhengyang Liang
Ao Li
Yang Tian
Bo Zhao
VGen
VLM
274
9
0
24 Mar 2025
Memory-enhanced Retrieval Augmentation for Long Video Understanding
Huaying Yuan
Zhengyang Liang
Minhao Qin
Hongjin Qian
Yan Shu
Zhicheng Dou
Ji-Rong Wen
N. Sebe
VOS
RALM
VLM
119
5
0
12 Mar 2025
Qwen2.5-VL Technical Report
S. Bai
Keqin Chen
Xuejing Liu
Jialin Wang
Wenbin Ge
...
Zesen Cheng
Hang Zhang
Zhibo Yang
Haiyang Xu
Junyang Lin
VLM
438
699
0
20 Feb 2025
VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling
Xinhao Li
Yi Wang
Jiashuo Yu
Xiangyu Zeng
Yuhan Zhu
...
Yinan He
Chenting Wang
Yu Qiao
Yali Wang
L. Wang
VLM
230
40
0
31 Dec 2024
Aria: An Open Multimodal Native Mixture-of-Experts Model
Dongxu Li
Yudong Liu
Haoning Wu
Yue Wang
Zhiqi Shen
...
Lihuan Zhang
Hanshu Yan
Guoyin Wang
Bei Chen
Junnan Li
MoE
148
65
0
08 Oct 2024
Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution
Zuyan Liu
Yuhao Dong
Ziwei Liu
Winston Hu
Jiwen Lu
Yongming Rao
ObjD
218
72
0
19 Sep 2024
1