
VidText: Towards Comprehensive Evaluation for Video Text Understanding
Papers citing "VidText: Towards Comprehensive Evaluation for Video Text Understanding"
7 / 7 papers shown
Title |
---|
![]() Aria: An Open Multimodal Native Mixture-of-Experts Model Dongxu Li Yudong Liu Haoning Wu Yue Wang Zhiqi Shen ...Lihuan Zhang Hanshu Yan Guoyin Wang Bei Chen Junnan Li |