Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2412.10360
Cited By
Apollo: An Exploration of Video Understanding in Large Multimodal Models
13 December 2024
Orr Zohar
Xiaohan Wang
Yann Dubois
Nikhil Mehta
Tong Xiao
Philippe Hansen-Estruch
Licheng Yu
Xiaofang Wang
F. Xu
Ning Zhang
Serena Yeung-Levy
Xide Xia
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Apollo: An Exploration of Video Understanding in Large Multimodal Models"
7 / 7 papers shown
Title
StreamBridge: Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant
Haibo Wang
Bo Feng
Zhengfeng Lai
Mingze Xu
Shiyu Li
Weifeng Ge
Afshin Dehghan
Meng Cao
Ping-Chia Huang
OffRL
36
3
0
08 May 2025
RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video
Shuhang Xun
Sicheng Tao
J. Li
Yibo Shi
Zhixin Lin
...
Shikang Wang
Y. Liu
H. Zhang
Ying Ma
Xuming Hu
VLM
LRM
32
0
0
04 May 2025
ActionArt: Advancing Multimodal Large Models for Fine-Grained Human-Centric Video Understanding
Yi-Xing Peng
Q. Yang
Yu-Ming Tang
Shenghao Fu
Kun-Yu Lin
Xihan Wei
Wei-Shi Zheng
40
0
0
25 Apr 2025
Video-MMLU: A Massive Multi-Discipline Lecture Understanding Benchmark
Enxin Song
Wenhao Chai
Weili Xu
Jianwen Xie
Yuxuan Liu
Gaoang Wang
54
0
0
20 Apr 2025
Video-XL-Pro: Reconstructive Token Compression for Extremely Long Video Understanding
Xiangrui Liu
Yan Shu
Zheng Liu
Ao Li
Yang Tian
Bo Zhao
VGen
VLM
86
0
0
24 Mar 2025
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling
Yi Wang
Xinhao Li
Ziang Yan
Yinan He
Jiashuo Yu
...
Kai Chen
Wenhai Wang
Yu Qiao
Yali Wang
Limin Wang
61
19
0
21 Jan 2025
Do Language Models Understand Time?
Xi Ding
Lei Wang
149
0
0
18 Dec 2024
1