Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.18938
Cited By
From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding
27 September 2024
Heqing Zou
Tianze Luo
Guiyang Xie
Victor
Zhang
Fengmao Lv
Guangcong Wang
Juanyang Chen
Zhuochen Wang
Hansheng Zhang
Huaijian Zhang
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding"
2 / 2 papers shown
Title
DyGEnc: Encoding a Sequence of Textual Scene Graphs to Reason and Answer Questions in Dynamic Scenes
S. Linok
Vadim Semenov
Anastasia Trunova
Oleg Bulichev
Dmitry A. Yudin
37
0
0
06 May 2025
ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding
Xiao Wang
Qingyi Si
Jianlong Wu
Shiyu Zhu
Li Cao
Liqiang Nie
VLM
67
6
0
29 Dec 2024
1