Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2501.12380
Cited By
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Computer Vision and Pattern Recognition (CVPR), 2025
21 January 2025
Yilun Zhao
Lujing Xie
Haowei Zhang
Guo Gan
Yitao Long
Zhiyuan Hu
Tongyan Hu
Weiyuan Chen
Chuhan Li
Junyang Song
Zhihao Xu
Chengye Wang
Weifeng Pan
Ziyao Shangguan
Xiangru Tang
Zhenwen Liang
Yongxu Liu
Chen Zhao
Arman Cohan
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (86 upvotes)
Papers citing
"MMVU: Measuring Expert-Level Multi-Discipline Video Understanding"
40 / 40 papers shown
Title
Thinking with Drafts: Speculative Temporal Reasoning for Efficient Long Video Understanding
Pengfei Hu
Meng Cao
Y. Wang
Yi Wang
Jiahua Dong
Jun Song
Yu Cheng
Bo Zheng
Xiaodan Liang
LRM
VLM
97
0
0
30 Nov 2025
Video-CoM: Interactive Video Reasoning via Chain of Manipulations
H. Rasheed
Mohammed Zumri
Muhammad Maaz
Ming-Hsuan Yang
Fahad Shahbaz Khan
Salman Khan
AI4TS
LRM
113
0
0
28 Nov 2025
Boosting Reasoning in Large Multimodal Models via Activation Replay
Yun Xing
Xiaobin Hu
Qingdong He
Jiangning Zhang
Shuicheng Yan
Shijian Lu
Yu-Gang Jiang
OffRL
LRM
180
1
0
25 Nov 2025
MASS: Motion-Aware Spatial-Temporal Grounding for Physics Reasoning and Comprehension in Vision-Language Models
Xiyang Wu
Zongxia Li
Jihui Jin
Guangyao Shi
Gouthaman KV
Vishnu Raj
Nilotpal Sinha
Jingxi Chen
Fan Du
Dinesh Manocha
108
0
0
23 Nov 2025
CrossVid: A Comprehensive Benchmark for Evaluating Cross-Video Reasoning in Multimodal Large Language Models
Jingyao Li
Jingyun Wang
Molin Tan
Haochen Wang
Cilin Yan
Likun Shi
Jiayin Cai
Xiaolong Jiang
Yao Hu
VLM
LRM
136
0
0
15 Nov 2025
Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark
Ziyu Guo
Xinyan Chen
Renrui Zhang
Ruichuan An
Yu Qi
Dongzhi Jiang
Xiangtai Li
M. Zhang
Jiaming Song
Pheng-Ann Heng
VGen
LRM
140
9
0
30 Oct 2025
Video Reasoning without Training
Deepak Sridhar
K. Bhardwaj
Jeya Pradha Jeyaraj
Nuno Vasconcelos
Ankita Nayak
Harris Teague
OffRL
LRM
180
1
0
19 Oct 2025
ExpVid: A Benchmark for Experiment Video Understanding & Reasoning
Yicheng Xu
Y. Wu
Jiashuo Yu
Ziang Yan
Tianxiang Jiang
...
Kai Chen
Yu Qiao
Limin Wang
Manabu Okumura
Y. Wang
LRM
112
1
0
13 Oct 2025
Answer-Consistent Chain-of-thought Reinforcement Learning For Multi-modal Large Langauge Models
Minbin Huang
Runhui Huang
Chuanyang Zheng
Jingyao Li
Guoxuan Chen
Han Shi
Hong Cheng
KELM
LRM
80
0
0
11 Oct 2025
MARC: Memory-Augmented RL Token Compression for Efficient Video Understanding
Peiran Wu
Zhuorui Yu
Yunze Liu
Chi-Hao Wu
Enmin Zhou
Junxiao Shen
OffRL
VLM
80
1
0
09 Oct 2025
When Thinking Drifts: Evidential Grounding for Robust Video Reasoning
M. Luo
Zihui Xue
Alex Dimakis
Kristen Grauman
VGen
LRM
232
4
0
07 Oct 2025
RefineShot: Rethinking Cinematography Understanding with Foundational Skill Evaluation
Hang Wu
Yujun Cai
Haonan Ge
H. Chen
Ming-Hsuan Yang
Yiwei Wang
CoGe
151
0
0
02 Oct 2025
NeMo: Needle in a Montage for Video-Language Understanding
Zi-Yuan Hu
Shuo Liang
Duo Zheng
Yanyang Li
Yeyao Tao
...
Jianguang Yu
Jing-ling Huang
Meng Fang
Yin Li
Liwei Wang
145
2
0
29 Sep 2025
ReWatch-R1: Boosting Complex Video Reasoning in Large Vision-Language Models through Agentic Data Synthesis
Congzhi Zhang
Zhibin Wang
Yinchao Ma
Jiawei Peng
Y. Wang
Qiang Zhou
Jun Song
Bo Zheng
OffRL
AI4TS
LRM
202
2
0
28 Sep 2025
HyCoVAD: A Hybrid SSL-LLM Model for Complex Video Anomaly Detection
Mohammad Mahdi Hemmatyar
Mahdi Jafari
Mohammad Amin Yousefi
Mohammad Reza Nemati
Mobin Azadani
Hamid Reza Rastad
Amirmohammad Akbari
172
0
0
26 Sep 2025
Kwai Keye-VL 1.5 Technical Report
Biao Yang
Bin Wen
Boyang Ding
Changyi Liu
Chenglong Chu
...
S. Wang
X. Luo
Yan Li
Yuhang Hu
Zixing Zhang
VLM
276
12
0
01 Sep 2025
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model
Xiyao Wang
Chunyuan Li
Jianwei Yang
Kai Zhang
B. Liu
Tianyi Xiong
Furong Huang
OffRL
ReLM
LRM
103
6
0
31 Aug 2025
VideoRewardBench: Comprehensive Evaluation of Multimodal Reward Models for Video Understanding
Zhihong Zhang
Xiaojian Huang
Jin Xu
Zhuodong Luo
Xinzhi Wang
Jiansheng Wei
Xuejin Chen
VLM
104
0
0
30 Aug 2025
Droplet3D: Commonsense Priors from Videos Facilitate 3D Generation
Xiaochuan Li
Guoguang Du
Runze Zhang
Liang Jin
Qi Jia
...
Tianqi Wang
Changsheng Li
Xiaoli Gong
Rengang Li
Baoyu Fan
VGen
89
0
0
28 Aug 2025
Video-LevelGauge: Investigating Contextual Positional Bias in Large Video Language Models
Hou Xia
Zheren Fu
Fangcan Ling
Jiajun Li
Yi Tu
Zhendong Mao
Yongdong Zhang
153
0
0
27 Aug 2025
HumanPCR: Probing MLLM Capabilities in Diverse Human-Centric Scenes
Keliang Li
Hongze Shen
Hao Shi
Ruibing Hou
Hong Chang
...
Wen Wang
Yiling Wu
Shihong Deng
Shiguang Shan
Xilin Chen
LRM
148
1
0
19 Aug 2025
Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning
H. Zhang
Xin Gu
Jiawen Li
Chixiang Ma
Sule Bai
Chubin Zhang
Bowen Zhang
Zhichao Zhou
Dongliang He
Yansong Tang
OffRL
LRM
157
24
0
06 Aug 2025
AVATAR: Reinforcement Learning to See, Hear, and Reason Over Video
Yogesh Kulkarni
Pooyan Fazli
OffRL
LRM
228
4
0
05 Aug 2025
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
Shijie Zhou
Alexander Vilesov
Xuehai He
Ziyu Wan
Shuwang Zhang
Aditya Nagachandra
Di Chang
DongDong Chen
Xin Eric Wang
A. Kadambi
VLM
170
0
0
04 Aug 2025
CausalStep: A Benchmark for Explicit Stepwise Causal Reasoning in Videos
Xuchen Li
Xuzhao Li
Shiyu Hu
Kaiqi Huang
Wentao Zhang
CML
ELM
LRM
241
3
0
22 Jul 2025
Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning
Ziyang Wang
Jaehong Yoon
Shoubin Yu
Md. Mohaiminul Islam
Gedas Bertasius
Mohit Bansal
OffRL
LRM
199
5
0
09 Jul 2025
GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning
Yi Chen
Yuying Ge
Rui Wang
Yixiao Ge
Junhao Cheng
Mingyu Ding
Xihui Liu
OffRL
VLM
LRM
139
21
0
19 Jun 2025
VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos
Jiashuo Yu
Y. Wu
Meng Chu
Zhifei Ren
Z. Huang
...
Conghui He
Yu Qiao
Yali Wang
Yi Wang
L. Wang
LRM
391
4
0
12 Jun 2025
DeepVideo-R1: Video Reinforcement Fine-Tuning via Difficulty-aware Regressive GRPO
Jinyoung Park
Jeehye Na
Jinyoung Kim
H. Kim
OffRL
308
18
0
09 Jun 2025
ExAct: A Video-Language Benchmark for Expert Action Analysis
Han Yi
Yulu Pan
Feihong He
Xinyu Liu
Benjamin Zhang
Oluwatumininu Oguntola
Gedas Bertasius
184
1
0
06 Jun 2025
VideoCap-R1: Enhancing MLLMs for Video Captioning via Structured Thinking
Desen Meng
Rui Huang
Zhilin Dai
Xinhao Li
Yifan Xu
...
Z. Huang
Meng Zhang
L. Zhang
Lu Dong
Limin Wang
OffRL
VLM
LRM
238
12
0
02 Jun 2025
Reinforcing Video Reasoning with Focused Thinking
Jisheng Dang
Jingze Wu
T. Wang
Xuanhui Lin
Nannan Zhu
Hongbo Chen
Wei-Shi Zheng
Meng Wang
Tat-Seng Chua
OffRL
LRM
311
11
0
30 May 2025
VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?
Y. Liu
Kun Ouyang
Haoning Wu
Yi Liu
Lin Sui
Xinhao Li
Y. Zhong
Y. Charles
Xinyu Zhou
Xu Sun
VLM
LRM
251
4
0
29 May 2025
TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning
Xingjian Zhang
Siwei Wen
Wenjun Wu
Lei Huang
LRM
307
39
0
13 Apr 2025
VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning
Yukun Qi
Yiming Zhao
Y. Zeng
Xikun Bao
Wenjie Huang
Lin Yen-Chen
Zehui Chen
Jie Zhao
Zhongang Qi
Feng Zhao
LRM
263
17
0
10 Apr 2025
Kimi-VL Technical Report
Kimi Team
Angang Du
B. Yin
Bowei Xing
Bowen Qu
...
Longxiang Zhang
Zhe Chen
Zijia Zhao
Ziwei Chen
Zongyu Lin
MLLM
VLM
MoE
872
133
0
10 Apr 2025
Video-R1: Reinforcing Video Reasoning in MLLMs
Kaituo Feng
Kaixiong Gong
Yangqiu Song
Zonghao Guo
Yibing Wang
Tianshuo Peng
Jian Wu
Xiaoying Zhang
Benyou Wang
Xiangyu Yue
AI4TS
SyDa
LRM
513
217
0
27 Mar 2025
Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models
Meng Cao
Pengfei Hu
Yuhang Han
J. Gu
Haoran Tang
...
Jun Song
Xiang Li
Bo Zheng
Ian Reid
Xiaodan Liang
156
7
0
24 Mar 2025
Neptune: The Long Orbit to Benchmarking Long Video Understanding
Arsha Nagrani
Ruotong Wang
Ramin Mehran
Rachel Hornung
N. B. Gundavarapu
...
Boqing Gong
Cordelia Schmid
Mikhail Sirotenko
Yukun Zhu
Tobias Weyand
389
15
0
12 Dec 2024
Aria: An Open Multimodal Native Mixture-of-Experts Model
Dongxu Li
Yudong Liu
Haoning Wu
Yue Wang
Zhiqi Shen
...
Lihuan Zhang
Hanshu Yan
Guoyin Wang
Bei Chen
Junnan Li
MoE
424
114
0
08 Oct 2024
1