ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.12380
  4. Cited By
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Computer Vision and Pattern Recognition (CVPR), 2025
21 January 2025
Yilun Zhao
Lujing Xie
Haowei Zhang
Guo Gan
Yitao Long
Zhiyuan Hu
Tongyan Hu
Weiyuan Chen
Chuhan Li
Junyang Song
Zhihao Xu
Chengye Wang
Weifeng Pan
Ziyao Shangguan
Xiangru Tang
Zhenwen Liang
Yongxu Liu
Chen Zhao
Arman Cohan
ArXiv (abs)PDFHTMLHuggingFace (86 upvotes)

Papers citing "MMVU: Measuring Expert-Level Multi-Discipline Video Understanding"

40 / 40 papers shown
Title
Thinking with Drafts: Speculative Temporal Reasoning for Efficient Long Video Understanding
Pengfei Hu
Meng Cao
Y. Wang
Yi Wang
Jiahua Dong
Jun Song
Yu Cheng
Bo Zheng
Xiaodan Liang
LRMVLM
97
0
0
30 Nov 2025
Video-CoM: Interactive Video Reasoning via Chain of Manipulations
Video-CoM: Interactive Video Reasoning via Chain of Manipulations
H. Rasheed
Mohammed Zumri
Muhammad Maaz
Ming-Hsuan Yang
Fahad Shahbaz Khan
Salman Khan
AI4TSLRM
113
0
0
28 Nov 2025
Boosting Reasoning in Large Multimodal Models via Activation Replay
Boosting Reasoning in Large Multimodal Models via Activation Replay
Yun Xing
Xiaobin Hu
Qingdong He
Jiangning Zhang
Shuicheng Yan
Shijian Lu
Yu-Gang Jiang
OffRLLRM
180
1
0
25 Nov 2025
MASS: Motion-Aware Spatial-Temporal Grounding for Physics Reasoning and Comprehension in Vision-Language Models
MASS: Motion-Aware Spatial-Temporal Grounding for Physics Reasoning and Comprehension in Vision-Language Models
Xiyang Wu
Zongxia Li
Jihui Jin
Guangyao Shi
Gouthaman KV
Vishnu Raj
Nilotpal Sinha
Jingxi Chen
Fan Du
Dinesh Manocha
108
0
0
23 Nov 2025
CrossVid: A Comprehensive Benchmark for Evaluating Cross-Video Reasoning in Multimodal Large Language Models
CrossVid: A Comprehensive Benchmark for Evaluating Cross-Video Reasoning in Multimodal Large Language Models
Jingyao Li
Jingyun Wang
Molin Tan
Haochen Wang
Cilin Yan
Likun Shi
Jiayin Cai
Xiaolong Jiang
Yao Hu
VLMLRM
136
0
0
15 Nov 2025
Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark
Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark
Ziyu Guo
Xinyan Chen
Renrui Zhang
Ruichuan An
Yu Qi
Dongzhi Jiang
Xiangtai Li
M. Zhang
Jiaming Song
Pheng-Ann Heng
VGenLRM
140
9
0
30 Oct 2025
Video Reasoning without Training
Video Reasoning without Training
Deepak Sridhar
K. Bhardwaj
Jeya Pradha Jeyaraj
Nuno Vasconcelos
Ankita Nayak
Harris Teague
OffRLLRM
180
1
0
19 Oct 2025
ExpVid: A Benchmark for Experiment Video Understanding & Reasoning
ExpVid: A Benchmark for Experiment Video Understanding & Reasoning
Yicheng Xu
Y. Wu
Jiashuo Yu
Ziang Yan
Tianxiang Jiang
...
Kai Chen
Yu Qiao
Limin Wang
Manabu Okumura
Y. Wang
LRM
112
1
0
13 Oct 2025
Answer-Consistent Chain-of-thought Reinforcement Learning For Multi-modal Large Langauge Models
Answer-Consistent Chain-of-thought Reinforcement Learning For Multi-modal Large Langauge Models
Minbin Huang
Runhui Huang
Chuanyang Zheng
Jingyao Li
Guoxuan Chen
Han Shi
Hong Cheng
KELMLRM
80
0
0
11 Oct 2025
MARC: Memory-Augmented RL Token Compression for Efficient Video Understanding
MARC: Memory-Augmented RL Token Compression for Efficient Video Understanding
Peiran Wu
Zhuorui Yu
Yunze Liu
Chi-Hao Wu
Enmin Zhou
Junxiao Shen
OffRLVLM
80
1
0
09 Oct 2025
When Thinking Drifts: Evidential Grounding for Robust Video Reasoning
When Thinking Drifts: Evidential Grounding for Robust Video Reasoning
M. Luo
Zihui Xue
Alex Dimakis
Kristen Grauman
VGenLRM
232
4
0
07 Oct 2025
RefineShot: Rethinking Cinematography Understanding with Foundational Skill Evaluation
RefineShot: Rethinking Cinematography Understanding with Foundational Skill Evaluation
Hang Wu
Yujun Cai
Haonan Ge
H. Chen
Ming-Hsuan Yang
Yiwei Wang
CoGe
151
0
0
02 Oct 2025
NeMo: Needle in a Montage for Video-Language Understanding
NeMo: Needle in a Montage for Video-Language Understanding
Zi-Yuan Hu
Shuo Liang
Duo Zheng
Yanyang Li
Yeyao Tao
...
Jianguang Yu
Jing-ling Huang
Meng Fang
Yin Li
Liwei Wang
145
2
0
29 Sep 2025
ReWatch-R1: Boosting Complex Video Reasoning in Large Vision-Language Models through Agentic Data Synthesis
ReWatch-R1: Boosting Complex Video Reasoning in Large Vision-Language Models through Agentic Data Synthesis
Congzhi Zhang
Zhibin Wang
Yinchao Ma
Jiawei Peng
Y. Wang
Qiang Zhou
Jun Song
Bo Zheng
OffRLAI4TSLRM
202
2
0
28 Sep 2025
HyCoVAD: A Hybrid SSL-LLM Model for Complex Video Anomaly Detection
HyCoVAD: A Hybrid SSL-LLM Model for Complex Video Anomaly Detection
Mohammad Mahdi Hemmatyar
Mahdi Jafari
Mohammad Amin Yousefi
Mohammad Reza Nemati
Mobin Azadani
Hamid Reza Rastad
Amirmohammad Akbari
172
0
0
26 Sep 2025
Kwai Keye-VL 1.5 Technical Report
Kwai Keye-VL 1.5 Technical Report
Biao Yang
Bin Wen
Boyang Ding
Changyi Liu
Chenglong Chu
...
S. Wang
X. Luo
Yan Li
Yuhang Hu
Zixing Zhang
VLM
276
12
0
01 Sep 2025
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model
Xiyao Wang
Chunyuan Li
Jianwei Yang
Kai Zhang
B. Liu
Tianyi Xiong
Furong Huang
OffRLReLMLRM
103
6
0
31 Aug 2025
VideoRewardBench: Comprehensive Evaluation of Multimodal Reward Models for Video Understanding
VideoRewardBench: Comprehensive Evaluation of Multimodal Reward Models for Video Understanding
Zhihong Zhang
Xiaojian Huang
Jin Xu
Zhuodong Luo
Xinzhi Wang
Jiansheng Wei
Xuejin Chen
VLM
104
0
0
30 Aug 2025
Droplet3D: Commonsense Priors from Videos Facilitate 3D Generation
Droplet3D: Commonsense Priors from Videos Facilitate 3D Generation
Xiaochuan Li
Guoguang Du
Runze Zhang
Liang Jin
Qi Jia
...
Tianqi Wang
Changsheng Li
Xiaoli Gong
Rengang Li
Baoyu Fan
VGen
89
0
0
28 Aug 2025
Video-LevelGauge: Investigating Contextual Positional Bias in Large Video Language Models
Video-LevelGauge: Investigating Contextual Positional Bias in Large Video Language Models
Hou Xia
Zheren Fu
Fangcan Ling
Jiajun Li
Yi Tu
Zhendong Mao
Yongdong Zhang
153
0
0
27 Aug 2025
HumanPCR: Probing MLLM Capabilities in Diverse Human-Centric Scenes
HumanPCR: Probing MLLM Capabilities in Diverse Human-Centric Scenes
Keliang Li
Hongze Shen
Hao Shi
Ruibing Hou
Hong Chang
...
Wen Wang
Yiling Wu
Shihong Deng
Shiguang Shan
Xilin Chen
LRM
148
1
0
19 Aug 2025
Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning
Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning
H. Zhang
Xin Gu
Jiawen Li
Chixiang Ma
Sule Bai
Chubin Zhang
Bowen Zhang
Zhichao Zhou
Dongliang He
Yansong Tang
OffRLLRM
157
24
0
06 Aug 2025
AVATAR: Reinforcement Learning to See, Hear, and Reason Over Video
AVATAR: Reinforcement Learning to See, Hear, and Reason Over Video
Yogesh Kulkarni
Pooyan Fazli
OffRLLRM
228
4
0
05 Aug 2025
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
Shijie Zhou
Alexander Vilesov
Xuehai He
Ziyu Wan
Shuwang Zhang
Aditya Nagachandra
Di Chang
DongDong Chen
Xin Eric Wang
A. Kadambi
VLM
170
0
0
04 Aug 2025
CausalStep: A Benchmark for Explicit Stepwise Causal Reasoning in Videos
CausalStep: A Benchmark for Explicit Stepwise Causal Reasoning in Videos
Xuchen Li
Xuzhao Li
Shiyu Hu
Kaiqi Huang
Wentao Zhang
CMLELMLRM
241
3
0
22 Jul 2025
Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning
Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning
Ziyang Wang
Jaehong Yoon
Shoubin Yu
Md. Mohaiminul Islam
Gedas Bertasius
Mohit Bansal
OffRLLRM
199
5
0
09 Jul 2025
GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning
GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning
Yi Chen
Yuying Ge
Rui Wang
Yixiao Ge
Junhao Cheng
Mingyu Ding
Xihui Liu
OffRLVLMLRM
139
21
0
19 Jun 2025
VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos
VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos
Jiashuo Yu
Y. Wu
Meng Chu
Zhifei Ren
Z. Huang
...
Conghui He
Yu Qiao
Yali Wang
Yi Wang
L. Wang
LRM
391
4
0
12 Jun 2025
DeepVideo-R1: Video Reinforcement Fine-Tuning via Difficulty-aware Regressive GRPO
DeepVideo-R1: Video Reinforcement Fine-Tuning via Difficulty-aware Regressive GRPO
Jinyoung Park
Jeehye Na
Jinyoung Kim
H. Kim
OffRL
308
18
0
09 Jun 2025
ExAct: A Video-Language Benchmark for Expert Action Analysis
ExAct: A Video-Language Benchmark for Expert Action Analysis
Han Yi
Yulu Pan
Feihong He
Xinyu Liu
Benjamin Zhang
Oluwatumininu Oguntola
Gedas Bertasius
184
1
0
06 Jun 2025
VideoCap-R1: Enhancing MLLMs for Video Captioning via Structured Thinking
VideoCap-R1: Enhancing MLLMs for Video Captioning via Structured Thinking
Desen Meng
Rui Huang
Zhilin Dai
Xinhao Li
Yifan Xu
...
Z. Huang
Meng Zhang
L. Zhang
Lu Dong
Limin Wang
OffRLVLMLRM
238
12
0
02 Jun 2025
Reinforcing Video Reasoning with Focused Thinking
Reinforcing Video Reasoning with Focused Thinking
Jisheng Dang
Jingze Wu
T. Wang
Xuanhui Lin
Nannan Zhu
Hongbo Chen
Wei-Shi Zheng
Meng Wang
Tat-Seng Chua
OffRLLRM
311
11
0
30 May 2025
VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?
VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?
Y. Liu
Kun Ouyang
Haoning Wu
Yi Liu
Lin Sui
Xinhao Li
Y. Zhong
Y. Charles
Xinyu Zhou
Xu Sun
VLMLRM
251
4
0
29 May 2025
TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning
TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning
Xingjian Zhang
Siwei Wen
Wenjun Wu
Lei Huang
LRM
307
39
0
13 Apr 2025
VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning
VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning
Yukun Qi
Yiming Zhao
Y. Zeng
Xikun Bao
Wenjie Huang
Lin Yen-Chen
Zehui Chen
Jie Zhao
Zhongang Qi
Feng Zhao
LRM
263
17
0
10 Apr 2025
Kimi-VL Technical Report
Kimi-VL Technical Report
Kimi Team
Angang Du
B. Yin
Bowei Xing
Bowen Qu
...
Longxiang Zhang
Zhe Chen
Zijia Zhao
Ziwei Chen
Zongyu Lin
MLLMVLMMoE
872
133
0
10 Apr 2025
Video-R1: Reinforcing Video Reasoning in MLLMs
Video-R1: Reinforcing Video Reasoning in MLLMs
Kaituo Feng
Kaixiong Gong
Yangqiu Song
Zonghao Guo
Yibing Wang
Tianshuo Peng
Jian Wu
Xiaoying Zhang
Benyou Wang
Xiangyu Yue
AI4TSSyDaLRM
513
217
0
27 Mar 2025
Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models
Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models
Meng Cao
Pengfei Hu
Yuhang Han
J. Gu
Haoran Tang
...
Jun Song
Xiang Li
Bo Zheng
Ian Reid
Xiaodan Liang
156
7
0
24 Mar 2025
Neptune: The Long Orbit to Benchmarking Long Video Understanding
Arsha Nagrani
Ruotong Wang
Ramin Mehran
Rachel Hornung
N. B. Gundavarapu
...
Boqing Gong
Cordelia Schmid
Mikhail Sirotenko
Yukun Zhu
Tobias Weyand
389
15
0
12 Dec 2024
Aria: An Open Multimodal Native Mixture-of-Experts Model
Aria: An Open Multimodal Native Mixture-of-Experts Model
Dongxu Li
Yudong Liu
Haoning Wu
Yue Wang
Zhiqi Shen
...
Lihuan Zhang
Hanshu Yan
Guoyin Wang
Bei Chen
Junnan Li
MoE
424
114
0
08 Oct 2024
1