Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2507.07966
Cited By
v1
v2
v3
v4 (latest)
Scaling RL to Long Videos
10 July 2025
Yukang Chen
Wei Huang
Baifeng Shi
Qinghao Hu
Hanrong Ye
Ligeng Zhu
Zhijian Liu
Pavlo Molchanov
Jan Kautz
Xiaojuan Qi
Sifei Liu
Hongxu Yin
Yao Lu
Song Han
OffRL
AI4TS
VLM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (129 upvotes)
Github (625★)
Papers citing
"Scaling RL to Long Videos"
22 / 22 papers shown
Title
Reinforcement Learning for Large Model: A Survey
Weijia Wu
Chen Gao
Joya Chen
Kevin Lin
Qingwei Meng
Yiming Zhang
Yuke Qiu
Hong Zhou
Mike Zheng Shou
273
2
0
24 Dec 2025
Thinking with Drafts: Speculative Temporal Reasoning for Efficient Long Video Understanding
Pengfei Hu
Meng Cao
Y. Wang
Yi Wang
Jiahua Dong
Jun Song
Yu Cheng
Bo Zheng
Xiaodan Liang
LRM
VLM
113
0
0
30 Nov 2025
Boosting Reasoning in Large Multimodal Models via Activation Replay
Yun Xing
Xiaobin Hu
Qingdong He
Jiangning Zhang
Shuicheng Yan
Shijian Lu
Yu-Gang Jiang
OffRL
LRM
200
1
0
25 Nov 2025
VideoChat-M1: Collaborative Policy Planning for Video Understanding via Multi-Agent Reinforcement Learning
Boyu Chen
Zikang Wang
Zhengrong Yue
Kainan Yan
Chenyun Yu
...
Yafei Wen
Xiaoxin Chen
Yang Liu
Peng Li
Yali Wang
LLMAG
292
3
0
24 Nov 2025
REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding
Jiaze Li
Hao Yin
Wenhui Tan
Jingyang Chen
Boshen Xu
Yuxun Qu
Yijing Chen
Jianzhong Ju
Zhenbo Luo
Jian Luan
LRM
VLM
222
1
0
17 Nov 2025
Cambrian-S: Towards Spatial Supersensing in Video
Shusheng Yang
J. Yang
Pinzhi Huang
Ellis L Brown
Zihao Yang
...
Daohan Lu
Rob Fergus
Yann LeCun
Li Fei-Fei
Saining Xie
160
12
0
06 Nov 2025
Conan: Progressive Learning to Reason Like a Detective over Multi-Scale Visual Evidence
Kun Ouyang
Yuanxin Liu
Linli Yao
Yishuo Cai
Hao Zhou
Jie Zhou
Fandong Meng
Xu Sun
OffRL
LRM
ReLM
351
1
0
23 Oct 2025
Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence
Jiahao Meng
X. Li
Haochen Wang
Yue Tan
Tao Zhang
...
Yunhai Tong
Anran Wang
Zhiyang Teng
Y. Wang
Z. Wang
VGen
LRM
308
6
0
23 Oct 2025
SparseVILA: Decoupling Visual Sparsity for Efficient VLM Inference
Samir Khaki
Junxian Guo
Jiaming Tang
Shang Yang
Yukang Chen
Konstantinos N. Plataniotis
Yao Lu
Song Han
Zhijian Liu
MLLM
VLM
161
1
0
20 Oct 2025
Video Reasoning without Training
Deepak Sridhar
K. Bhardwaj
Jeya Pradha Jeyaraj
Nuno Vasconcelos
Ankita Nayak
Harris Teague
OffRL
LRM
180
1
0
19 Oct 2025
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
Hanrong Ye
Chao-Han Huck Yang
Arushi Goel
Wei Huang
Ligeng Zhu
...
Andrew Tao
Song Han
Jan Kautz
Hongxu Yin
Pavlo Molchanov
170
3
0
17 Oct 2025
Select Less, Reason More: Prioritizing Evidence Purity for Video Reasoning
Xuchen Li
Xuzhao Li
Shiyu Hu
Kaiqi Huang
80
0
0
17 Oct 2025
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs
Wei Huang
Y. Ge
S. Yang
Yicheng Xiao
Huizi Mao
...
Hongxu Yin
Yao Lu
Xiaojuan Qi
Song Han
Yukang Chen
OffRL
102
0
0
13 Oct 2025
A Survey on Agentic Multimodal Large Language Models
Huanjin Yao
Ruifei Zhang
Jiaxing Huang
Jingyi Zhang
Yibo Wang
...
Ruolin Zhu
Yongcheng Jing
Shunyu Liu
Guanbin Li
Dacheng Tao
LM&Ro
AIFin
AI4TS
LRM
AI4CE
225
4
0
13 Oct 2025
AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration
Xinlong Chen
Yue Ding
Weihong Lin
Jingyun Hua
Linli Yao
...
Yuanxing Zhang
Qiang Liu
Pengfei Wan
Liang Wang
Tieniu Tan
237
2
0
12 Oct 2025
Video-STAR: Reinforcing Open-Vocabulary Action Recognition with Tools
Zhenlong Yuan
Xiangyan Qu
Chengxuan Qian
Rui Chen
Jing Tang
...
Xiangxiang Chu
Dapeng Zhang
Yiwei Wang
Y. Cai
Shuo Li
VLM
LRM
132
8
0
09 Oct 2025
FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting
Zefeng He
Xiaoye Qu
Yafu Li
Siyuan Huang
Daizong Liu
Yu Cheng
OffRL
VLM
LRM
271
7
0
29 Sep 2025
LOVE-R1: Advancing Long Video Understanding with an Adaptive Zoom-in Mechanism via Multi-Step Reasoning
Shenghao Fu
Q. Yang
Yuan-Ming Li
Xihan Wei
Xiaohua Xie
Wei-Shi Zheng
LRM
148
5
0
29 Sep 2025
ReWatch-R1: Boosting Complex Video Reasoning in Large Vision-Language Models through Agentic Data Synthesis
Congzhi Zhang
Zhibin Wang
Yinchao Ma
Jiawei Peng
Y. Wang
Qiang Zhou
Jun Song
Bo Zheng
OffRL
AI4TS
LRM
210
2
0
28 Sep 2025
TAR-TVG: Enhancing VLMs with Timestamp Anchor-Constrained Reasoning for Temporal Video Grounding
Chaohong Guo
Xun Mo
Yongwei Nie
Xuemiao Xu
Chao Xu
Fei Richard Yu
Chengjiang Long
LRM
200
3
0
11 Aug 2025
Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning
H. Zhang
Xin Gu
Jiawen Li
Chixiang Ma
Sule Bai
Chubin Zhang
Bowen Zhang
Zhichao Zhou
Dongliang He
Yansong Tang
OffRL
LRM
169
24
0
06 Aug 2025
AVATAR: Reinforcement Learning to See, Hear, and Reason Over Video
Yogesh Kulkarni
Pooyan Fazli
OffRL
LRM
228
4
0
05 Aug 2025
1