Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2508.04416
Cited By

Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning

v1v2 (latest)

Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning

6 August 2025

ArXiv (abs)PDF HTML

Papers citing "Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning"

15 / 15 papers shown

Thinking with Drafts: Speculative Temporal Reasoning for Efficient Long Video Understanding

137

0

0

30 Nov 2025

Video-CoM: Interactive Video Reasoning via Chain of Manipulations

Video-CoM: Interactive Video Reasoning via Chain of Manipulations

Ming-Hsuan Yang

Fahad Shahbaz Khan

164

0

0

28 Nov 2025

Thinking With Bounding Boxes: Enhancing Spatio-Temporal Video Grounding via Reinforcement Fine-Tuning

Thinking With Bounding Boxes: Enhancing Spatio-Temporal Video Grounding via Reinforcement Fine-Tuning

327

1

0

26 Nov 2025

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

...

178

5

0

25 Nov 2025

VideoChat-M1: Collaborative Policy Planning for Video Understanding via Multi-Agent Reinforcement Learning

VideoChat-M1: Collaborative Policy Planning for Video Understanding via Multi-Agent Reinforcement Learning

...

324

3

0

24 Nov 2025

VideoPerceiver: Enhancing Fine-Grained Temporal Perception in Video Multimodal Large Language Models

VideoPerceiver: Enhancing Fine-Grained Temporal Perception in Video Multimodal Large Language Models

Fufangchen Zhao

140

0

0

24 Nov 2025

Minimax Multi-Target Conformal Prediction with Applications to Imaging Inverse Problems

Minimax Multi-Target Conformal Prediction with Applications to Imaging Inverse Problems

Philip Schniter

333

0

0

17 Nov 2025

ViPER: Empowering the Self-Evolution of Visual Perception Abilities in Vision-Language Model

ViPER: Empowering the Self-Evolution of Visual Perception Abilities in Vision-Language Model

...

212

3

0

28 Oct 2025

Select Less, Reason More: Prioritizing Evidence Purity for Video Reasoning

Select Less, Reason More: Prioritizing Evidence Purity for Video Reasoning

88

0

0

17 Oct 2025

Video-STAR: Reinforcing Open-Vocabulary Action Recognition with Tools

Video-STAR: Reinforcing Open-Vocabulary Action Recognition with Tools

...

140

8

0

09 Oct 2025

Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models

Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models

...

MLLM OffRL VLM LRM

742

8

0

06 Oct 2025

TimeScope: Towards Task-Oriented Temporal Grounding In Long Videos

TimeScope: Towards Task-Oriented Temporal Grounding In Long Videos

Zhengyang Liang

Chen Jason Zhang

319

0

0

30 Sep 2025

TAMA: Tool-Augmented Multimodal Agent for Procedural Activity Understanding

TAMA: Tool-Augmented Multimodal Agent for Procedural Activity Understanding

Kimihiro Hasegawa

Wiradee Imrattanatrai

Teruko Mitamura

144

0

0

30 Sep 2025

LOVE-R1: Advancing Long Video Understanding with an Adaptive Zoom-in Mechanism via Multi-Step Reasoning

LOVE-R1: Advancing Long Video Understanding with an Adaptive Zoom-in Mechanism via Multi-Step Reasoning

164

7

0

29 Sep 2025

ReWatch-R1: Boosting Complex Video Reasoning in Large Vision-Language Models through Agentic Data Synthesis

ReWatch-R1: Boosting Complex Video Reasoning in Large Vision-Language Models through Agentic Data Synthesis

OffRL AI4TS LRM

230

2

0

28 Sep 2025