ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.01725
  4. Cited By
VideoCap-R1: Enhancing MLLMs for Video Captioning via Structured Thinking

VideoCap-R1: Enhancing MLLMs for Video Captioning via Structured Thinking

2 June 2025
Desen Meng
Rui Huang
Zhilin Dai
Xinhao Li
Yifan Xu
Jun Zhang
Z. Huang
Meng Zhang
L. Zhang
Lu Dong
Limin Wang
    OffRLVLMLRM
ArXiv (abs)PDFHTML

Papers citing "VideoCap-R1: Enhancing MLLMs for Video Captioning via Structured Thinking"

9 / 9 papers shown
A Reason-then-Describe Instruction Interpreter for Controllable Video Generation
A Reason-then-Describe Instruction Interpreter for Controllable Video Generation
Shengqiong Wu
Weicai Ye
Y. Zhang
Jiahao Wang
Quande Liu
Xintao Wang
Pengfei Wan
Kun Gai
Hao Fei
Tat-Seng Chua
VGenLRM
184
0
0
25 Nov 2025
VDC-Agent: When Video Detailed Captioners Evolve Themselves via Agentic Self-Reflection
VDC-Agent: When Video Detailed Captioners Evolve Themselves via Agentic Self-Reflection
Qiang Wang
Xinyuan Gao
Songlin Dong
Jizhou Han
Jiangyang Li
Yuhang He
Yihong Gong
VGen
158
1
0
24 Nov 2025
DynaStride: Dynamic Stride Windowing with MMCoT for Instructional Multi-Scene Captioning
DynaStride: Dynamic Stride Windowing with MMCoT for Instructional Multi-Scene Captioning
Eddison Pham
Prisha Priyadarshini
Adrian Maliackel
Kanishk Bandi
Cristian Meo
Kevin Zhu
149
0
0
27 Oct 2025
AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration
AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration
Xinlong Chen
Yue Ding
Weihong Lin
Jingyun Hua
Linli Yao
...
Yuanxing Zhang
Qiang Liu
Pengfei Wan
Liang Wang
Tieniu Tan
252
2
0
12 Oct 2025
OwlCap: Harmonizing Motion-Detail for Video Captioning via HMD-270K and Caption Set Equivalence Reward
OwlCap: Harmonizing Motion-Detail for Video Captioning via HMD-270K and Caption Set Equivalence Reward
Chunlin Zhong
Qiuxia Hou
Zhangjun Zhou
Shuang Hao
Haonan Lu
Yanhao Zhang
He Tang
Xiang Bai
VGen
173
2
0
26 Aug 2025
Empowering Multimodal LLMs with External Tools: A Comprehensive Survey
Empowering Multimodal LLMs with External Tools: A Comprehensive Survey
Wenbin An
Jiahao Nie
Yaqiang Wu
Feng Tian
Shijian Lu
Q. Zheng
MLLM
181
1
0
14 Aug 2025
AVATAR: Reinforcement Learning to See, Hear, and Reason Over Video
AVATAR: Reinforcement Learning to See, Hear, and Reason Over Video
Yogesh Kulkarni
Pooyan Fazli
OffRLLRM
280
4
0
05 Aug 2025
UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks
UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks
Peiran Wu
Yunze Liu
Zhengdong Zhu
Enmin Zhou
Junxiao Shen
210
2
0
15 Jul 2025
Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning
Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning
Ziyang Wang
Jaehong Yoon
Shoubin Yu
Md. Mohaiminul Islam
Gedas Bertasius
Mohit Bansal
OffRLLRM
258
5
0
09 Jul 2025
1