ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.12813
  4. Cited By
ChatVTG: Video Temporal Grounding via Chat with Video Dialogue Large
  Language Models

ChatVTG: Video Temporal Grounding via Chat with Video Dialogue Large Language Models

1 October 2024
Mengxue Qu
Xiaodong Chen
Wu Liu
Alicia Li
Yao Zhao
ArXiv (abs)PDFHTMLGithub

Papers citing "ChatVTG: Video Temporal Grounding via Chat with Video Dialogue Large Language Models"

23 / 23 papers shown
TempR1: Improving Temporal Understanding of MLLMs via Temporal-Aware Multi-Task Reinforcement Learning
TempR1: Improving Temporal Understanding of MLLMs via Temporal-Aware Multi-Task Reinforcement Learning
Tao Wu
Li Yang
Gen Zhan
Y. Zhang
Yiting Liao
Junlin Li
Deliang Fu
Li Zhang
Limin Wang
AI4TSLRM
333
3
0
03 Dec 2025
Learning to Refuse: Refusal-Aware Reinforcement Fine-Tuning for Hard-Irrelevant Queries in Video Temporal Grounding
Learning to Refuse: Refusal-Aware Reinforcement Fine-Tuning for Hard-Irrelevant Queries in Video Temporal Grounding
Jin-Seop Lee
SungJoon Lee
SeongJun Jung
Boyang Li
Jee-Hyong Lee
OOD
240
0
0
28 Nov 2025
VideoTG-R1: Boosting Video Temporal Grounding via Curriculum Reinforcement Learning on Reflected Boundary Annotations
VideoTG-R1: Boosting Video Temporal Grounding via Curriculum Reinforcement Learning on Reflected Boundary Annotations
Lu Dong
H. Zhang
Han Lin
Ziang Yan
Xiangyu Zeng
...
Yifei Huang
Yi Wang
Z. Ling
Limin Wang
Yali Wang
OffRL
193
1
0
27 Oct 2025
Enrich and Detect: Video Temporal Grounding with Multimodal LLMs
Enrich and Detect: Video Temporal Grounding with Multimodal LLMs
Shraman Pramanick
E. Mavroudi
Yale Song
Rama Chellappa
Lorenzo Torresani
Triantafyllos Afouras
277
3
0
19 Oct 2025
SVAG-Bench: A Large-Scale Benchmark for Multi-Instance Spatio-temporal Video Action Grounding
SVAG-Bench: A Large-Scale Benchmark for Multi-Instance Spatio-temporal Video Action Grounding
Tanveer Hannan
Shuaicong Wu
Mark Weber
Antonio Terpin
Jindong Gu
Rajat Koner
Aljosa Osep
Laura Leal-Taixé
Thomas Seidl
349
1
0
14 Oct 2025
Video-in-the-Loop: Span-Grounded Long Video QA with Interleaved Reasoning
Video-in-the-Loop: Span-Grounded Long Video QA with Interleaved Reasoning
C. Wang
Donglin Bai
Yifan Yang
Xiao Jin
Anlan Zhang
...
Jingdong Sun
Chong Luo
Ting Cao
Lili Qiu
Suman Banerjee
333
1
0
05 Oct 2025
OVG-HQ: Online Video Grounding with Hybrid-modal Queries
OVG-HQ: Online Video Grounding with Hybrid-modal Queries
Runhao Zeng
Jiaqi Mao
Minghao Lai
Minh Hieu Phan
Yanjie Dong
Wei Wang
Qi Chen
Xiping Hu
190
0
0
16 Aug 2025
Empowering Multimodal LLMs with External Tools: A Comprehensive Survey
Empowering Multimodal LLMs with External Tools: A Comprehensive Survey
Wenbin An
Jiahao Nie
Yaqiang Wu
Feng Tian
Shijian Lu
Q. Zheng
MLLM
251
1
0
14 Aug 2025
A Survey on Video Temporal Grounding with Multimodal Large Language Model
A Survey on Video Temporal Grounding with Multimodal Large Language ModelIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Yue Yu
Wei Liu
Y. Liu
Meng-yang Liu
Liqiang Nie
Zhouchen Lin
C. Chen
AI4TSVLMLRM
175
13
0
07 Aug 2025
DaMO: A Data-Efficient Multimodal Orchestrator for Temporal Reasoning with Video LLMs
DaMO: A Data-Efficient Multimodal Orchestrator for Temporal Reasoning with Video LLMs
Bo-Cheng Chiu
Jen-Jee Chen
Yu-Chee Tseng
Feng-Chi Chen
An-Zi Yen
409
0
0
13 Jun 2025
DisTime: Distribution-based Time Representation for Video Large Language Models
DisTime: Distribution-based Time Representation for Video Large Language Models
Yingsen Zeng
Zepeng Huang
Yujie Zhong
Chengjian Feng
Jie Hu
Lin Ma
Yang Liu
VGen
293
6
0
30 May 2025
MotionPro: A Precise Motion Controller for Image-to-Video Generation
MotionPro: A Precise Motion Controller for Image-to-Video GenerationComputer Vision and Pattern Recognition (CVPR), 2025
Zhongwei Zhang
Fuchen Long
Zhaofan Qiu
Yingwei Pan
Wu Liu
Ting Yao
Tao Mei
DiffMVGen
437
18
0
26 May 2025
Object-Shot Enhanced Grounding Network for Egocentric Video
Object-Shot Enhanced Grounding Network for Egocentric VideoComputer Vision and Pattern Recognition (CVPR), 2025
Yisen Feng
Haoyu Zhang
Meng Liu
Weili Guan
Liqiang Nie
316
8
0
07 May 2025
TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action
TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action
Jen-Hao Cheng
Vivian Wang
Huayu Wang
Huapeng Zhou
Yi-Hao Peng
...
Wenhao Chai
Yi-Ling Chen
Vibhav Vineet
Qin Cai
Lei Li
AI4TS
942
10
0
02 May 2025
Ask2Loc: Learning to Locate Instructional Visual Answers by Asking Questions
Ask2Loc: Learning to Locate Instructional Visual Answers by Asking Questions
Chang Zong
Bin Li
Shoujun Zhou
Jian Wan
Lei Zhang
1.1K
1
0
22 Apr 2025
Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation
Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation
Junyu Xie
Tengda Han
Max Bain
Arsha Nagrani
Eshika Khandelwal
Gül Varol
Weidi Xie
Andrew Zisserman
DiffMVGen
489
6
0
01 Apr 2025
VideoMind: A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning
VideoMind: A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning
Wenshu Fan
Kevin Qinghong Lin
C. Chen
Mike Zheng Shou
LM&RoLRM
1.1K
41
0
17 Mar 2025
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling
Yi Wang
Xinhao Li
Ziang Yan
Yinan He
Jiashuo Yu
...
Kai Chen
Wenhai Wang
Yu Qiao
Yali Wang
Limin Wang
687
154
0
21 Jan 2025
T-SVG: Text-Driven Stereoscopic Video Generation
T-SVG: Text-Driven Stereoscopic Video Generation
Qiao Jin
Xiaodong Chen
Wu Liu
Tao Mei
Yongdong Zhang
DiffMVGen
342
4
0
12 Dec 2024
TimeRefine: Temporal Grounding with Time Refining Video LLM
TimeRefine: Temporal Grounding with Time Refining Video LLM
Xizi Wang
Feng Cheng
Ziyang Wang
Huiyu Wang
Md. Mohaiminul Islam
Lorenzo Torresani
Joey Tianyi Zhou
Gedas Bertasius
David J. Crandall
606
10
0
12 Dec 2024
TimeMarker: A Versatile Video-LLM for Long and Short Video Understanding
  with Superior Temporal Localization Ability
TimeMarker: A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability
Shimin Chen
Xiaohan Lan
Yitian Yuan
Zequn Jie
Lin Ma
VLMMLLM
372
58
0
27 Nov 2024
ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in
  Hour-Long Videos
ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long VideosComputer Vision and Pattern Recognition (CVPR), 2024
Tanveer Hannan
Md. Mohaiminul Islam
Jindong Gu
Thomas Seidl
Gedas Bertasius
VLM
268
10
0
22 Nov 2024
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded TuningInternational Conference on Learning Representations (ICLR), 2024
Xiangyu Zeng
Kunchang Li
Chenting Wang
Xinhao Li
Tianxiang Jiang
...
Zhengrong Yue
Yi Wang
Yali Wang
Yu Qiao
Limin Wang
MLLMVLMAI4TS
357
81
0
25 Oct 2024
1
Page 1 of 1