ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.08039
  4. Cited By
A Survey on Temporal Sentence Grounding in Videos
v1v2 (latest)

A Survey on Temporal Sentence Grounding in Videos

16 September 2021
Xiaohan Lan
Yitian Yuan
Xin Eric Wang
Zhi Wang
Wenwu Zhu
ArXiv (abs)PDFHTML

Papers citing "A Survey on Temporal Sentence Grounding in Videos"

26 / 26 papers shown
Affordance-First Decomposition for Continual Learning in Video-Language Understanding
Affordance-First Decomposition for Continual Learning in Video-Language Understanding
Mengzhu Xu
Hanzhi Liu
Ningkang Peng
Qianyu Chen
Canran Xiao
CLL
223
2
0
30 Nov 2025
A Survey on Video Temporal Grounding with Multimodal Large Language Model
A Survey on Video Temporal Grounding with Multimodal Large Language ModelIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Yue Yu
Wei Liu
Y. Liu
Meng-yang Liu
Liqiang Nie
Zhouchen Lin
C. Chen
AI4TSVLMLRM
174
13
0
07 Aug 2025
SD-VSum: A Method and Dataset for Script-Driven Video Summarization
SD-VSum: A Method and Dataset for Script-Driven Video Summarization
Manolis Mylonas
Evlampios Apostolidis
Vasileios Mezaris
452
3
0
06 May 2025
Collaborative Temporal Consistency Learning for Point-supervised Natural Language Video Localization
Collaborative Temporal Consistency Learning for Point-supervised Natural Language Video Localization
Zhuo Tao
Liang Li
Qi Chen
Yunbin Tu
Zheng-Jun Zha
Ming-Hsuan Yang
Yuankai Qi
Qingming Huang
271
0
0
22 Mar 2025
Towards Visual Grounding: A Survey
Towards Visual Grounding: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
1.1K
41
0
28 Dec 2024
Learning to Unify Audio, Visual and Text for Audio-Enhanced Multilingual
  Visual Answer Localization
Learning to Unify Audio, Visual and Text for Audio-Enhanced Multilingual Visual Answer Localization
Zhibin Wen
Bin Li
256
3
0
05 Nov 2024
VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained
  Video Understanding
VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video UnderstandingNeural Information Processing Systems (NeurIPS), 2024
Houlun Chen
Xin Wang
Hong Chen
Zeyang Zhang
Wei Feng
Bin Huang
Jia Jia
Wenwu Zhu
VGen
381
10
0
11 Oct 2024
UAL-Bench: The First Comprehensive Unusual Activity Localization
  Benchmark
UAL-Bench: The First Comprehensive Unusual Activity Localization BenchmarkIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Hasnat Md Abdullah
Tian Liu
Kangda Wei
Shu Kong
Ruihong Huang
297
5
0
02 Oct 2024
A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances,
  and Future Directions
A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances, and Future DirectionsIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2024
Daizong Liu
Yang Liu
Wencan Huang
Wei Hu
LM&Ro
419
35
0
09 Jun 2024
Video sentence grounding with temporally global textual knowledge
Video sentence grounding with temporally global textual knowledge
Cai Chen
Runzhong Zhang
Jianjun Gao
Kejun Wu
Kim-Hui Yap
Yi Wang
320
1
0
21 Apr 2024
Grounding-Prompter: Prompting LLM with Multimodal Information for
  Temporal Sentence Grounding in Long Videos
Grounding-Prompter: Prompting LLM with Multimodal Information for Temporal Sentence Grounding in Long Videos
Houlun Chen
Xin Wang
Hong Chen
Zihan Song
Jia Jia
Wenwu Zhu
LRM
271
19
0
28 Dec 2023
LLM4VG: Large Language Models Evaluation for Video Grounding
LLM4VG: Large Language Models Evaluation for Video Grounding
Wei Feng
Xin Wang
Hong Chen
Zeyang Zhang
Zihan Song
Yuwei Zhou
Wenwu Zhu
437
11
0
21 Dec 2023
Cross-modal Contrastive Learning with Asymmetric Co-attention Network
  for Video Moment Retrieval
Cross-modal Contrastive Learning with Asymmetric Co-attention Network for Video Moment Retrieval
Love Panta
Prashant Shrestha
Brabeem Sapkota
Amrita Bhattarai
Suresh Manandhar
Anand Kumar Sah
315
7
0
12 Dec 2023
Dense Video Captioning: A Survey of Techniques, Datasets and Evaluation
  Protocols
Dense Video Captioning: A Survey of Techniques, Datasets and Evaluation ProtocolsACM Computing Surveys (ACM Comput. Surv.), 2023
Iqra Qasim
Alexander Horsch
Dilip K. Prasad
286
20
0
05 Nov 2023
Towards Surveillance Video-and-Language Understanding: New Dataset,
  Baselines, and Challenges
Towards Surveillance Video-and-Language Understanding: New Dataset, Baselines, and ChallengesComputer Vision and Pattern Recognition (CVPR), 2023
Tongtong Yuan
Xuange Zhang
Kun Liu
Bo Liu
Chen Chen
Jian Jin
Zhenzhen Jiao
AI4TS
379
52
0
25 Sep 2023
Temporal Sentence Grounding in Streaming Videos
Temporal Sentence Grounding in Streaming VideosACM Multimedia (ACM MM), 2023
Tian Gan
Xiao Wang
Yan Sun
Yue Yu
Qingpei Guo
Liqiang Nie
300
9
0
14 Aug 2023
Counterfactual Cross-modality Reasoning for Weakly Supervised Video
  Moment Localization
Counterfactual Cross-modality Reasoning for Weakly Supervised Video Moment LocalizationACM Multimedia (ACM MM), 2023
Zezhong Lv
Fuchun Sun
Ji-Rong Wen
300
23
0
10 Aug 2023
Transform-Equivariant Consistency Learning for Temporal Sentence
  Grounding
Transform-Equivariant Consistency Learning for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Jianfeng Dong
Pan Zhou
Zichuan Xu
Yining Qi
Xing Di
Weining Lu
Yu Cheng
327
12
0
06 May 2023
Text-Visual Prompting for Efficient 2D Temporal Video Grounding
Text-Visual Prompting for Efficient 2D Temporal Video GroundingComputer Vision and Pattern Recognition (CVPR), 2023
Yimeng Zhang
Xin Chen
Jinghan Jia
Sijia Liu
Ke Ding
388
33
0
09 Mar 2023
A Simple Transformer-Based Model for Ego4D Natural Language Queries
  Challenge
A Simple Transformer-Based Model for Ego4D Natural Language Queries Challenge
Sicheng Mo
Fangzhou Mu
Yin Li
165
8
0
16 Nov 2022
FedVMR: A New Federated Learning method for Video Moment Retrieval
FedVMR: A New Federated Learning method for Video Moment RetrievalIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yan Wang
Xin Luo
Zhen-Duo Chen
P. Zhang
Meng Liu
Xin-Shun Xu
FedML
225
3
0
28 Oct 2022
Learning to Locate Visual Answer in Video Corpus Using Question
Learning to Locate Visual Answer in Video Corpus Using QuestionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Bin Li
Yixuan Weng
Bin Sun
Shutao Li
474
8
0
11 Oct 2022
Towards Visual-Prompt Temporal Answering Grounding in Medical
  Instructional Video
Towards Visual-Prompt Temporal Answering Grounding in Medical Instructional VideoIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Bin Li
Yixuan Weng
Bin Sun
Shutao Li
811
67
0
13 Mar 2022
Temporal Sentence Grounding in Videos: A Survey and Future Directions
Temporal Sentence Grounding in Videos: A Survey and Future DirectionsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Hao Zhang
Aixin Sun
Wei Jing
Qiufeng Wang
3DGS
470
57
0
20 Jan 2022
MAD: A Scalable Dataset for Language Grounding in Videos from Movie
  Audio Descriptions
MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions
Mattia Soldan
Alejandro Pardo
Juan Carlos León Alcázar
Fabian Caba Heilbron
Chen Zhao
Silvio Giancola
Guohao Li
VGen
458
133
0
01 Dec 2021
Fine-grained Iterative Attention Network for TemporalLanguage
  Localization in Videos
Fine-grained Iterative Attention Network for TemporalLanguage Localization in Videos
Xiaoye Qu
Peng Tang
Zhikang Zhou
Yu Cheng
Jianfeng Dong
Pan Zhou
333
102
0
06 Aug 2020
1
Page 1 of 1