Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1901.06829
Cited By
Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos
21 January 2019
Dongliang He
Xiang Zhao
Jizhou Huang
Fu Li
Xiao-Chang Liu
Shilei Wen
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos"
50 / 71 papers shown
Title
Who Can We Trust? Scope-Aware Video Moment Retrieval with Multi-Agent Conflict
Chaochen Wu
Guan Luo
Meiyun Zuo
Zhitao Fan
8
0
0
01 Nov 2025
From Learning to Mastery: Achieving Safe and Efficient Real-World Autonomous Driving with Human-In-The-Loop Reinforcement Learning
Li Zeqiao
Wang Yijing
Wang Haoyu
Li Zheng
Li Peng
Liu Wenfei
Zuo zhiqiang
56
0
0
07 Oct 2025
Dynamic-Aware Video Distillation: Optimizing Temporal Resolution Based on Video Semantics
Yinjie Zhao
Heng Zhao
Bihan Wen
Yew-Soon Ong
Joey Tianyi Zhou
VGen
91
1
0
28 May 2025
Cross-modal Causal Relation Alignment for Video Question Grounding
Computer Vision and Pattern Recognition (CVPR), 2025
Weixing Chen
Wenshu Fan
Binglin Chen
Jiandong Su
Yongsen Zheng
Liang Lin
BDL
VGen
CML
206
7
0
05 Mar 2025
UAL-Bench: The First Comprehensive Unusual Activity Localization Benchmark
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Hasnat Md Abdullah
Tian Liu
Kangda Wei
Shu Kong
Ruihong Huang
187
6
0
02 Oct 2024
LLM4VG: Large Language Models Evaluation for Video Grounding
Wei Feng
Xin Wang
Hong Chen
Zeyang Zhang
Zihan Song
Yuwei Zhou
Wenwu Zhu
210
10
0
21 Dec 2023
Towards Surveillance Video-and-Language Understanding: New Dataset, Baselines, and Challenges
Computer Vision and Pattern Recognition (CVPR), 2023
Tongtong Yuan
Xuange Zhang
Kun Liu
Bo Liu
Chen Chen
Jian Jin
Zhenzhen Jiao
AI4TS
177
32
0
25 Sep 2023
Dual-Path Temporal Map Optimization for Make-up Temporal Video Grounding
Multimedia Systems (MS), 2023
Jiaxiu Li
Kun Li
Jia Li
Guoliang Chen
Dan Guo
Meng Wang
124
3
0
12 Sep 2023
Temporal Sentence Grounding in Streaming Videos
ACM Multimedia (ACM MM), 2023
Tian Gan
Xiao Wang
Yan Sun
Yue Yu
Qingpei Guo
Liqiang Nie
134
9
0
14 Aug 2023
ViGT: Proposal-free Video Grounding with Learnable Token in Transformer
Science China Information Sciences (Sci China Inf Sci), 2023
Kun Li
Dan Guo
Meng Wang
ViT
124
54
0
11 Aug 2023
Look, Remember and Reason: Grounded reasoning in videos with language models
International Conference on Learning Representations (ICLR), 2023
Apratim Bhattacharyya
Sunny Panchal
Mingu Lee
Reza Pourreza
Pulkit Madan
Roland Memisevic
LRM
234
11
0
30 Jun 2023
A Survey on Video Moment Localization
ACM Computing Surveys (ACM CSUR), 2022
Meng Liu
Liqiang Nie
Yunxiao Wang
Meng Wang
Yong Rui
225
33
0
13 Jun 2023
Boundary-Denoising for Video Activity Localization
International Conference on Learning Representations (ICLR), 2023
Mengmeng Xu
Mattia Soldan
Jialin Gao
Shuming Liu
Juan-Manuel Perez-Rua
Guohao Li
136
13
0
06 Apr 2023
Query-Dependent Video Representation for Moment Retrieval and Highlight Detection
Computer Vision and Pattern Recognition (CVPR), 2023
WonJun Moon
Sangeek Hyun
S. Park
Dongchan Park
Jae-Pil Heo
ViT
184
167
0
24 Mar 2023
Towards Diverse Temporal Grounding under Single Positive Labels
Hao Zhou
Chongyang Zhang
Yanjun Chen
Chuanping Hu
92
2
0
12 Mar 2023
Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training
Computer Vision and Pattern Recognition (CVPR), 2023
Dezhao Luo
Jiabo Huang
S. Gong
Hailin Jin
Yang Liu
VGen
219
41
0
28 Feb 2023
Interactive Video Corpus Moment Retrieval using Reinforcement Learning
ACM Multimedia (ACM MM), 2022
Zhixin Ma
Chong-Wah Ngo
112
5
0
19 Feb 2023
MINOTAUR: Multi-task Video Grounding From Multimodal Queries
Raghav Goyal
E. Mavroudi
Xitong Yang
Sainbayar Sukhbaatar
Leonid Sigal
Matt Feiszli
Lorenzo Torresani
Du Tran
155
8
0
16 Feb 2023
Variational Cross-Graph Reasoning and Adaptive Structured Semantics Learning for Compositional Temporal Grounding
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Juncheng Li
Siliang Tang
Linchao Zhu
Wenqiao Zhang
Yi Yang
Tat-Seng Chua
Fei Wu
Yueting Zhuang
BDL
153
22
0
22 Jan 2023
MRTNet: Multi-Resolution Temporal Network for Video Sentence Grounding
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Wei Ji
Long Chen
Yin-wei Wei
Yiming Wu
Tat-Seng Chua
AI4TS
109
22
0
26 Dec 2022
FedVMR: A New Federated Learning method for Video Moment Retrieval
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yan Wang
Xin Luo
Zhen-Duo Chen
P. Zhang
Meng Liu
Xin-Shun Xu
FedML
100
3
0
28 Oct 2022
Fine-grained Semantic Alignment Network for Weakly Supervised Temporal Language Grounding
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Yuechen Wang
Wen-gang Zhou
Houqiang Li
AI4TS
105
14
0
21 Oct 2022
Semantic Video Moments Retrieval at Scale: A New Task and a Baseline
Na Li
159
0
0
15 Oct 2022
See, Plan, Predict: Language-guided Cognitive Planning with Video Prediction
Maria Attarian
Advaya Gupta
Ziyi Zhou
Wei Yu
Igor Gilitschenski
Animesh Garg
LM&Ro
119
8
0
07 Oct 2022
Video-Guided Curriculum Learning for Spoken Video Grounding
ACM Multimedia (ACM MM), 2022
Yan Xia
Zhou Zhao
Shangwei Ye
Yang Zhao
Haoyuan Li
Yi Ren
112
12
0
01 Sep 2022
One for All: One-stage Referring Expression Comprehension with Dynamic Reasoning
Neurocomputing (Neurocomputing), 2022
Zhipeng Zhang
Zhimin Wei
Zhongzhen Huang
Rui Niu
Peng Wang
ObjD
LRM
164
9
0
31 Jul 2022
Position-aware Location Regression Network for Temporal Video Grounding
Advanced Video and Signal Based Surveillance (AVSS), 2021
Sunoh Kim
Kimin Yun
J. Choi
96
4
0
12 Apr 2022
Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal Grounding
Ziyue Wu
Junyu Gao
Shucheng Huang
Changsheng Xu
168
6
0
04 Apr 2022
TubeDETR: Spatio-Temporal Video Grounding with Transformers
Computer Vision and Pattern Recognition (CVPR), 2022
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
ViT
236
112
0
30 Mar 2022
AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval
Computer Vision and Pattern Recognition (CVPR), 2022
Riku Togashi
Mayu Otani
Yuta Nakashima
Esa Rahtu
J. Heikkilä
T. Sakai
97
1
0
30 Mar 2022
A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach
Xiaohan Lan
Yitian Yuan
Xin Eric Wang
Long Chen
Zhi Wang
Lin Ma
Wenwu Zhu
CML
112
18
0
10 Mar 2022
Temporal Sentence Grounding in Videos: A Survey and Future Directions
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Hao Zhang
Aixin Sun
Wei Jing
Qiufeng Wang
3DGS
259
47
0
20 Jan 2022
Towards Debiasing Temporal Sentence Grounding in Video
Hao Zhang
Aixin Sun
Wei Jing
Qiufeng Wang
134
20
0
08 Nov 2021
Hierarchical Deep Residual Reasoning for Temporal Moment Localization
ACM Multimedia Asia (MA), 2021
Ziyang Ma
Xianjing Han
Xuemeng Song
Yiran Cui
Liqiang Nie
115
10
0
31 Oct 2021
Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos
IEEE Transactions on Image Processing (TIP), 2021
Zongmeng Zhang
Xianjing Han
Xuemeng Song
Yan Yan
Liqiang Nie
187
40
0
12 Oct 2021
Relation-aware Video Reading Comprehension for Temporal Language Grounding
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Jialin Gao
Xin Sun
Mengmeng Xu
Xi Zhou
Guohao Li
145
54
0
12 Oct 2021
Procedure Planning in Instructional Videos via Contextual Modeling and Model-based Policy Learning
Jing Bi
Jiebo Luo
Chenliang Xu
209
52
0
05 Oct 2021
End-to-End Dense Video Grounding via Parallel Regression
Computer Vision and Image Understanding (CVIU), 2021
Fengyuan Shi
Weilin Huang
Limin Wang
227
11
0
23 Sep 2021
Natural Language Video Localization with Learnable Moment Proposals
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Shaoning Xiao
Long Chen
Jian Shao
Yueting Zhuang
Jun Xiao
121
45
0
22 Sep 2021
A Survey on Temporal Sentence Grounding in Videos
Xiaohan Lan
Yitian Yuan
Xin Eric Wang
Zhi Wang
Wenwu Zhu
212
54
0
16 Sep 2021
On Pursuit of Designing Multi-modal Transformer for Video Grounding
Meng Cao
Long Chen
Mike Zheng Shou
Can Zhang
Yuexian Zou
176
90
0
13 Sep 2021
Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding
AAAI Conference on Artificial Intelligence (AAAI), 2021
Zhenzhi Wang
Limin Wang
Tao Wu
Tianhao Li
Gangshan Wu
AI4TS
198
147
0
10 Sep 2021
EVOQUER: Enhancing Temporal Grounding with Video-Pivoted BackQuery Generation
Yanjun Gao
Lulu Liu
Jason Wang
Xin Chen
Huayan Wang
Rui Zhang
127
1
0
10 Sep 2021
Support-Set Based Cross-Supervision for Video Grounding
IEEE International Conference on Computer Vision (ICCV), 2021
Xinpeng Ding
N. Wang
Shiwei Zhang
De Cheng
Xiaomeng Li
Ziyuan Huang
Mingqian Tang
Xinbo Gao
104
44
0
24 Aug 2021
End-to-end Multi-modal Video Temporal Grounding
Yi-Wen Chen
Yi-Hsuan Tsai
Ming-Hsuan Yang
139
57
0
12 Jul 2021
Weakly Supervised Temporal Adjacent Network for Language Grounding
IEEE transactions on multimedia (IEEE Trans. Multimedia), 2021
Yuechen Wang
Jiajun Deng
Wen-gang Zhou
Houqiang Li
148
77
0
30 Jun 2021
Interventional Video Grounding with Dual Contrastive Learning
Computer Vision and Pattern Recognition (CVPR), 2021
Guoshun Nan
Rui Qiao
Yao Xiao
Jun Liu
Sicong Leng
H. Zhang
Wei Lu
170
157
0
21 Jun 2021
Parallel Attention Network with Sequence Matching for Video Grounding
Findings (Findings), 2021
Hao Zhang
Aixin Sun
Wei Jing
Liangli Zhen
Qiufeng Wang
Rick Siow Mong Goh
175
48
0
18 May 2021
Video Corpus Moment Retrieval with Contrastive Learning
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2021
Hao Zhang
Aixin Sun
Wei Jing
Guoshun Nan
Liangli Zhen
Qiufeng Wang
Rick Siow Mong Goh
191
98
0
13 May 2021
Aligning Subtitles in Sign Language Videos
IEEE International Conference on Computer Vision (ICCV), 2021
Hannah Bull
Triantafyllos Afouras
Gül Varol
Samuel Albanie
Liliane Momeni
Andrew Zisserman
SLR
79
33
0
06 May 2021
1
2
Next