Localizing Moments in Video with Natural Language

4 August 2017

Papers citing "Localizing Moments in Video with Natural Language"

31 / 181 papers shown

Title
COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning Simon Ging Mohammadreza Zolfaghari Hamed Pirsiavash Thomas Brox ViT CLIP 13 168 0 01 Nov 2020
Uncovering Hidden Challenges in Query-Based Video Moment Retrieval Mayu Otani Yuta Nakashima Esa Rahtu J. Heikkilä 13 74 0 01 Sep 2020
Jointly Cross- and Self-Modal Graph Attention Network for Query-Based Moment Localization Daizong Liu Xiaoye Qu Xiao-Yang Liu Jianfeng Dong Pan Zhou Zichuan Xu 26 129 0 04 Aug 2020
Enriching Video Captions With Contextual Text Philipp Rimle Pelin Dogan Markus Gross 22 3 0 29 Jul 2020
Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos Shaoxiang Chen Wenhao Jiang Wei Liu Yu-Gang Jiang 23 101 0 28 Jul 2020
Condensed Movies: Story Based Retrieval with Contextual Embeddings Max Bain Arsha Nagrani A. Brown Andrew Zisserman 28 100 0 08 May 2020
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training Linjie Li Yen-Chun Chen Yu Cheng Zhe Gan Licheng Yu Jingjing Liu MLLM VLM OffRL AI4TS 41 492 0 01 May 2020
Span-based Localizing Network for Natural Language Video Localization Hao Zhang Aixin Sun Wei Jing Joey Tianyi Zhou 15 311 0 29 Apr 2020
Local-Global Video-Text Interactions for Temporal Grounding Jonghwan Mun Minsu Cho Bohyung Han 20 267 0 16 Apr 2020
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval Jie Lei Licheng Yu Tamara L. Berg Mohit Bansal 108 275 0 24 Jan 2020
Tree-Structured Policy based Progressive Reinforcement Learning for Temporally Language Grounding in Video Jie Wu Guanbin Li Si Liu Liang Lin OffRL 18 104 0 18 Jan 2020
Action Modifiers: Learning from Adverbs in Instructional Videos Hazel Doughty Ivan Laptev W. Mayol-Cuevas Dima Damen 10 30 0 13 Dec 2019
Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos Yitian Yuan Lin Ma Jingwen Wang Wei Liu Wenwu Zhu 16 242 0 31 Oct 2019
A Graph-Based Framework to Bridge Movies and Synopses Yu Xiong Chengyi Zhang Lingfeng Guo Hang Zhou Bolei Zhou Dahua Lin 25 60 0 24 Oct 2019
CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning Rohit Girdhar Deva Ramanan 19 176 0 10 Oct 2019
Rekall: Specifying Video Events using Compositions of Spatiotemporal Labels Daniel Y. Fu Will Crichton James Hong Xinwei Yao Haotian Zhang A. Truong A. Narayan Maneesh Agrawala Christopher Ré Kayvon Fatahalian 11 48 0 07 Oct 2019
LoGAN: Latent Graph Co-Attention Network for Weakly-Supervised Video Moment Retrieval Reuben Tan Huijuan Xu Kate Saenko Bryan A. Plummer 25 67 0 27 Sep 2019
Proposal-free Temporal Moment Localization of a Natural-Language Query in Video using Guided Attention Cristian Rodriguez-Opazo Edison Marrese-Taylor F. Saleh Hongdong Li Stephen Gould 14 147 0 20 Aug 2019
Exploiting Temporal Relationships in Video Moment Localization with Natural Language Songyang Zhang Jinsong Su Jiebo Luo 12 74 0 11 Aug 2019
Use What You Have: Video Retrieval Using Representations From Collaborative Experts Yang Liu Samuel Albanie Arsha Nagrani Andrew Zisserman 28 387 0 31 Jul 2019
Localizing Unseen Activities in Video via Image Query Zhu Zhang Zhou Zhao Zhijie Lin Jingkuan Song Deng Cai ViT 16 13 0 28 Jun 2019
TVQA+: Spatio-Temporal Grounding for Video Question Answering Jie Lei Licheng Yu Tamara L. Berg Mohit Bansal 28 227 0 25 Apr 2019
VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research Xin Eric Wang Jiawei Wu Junkun Chen Lei Li Yuan-fang Wang William Yang Wang 15 540 0 06 Apr 2019
Weakly Supervised Video Moment Retrieval From Text Queries Niluthpol Chowdhury Mithun S. Paul A. Roy-Chowdhury 22 192 0 05 Apr 2019
TVQA: Localized, Compositional Video Question Answering Muhammad Abdul Wahab Licheng Yu Mounir Nasr Allah Tamara L. Berg 23 616 0 05 Sep 2018
Attentive Sequence to Sequence Translation for Localizing Clips of Interest by Natural Language Descriptions Ke Ning Linchao Zhu Ming Cai Yi Yang Di Xie Fei Wu 13 2 0 27 Aug 2018
Video Re-localization Yang Feng Lin Ma W. Liu Tong Zhang Jiebo Luo 13 71 0 05 Aug 2018
Motion-Appearance Co-Memory Networks for Video Question Answering J. Gao Runzhou Ge Kan Chen Ram Nevatia 27 240 0 29 Mar 2018
Weakly-Supervised Action Segmentation with Iterative Soft Boundary Assignment Li Ding Chenliang Xu 19 180 0 28 Mar 2018
Actor and Action Video Segmentation from a Sentence Kirill Gavrilyuk Amir Ghodrati Zhenyang Li Cees G. M. Snoek VLM 17 146 0 20 Mar 2018
Query-Focused Extractive Video Summarization Aidean Sharghi Boqing Gong M. Shah 60 121 0 18 Jul 2016