Move Forward and Tell: A Progressive Generator of Video Descriptions

26 July 2018

Papers citing "Move Forward and Tell: A Progressive Generator of Video Descriptions"

10 / 10 papers shown

Title
Text with Knowledge Graph Augmented Transformer for Video Captioning Xin Gu G. Chen Yufei Wang Libo Zhang Tiejian Luo Longyin Wen 19 47 0 22 Mar 2023
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning Antoine Yang Arsha Nagrani Paul Hongsuck Seo Antoine Miech Jordi Pont-Tuset Ivan Laptev Josef Sivic Cordelia Schmid AI4TS VLM 23 220 0 27 Feb 2023
VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning Kashu Yamazaki Khoa T. Vo Sang Truong Bhiksha Raj Ngan Le 21 35 0 28 Nov 2022
DVCFlow: Modeling Information Flow Towards Human-like Video Captioning Xu Yan Zhengcong Fei Shuhui Wang Qingming Huang Qi Tian VGen 28 4 0 19 Nov 2021
End-to-End Dense Video Captioning with Parallel Decoding Teng Wang Ruimao Zhang Zhichao Lu Feng Zheng Ran Cheng Ping Luo 3DV 30 179 0 17 Aug 2021
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks Humam Alwassel Silvio Giancola Bernard Ghanem 28 123 0 23 Nov 2020
COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning Simon Ging Mohammadreza Zolfaghari Hamed Pirsiavash Thomas Brox ViT CLIP 13 168 0 01 Nov 2020
MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning Jie Lei Liwei Wang Yelong Shen Dong Yu Tamara L. Berg Mohit Bansal 14 186 0 11 May 2020
Recursive Visual Sound Separation Using Minus-Plus Net Xudong Xu Bo Dai Dahua Lin 13 91 0 30 Aug 2019
Grounded Video Description Luowei Zhou Yannis Kalantidis Xinlei Chen Jason J. Corso Marcus Rohrbach 19 190 0 17 Dec 2018