Delving Deeper into the Decoder for Video Captioning

16 January 2020

Papers citing "Delving Deeper into the Decoder for Video Captioning"

8 / 8 papers shown

Title
Question-Answering Dense Video Events Hangyu Qin Junbin Xiao Angela Yao VLM 71 1 0 06 Sep 2024
MSVD-Indonesian: A Benchmark for Multimodal Video-Text Tasks in Indonesian Willy Fitra Hendria 29 2 0 20 Jun 2023
STOA-VLP: Spatial-Temporal Modeling of Object and Action for Video-Language Pre-training Weihong Zhong Mao Zheng Duyu Tang Xuan Luo Heng Gong Xiaocheng Feng Bing Qin 27 8 0 20 Feb 2023
End-to-end Generative Pretraining for Multimodal Video Captioning Paul Hongsuck Seo Arsha Nagrani Anurag Arnab Cordelia Schmid 27 164 0 20 Jan 2022
CLIP4Caption: CLIP for Video Caption Mingkang Tang Zhanyu Wang Zhenhua Liu Fengyun Rao Dian Li Xiu Li CLIP VLM 29 149 0 13 Oct 2021
The MSR-Video to Text Dataset with Clean Annotations Haoran Chen Jianmin Li Simone Frintrop Xiaolin Hu 22 18 0 12 Feb 2021
ECO: Efficient Convolutional Network for Online Video Understanding Mohammadreza Zolfaghari Kamaljeet Singh Thomas Brox 130 496 0 24 Apr 2018
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning Y. Gal Zoubin Ghahramani UQCV BDL 285 9,136 0 06 Jun 2015