v1v2v3 (latest)

Translating Videos to Natural Language Using Deep Recurrent Neural Networks

North American Chapter of the Association for Computational Linguistics (NAACL), 2014

15 December 2014

Subhashini Venugopalan

Papers citing "Translating Videos to Natural Language Using Deep Recurrent Neural Networks"

50 / 334 papers shown

Title
Poet: Product-oriented Video Captioner for E-commerce Shengyu Zhang Ziqi Tan Jin Yu Zhou Zhao Kun Kuang Jie Liu Jingren Zhou Hongxia Yang Leilei Gan 128 36 0 16 Aug 2020
Vision Meets Wireless Positioning: Effective Person Re-identification with Recurrent Context PropagationACM Multimedia (ACM MM), 2020 Yiheng Liu Wen-gang Zhou Mao Xi Sanjing Shen Houqiang Li 182 10 0 10 Aug 2020
Enriching Video Captions With Contextual TextInternational Conference on Pattern Recognition (ICPR), 2020 Philipp Rimle Pelin Dogan Markus Gross 141 3 0 29 Jul 2020
Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in VideosEuropean Conference on Computer Vision (ECCV), 2020 Shaoxiang Chen Wenhao Jiang Wei Liu Yu-Gang Jiang 199 111 0 28 Jul 2020
Fully Convolutional Networks for Continuous Sign Language RecognitionEuropean Conference on Computer Vision (ECCV), 2020 Ka Leong Cheng Zhaoyang Yang Qifeng Chen Yu-Wing Tai SLR 207 184 0 24 Jul 2020
Deep Learning Techniques for Future Intelligent Cross-Media Retrieval S. Rehman M. Waqas Shanshan Tu Anis Koubaa O. Rehman Jawad Ahmad Muhammad Hanif Zhu Han 118 7 0 21 Jul 2020
Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training Yingwei Pan Yehao Li Jianjie Luo Jun Xu Ting Yao Tao Mei 167 61 0 05 Jul 2020
Listen carefully and tell: an audio captioning system based on residual learning and gammatone audio representation Sergi Perez-Castanos Javier Naranjo-Alcazar P. Zuccarello M. Cobos 152 12 0 27 Jun 2020
SACT: Self-Aware Multi-Space Feature Composition Transformer for Multinomial Attention for Video Captioning C. Sur 120 7 0 25 Jun 2020
Comprehensive Information Integration Modeling Framework for Video TitlingKnowledge Discovery and Data Mining (KDD), 2020 Shengyu Zhang Ziqi Tan Jin Yu Zhou Zhao Kun Kuang Tan Jiang Jingren Zhou Hongxia Yang Leilei Gan 146 41 0 24 Jun 2020
iSeeBetter: Spatio-temporal video super-resolution using recurrent generative back-projection networksComputational Visual Media (CVM), 2020 Vasu Sharma John Britto M. Mani Roja SupR 195 26 0 13 Jun 2020
NITS-VC System for VATEX Video Captioning Challenge 2020 Alok Singh Thoudam Doren Singh Sivaji Bandyopadhyay 122 16 0 07 Jun 2020
A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal Transformer Vladimir E. Iashin Esa Rahtu 207 128 0 17 May 2020
Rethinking and Improving Natural Language Generation with Layer-Wise Multi-View Decoding Fenglin Liu Xuancheng Ren Guangxiang Zhao Chenyu You Xuewei Ma Xian Wu Xu Sun 358 2 0 16 May 2020
Learning from Noisy Labels with Noise Modeling Network Zhuolin Jiang J. Silovský M. Siu William Hartmann H. Gish Sancar Adali NoLa 71 3 0 01 May 2020
Spatio-Temporal Graph for Video Captioning with Knowledge DistillationComputer Vision and Pattern Recognition (CVPR), 2020 Boxiao Pan Haoye Cai De-An Huang Kuan-Hui Lee Adrien Gaidon Ehsan Adeli Juan Carlos Niebles 179 259 0 31 Mar 2020
Multi-modal Dense Video Captioning Vladimir E. Iashin Esa Rahtu 265 198 0 17 Mar 2020
Video Caption Dataset for Describing Human Actions in JapaneseInternational Conference on Language Resources and Evaluation (LREC), 2020 Yutaro Shigeto Yuya Yoshikawa Jiaqing Lin A. Takeuchi 68 3 0 10 Mar 2020
OVC-Net: Object-Oriented Video Captioning with Temporal Graph and Detail Enhancement Fangyi Zhu Lei Li Zhanyu Ma Guang Chen Jun Guo 160 1 0 08 Mar 2020
Noise Estimation Using Density Estimation for Self-Supervised Multimodal LearningAAAI Conference on Artificial Intelligence (AAAI), 2020 Elad Amrani Rami Ben-Ari Daniel Rotman A. Bronstein 288 129 0 06 Mar 2020
Hierarchical Memory Decoding for Video Captioning Aming Wu Yahong Han 123 2 0 27 Feb 2020
CLARA: Clinical Report Auto-completionThe Web Conference (WWW), 2020 Siddharth Biswal Cao Xiao Lucas Glass M. P. M. Brandon Westover Jimeng Sun 196 29 0 26 Feb 2020
Object Relational Graph with Teacher-Recommended Learning for Video CaptioningComputer Vision and Pattern Recognition (CVPR), 2020 Ziqi Zhang Yaya Shi Chunfen Yuan Bing Li Peijin Wang Weiming Hu Zhengjun Zha VLM 188 302 0 26 Feb 2020
Multimodal Matching Transformer for Live CommentingEuropean Conference on Artificial Intelligence (ECAI), 2020 Chaoqun Duan Lei Cui Shuming Ma Furu Wei Conghui Zhu Tiejun Zhao 85 13 0 07 Feb 2020
Spatio-Temporal Ranked-Attention Networks for Video CaptioningIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2020 A. Cherian Jue Wang Chiori Hori Tim K. Marks AI4TS 109 22 0 17 Jan 2020
Delving Deeper into the Decoder for Video CaptioningEuropean Conference on Artificial Intelligence (ECAI), 2020 Haoran Chen Jianmin Li Xiaolin Hu 147 38 0 16 Jan 2020
Non-Autoregressive Coarse-to-Fine Video Captioning Bang-ju Yang Yuexian Zou Fenglin Liu Can Zhang 352 11 0 27 Nov 2019
Zero-Shot Imitating Collaborative Manipulation Plans from YouTube Cooking Videos Hejia Zhang Jie Zhong Stefanos Nikolaidis LM&Ro 932 2 0 25 Nov 2019
Characterizing the impact of using features extracted from pre-trained models on the quality of video captioning sequence-to-sequence modelsInternational Conferences on Pattern Recognition and Artificial Intelligence (ICCPRAI), 2019 Menatallh Hammad May Hammad Mohamed Elshenawy 81 2 0 22 Nov 2019
Empirical Autopsy of Deep Video Captioning Frameworks Nayyer Aafaq Naveed Akhtar Wei Liu Lin Wang 115 6 0 21 Nov 2019
Crowd Video Captioning Liqi Yan Mingjian Zhu Changbin (Brad) Yu 76 4 0 13 Nov 2019
Video Captioning with Text-based Dynamic Attention and Step-by-Step LearningPattern Recognition Letters (PR), 2019 Huanhou Xiao Jinglun Shi 109 26 0 05 Nov 2019
Low-Rank HOCA: Efficient High-Order Cross-Modal Attention for Video CaptioningConference on Empirical Methods in Natural Language Processing (EMNLP), 2019 Tao Jin Siyu Huang Yingming Li Zhongfei Zhang 148 21 0 01 Nov 2019
Orchestrating the Development Lifecycle of Machine Learning-Based IoT Applications: A Taxonomy and Survey Bin Qian Jie Su Z. Wen D. N. Jha Yinhao Li ... Albert Y. Zomaya Omer F. Rana Lizhe Wang Maciej Koutny R. Ranjan 185 4 0 11 Oct 2019
Hear "No Evil", See "Kenansville": Efficient and Transferable Black-Box Attacks on Speech Recognition and Voice Identification Systems H. Abdullah Muhammad Sajidur Rahman Washington Garcia Logan Blue Kevin Warren Anurag Swarnim Yadav T. Shrimpton Patrick Traynor AAML 129 95 0 11 Oct 2019
Explaining and Interpreting LSTMs L. Arras Jose A. Arjona-Medina Michael Widrich G. Montavon Michael Gillhofer K. Müller Sepp Hochreiter Wojciech Samek FAtt AI4TS 143 83 0 25 Sep 2019
Learning Actions from Human Demonstration Video for Robotic ManipulationIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2019 Shuo Yang Wei Zhang Weizhi Lu Hesheng Wang Yibin Li 90 26 0 10 Sep 2019
Time Series Motion Generation Considering Long Short-Term MotionIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2019 K. Fujimoto S. Sakaino T. Tsuji 117 15 0 09 Sep 2019
Controllable Video Captioning with POS Sequence Guidance Based on Gated Fusion NetworkIEEE International Conference on Computer Vision (ICCV), 2019 Bairui Wang Lin Ma Wei Zhang Wenhao Jiang Jingwen Wang Wei Liu 206 176 0 27 Aug 2019
Autonomous Learning for Face Recognition in the Wild via Ambient Wireless CuesThe Web Conference (WWW), 2019 Chris Xiaoxuan Lu Xuan Kan Bowen Du Changhao Chen Hongkai Wen Andrew Markham A. Trigoni John A. Stankovic CVBM 126 7 0 14 Aug 2019
SF-Net: Structured Feature Network for Continuous Sign Language Recognition Zhaoyang Yang Zhenmei Shi Xiaoyong Shen Yu-Wing Tai SLR 111 71 0 04 Aug 2019
Prediction and Description of Near-Future Activities in VideoComputer Vision and Image Understanding (CVIU), 2019 T. Mahmud Mohammad Billah Mahmudul Hasan Amit K. Roy-Chowdhury 283 17 0 02 Aug 2019
Use What You Have: Video Retrieval Using Representations From Collaborative ExpertsBritish Machine Vision Conference (BMVC), 2019 Yang Liu Samuel Albanie Arsha Nagrani Andrew Zisserman 169 422 0 31 Jul 2019
Deep Multi-Kernel Convolutional LSTM Networks and an Attention-Based Mechanism for VideosIEEE transactions on multimedia (IEEE TMM), 2019 Sebastian Agethen Winston H. Hsu HAI 125 29 0 30 Jul 2019
Learning Visual Actions Using Multiple Verb-Only LabelsBritish Machine Vision Conference (BMVC), 2019 Michael Wray Dima Damen 183 7 0 25 Jul 2019
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and MethodsJournal of Artificial Intelligence Research (JAIR), 2019 Aditya Mogadala M. Kalimuthu Dietrich Klakow VLM 332 141 0 22 Jul 2019
Watch It Twice: Video Captioning with a Refocused Video EncoderACM Multimedia (ACM MM), 2019 Xiangxi Shi Jianfei Cai Shafiq Joty Jiuxiang Gu 134 28 0 21 Jul 2019
Structured Variational Inference in Unstable Gaussian Process State Space Models Silvan Melchior Sebastian Curi Felix Berkenkamp Andreas Krause 261 4 0 16 Jul 2019
Video Question Generation via Cross-Modal Self-Attention Networks LearningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019 Yu-Siang Wang Hung-Ting Su Chen-Hsi Chang Zhe-Yu Liu Winston H. Hsu 135 12 0 05 Jul 2019
A Deep Decoder Structure Based on WordEmbedding Regression for An Encoder-Decoder Based Model for Image Captioning A. Asadi Reza Safabakhsh 66 3 0 26 Jun 2019