Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1412.4729
Cited By
Translating Videos to Natural Language Using Deep Recurrent Neural Networks
15 December 2014
Subhashini Venugopalan
Huijuan Xu
Jeff Donahue
Marcus Rohrbach
Raymond J. Mooney
Kate Saenko
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Translating Videos to Natural Language Using Deep Recurrent Neural Networks"
50 / 333 papers shown
Title
A Survey on Content-Aware Video Analysis for Sports
H. Shih
14
189
0
03 Mar 2017
Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation
N. Mostafazadeh
Chris Brockett
W. Dolan
Michel Galley
Jianfeng Gao
Georgios P. Spithourakis
Lucy Vanderwende
21
181
0
28 Jan 2017
Attention-Based Multimodal Fusion for Video Description
Chiori Hori
Takaaki Hori
Teng-Yok Lee
Kazuhiro Sumi
J. Hershey
Tim K. Marks
35
359
0
11 Jan 2017
Top-down Visual Saliency Guided by Captions
Vasili Ramanishka
Abir Das
Jianming Zhang
Kate Saenko
19
142
0
21 Dec 2016
Video Captioning with Multi-Faceted Attention
Xiang Long
Chuang Gan
Gerard de Melo
22
88
0
01 Dec 2016
Hierarchical Boundary-Aware Neural Encoder for Video Captioning
Lorenzo Baraldi
C. Grana
Rita Cucchiara
26
191
0
28 Nov 2016
Bidirectional Multirate Reconstruction for Temporal Modeling in Videos
Linchao Zhu
Zhongwen Xu
Yi Yang
24
76
0
28 Nov 2016
Visual Dialog
Abhishek Das
Satwik Kottur
Khushi Gupta
Avi Singh
Deshraj Yadav
José M. F. Moura
Devi Parikh
Dhruv Batra
52
989
0
26 Nov 2016
Semantic Compositional Networks for Visual Captioning
Zhe Gan
Chuang Gan
Xiaodong He
Yunchen Pu
Kenneth Tran
Jianfeng Gao
Lawrence Carin
Li Deng
CoGe
36
425
0
23 Nov 2016
Adaptive Feature Abstraction for Translating Video to Text
Yunchen Pu
Martin Renqiang Min
Zhe Gan
Lawrence Carin
29
14
0
23 Nov 2016
A dataset and exploration of models for understanding video data through fill-in-the-blank question-answering
Tegan Maharaj
Nicolas Ballas
Anna Rohrbach
Aaron Courville
C. Pal
VGen
11
107
0
23 Nov 2016
Video Captioning with Transferred Semantic Attributes
Yingwei Pan
Ting Yao
Houqiang Li
Tao Mei
19
329
0
23 Nov 2016
Recurrent Memory Addressing for describing videos
A. Jain
Abhinav Agarwalla
Kumar Krishna Agrawal
Pabitra Mitra
30
10
0
20 Nov 2016
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning
Long Chen
Hanwang Zhang
Jun Xiao
Liqiang Nie
Jian Shao
Wei Liu
Tat-Seng Chua
13
1,649
0
17 Nov 2016
Multimodal Memory Modelling for Video Captioning
Junbo Wang
Wei Wang
Yan Huang
Liang Wang
T. Tan
32
142
0
17 Nov 2016
Learning long-term dependencies for action recognition with a biologically-inspired deep network
Yemin Shi
Yonghong Tian
Yaowei Wang
Tiejun Huang
21
62
0
16 Nov 2016
Memory-augmented Attention Modelling for Videos
Rasool Fakoor
Abdel-rahman Mohamed
Margaret Mitchell
S. B. Kang
Pushmeet Kohli
35
20
0
07 Nov 2016
Inference Compilation and Universal Probabilistic Programming
T. Le
A. G. Baydin
Frank D. Wood
UQCV
36
142
0
31 Oct 2016
Spatio-Temporal Attention Models for Grounded Video Captioning
M. Zanfir
Elisabeta Marinoiu
C. Sminchisescu
27
50
0
17 Oct 2016
Video Fill in the Blank with Merging LSTMs
Amir Mazaheri
Dong-Ming Zhang
M. Shah
16
18
0
13 Oct 2016
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering
Youngjae Yu
Hyungjin Ko
Jongwook Choi
Gunhee Kim
6
229
0
10 Oct 2016
Prediction of Manipulation Actions
Cornelia Fermuller
Fang Wang
Yezhou Yang
Konstantinos Zampogiannis
Yi Zhang
Francisco Barranco
Michael Pfeiffer
10
51
0
03 Oct 2016
A Survey of Multi-View Representation Learning
Yingming Li
Ming Yang
Zhongfei Zhang
AI4TS
3DV
22
509
0
03 Oct 2016
Recurrent Convolutional Networks for Pulmonary Nodule Detection in CT Imaging
P. Ypsilantis
Giovanni Montana
MedIm
6
31
0
28 Sep 2016
Pose-Selective Max Pooling for Measuring Similarity
Xiang Xiang
T. Tran
CVBM
14
5
0
22 Sep 2016
Deep Learning for Video Classification and Captioning
Zuxuan Wu
Ting Yao
Yanwei Fu
Yu-Gang Jiang
3DV
VLM
13
122
0
22 Sep 2016
Learning to generalize to new compositions in image understanding
Y. Atzmon
Jonathan Berant
Vahid Kezami
Amir Globerson
Gal Chechik
18
67
0
27 Aug 2016
Title Generation for User Generated Videos
Kuo-Hao Zeng
Tseng-Hung Chen
Juan Carlos Niebles
Min Sun
27
69
0
25 Aug 2016
Learning Joint Representations of Videos and Sentences with Web Image Search
Mayu Otani
Yuta Nakashima
Esa Rahtu
J. Heikkilä
N. Yokoya
16
94
0
08 Aug 2016
A Comprehensive Survey on Cross-modal Retrieval
K. Wang
Qiyue Yin
Wei Wang
Shu Wu
Liang Wang
34
294
0
21 Jul 2016
Hierarchical Deep Temporal Models for Group Activity Recognition
Mostafa S. Ibrahim
S. Muralidharan
Zhiwei Deng
Arash Vahdat
Greg Mori
77
445
0
09 Jul 2016
Domain Adaptation for Neural Networks by Parameter Augmentation
Yusuke Watanabe
Kazuma Hashimoto
Yoshimasa Tsuruoka
OOD
14
6
0
01 Jul 2016
Bidirectional Long-Short Term Memory for Video Description
Yi Bin
Yang Yang
Zi Huang
Fumin Shen
Xing Xu
Heng Tao Shen
31
60
0
15 Jun 2016
Storytelling of Photo Stream with Bidirectional Multi-thread Recurrent Neural Network
Yu Liu
Jianlong Fu
Tao Mei
C. Chen
11
4
0
02 Jun 2016
Video Summarization with Long Short-term Memory
Ke Zhang
Wei-Lun Chao
Fei Sha
Kristen Grauman
27
682
0
26 May 2016
Movie Description
Anna Rohrbach
Atousa Torabi
Marcus Rohrbach
Niket Tandon
C. Pal
Hugo Larochelle
Aaron Courville
Bernt Schiele
3DV
VGen
32
353
0
12 May 2016
Ask Your Neurons: A Deep Learning Approach to Visual Question Answering
Mateusz Malinowski
Marcus Rohrbach
Mario Fritz
11
101
0
09 May 2016
Dependency Parsing with LSTMs: An Empirical Evaluation
A. Kuncoro
Yu Sawai
Kevin Duh
Yuji Matsumoto
13
3
0
22 Apr 2016
Attributes as Semantic Units between Natural Language and Visual Recognition
Marcus Rohrbach
VLM
14
3
0
12 Apr 2016
TGIF: A New Dataset and Benchmark on Animated GIF Description
Yuncheng Li
Yale Song
Liangliang Cao
Joel R. Tetreault
Larry Goldberg
A. Jaimes
Jiebo Luo
22
269
0
10 Apr 2016
Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text
Subhashini Venugopalan
Lisa Anne Hendricks
Raymond J. Mooney
Kate Saenko
VLM
20
117
0
06 Apr 2016
Character-Level Neural Translation for Multilingual Media Monitoring in the SUMMA Project
Guntis Barzdins
Steve Renals
D. Gosko
13
5
0
05 Apr 2016
Character-Level Question Answering with Attention
David Golub
Xiaodong He
19
184
0
04 Apr 2016
Do You See What I Mean? Visual Resolution of Linguistic Ambiguities
Yevgeni Berzak
Andrei Barbu
Daniel Harari
Boris Katz
S. Ullman
11
34
0
26 Mar 2016
Attentive Contexts for Object Detection
Jianan Li
Yunchao Wei
Xiaodan Liang
Jian Dong
Tingfa Xu
Jiashi Feng
Shuicheng Yan
ObjD
12
221
0
24 Mar 2016
Super Mario as a String: Platformer Level Generation Via LSTMs
A. Summerville
Michael Mateas
17
149
0
02 Mar 2016
A Taxonomy of Deep Convolutional Neural Nets for Computer Vision
Suraj Srinivas
Ravi Kiran Sarvadevabhatla
Konda Reddy Mopuri
N. Prabhu
S. Kruthiventi
R. Venkatesh Babu
OOD
33
215
0
25 Jan 2016
MovieQA: Understanding Stories in Movies through Question-Answering
Makarand Tapaswi
Yukun Zhu
Rainer Stiefelhagen
Antonio Torralba
R. Urtasun
Sanja Fidler
18
736
0
09 Dec 2015
A Deep Structured Model with Radius-Margin Bound for 3D Human Activity Recognition
Liang Lin
Keze Wang
W. Zuo
M. Wang
Jiebo Luo
Lei Zhang
HAI
BDL
22
102
0
05 Dec 2015
Stories in the Eye: Contextual Visual Interactions for Efficient Video to Language Translation
Anirudh Goyal
Marius Leordeanu
16
1
0
20 Nov 2015
Previous
1
2
3
4
5
6
7
Next