ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1412.4729
  4. Cited By
Translating Videos to Natural Language Using Deep Recurrent Neural
  Networks

Translating Videos to Natural Language Using Deep Recurrent Neural Networks

15 December 2014
Subhashini Venugopalan
Huijuan Xu
Jeff Donahue
Marcus Rohrbach
Raymond J. Mooney
Kate Saenko
ArXivPDFHTML

Papers citing "Translating Videos to Natural Language Using Deep Recurrent Neural Networks"

33 / 333 papers shown
Title
Delving Deeper into Convolutional Networks for Learning Video
  Representations
Delving Deeper into Convolutional Networks for Learning Video Representations
Nicolas Ballas
L. Yao
C. Pal
Aaron Courville
MDE
17
692
0
19 Nov 2015
Learning Deep Structure-Preserving Image-Text Embeddings
Learning Deep Structure-Preserving Image-Text Embeddings
Liwei Wang
Yin Li
Svetlana Lazebnik
35
780
0
19 Nov 2015
ABC-CNN: An Attention Based Convolutional Neural Network for Visual
  Question Answering
ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering
Kan Chen
Jiang Wang
Liang-Chieh Chen
Haoyuan Gao
W. Xu
Ram Nevatia
22
287
0
18 Nov 2015
Structural-RNN: Deep Learning on Spatio-Temporal Graphs
Structural-RNN: Deep Learning on Spatio-Temporal Graphs
Ashesh Jain
Amir Zamir
Silvio Savarese
Ashutosh Saxena
GNN
46
1,080
0
17 Nov 2015
Deep Compositional Captioning: Describing Novel Object Categories
  without Paired Training Data
Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data
Lisa Anne Hendricks
Subhashini Venugopalan
Marcus Rohrbach
Raymond J. Mooney
Kate Saenko
Trevor Darrell
CoGe
16
284
0
17 Nov 2015
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for
  Visual Question Answering
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering
Huijuan Xu
Kate Saenko
22
760
0
17 Nov 2015
Oracle performance for visual captioning
Oracle performance for visual captioning
L. Yao
Nicolas Ballas
Kyunghyun Cho
John R. Smith
Yoshua Bengio
VLM
31
8
0
14 Nov 2015
Action Recognition using Visual Attention
Action Recognition using Visual Attention
Shikhar Sharma
Ryan Kiros
Ruslan Salakhutdinov
24
666
0
12 Nov 2015
Deep Gaussian Conditional Random Field Network: A Model-based Deep
  Network for Discriminative Denoising
Deep Gaussian Conditional Random Field Network: A Model-based Deep Network for Discriminative Denoising
Raviteja Vemulapalli
Oncel Tuzel
Ming-Yu Liu
19
69
0
12 Nov 2015
Generative Concatenative Nets Jointly Learn to Write and Classify
  Reviews
Generative Concatenative Nets Jointly Learn to Write and Classify Reviews
Zachary Chase Lipton
Sharad Vikram
Julian McAuley
BDL
25
32
0
11 Nov 2015
Hierarchical Recurrent Neural Encoder for Video Representation with
  Application to Captioning
Hierarchical Recurrent Neural Encoder for Video Representation with Application to Captioning
Pingbo Pan
Zhongwen Xu
Yi Yang
Fei Wu
Yueting Zhuang
16
385
0
11 Nov 2015
VideoStory Embeddings Recognize Events when Examples are Scarce
VideoStory Embeddings Recognize Events when Examples are Scarce
A. Habibian
Thomas Mensink
Cees G. M. Snoek
14
11
0
08 Nov 2015
Privacy Prediction of Images Shared on Social Media Sites Using Deep
  Features
Privacy Prediction of Images Shared on Social Media Sites Using Deep Features
Ashwini Tonge
Cornelia Caragea
14
14
0
29 Oct 2015
Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks
Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks
Haonan Yu
Jiang Wang
Zhiheng Huang
Yi Yang
W. Xu
42
560
0
26 Oct 2015
Learning Contextual Dependencies with Convolutional Hierarchical
  Recurrent Neural Networks
Learning Contextual Dependencies with Convolutional Hierarchical Recurrent Neural Networks
Zhen Zuo
Bing Shuai
G. Wang
Xiao Liu
Xingxing Wang
B. Wang
11
93
0
13 Sep 2015
Describing Multimedia Content using Attention-based Encoder--Decoder
  Networks
Describing Multimedia Content using Attention-based Encoder--Decoder Networks
Kyunghyun Cho
Aaron Courville
Yoshua Bengio
32
411
0
04 Jul 2015
A Survey of Current Datasets for Vision and Language Research
A Survey of Current Datasets for Vision and Language Research
Francis Ferraro
N. Mostafazadeh
Ting-Hao 'Kenneth' Huang
Huang
Lucy Vanderwende
Jacob Devlin
Michel Galley
Margaret Mitchell
VLM
20
73
0
23 Jun 2015
Aligning Books and Movies: Towards Story-like Visual Explanations by
  Watching Movies and Reading Books
Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books
Yukun Zhu
Ryan Kiros
R. Zemel
Ruslan Salakhutdinov
R. Urtasun
Antonio Torralba
Sanja Fidler
22
2,515
0
22 Jun 2015
Learning language through pictures
Learning language through pictures
Grzegorz Chrupała
Ákos Kádár
A. Alishahi
VLM
SSL
29
65
0
11 Jun 2015
The Long-Short Story of Movie Description
The Long-Short Story of Movie Description
Anna Rohrbach
Marcus Rohrbach
Bernt Schiele
VLM
25
110
0
04 Jun 2015
Visual Madlibs: Fill in the blank Image Generation and Question
  Answering
Visual Madlibs: Fill in the blank Image Generation and Question Answering
Licheng Yu
Eunbyung Park
Alexander C. Berg
Tamara L. Berg
VLM
MLLM
24
97
0
31 May 2015
A Multi-scale Multiple Instance Video Description Network
A Multi-scale Multiple Instance Video Description Network
Huijuan Xu
Subhashini Venugopalan
Vasili Ramanishka
Marcus Rohrbach
Kate Saenko
32
64
0
21 May 2015
Jointly Modeling Embedding and Translation to Bridge Video and Language
Jointly Modeling Embedding and Translation to Bridge Video and Language
Yingwei Pan
Tao Mei
Ting Yao
Houqiang Li
Y. Rui
38
534
0
07 May 2015
Ask Your Neurons: A Neural-based Approach to Answering Questions about
  Images
Ask Your Neurons: A Neural-based Approach to Answering Questions about Images
Mateusz Malinowski
Marcus Rohrbach
Mario Fritz
35
595
0
05 May 2015
Sequence to Sequence -- Video to Text
Sequence to Sequence -- Video to Text
Subhashini Venugopalan
Marcus Rohrbach
Jeff Donahue
Raymond J. Mooney
Trevor Darrell
Kate Saenko
22
1,416
0
03 May 2015
Differential Recurrent Neural Networks for Action Recognition
Differential Recurrent Neural Networks for Action Recognition
Vivek Veeriah
Naifan Zhuang
Guo-Jun Qi
24
462
0
25 Apr 2015
Evaluating Two-Stream CNN for Video Classification
Evaluating Two-Stream CNN for Video Classification
Hao Ye
Zuxuan Wu
Rui Zhao
Xi Wang
Yu-Gang Jiang
Xiangyang Xue
19
119
0
08 Apr 2015
Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for
  Video Classification
Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for Video Classification
Zuxuan Wu
Xi Wang
Yu-Gang Jiang
Hao Ye
Xiangyang Xue
20
448
0
07 Apr 2015
Using Descriptive Video Services to Create a Large Data Source for Video
  Annotation Research
Using Descriptive Video Services to Create a Large Data Source for Video Annotation Research
Atousa Torabi
C. Pal
Hugo Larochelle
Aaron Courville
VGen
31
204
0
03 Mar 2015
Describing Videos by Exploiting Temporal Structure
Describing Videos by Exploiting Temporal Structure
L. Yao
Atousa Torabi
Kyunghyun Cho
Nicolas Ballas
C. Pal
Hugo Larochelle
Aaron Courville
30
1,062
0
27 Feb 2015
Phrase-based Image Captioning
Phrase-based Image Captioning
R. Lebret
Pedro H. O. Pinheiro
R. Collobert
VLM
23
120
0
12 Feb 2015
A Dataset for Movie Description
A Dataset for Movie Description
Anna Rohrbach
Marcus Rohrbach
Niket Tandon
Bernt Schiele
VGen
31
498
0
12 Jan 2015
Long-term Recurrent Convolutional Networks for Visual Recognition and
  Description
Long-term Recurrent Convolutional Networks for Visual Recognition and Description
Jeff Donahue
Lisa Anne Hendricks
Marcus Rohrbach
Subhashini Venugopalan
S. Guadarrama
Kate Saenko
Trevor Darrell
VLM
50
6,030
0
17 Nov 2014
Previous
1234567