Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1505.01861
Cited By
v1
v2
v3 (latest)
Jointly Modeling Embedding and Translation to Bridge Video and Language
7 May 2015
Yingwei Pan
Tao Mei
Ting Yao
Houqiang Li
Y. Rui
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Jointly Modeling Embedding and Translation to Bridge Video and Language"
49 / 199 papers shown
PassGAN: A Deep Learning Approach for Password Guessing
Briland Hitaj
Paolo Gasti
G. Ateniese
Fernando Perez-Cruz
GAN
238
284
0
01 Sep 2017
Video Captioning with Guidance of Multimodal Latent Topics
Shizhe Chen
Jia Chen
Qin Jin
Alexander G. Hauptmann
207
71
0
31 Aug 2017
Generating Video Descriptions with Topic Guidance
Shizhe Chen
Jia Chen
Qin Jin
156
21
0
31 Aug 2017
Incorporating Copying Mechanism in Image Captioning for Learning Novel Objects
Ting Yao
Yingwei Pan
Yehao Li
Tao Mei
VLM
141
154
0
17 Aug 2017
ConvNet Architecture Search for Spatiotemporal Feature Learning
Du Tran
Jamie Ray
Zheng Shou
Shih-Fu Chang
Manohar Paluri
3DPC
196
411
0
16 Aug 2017
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation
IEEE International Conference on Computer Vision (ICCV), 2017
Chuang Gan
Yandong Li
Haoxiang Li
Chen Sun
Boqing Gong
243
136
0
15 Aug 2017
Hierarchically-Attentive RNN for Album Summarization and Storytelling
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2017
Licheng Yu
Joey Tianyi Zhou
Tamara L. Berg
108
69
0
09 Aug 2017
From Deterministic to Generative: Multi-Modal Stochastic RNNs for Video Captioning
IEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), 2017
Jingkuan Song
Yuyu Guo
Lianli Gao
Xuelong Li
Alan Hanjalic
Heng Tao Shen
174
228
0
08 Aug 2017
Reinforced Video Captioning with Entailment Rewards
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2017
Ramakanth Pasunuru
Joey Tianyi Zhou
157
118
0
07 Aug 2017
Localizing Moments in Video with Natural Language
Lisa Anne Hendricks
Oliver Wang
Eli Shechtman
Josef Sivic
Trevor Darrell
Bryan C. Russell
404
1,102
0
04 Aug 2017
Learning Fashion Compatibility with Bidirectional LSTMs
Xintong Han
Zuxuan Wu
Yu-Gang Jiang
L. Davis
181
392
0
18 Jul 2017
Show and Recall: Learning What Makes Videos Memorable
Sumit Shekhar
Dhruv Singal
Harvineet Singh
Manav Kedia
Akhil Shetty
130
43
0
17 Jul 2017
Hierarchical LSTM with Adjusted Temporal Attention for Video Captioning
International Joint Conference on Artificial Intelligence (IJCAI), 2017
Jingkuan Song
Zhao Guo
Lianli Gao
Wu Liu
Dongxiang Zhang
Heng Tao Shen
176
169
0
05 Jun 2017
Weakly supervised 3D Reconstruction with Adversarial Constraint
International Conference on 3D Vision (3DV), 2017
JunYoung Gwak
Chris Choy
Animesh Garg
Manmohan Chandraker
Silvio Savarese
3DV
GAN
219
123
0
31 May 2017
Multimodal Machine Learning: A Survey and Taxonomy
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
534
3,572
0
26 May 2017
Unified Embedding and Metric Learning for Zero-Exemplar Event Detection
Noureldien Hussein
E. Gavves
A. Smeulders
113
15
0
05 May 2017
Dense-Captioning Events in Videos
Ranjay Krishna
Kenji Hata
F. Ren
Li Fei-Fei
Juan Carlos Niebles
400
1,439
0
02 May 2017
Multi-Task Video Captioning with Video and Entailment Generation
Ramakanth Pasunuru
Joey Tianyi Zhou
186
120
0
24 Apr 2017
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Zhou Ren
Xiaoyu Wang
Ning Zhang
Xutao Lv
Li Li
146
334
0
12 Apr 2017
Weakly Supervised Dense Video Captioning
Zhiqiang Shen
Jianguo Li
Zhou Su
Minjun Li
Yurong Chen
Yu-Gang Jiang
Xiangyang Xue
183
140
0
05 Apr 2017
TS-LSTM and Temporal-Inception: Exploiting Spatiotemporal Dynamics for Activity Recognition
Chih-Yao Ma
Min-Hung Chen
Z. Kira
G. Al-Regib
AI4TS
199
252
0
30 Mar 2017
Improving Interpretability of Deep Neural Networks with Semantic Information
Yinpeng Dong
Hang Su
Jun Zhu
Bo Zhang
194
130
0
12 Mar 2017
Contextually Customized Video Summaries via Natural Language
Jinsoo Choi
Tae-Hyun Oh
In So Kweon
168
12
0
06 Feb 2017
Attention-Based Multimodal Fusion for Video Description
IEEE International Conference on Computer Vision (ICCV), 2017
Chiori Hori
Takaaki Hori
Teng-Yok Lee
Kazuhiro Sumi
J. Hershey
Tim K. Marks
319
379
0
11 Jan 2017
Video Captioning with Multi-Faceted Attention
Xiang Long
Chuang Gan
Gerard de Melo
177
89
0
01 Dec 2016
Hierarchical Boundary-Aware Neural Encoder for Video Captioning
Lorenzo Baraldi
C. Grana
Rita Cucchiara
271
196
0
28 Nov 2016
Bidirectional Multirate Reconstruction for Temporal Modeling in Videos
Linchao Zhu
Zhongwen Xu
Yi Yang
158
78
0
28 Nov 2016
Semantic Compositional Networks for Visual Captioning
Zhe Gan
Chuang Gan
Xiaodong He
Yunchen Pu
Kenneth Tran
Jianfeng Gao
Lawrence Carin
Li Deng
CoGe
269
444
0
23 Nov 2016
Adaptive Feature Abstraction for Translating Video to Text
Yunchen Pu
Martin Renqiang Min
Zhe Gan
Lawrence Carin
190
14
0
23 Nov 2016
Video Captioning with Transferred Semantic Attributes
Yingwei Pan
Ting Yao
Houqiang Li
Tao Mei
166
337
0
23 Nov 2016
Dense Captioning with Joint Inference and Visual Context
L. Yang
K. Tang
Jianchao Yang
Li Li
VLM
214
177
0
21 Nov 2016
Recurrent Memory Addressing for describing videos
A. Jain
Abhinav Agarwalla
Kumar Krishna Agrawal
Pabitra Mitra
132
10
0
20 Nov 2016
Multimodal Memory Modelling for Video Captioning
Junbo Wang
Wei Wang
Yan Huang
Liang Wang
Tieniu Tan
203
147
0
17 Nov 2016
Learning long-term dependencies for action recognition with a biologically-inspired deep network
Yemin Shi
Yonghong Tian
Yaowei Wang
Tiejun Huang
194
65
0
16 Nov 2016
Leveraging Video Descriptions to Learn Video Question Answering
Kuo-Hao Zeng
Tseng-Hung Chen
Ching-Yao Chuang
Yuan-Hong Liao
Juan Carlos Niebles
Min Sun
266
188
0
12 Nov 2016
Memory-augmented Attention Modelling for Videos
Rasool Fakoor
Abdel-rahman Mohamed
Margaret Mitchell
S. B. Kang
Pushmeet Kohli
261
20
0
07 Nov 2016
Boosting Image Captioning with Attributes
Ting Yao
Yingwei Pan
Yehao Li
Zhaofan Qiu
Tao Mei
VLM
293
648
0
05 Nov 2016
Spatio-Temporal Attention Models for Grounded Video Captioning
M. Zanfir
Elisabeta Marinoiu
C. Sminchisescu
213
51
0
17 Oct 2016
Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2016
Zecheng Xie
Zenghui Sun
Lianwen Jin
Hao Ni
Terry Lyons
188
128
0
09 Oct 2016
Deep Learning for Video Classification and Captioning
Zuxuan Wu
Ting Yao
Yanwei Fu
Yu-Gang Jiang
3DV
VLM
176
139
0
22 Sep 2016
Title Generation for User Generated Videos
European Conference on Computer Vision (ECCV), 2016
Kuo-Hao Zeng
Tseng-Hung Chen
Juan Carlos Niebles
Min Sun
168
71
0
25 Aug 2016
Bidirectional Long-Short Term Memory for Video Description
Yi Bin
Yang Yang
Zi Huang
Fumin Shen
Xing Xu
Heng Tao Shen
159
66
0
15 Jun 2016
Beyond Caption To Narrative: Video Captioning With Multiple Sentences
Andrew Shin
Katsunori Ohnishi
Tatsuya Harada
131
33
0
18 May 2016
Movie Description
Anna Rohrbach
Atousa Torabi
Marcus Rohrbach
Niket Tandon
C. Pal
Hugo Larochelle
Aaron Courville
Bernt Schiele
3DV
VGen
266
387
0
12 May 2016
TGIF: A New Dataset and Benchmark on Animated GIF Description
Yuncheng Li
Yale Song
Liangliang Cao
Joel R. Tetreault
Larry Goldberg
A. Jaimes
Jiebo Luo
199
295
0
10 Apr 2016
A Taxonomy of Deep Convolutional Neural Nets for Computer Vision
Suraj Srinivas
Ravi Kiran Sarvadevabhatla
Konda Reddy Mopuri
N. Prabhu
S. Kruthiventi
R. Venkatesh Babu
OOD
181
219
0
25 Jan 2016
Hierarchical Recurrent Neural Encoder for Video Representation with Application to Captioning
Pingbo Pan
Zhongwen Xu
Yi Yang
Leilei Gan
Yueting Zhuang
154
391
0
11 Nov 2015
Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks
Haonan Yu
Jiang Wang
Zhiheng Huang
Yi Yang
Wenyuan Xu
360
573
0
26 Oct 2015
The Long-Short Story of Movie Description
German Conference on Pattern Recognition (DAGM), 2015
Anna Rohrbach
Marcus Rohrbach
Bernt Schiele
VLM
143
117
0
04 Jun 2015
Previous
1
2
3
4