Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1412.6632
Cited By
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
20 December 2014
Junhua Mao
W. Xu
Yi Yang
Jiang Wang
Zhiheng Huang
Alan Yuille
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)"
50 / 417 papers shown
Title
Improving Classification by Improving Labelling: Introducing Probabilistic Multi-Label Object Interaction Recognition
Michael Wray
Davide Moltisanti
W. Mayol-Cuevas
Dima Damen
25
2
0
24 Mar 2017
Recurrent Multimodal Interaction for Referring Image Segmentation
Chenxi Liu
Zhe-nan Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Alan Yuille
EgoV
36
234
0
23 Mar 2017
Recurrent Topic-Transition GAN for Visual Paragraph Generation
Xiaodan Liang
Zhiting Hu
H. M. Zhang
Chuang Gan
Eric P. Xing
GAN
19
200
0
21 Mar 2017
Person Search with Natural Language Description
Shuang Li
Tong Xiao
Hongsheng Li
Bolei Zhou
Dayu Yue
Xiaogang Wang
19
385
0
19 Feb 2017
Auxiliary Multimodal LSTM for Audio-visual Speech Recognition and Lipreading
Chunlin Tian
Weijun Ji
22
7
0
16 Jan 2017
Attention-Based Multimodal Fusion for Video Description
Chiori Hori
Takaaki Hori
Teng-Yok Lee
Kazuhiro Sumi
J. Hershey
Tim K. Marks
27
359
0
11 Jan 2017
Learning Visual N-Grams from Web Data
Ang Li
Allan Jabri
Armand Joulin
L. V. D. van der Maaten
VLM
12
136
0
29 Dec 2016
Image-Text Multi-Modal Representation Learning by Adversarial Backpropagation
Gwangbeen Park
Woobin Im
GAN
8
25
0
26 Dec 2016
Understanding Image and Text Simultaneously: a Dual Vision-Language Machine Comprehension Task
Nan Ding
Sebastian Goodman
Fei Sha
Radu Soricut
VLM
11
9
0
22 Dec 2016
An Empirical Study of Language CNN for Image Captioning
Jiuxiang Gu
G. Wang
Jianfei Cai
Tsuhan Chen
17
132
0
21 Dec 2016
Recurrent Image Captioner: Describing Images with Spatial-Invariant Transformation and Attention Filtering
Hao Liu
Yang Yang
Fumin Shen
Lixin Duan
Heng Tao Shen
30
9
0
15 Dec 2016
Text-guided Attention Model for Image Captioning
Jonghwan Mun
Minsu Cho
Bohyung Han
VLM
10
92
0
12 Dec 2016
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning
Jiasen Lu
Caiming Xiong
Devi Parikh
R. Socher
85
1,442
0
06 Dec 2016
Areas of Attention for Image Captioning
M. Pedersoli
Thomas Lucas
Cordelia Schmid
Jakob Verbeek
25
205
0
03 Dec 2016
Guided Open Vocabulary Image Captioning with Constrained Beam Search
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
16
232
0
02 Dec 2016
Video Captioning with Multi-Faceted Attention
Xiang Long
Chuang Gan
Gerard de Melo
19
88
0
01 Dec 2016
Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images
Junhua Mao
Jiajing Xu
Yushi Jing
Alan Yuille
11
48
0
24 Nov 2016
Semantic Compositional Networks for Visual Captioning
Zhe Gan
Chuang Gan
Xiaodong He
Yunchen Pu
Kenneth Tran
Jianfeng Gao
Lawrence Carin
Li Deng
CoGe
36
425
0
23 Nov 2016
Learning Generic Sentence Representations Using Convolutional Neural Networks
Zhe Gan
Yunchen Pu
Ricardo Henao
Chunyuan Li
Xiaodong He
Lawrence Carin
SSL
34
98
0
23 Nov 2016
Adaptive Feature Abstraction for Translating Video to Text
Yunchen Pu
Martin Renqiang Min
Zhe Gan
Lawrence Carin
29
14
0
23 Nov 2016
Dense Captioning with Joint Inference and Visual Context
L. Yang
K. Tang
Jianchao Yang
Li-Jia Li
VLM
19
169
0
21 Nov 2016
A Hierarchical Approach for Generating Descriptive Image Paragraphs
J. Krause
Justin Johnson
Ranjay Krishna
Li Fei-Fei
VLM
19
373
0
20 Nov 2016
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning
Long Chen
Hanwang Zhang
Jun Xiao
Liqiang Nie
Jian Shao
Wei Liu
Tat-Seng Chua
11
1,649
0
17 Nov 2016
Semantic Regularisation for Recurrent Image Annotation
Feng Liu
Tao Xiang
Timothy M. Hospedales
Wankou Yang
Changyin Sun
23
103
0
16 Nov 2016
Dual Attention Networks for Multimodal Reasoning and Matching
Hyeonseob Nam
Jung-Woo Ha
Jeonghee Kim
28
664
0
02 Nov 2016
Visual Question Answering: Datasets, Algorithms, and Future Challenges
Kushal Kafle
Christopher Kanan
OOD
25
235
0
05 Oct 2016
A Survey of Multi-View Representation Learning
Yingming Li
Ming Yang
Zhongfei Zhang
AI4TS
3DV
22
509
0
03 Oct 2016
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
11
848
0
21 Sep 2016
GeThR-Net: A Generalized Temporally Hybrid Recurrent Neural Network for Multimodal Information Fusion
Ankit Gandhi
Arjun Sharma
Arijit Biswas
Om Deshmukh
AI4TS
19
12
0
17 Sep 2016
Predicting Shot Making in Basketball Learnt from Adversarial Multiagent Trajectories
Mark Harmon
Abdolghani Ebrahimi
P. Lucey
Diego Klabjan
GAN
9
18
0
15 Sep 2016
Multimodal Attention for Neural Machine Translation
Ozan Caglayan
Loïc Barrault
Fethi Bougares
21
75
0
13 Sep 2016
Measuring Machine Intelligence Through Visual Question Answering
C. L. Zitnick
Aishwarya Agrawal
Stanislaw Antol
Margaret Mitchell
Dhruv Batra
Devi Parikh
14
37
0
31 Aug 2016
Utilizing Large Scale Vision and Text Datasets for Image Segmentation from Referring Expressions
Ronghang Hu
Marcus Rohrbach
Subhashini Venugopalan
Trevor Darrell
VLM
17
18
0
30 Aug 2016
phi-LSTM: A Phrase-based Hierarchical LSTM Model for Image Captioning
Y. Tan
Chee Seng Chan
VLM
9
29
0
20 Aug 2016
Detecting Sarcasm in Multimodal Social Platforms
Rossano Schifanella
Paloma de Juan
Joel R. Tetreault
Liangliang Cao
15
167
0
08 Aug 2016
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
28
1,223
0
31 Jul 2016
SPICE: Semantic Propositional Image Caption Evaluation
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
29
1,883
0
29 Jul 2016
A Comprehensive Survey on Cross-modal Retrieval
K. Wang
Qiyue Yin
Wei Wang
Shu Wu
Liang Wang
26
294
0
21 Jul 2016
Visual Question Answering: A Survey of Methods and Datasets
Qi Wu
Damien Teney
Peng Wang
Chunhua Shen
A. Dick
A. Hengel
19
413
0
20 Jul 2016
Captioning Images with Diverse Objects
Subhashini Venugopalan
Lisa Anne Hendricks
Marcus Rohrbach
Raymond J. Mooney
Trevor Darrell
Kate Saenko
VLM
22
178
0
24 Jun 2016
Picture It In Your Mind: Generating High Level Visual Representations From Textual Descriptions
F. Carrara
Andrea Esuli
T. Fagni
Fabrizio Falchi
Alejandro Moreo
DiffM
14
31
0
23 Jun 2016
Watch What You Just Said: Image Captioning with Text-Conditional Attention
Luowei Zhou
Chenliang Xu
Parker A. Koch
Jason J. Corso
VLM
8
44
0
15 Jun 2016
Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation
Jie Zhou
Ying Cao
Xuguang Wang
Peng Li
W. Xu
AIMat
19
215
0
14 Jun 2016
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
149
1,465
0
06 Jun 2016
Storytelling of Photo Stream with Bidirectional Multi-thread Recurrent Neural Network
Yu Liu
Jianlong Fu
Tao Mei
C. Chen
11
4
0
02 Jun 2016
Attention Correctness in Neural Image Captioning
Chenxi Liu
Junhua Mao
Fei Sha
Alan Yuille
3DV
27
220
0
31 May 2016
SNN: Stacked Neural Networks
Milad Mohammadi
Subhasis Das
11
15
0
27 May 2016
Generative Adversarial Text to Image Synthesis
Scott E. Reed
Zeynep Akata
Xinchen Yan
Lajanugen Logeswaran
Bernt Schiele
Honglak Lee
GAN
17
3,124
0
17 May 2016
Learning Deep Representations of Fine-grained Visual Descriptions
Scott E. Reed
Zeynep Akata
Bernt Schiele
Honglak Lee
OCL
VLM
165
840
0
17 May 2016
Movie Description
Anna Rohrbach
Atousa Torabi
Marcus Rohrbach
Niket Tandon
C. Pal
Hugo Larochelle
Aaron Courville
Bernt Schiele
3DV
VGen
30
353
0
12 May 2016
Previous
1
2
3
4
5
6
7
8
9
Next