ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1412.6632
  4. Cited By
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

20 December 2014
Junhua Mao
W. Xu
Yi Yang
Jiang Wang
Zhiheng Huang
Alan Yuille
    VLM
ArXivPDFHTML

Papers citing "Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)"

50 / 417 papers shown
Title
Improving Factor-Based Quantitative Investing by Forecasting Company
  Fundamentals
Improving Factor-Based Quantitative Investing by Forecasting Company Fundamentals
J. Alberg
Zachary Chase Lipton
AI4TS
21
48
0
13 Nov 2017
Phrase-based Image Captioning with Hierarchical LSTM Model
Phrase-based Image Captioning with Hierarchical LSTM Model
Y. Tan
Chee Seng Chan
VLM
19
4
0
11 Nov 2017
A Neural-Symbolic Approach to Design of CAPTCHA
A Neural-Symbolic Approach to Design of CAPTCHA
Qiuyuan Huang
P. Smolensky
Xiaodong He
Li Deng
D. Wu
AAML
21
1
0
29 Oct 2017
Learning Social Image Embedding with Deep Multimodal Attention Networks
Learning Social Image Embedding with Deep Multimodal Attention Networks
Feiran Huang
Xiaoming Zhang
Zhoujun Li
Tao Mei
Yueying He
Zhonghua Zhao
14
20
0
18 Oct 2017
Tensor Product Generation Networks for Deep NLP Modeling
Tensor Product Generation Networks for Deep NLP Modeling
Qiuyuan Huang
P. Smolensky
Xiaodong He
Li Deng
D. Wu
14
3
0
26 Sep 2017
Fooling Vision and Language Models Despite Localization and Attention
  Mechanism
Fooling Vision and Language Models Despite Localization and Attention Mechanism
Xiaojun Xu
Xinyun Chen
Chang-rui Liu
Anna Rohrbach
Trevor Darrell
D. Song
AAML
8
41
0
25 Sep 2017
Self-Guiding Multimodal LSTM - when we do not have a perfect training
  dataset for image captioning
Self-Guiding Multimodal LSTM - when we do not have a perfect training dataset for image captioning
Yang Xian
Yingli Tian
VLM
18
22
0
15 Sep 2017
Stack-Captioning: Coarse-to-Fine Learning for Image Captioning
Stack-Captioning: Coarse-to-Fine Learning for Image Captioning
Jiuxiang Gu
Jianfei Cai
G. Wang
Tsuhan Chen
19
178
0
11 Sep 2017
Predicting Visual Features from Text for Image and Video Caption
  Retrieval
Predicting Visual Features from Text for Image and Video Caption Retrieval
Jianfeng Dong
Xirong Li
Cees G. M. Snoek
9
223
0
05 Sep 2017
Image2song: Song Retrieval via Bridging Image Content and Lyric Words
Image2song: Song Retrieval via Bridging Image Content and Lyric Words
Xuelong Li
Di Hu
Xiaoqiang Lu
11
10
0
19 Aug 2017
VQS: Linking Segmentations to Questions and Answers for Supervised
  Attention in VQA and Question-Focused Semantic Segmentation
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation
Chuang Gan
Yandong Li
Haoxiang Li
Chen Sun
Boqing Gong
14
126
0
15 Aug 2017
Fluency-Guided Cross-Lingual Image Captioning
Fluency-Guided Cross-Lingual Image Captioning
Weiyu Lan
Xirong Li
Jianfeng Dong
17
92
0
15 Aug 2017
Learning to Disambiguate by Asking Discriminative Questions
Learning to Disambiguate by Asking Discriminative Questions
Yining Li
Chen Huang
Xiaoou Tang
Chen Change Loy
18
22
0
09 Aug 2017
Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual
  Cross Retrieval
Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual Cross Retrieval
Yuming Shen
Li Liu
Ling Shao
Jingkuan Song
14
49
0
08 Aug 2017
What is the Role of Recurrent Neural Networks (RNNs) in an Image Caption
  Generator?
What is the Role of Recurrent Neural Networks (RNNs) in an Image Caption Generator?
Marc Tanti
Albert Gatt
K. Camilleri
16
56
0
07 Aug 2017
Identity-Aware Textual-Visual Matching with Latent Co-attention
Identity-Aware Textual-Visual Matching with Latent Co-attention
Shuang Li
Tong Xiao
Hongsheng Li
Wei Yang
Xiaogang Wang
12
226
0
07 Aug 2017
Discover and Learn New Objects from Documentaries
Discover and Learn New Objects from Documentaries
Kai-xiang Chen
Hang Song
Chen Change Loy
Dahua Lin
ObjD
17
20
0
30 Jul 2017
Deep Interactive Region Segmentation and Captioning
Deep Interactive Region Segmentation and Captioning
Ali Sharifi Boroujerdi
M. Khanian
M. Breuß
16
7
0
26 Jul 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual
  Question Answering
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
29
4,180
0
25 Jul 2017
Image Pivoting for Learning Multilingual Multimodal Representations
Image Pivoting for Learning Multilingual Multimodal Representations
Spandana Gella
Rico Sennrich
Frank Keller
Mirella Lapata
SSL
22
78
0
24 Jul 2017
OBJ2TEXT: Generating Visually Descriptive Language from Object Layouts
OBJ2TEXT: Generating Visually Descriptive Language from Object Layouts
Xuwang Yin
Vicente Ordonez
VLM
27
55
0
22 Jul 2017
Learning Visually Grounded Sentence Representations
Learning Visually Grounded Sentence Representations
Douwe Kiela
Alexis Conneau
Allan Jabri
Maximilian Nickel
SSL
15
69
0
19 Jul 2017
Order-Free RNN with Visual Attention for Multi-Label Classification
Order-Free RNN with Visual Attention for Multi-Label Classification
Shang-Fu Chen
Yi-Chen Chen
Chih-Kuan Yeh
Y. Wang
12
142
0
18 Jul 2017
MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis
  Network
MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network
Zizhao Zhang
Yuanpu Xie
Fuyong Xing
M. McGough
L. Yang
MedIm
13
301
0
08 Jul 2017
Actor-Critic Sequence Training for Image Captioning
Actor-Critic Sequence Training for Image Captioning
Li Zhang
Flood Sung
Feng Liu
Tao Xiang
S. Gong
Yongxin Yang
Timothy M. Hospedales
16
111
0
29 Jun 2017
Image Captioning with Object Detection and Localization
Image Captioning with Object Detection and Localization
Zhongliang Yang
Yujin Zhang
S. Rehman
Yongfeng Huang
ObjD
VLM
17
47
0
08 Jun 2017
Order embeddings and character-level convolutions for multimodal
  alignment
Order embeddings and character-level convolutions for multimodal alignment
Jonatas Wehrmann
Anderson Mattjie
Rodrigo C. Barros
15
27
0
03 Jun 2017
Listen, Interact and Talk: Learning to Speak via Interaction
Listen, Interact and Talk: Learning to Speak via Interaction
Haichao Zhang
Haonan Yu
W. Xu
20
13
0
28 May 2017
Multimodal Machine Learning: A Survey and Taxonomy
Multimodal Machine Learning: A Survey and Taxonomy
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
13
2,855
0
26 May 2017
Deep image representations using caption generators
Deep image representations using caption generators
Konda Reddy Mopuri
Vishal B. Athreya
R. Venkatesh Babu
VLM
SSL
9
1
0
25 May 2017
Attention-based Natural Language Person Retrieval
Attention-based Natural Language Person Retrieval
Tao Zhou
Muhao Chen
Jie Yu
Demetri Terzopoulos
17
14
0
24 May 2017
Object-Level Context Modeling For Scene Classification with Context-CNN
Object-Level Context Modeling For Scene Classification with Context-CNN
Syed Ashar Javed
A. Nelakanti
VLM
19
10
0
11 May 2017
Image Annotation using Multi-Layer Sparse Coding
Image Annotation using Multi-Layer Sparse Coding
Amara Tariq
H. Foroosh
6
2
0
06 May 2017
TALL: Temporal Activity Localization via Language Query
TALL: Temporal Activity Localization via Language Query
J. Gao
Chen Sun
Zhenheng Yang
Ram Nevatia
21
799
0
05 May 2017
STAIR Captions: Constructing a Large-Scale Japanese Image Caption
  Dataset
STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset
Yuya Yoshikawa
Yutaro Shigeto
A. Takeuchi
3DV
8
118
0
02 May 2017
Spatio-temporal Person Retrieval via Natural Language Queries
Spatio-temporal Person Retrieval via Natural Language Queries
Masataka Yamaguchi
Kuniaki Saito
Yoshitaka Ushiku
Tatsuya Harada
14
57
0
26 Apr 2017
Inception Recurrent Convolutional Neural Network for Object Recognition
Inception Recurrent Convolutional Neural Network for Object Recognition
Md. Zahangir Alom
Mahmudul Hasan
C. Yakopcic
T. Taha
31
86
0
25 Apr 2017
Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition
Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition
Yufei Wang
Zhe-nan Lin
Xiaohui Shen
Scott D. Cohen
G. Cottrell
11
105
0
23 Apr 2017
Spatial Memory for Context Reasoning in Object Detection
Spatial Memory for Context Reasoning in Object Detection
Xinlei Chen
Abhinav Gupta
ObjD
17
164
0
13 Apr 2017
Discriminative Bimodal Networks for Visual Localization and Detection
  with Natural Language Queries
Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries
Y. Zhang
Luyao Yuan
Yijie Guo
Zhiyuan He
I-An Huang
Honglak Lee
ObjD
23
57
0
12 Apr 2017
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Zhou Ren
Xiaoyu Wang
Ning Zhang
Xutao Lv
Li-Jia Li
23
324
0
12 Apr 2017
Reformulating Level Sets as Deep Recurrent Neural Network Approach to
  Semantic Segmentation
Reformulating Level Sets as Deep Recurrent Neural Network Approach to Semantic Segmentation
Ngan Le
Kha Gia Quach
Khoa Luu
Marios Savvides
Chenchen Zhu
16
71
0
12 Apr 2017
Creativity: Generating Diverse Questions using Variational Autoencoders
Creativity: Generating Diverse Questions using Variational Autoencoders
Unnat Jain
Ziyu Zhang
A. Schwing
17
151
0
11 Apr 2017
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
Liwei Wang
Yin Li
Jing-ling Huang
Svetlana Lazebnik
VLM
19
494
0
11 Apr 2017
Generating Descriptions with Grounded and Co-Referenced People
Generating Descriptions with Grounded and Co-Referenced People
Anna Rohrbach
Marcus Rohrbach
Siyu Tang
Seong Joon Oh
Bernt Schiele
314
72
0
05 Apr 2017
Weakly Supervised Dense Video Captioning
Weakly Supervised Dense Video Captioning
Zhiqiang Shen
Jianguo Li
Zhou Su
Minjun Li
Yurong Chen
Yu-Gang Jiang
Xiangyang Xue
16
134
0
05 Apr 2017
AMC: Attention guided Multi-modal Correlation Learning for Image Search
AMC: Attention guided Multi-modal Correlation Learning for Image Search
Kan Chen
Trung Bui
Chen Fang
Zhaowen Wang
Ram Nevatia
29
38
0
03 Apr 2017
Speaking the Same Language: Matching Machine to Human Captions by
  Adversarial Training
Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training
Rakshith Shetty
Marcus Rohrbach
Lisa Anne Hendricks
Mario Fritz
Bernt Schiele
9
142
0
30 Mar 2017
Survey of the State of the Art in Natural Language Generation: Core
  tasks, applications and evaluation
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
Albert Gatt
E. Krahmer
LM&MA
ELM
16
808
0
29 Mar 2017
Where to put the Image in an Image Caption Generator
Where to put the Image in an Image Caption Generator
Marc Tanti
Albert Gatt
K. Camilleri
39
96
0
27 Mar 2017
Previous
123456789
Next