Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

20 December 2014

Yi Yang

Papers citing "Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)"

50 / 417 papers shown

Title
Improving Factor-Based Quantitative Investing by Forecasting Company Fundamentals J. Alberg Zachary Chase Lipton AI4TS 21 48 0 13 Nov 2017
Phrase-based Image Captioning with Hierarchical LSTM Model Y. Tan Chee Seng Chan VLM 19 4 0 11 Nov 2017
A Neural-Symbolic Approach to Design of CAPTCHA Qiuyuan Huang P. Smolensky Xiaodong He Li Deng D. Wu AAML 21 1 0 29 Oct 2017
Learning Social Image Embedding with Deep Multimodal Attention Networks Feiran Huang Xiaoming Zhang Zhoujun Li Tao Mei Yueying He Zhonghua Zhao 14 20 0 18 Oct 2017
Tensor Product Generation Networks for Deep NLP Modeling Qiuyuan Huang P. Smolensky Xiaodong He Li Deng D. Wu 14 3 0 26 Sep 2017
Fooling Vision and Language Models Despite Localization and Attention Mechanism Xiaojun Xu Xinyun Chen Chang-rui Liu Anna Rohrbach Trevor Darrell D. Song AAML 8 41 0 25 Sep 2017
Self-Guiding Multimodal LSTM - when we do not have a perfect training dataset for image captioning Yang Xian Yingli Tian VLM 18 22 0 15 Sep 2017
Stack-Captioning: Coarse-to-Fine Learning for Image Captioning Jiuxiang Gu Jianfei Cai G. Wang Tsuhan Chen 19 178 0 11 Sep 2017
Predicting Visual Features from Text for Image and Video Caption Retrieval Jianfeng Dong Xirong Li Cees G. M. Snoek 9 223 0 05 Sep 2017
Image2song: Song Retrieval via Bridging Image Content and Lyric Words Xuelong Li Di Hu Xiaoqiang Lu 11 10 0 19 Aug 2017
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation Chuang Gan Yandong Li Haoxiang Li Chen Sun Boqing Gong 14 126 0 15 Aug 2017
Fluency-Guided Cross-Lingual Image Captioning Weiyu Lan Xirong Li Jianfeng Dong 17 92 0 15 Aug 2017
Learning to Disambiguate by Asking Discriminative Questions Yining Li Chen Huang Xiaoou Tang Chen Change Loy 18 22 0 09 Aug 2017
Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual Cross Retrieval Yuming Shen Li Liu Ling Shao Jingkuan Song 14 49 0 08 Aug 2017
What is the Role of Recurrent Neural Networks (RNNs) in an Image Caption Generator? Marc Tanti Albert Gatt K. Camilleri 16 56 0 07 Aug 2017
Identity-Aware Textual-Visual Matching with Latent Co-attention Shuang Li Tong Xiao Hongsheng Li Wei Yang Xiaogang Wang 12 226 0 07 Aug 2017
Discover and Learn New Objects from Documentaries Kai-xiang Chen Hang Song Chen Change Loy Dahua Lin ObjD 17 20 0 30 Jul 2017
Deep Interactive Region Segmentation and Captioning Ali Sharifi Boroujerdi M. Khanian M. Breuß 16 7 0 26 Jul 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering Peter Anderson Xiaodong He Chris Buehler Damien Teney Mark Johnson Stephen Gould Lei Zhang AIMat 29 4,180 0 25 Jul 2017
Image Pivoting for Learning Multilingual Multimodal Representations Spandana Gella Rico Sennrich Frank Keller Mirella Lapata SSL 22 78 0 24 Jul 2017
OBJ2TEXT: Generating Visually Descriptive Language from Object Layouts Xuwang Yin Vicente Ordonez VLM 27 55 0 22 Jul 2017
Learning Visually Grounded Sentence Representations Douwe Kiela Alexis Conneau Allan Jabri Maximilian Nickel SSL 15 69 0 19 Jul 2017
Order-Free RNN with Visual Attention for Multi-Label Classification Shang-Fu Chen Yi-Chen Chen Chih-Kuan Yeh Y. Wang 12 142 0 18 Jul 2017
MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network Zizhao Zhang Yuanpu Xie Fuyong Xing M. McGough L. Yang MedIm 13 301 0 08 Jul 2017
Actor-Critic Sequence Training for Image Captioning Li Zhang Flood Sung Feng Liu Tao Xiang S. Gong Yongxin Yang Timothy M. Hospedales 16 111 0 29 Jun 2017
Image Captioning with Object Detection and Localization Zhongliang Yang Yujin Zhang S. Rehman Yongfeng Huang ObjD VLM 17 47 0 08 Jun 2017
Order embeddings and character-level convolutions for multimodal alignment Jonatas Wehrmann Anderson Mattjie Rodrigo C. Barros 15 27 0 03 Jun 2017
Listen, Interact and Talk: Learning to Speak via Interaction Haichao Zhang Haonan Yu W. Xu 20 13 0 28 May 2017
Multimodal Machine Learning: A Survey and Taxonomy T. Baltrušaitis Chaitanya Ahuja Louis-Philippe Morency 13 2,855 0 26 May 2017
Deep image representations using caption generators Konda Reddy Mopuri Vishal B. Athreya R. Venkatesh Babu VLM SSL 9 1 0 25 May 2017
Attention-based Natural Language Person Retrieval Tao Zhou Muhao Chen Jie Yu Demetri Terzopoulos 17 14 0 24 May 2017
Object-Level Context Modeling For Scene Classification with Context-CNN Syed Ashar Javed A. Nelakanti VLM 19 10 0 11 May 2017
Image Annotation using Multi-Layer Sparse Coding Amara Tariq H. Foroosh 6 2 0 06 May 2017
TALL: Temporal Activity Localization via Language Query J. Gao Chen Sun Zhenheng Yang Ram Nevatia 21 799 0 05 May 2017
STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset Yuya Yoshikawa Yutaro Shigeto A. Takeuchi 3DV 8 118 0 02 May 2017
Spatio-temporal Person Retrieval via Natural Language Queries Masataka Yamaguchi Kuniaki Saito Yoshitaka Ushiku Tatsuya Harada 14 57 0 26 Apr 2017
Inception Recurrent Convolutional Neural Network for Object Recognition Md. Zahangir Alom Mahmudul Hasan C. Yakopcic T. Taha 31 86 0 25 Apr 2017
Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition Yufei Wang Zhe-nan Lin Xiaohui Shen Scott D. Cohen G. Cottrell 11 105 0 23 Apr 2017
Spatial Memory for Context Reasoning in Object Detection Xinlei Chen Abhinav Gupta ObjD 17 164 0 13 Apr 2017
Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries Y. Zhang Luyao Yuan Yijie Guo Zhiyuan He I-An Huang Honglak Lee ObjD 23 57 0 12 Apr 2017
Deep Reinforcement Learning-based Image Captioning with Embedding Reward Zhou Ren Xiaoyu Wang Ning Zhang Xutao Lv Li-Jia Li 23 324 0 12 Apr 2017
Reformulating Level Sets as Deep Recurrent Neural Network Approach to Semantic Segmentation Ngan Le Kha Gia Quach Khoa Luu Marios Savvides Chenchen Zhu 16 71 0 12 Apr 2017
Creativity: Generating Diverse Questions using Variational Autoencoders Unnat Jain Ziyu Zhang A. Schwing 17 151 0 11 Apr 2017
Learning Two-Branch Neural Networks for Image-Text Matching Tasks Liwei Wang Yin Li Jing-ling Huang Svetlana Lazebnik VLM 19 494 0 11 Apr 2017
Generating Descriptions with Grounded and Co-Referenced People Anna Rohrbach Marcus Rohrbach Siyu Tang Seong Joon Oh Bernt Schiele 314 72 0 05 Apr 2017
Weakly Supervised Dense Video Captioning Zhiqiang Shen Jianguo Li Zhou Su Minjun Li Yurong Chen Yu-Gang Jiang Xiangyang Xue 16 134 0 05 Apr 2017
AMC: Attention guided Multi-modal Correlation Learning for Image Search Kan Chen Trung Bui Chen Fang Zhaowen Wang Ram Nevatia 29 38 0 03 Apr 2017
Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training Rakshith Shetty Marcus Rohrbach Lisa Anne Hendricks Mario Fritz Bernt Schiele 9 142 0 30 Mar 2017
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation Albert Gatt E. Krahmer LM&MA ELM 16 808 0 29 Mar 2017
Where to put the Image in an Image Caption Generator Marc Tanti Albert Gatt K. Camilleri 39 96 0 27 Mar 2017