Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015

Jimmy Ba

Aaron Courville

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,508 papers shown

Title
Recurrent 3D Attentional Networks for End-to-End Active Object Recognition Min Liu Yifei Shi Lintao Zheng Kai Xu Hui Huang Dinesh Manocha 3DPC 19 10 0 14 Oct 2016
Video Fill in the Blank with Merging LSTMs Amir Mazaheri Dong-Ming Zhang M. Shah 16 18 0 13 Oct 2016
Generating captions without looking beyond objects Hendrik Heuer Christof Monz A. Smeulders 17 16 0 12 Oct 2016
Attention and Anticipation in Fast Visual-Inertial Navigation Luca Carlone S. Karaman 16 76 0 11 Oct 2016
Latent Sequence Decompositions William Chan Yu Zhang Quoc V. Le Navdeep Jaitly 14 62 0 10 Oct 2016
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering Youngjae Yu Hyungjin Ko Jongwook Choi Gunhee Kim 6 229 0 10 Oct 2016
Understanding intermediate layers using linear classifier probes Guillaume Alain Yoshua Bengio FAtt 34 889 0 05 Oct 2016
Visual Question Answering: Datasets, Algorithms, and Future Challenges Kushal Kafle Christopher Kanan OOD 25 235 0 05 Oct 2016
A Survey of Multi-View Representation Learning Yingming Li Ming Yang Zhongfei Zhang AI4TS 3DV 22 509 0 03 Oct 2016
Controlling Output Length in Neural Encoder-Decoders Yuta Kikuchi Graham Neubig Ryohei Sasano Hiroya Takamura Manabu Okumura 14 242 0 30 Sep 2016
Variational Autoencoder for Deep Learning of Images, Labels and Captions Yunchen Pu Zhe Gan Ricardo Henao Xin Yuan Chunyuan Li Andrew Stevens Lawrence Carin BDL CoGe 17 745 0 28 Sep 2016
Character Sequence Models for ColorfulWords Kazuya Kawakami Chris Dyer Bryan R. Routledge Noah A. Smith 3DV 20 17 0 28 Sep 2016
Learning Language-Visual Embedding for Movie Understanding with Natural-Language Atousa Torabi Niket Tandon Leonid Sigal 14 97 0 26 Sep 2016
Visual Fashion-Product Search at SK Planet Taewan Kim Seyeong Kim Sangil Na Hayoon Kim Moonki Kim Beyeongki Jeon 9 6 0 26 Sep 2016
Language as a Latent Variable: Discrete Generative Models for Sentence Compression Yishu Miao Phil Blunsom 201 223 0 23 Sep 2016
The Color of the Cat is Gray: 1 Million Full-Sentences Visual Question Answering (FSVQA) Andrew Shin Yoshitaka Ushiku Tatsuya Harada 44 14 0 21 Sep 2016
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge Oriol Vinyals Alexander Toshev Samy Bengio D. Erhan 19 848 0 21 Sep 2016
Enhanced LSTM for Natural Language Inference Qian Chen Xiao-Dan Zhu Zhenhua Ling Si Wei Hui Jiang Diana Inkpen LRM ReLM 26 1,126 0 20 Sep 2016
Image-to-Markup Generation with Coarse-to-Fine Attention Yuntian Deng Anssi Kanervisto Jeffrey Ling Alexander M. Rush 17 226 0 16 Sep 2016
Predicting Shot Making in Basketball Learnt from Adversarial Multiagent Trajectories Mark Harmon Abdolghani Ebrahimi P. Lucey Diego Klabjan GAN 14 18 0 15 Sep 2016
Multimodal Attention for Neural Machine Translation Ozan Caglayan Loïc Barrault Fethi Bougares 26 75 0 13 Sep 2016
Read, Tag, and Parse All at Once, or Fully-neural Dependency Parsing J. Chorowski Michal Zapotoczny Paweł Rychlikowski 11 5 0 12 Sep 2016
The Role of Context Selection in Object Detection Ruichi Yu Xi Chen Vlad I. Morariu L. Davis 14 42 0 09 Sep 2016
Optimizing Recurrent Neural Networks Architectures under Time Constraints Junqi Jin Ziang Yan Kun Fu Nan Jiang Changshui Zhang 11 2 0 29 Aug 2016
A Boundary Tilting Persepective on the Phenomenon of Adversarial Examples T. Tanay Lewis D. Griffin AAML 17 270 0 27 Aug 2016
Learning to generalize to new compositions in image understanding Y. Atzmon Jonathan Berant Vahid Kezami Amir Globerson Gal Chechik 18 67 0 27 Aug 2016
Title Generation for User Generated Videos Kuo-Hao Zeng Tseng-Hung Chen Juan Carlos Niebles Min Sun 27 69 0 25 Aug 2016
Context Gates for Neural Machine Translation Zhaopeng Tu Yang Liu Zhengdong Lu Xiaohua Liu Hang Li 19 137 0 22 Aug 2016
phi-LSTM: A Phrase-based Hierarchical LSTM Model for Image Captioning Y. Tan Chee Seng Chan VLM 17 29 0 20 Aug 2016
RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism E. Choi M. T. Bahadori Joshua A. Kulas A. Schuetz Walter F. Stewart Jimeng Sun AI4TS 13 1,228 0 19 Aug 2016
Modeling Human Reading with Neural Attention Michael Hahn Frank Keller 15 54 0 19 Aug 2016
Seeing with Humans: Gaze-Assisted Neural Image Captioning Yusuke Sugano Andreas Bulling 16 68 0 18 Aug 2016
Temporal Attention Model for Neural Machine Translation B. Sankaran Haitao Mi Yaser Al-Onaizan Abe Ittycheriah 17 62 0 09 Aug 2016
End-to-End Localization and Ranking for Relative Attributes Krishna Kumar Singh Yong Jae Lee 11 76 0 09 Aug 2016
Learning Online Alignments with Continuous Rewards Policy Gradient Yuping Luo Chung-Cheng Chiu Navdeep Jaitly Ilya Sutskever OffRL 13 46 0 03 Aug 2016
Modeling Context Between Objects for Referring Expression Understanding Varun K. Nagaraja Vlad I. Morariu Larry S. Davis 29 143 0 01 Aug 2016
Modeling Context in Referring Expressions Licheng Yu Patrick Poirson Shan Yang Alexander C. Berg Tamara L. Berg 28 1,223 0 31 Jul 2016
SPICE: Semantic Propositional Image Caption Evaluation Peter Anderson Basura Fernando Mark Johnson Stephen Gould EGVM 29 1,883 0 29 Jul 2016
Salient Object Subitizing Jianming Zhang Shugao Ma M. Sameki Stan Sclaroff Margrit Betke Zhe-nan Lin Xiaohui Shen Brian L. Price R. Měch 21 115 0 26 Jul 2016
Learning Aligned Cross-Modal Representations from Weakly Aligned Data Lluis Castrejon Y. Aytar Carl Vondrick Hamed Pirsiavash Antonio Torralba SSL DRL AI4TS 24 166 0 25 Jul 2016
An Actor-Critic Algorithm for Sequence Prediction Dzmitry Bahdanau Philemon Brakel Kelvin Xu Anirudh Goyal Ryan J. Lowe Joelle Pineau Aaron Courville Yoshua Bengio 31 633 0 24 Jul 2016
Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition Jun Liu Amir Shahroudy Dong Xu Gang Wang 35 1,098 0 24 Jul 2016
Hierarchical Attention Network for Action Recognition in Videos Yilin Wang Suhang Wang Jiliang Tang Neil O'Hare Yi-Ju Chang Baoxin Li BDL 22 82 0 21 Jul 2016
Constructing a Natural Language Inference Dataset using Generative Neural Networks Janez Starc Dunja Mladenić 14 7 0 20 Jul 2016
Visual Question Answering: A Survey of Methods and Datasets Qi Wu Damien Teney Peng Wang Chunhua Shen A. Dick A. Hengel 27 413 0 20 Jul 2016
HeMIS: Hetero-Modal Image Segmentation Mohammad Havaei N. Guizard Nicolas Chapados Yoshua Bengio MedIm 19 258 0 18 Jul 2016
Weakly Supervised Learning of Heterogeneous Concepts in Videos Sohil Shah K. Kulkarni Arijit Biswas Ankit Gandhi Om Deshmukh L. Davis 24 2 0 12 Jul 2016
VideoLSTM Convolves, Attends and Flows for Action Recognition Zhenyang Li E. Gavves Mihir Jain Cees G. M. Snoek 33 463 0 06 Jul 2016
Domain Adaptation for Neural Networks by Parameter Augmentation Yusuke Watanabe Kazuma Hashimoto Yoshimasa Tsuruoka OOD 14 6 0 01 Jul 2016
Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes Çağlar Gülçehre A. Chandar Kyunghyun Cho Yoshua Bengio 12 64 0 30 Jun 2016