v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015

Jimmy Ba

Aaron Courville

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,578 papers shown

Title
Learning Deep Structure-Preserving Image-Text Embeddings Liwei Wang Yin Li Svetlana Lazebnik 360 814 0 19 Nov 2015
Active Object Localization with Deep Reinforcement Learning Juan C. Caicedo Svetlana Lazebnik ObjD 192 454 0 18 Nov 2015
ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering Kan Chen Jiang Wang Liang-Chieh Chen Haoyuan Gao Wenyuan Xu Ram Nevatia 252 298 0 18 Nov 2015
ACDC: A Structured Efficient Linear Layer Marcin Moczulski Misha Denil J. Appleyard Nando de Freitas 498 103 0 18 Nov 2015
Image Question Answering using Convolutional Neural Network with Dynamic Parameter Prediction Hyeonwoo Noh Paul Hongsuck Seo Bohyung Han OOD 166 334 0 18 Nov 2015
Compositional Memory for Visual Question Answering Aiwen Jiang Fang Wang Fatih Porikli Yi Li CoGe 101 43 0 18 Nov 2015
Learning Articulated Motion Models from Visual and Lingual Signals Zhengyang Wu Joey Tianyi Zhou Matthew R. Walter 95 0 0 17 Nov 2015
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering Huijuan Xu Kate Saenko 378 780 0 17 Nov 2015
Yin and Yang: Balancing and Answering Binary Visual Questions Peng Zhang Yash Goyal D. Summers-Stay Dhruv Batra Devi Parikh CoGe 421 364 0 16 Nov 2015
Sherlock: Scalable Fact Learning in Images Mohamed Elhoseiny Scott D. Cohen W. Chang Brian L. Price Ahmed Elgammal 202 26 0 16 Nov 2015
Neural Programmer: Inducing Latent Programs with Gradient Descent Arvind Neelakantan Quoc V. Le Ilya Sutskever ODL 265 267 0 16 Nov 2015
Uncovering Temporal Context for Video Question and Answering Linchao Zhu Zhongwen Xu Yi Yang Alexander G. Hauptmann BDL 149 45 0 15 Nov 2015
Oracle performance for visual captioning Weitong Chen Nicolas Ballas Dong Wang John R. Smith Yoshua Bengio VLM 399 9 0 14 Nov 2015
Reversible Recursive Instance-level Object Segmentation Xiaodan Liang Yunchao Wei Xiaohui Shen Zequn Jie Jiashi Feng Liang Lin Shuicheng Yan SSeg ISeg 108 60 0 14 Nov 2015
Semantic Object Parsing with Local-Global Long Short-Term Memory Xiaodan Liang Xiaohui Shen Donglai Xiang Jiashi Feng Liang Lin Shuicheng Yan 199 187 0 14 Nov 2015
Natural Language Object Retrieval Ronghang Hu Huazhe Xu Marcus Rohrbach Jiashi Feng Kate Saenko Trevor Darrell ObjD 297 569 0 13 Nov 2015
Action Recognition using Visual Attention Shikhar Sharma Ryan Kiros Ruslan Salakhutdinov 293 677 0 12 Nov 2015
Deep Gaussian Conditional Random Field Network: A Model-based Deep Network for Discriminative Denoising Raviteja Vemulapalli Oncel Tuzel Ming-Yuan Liu 131 71 0 12 Nov 2015
Hand-Object Interaction and Precise Localization in Transitive Action Recognition Amir Rosenfeld S. Ullman 199 8 0 12 Nov 2015
Grounding of Textual Phrases in Images by Reconstruction Anna Rohrbach Marcus Rohrbach Ronghang Hu Trevor Darrell Bernt Schiele 343 510 0 12 Nov 2015
Generative Concatenative Nets Jointly Learn to Write and Classify Reviews Zachary Chase Lipton Sharad Vikram Julian McAuley BDL 270 33 0 11 Nov 2015
Visual7W: Grounded Question Answering in Images Yuke Zhu Oliver Groth Michael S. Bernstein Li Fei-Fei 476 957 0 11 Nov 2015
Attention to Scale: Scale-aware Semantic Image Segmentation Liang-Chieh Chen Yi Yang Jiang Wang Wei Xu Alan Yuille SSeg 261 1,368 0 10 Nov 2015
Detecting events and key actors in multi-person videos Vignesh Ramanathan Jonathan Huang Sami Abu-El-Haija Alexander N. Gorban Kevin Patrick Murphy Li Fei-Fei 202 215 0 09 Nov 2015
Neural Module Networks Jacob Andreas Marcus Rohrbach Trevor Darrell Dan Klein CoGe 537 1,126 0 09 Nov 2015
Generating Images from Captions with Attention Elman Mansimov Emilio Parisotto Jimmy Lei Ba Ruslan Salakhutdinov VLM 240 480 0 09 Nov 2015
Explicit Knowledge-based Reasoning for Visual Question Answering Peng Wang Qi Wu Chunhua Shen Anton Van Den Hengel A. Dick 201 281 0 09 Nov 2015
The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations Felix Hill Antoine Bordes S. Chopra Jason Weston RALM 593 645 0 07 Nov 2015
Generation and Comprehension of Unambiguous Object Descriptions Junhua Mao Jonathan Huang Alexander Toshev Oana-Maria Camburu Alan Yuille Kevin Patrick Murphy ObjD 636 1,547 0 07 Nov 2015
Stacked Attention Networks for Image Question Answering Zichao Yang Xiaodong He Jianfeng Gao Li Deng Alex Smola BDL 398 1,971 0 07 Nov 2015
Deep Kernel Learning A. Wilson Zhiting Hu Ruslan Salakhutdinov Eric Xing BDL 514 981 0 06 Nov 2015
RATM: Recurrent Attentive Tracking Model Samira Ebrahimi Kahou Vincent Michalski Roland Memisevic 268 85 0 29 Oct 2015
On End-to-End Program Generation from User Intention by Deep Neural Networks Lili Mou Rui Men Ge Li Jun Liu Zhi Jin 120 47 0 25 Oct 2015
Generic decoding of seen and imagined objects using hierarchical visual features T. Horikawa Y. Kamitani 163 509 0 22 Oct 2015
Multilingual Image Description with Neural Sequence Models Desmond Elliott Stella Frank Eva Hasler VLM 250 77 0 15 Oct 2015
A Diversity-Promoting Objective Function for Neural Conversation Models Jiwei Li Michel Galley Chris Brockett Jianfeng Gao W. Dolan 409 2,556 0 11 Oct 2015
SentiCap: Generating Image Descriptions with Sentiments A. Mathews Lexing Xie Xuming He 207 233 0 06 Oct 2015
Learning Wake-Sleep Recurrent Attention Models Jimmy Ba Roger C. Grosse Ruslan Salakhutdinov B. Frey BDL 162 65 0 22 Sep 2015
Reasoning about Entailment with Neural Attention Tim Rocktaschel Edward Grefenstette Karl Moritz Hermann Tomás Kociský Phil Blunsom NAI 212 772 0 22 Sep 2015
Recurrent Spatial Transformer Networks Søren Kaae Sønderby C. Sønderby Lars Maaløe Ole Winther ViT 150 51 0 17 Sep 2015
Guiding Long-Short Term Memory for Image Caption Generation Xu Jia E. Gavves Basura Fernando Tinne Tuytelaars VLM 132 102 0 16 Sep 2015
What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine AlignmentNorth American Chapter of the Association for Computational Linguistics (NAACL), 2015 Hongyuan Mei Joey Tianyi Zhou Matthew R. Walter 190 292 0 02 Sep 2015
End-to-End Attention-based Large Vocabulary Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015 Dzmitry Bahdanau J. Chorowski Dmitriy Serdyuk Philemon Brakel Yoshua Bengio 311 1,186 0 18 Aug 2015
Effective Approaches to Attention-based Neural Machine TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2015 Thang Luong Hieu H. Pham Christopher D. Manning 1.3K 8,242 0 17 Aug 2015
Listen, Attend and SpellIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015 William Chan Navdeep Jaitly Quoc V. Le Oriol Vinyals RALM 399 2,368 0 05 Aug 2015
Artificial Neural Networks Applied to Taxi Destination Prediction A. D. Brébisson Étienne Simon Alex Auvolat Pascal Vincent Yoshua Bengio 197 194 0 31 Jul 2015
Every Moment Counts: Dense Detailed Labeling of Actions in Complex VideosInternational Journal of Computer Vision (IJCV), 2015 Serena Yeung Olga Russakovsky Ning Jin Mykhaylo Andriluka Greg Mori Li Fei-Fei VLM 423 452 0 21 Jul 2015
Describing Multimedia Content using Attention-based Encoder--Decoder Networks Dong Wang Aaron Courville Yoshua Bengio 184 432 0 04 Jul 2015
Attention-Based Models for Speech Recognition J. Chorowski Dzmitry Bahdanau Dmitriy Serdyuk Dong Wang Yoshua Bengio 346 2,702 0 24 Jun 2015
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing A. Kumar Ozan Irsoy Peter Ondruska Mohit Iyyer James Bradbury Ishaan Gulrajani Victor Zhong Romain Paulus R. Socher 442 1,209 0 24 Jun 2015