ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Dong Wang
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXiv (abs)PDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,580 papers shown
phi-LSTM: A Phrase-based Hierarchical LSTM Model for Image Captioning
phi-LSTM: A Phrase-based Hierarchical LSTM Model for Image CaptioningAsian Conference on Computer Vision (ACCV), 2016
Y. Tan
Chee Seng Chan
VLM
332
33
0
20 Aug 2016
RETAIN: An Interpretable Predictive Model for Healthcare using Reverse
  Time Attention Mechanism
RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention MechanismNeural Information Processing Systems (NeurIPS), 2016
Edward Choi
M. T. Bahadori
Joshua A. Kulas
A. Schuetz
Walter F. Stewart
Jimeng Sun
AI4TS
511
1,394
0
19 Aug 2016
Modeling Human Reading with Neural Attention
Modeling Human Reading with Neural AttentionConference on Empirical Methods in Natural Language Processing (EMNLP), 2016
Michael Hahn
Frank Keller
188
57
0
19 Aug 2016
Seeing with Humans: Gaze-Assisted Neural Image Captioning
Seeing with Humans: Gaze-Assisted Neural Image Captioning
Yusuke Sugano
Andreas Bulling
221
72
0
18 Aug 2016
Temporal Attention Model for Neural Machine Translation
Temporal Attention Model for Neural Machine Translation
B. Sankaran
Haitao Mi
Yaser Al-Onaizan
Abe Ittycheriah
105
63
0
09 Aug 2016
End-to-End Localization and Ranking for Relative Attributes
End-to-End Localization and Ranking for Relative Attributes
Krishna Kumar Singh
Yong Jae Lee
210
78
0
09 Aug 2016
Learning Online Alignments with Continuous Rewards Policy Gradient
Learning Online Alignments with Continuous Rewards Policy Gradient
Yuping Luo
Chung-Cheng Chiu
Navdeep Jaitly
Ilya Sutskever
OffRL
173
47
0
03 Aug 2016
Modeling Context Between Objects for Referring Expression Understanding
Modeling Context Between Objects for Referring Expression Understanding
Varun K. Nagaraja
Vlad I. Morariu
Larry S. Davis
304
230
0
01 Aug 2016
Modeling Context in Referring Expressions
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
574
1,527
0
31 Jul 2016
SPICE: Semantic Propositional Image Caption Evaluation
SPICE: Semantic Propositional Image Caption Evaluation
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
422
2,166
0
29 Jul 2016
Salient Object Subitizing
Salient Object Subitizing
Jianming Zhang
Shugao Ma
M. Sameki
Stan Sclaroff
Margrit Betke
Zhe Lin
Xiaohui Shen
Brian L. Price
R. Měch
161
118
0
26 Jul 2016
Learning Aligned Cross-Modal Representations from Weakly Aligned Data
Learning Aligned Cross-Modal Representations from Weakly Aligned Data
Lluis Castrejon
Y. Aytar
Carl Vondrick
Hamed Pirsiavash
Antonio Torralba
SSLDRLAI4TS
168
177
0
25 Jul 2016
An Actor-Critic Algorithm for Sequence Prediction
An Actor-Critic Algorithm for Sequence Prediction
Dzmitry Bahdanau
Philemon Brakel
Kelvin Xu
Anirudh Goyal
Ryan J. Lowe
Joelle Pineau
Aaron Courville
Yoshua Bengio
319
660
0
24 Jul 2016
Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition
Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition
Jun Liu
Amir Shahroudy
Dong Xu
Gang Wang
357
1,185
0
24 Jul 2016
Hierarchical Attention Network for Action Recognition in Videos
Hierarchical Attention Network for Action Recognition in Videos
Yilin Wang
Suhang Wang
Shucheng Zhou
Neil O'Hare
Yi-Ju Chang
Baoxin Li
BDL
113
84
0
21 Jul 2016
Constructing a Natural Language Inference Dataset using Generative
  Neural Networks
Constructing a Natural Language Inference Dataset using Generative Neural Networks
Janez Starc
Dunja Mladenić
206
7
0
20 Jul 2016
Visual Question Answering: A Survey of Methods and Datasets
Visual Question Answering: A Survey of Methods and Datasets
Qi Wu
Damien Teney
Peng Wang
Chunhua Shen
A. Dick
Anton Van Den Hengel
322
451
0
20 Jul 2016
HeMIS: Hetero-Modal Image Segmentation
HeMIS: Hetero-Modal Image SegmentationInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2016
Mohammad Havaei
N. Guizard
Nicolas Chapados
Yoshua Bengio
MedIm
661
303
0
18 Jul 2016
Weakly Supervised Learning of Heterogeneous Concepts in Videos
Weakly Supervised Learning of Heterogeneous Concepts in VideosEuropean Conference on Computer Vision (ECCV), 2016
Sohil Shah
K. Kulkarni
Arijit Biswas
Ankit Gandhi
Om Deshmukh
L. Davis
195
2
0
12 Jul 2016
VideoLSTM Convolves, Attends and Flows for Action Recognition
VideoLSTM Convolves, Attends and Flows for Action RecognitionComputer Vision and Image Understanding (CVIU), 2016
Zhenyang Li
E. Gavves
Mihir Jain
Cees G. M. Snoek
240
477
0
06 Jul 2016
Domain Adaptation for Neural Networks by Parameter Augmentation
Domain Adaptation for Neural Networks by Parameter Augmentation
Yusuke Watanabe
Kazuma Hashimoto
Yoshimasa Tsuruoka
OOD
158
6
0
01 Jul 2016
Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes
Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes
Çağlar Gülçehre
A. Chandar
Dong Wang
Yoshua Bengio
302
65
0
30 Jun 2016
"Show me the cup": Reference with Continuous Representations
"Show me the cup": Reference with Continuous RepresentationsConference on Intelligent Text Processing and Computational Linguistics (CICLing), 2016
Gemma Boleda
Sebastian Padó
Marco Baroni
154
3
0
28 Jun 2016
Diversified Visual Attention Networks for Fine-Grained Object
  Classification
Diversified Visual Attention Networks for Fine-Grained Object ClassificationIEEE transactions on multimedia (TMM), 2016
Bo Zhao
Xiao-Jun Wu
Jiashi Feng
Qiang Peng
Shuicheng Yan
246
377
0
28 Jun 2016
Sequence-Level Knowledge Distillation
Sequence-Level Knowledge DistillationConference on Empirical Methods in Natural Language Processing (EMNLP), 2016
Yoon Kim
Alexander M. Rush
482
1,205
0
25 Jun 2016
CUNI System for WMT16 Automatic Post-Editing and Multimodal Translation
  Tasks
CUNI System for WMT16 Automatic Post-Editing and Multimodal Translation TasksConference on Machine Translation (WMT), 2016
Jindrich Libovický
Jindřich Helcl
Marek Tlustý
Pavel Pecina
Ondrej Bojar
151
68
0
23 Jun 2016
LSTMVis: A Tool for Visual Analysis of Hidden State Dynamics in
  Recurrent Neural Networks
LSTMVis: A Tool for Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks
Hendrik Strobelt
Sebastian Gehrmann
Hanspeter Pfister
Alexander M. Rush
HAI
147
85
0
23 Jun 2016
Tagger: Deep Unsupervised Perceptual Grouping
Tagger: Deep Unsupervised Perceptual Grouping
Klaus Greff
Antti Rasmus
Mathias Berglund
T. Hao
Jürgen Schmidhuber
Harri Valpola
OCL
301
165
0
21 Jun 2016
Question Relevance in VQA: Identifying Non-Visual And False-Premise
  Questions
Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions
Arijit Ray
Gordon A. Christie
Joey Tianyi Zhou
Dhruv Batra
Devi Parikh
204
60
0
21 Jun 2016
Drawing and Recognizing Chinese Characters with Recurrent Neural Network
Drawing and Recognizing Chinese Characters with Recurrent Neural Network
Xu-Yao Zhang
Fei Yin
Yanming Zhang
Cheng-Lin Liu
Yoshua Bengio
297
342
0
21 Jun 2016
Using Visual Analytics to Interpret Predictive Machine Learning Models
Using Visual Analytics to Interpret Predictive Machine Learning Models
Josua Krause
Adam Perer
E. Bertini
HAI
148
67
0
17 Jun 2016
FVQA: Fact-based Visual Question Answering
FVQA: Fact-based Visual Question Answering
Peng Wang
Qi Wu
Chunhua Shen
Anton van den Hengel
A. Dick
CoGe
478
515
0
17 Jun 2016
Model-Agnostic Interpretability of Machine Learning
Model-Agnostic Interpretability of Machine Learning
Marco Tulio Ribeiro
Sameer Singh
Carlos Guestrin
FAttFaML
231
918
0
16 Jun 2016
A Correlational Encoder Decoder Architecture for Pivot Based Sequence
  Generation
A Correlational Encoder Decoder Architecture for Pivot Based Sequence Generation
Amrita Saha
Mitesh M. Khapra
A. Chandar
Janarthanan Rajendran
Dong Wang
147
18
0
15 Jun 2016
Unsupervised Learning of Predictors from Unpaired Input-Output Samples
Unsupervised Learning of Predictors from Unpaired Input-Output Samples
Jianshu Chen
Po-Sen Huang
Xiaodong He
Jianfeng Gao
Li Deng
OODSSL
165
8
0
15 Jun 2016
Bidirectional Long-Short Term Memory for Video Description
Bidirectional Long-Short Term Memory for Video Description
Yi Bin
Yang Yang
Zi Huang
Fumin Shen
Xing Xu
Heng Tao Shen
159
67
0
15 Jun 2016
Watch What You Just Said: Image Captioning with Text-Conditional
  Attention
Watch What You Just Said: Image Captioning with Text-Conditional Attention
Luowei Zhou
Chenliang Xu
Parker A. Koch
Jason J. Corso
VLM
202
44
0
15 Jun 2016
End-to-End Comparative Attention Networks for Person Re-identification
End-to-End Comparative Attention Networks for Person Re-identification
Hao Liu
Jiashi Feng
Meibin Qi
Jianguo Jiang
Shuicheng Yan
258
599
0
14 Jun 2016
Rationalizing Neural Predictions
Rationalizing Neural Predictions
Tao Lei
Regina Barzilay
Tommi Jaakkola
268
854
0
13 Jun 2016
Training Recurrent Answering Units with Joint Loss Minimization for VQA
Training Recurrent Answering Units with Joint Loss Minimization for VQA
Hyeonwoo Noh
Bohyung Han
221
73
0
12 Jun 2016
Natural Language Generation in Dialogue using Lexicalized and
  Delexicalized Data
Natural Language Generation in Dialogue using Lexicalized and Delexicalized Data
Shikhar Sharma
Jing He
Kaheer Suleman
Hannes Schulz
Philip Bachman
234
30
0
11 Jun 2016
Human Attention in Visual Question Answering: Do Humans and Deep
  Networks Look at the Same Regions?
Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions?
Abhishek Das
Harsh Agrawal
C. L. Zitnick
Devi Parikh
Dhruv Batra
257
479
0
11 Jun 2016
Conditional Generation and Snapshot Learning in Neural Dialogue Systems
Conditional Generation and Snapshot Learning in Neural Dialogue SystemsConference on Empirical Methods in Natural Language Processing (EMNLP), 2016
Tsung-Hsien Wen
Milica Gasic
N. Mrksic
L. Rojas-Barahona
Pei-hao Su
Stefan Ultes
David Vandyke
S. Young
196
79
0
10 Jun 2016
Sequence-to-Sequence Learning as Beam-Search Optimization
Sequence-to-Sequence Learning as Beam-Search OptimizationConference on Empirical Methods in Natural Language Processing (EMNLP), 2016
Sam Wiseman
Alexander M. Rush
363
611
0
09 Jun 2016
Progressive Attention Networks for Visual Attribute Prediction
Progressive Attention Networks for Visual Attribute Prediction
Paul Hongsuck Seo
Zhe Lin
Scott D. Cohen
Xiaohui Shen
Bohyung Han
250
42
0
08 Jun 2016
SE3-Nets: Learning Rigid Body Motion using Deep Neural Networks
SE3-Nets: Learning Rigid Body Motion using Deep Neural NetworksIEEE International Conference on Robotics and Automation (ICRA), 2016
Arunkumar Byravan
Dieter Fox
3DPC
424
277
0
08 Jun 2016
Iterative Alternating Neural Attention for Machine Reading
Iterative Alternating Neural Attention for Machine Reading
Alessandro Sordoni
Philip Bachman
Adam Trischler
Yoshua Bengio
CLLAIMat
199
122
0
07 Jun 2016
Multimodal Compact Bilinear Pooling for Visual Question Answering and
  Visual Grounding
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual GroundingConference on Empirical Methods in Natural Language Processing (EMNLP), 2016
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
599
1,543
0
06 Jun 2016
Attention Correctness in Neural Image Captioning
Attention Correctness in Neural Image CaptioningAAAI Conference on Artificial Intelligence (AAAI), 2016
Chenxi Liu
Junhua Mao
Fei Sha
Alan Yuille
3DV
216
225
0
31 May 2016
End-to-End Instance Segmentation with Recurrent Attention
End-to-End Instance Segmentation with Recurrent Attention
Mengye Ren
R. Zemel
SSeg
204
67
0
30 May 2016
Previous
123...676869707172
Next