ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Dong Wang
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXiv (abs)PDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,583 papers shown
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks
  for Image Captioning
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning
Long Chen
Hanwang Zhang
Jun Xiao
Liqiang Nie
Jian Shao
Wei Liu
Tat-Seng Chua
485
1,793
0
17 Nov 2016
Instance-aware Image and Sentence Matching with Selective Multimodal
  LSTM
Instance-aware Image and Sentence Matching with Selective Multimodal LSTM
Yan Huang
Wei Wang
Liang Wang
220
229
0
17 Nov 2016
DelugeNets: Deep Networks with Efficient and Flexible Cross-layer
  Information Inflows
DelugeNets: Deep Networks with Efficient and Flexible Cross-layer Information Inflows
Jason Kuen
Xiangfei Kong
G. Wang
Yap-Peng Tan
209
16
0
17 Nov 2016
Semantic Regularisation for Recurrent Image Annotation
Semantic Regularisation for Recurrent Image Annotation
Feng Liu
Tao Xiang
Timothy M. Hospedales
Wankou Yang
Changyin Sun
174
108
0
16 Nov 2016
A Semi-supervised Framework for Image Captioning
A Semi-supervised Framework for Image Captioning
Wenhu Chen
Aurelien Lucchi
Thomas Hofmann
218
9
0
16 Nov 2016
The Amazing Mysteries of the Gutter: Drawing Inferences Between Panels
  in Comic Book Narratives
The Amazing Mysteries of the Gutter: Drawing Inferences Between Panels in Comic Book Narratives
Mohit Iyyer
Varun Manjunatha
Anupam Guha
Yogarshi Vyas
Jordan L. Boyd-Graber
Hal Daumé
L. Davis
210
113
0
16 Nov 2016
Diversity encouraged learning of unsupervised LSTM ensemble for neural
  activity video prediction
Diversity encouraged learning of unsupervised LSTM ensemble for neural activity video prediction
Yilin Song
J. Viventi
Yao Wang
AI4TS
106
2
0
15 Nov 2016
Hierarchical Object Detection with Deep Reinforcement Learning
Hierarchical Object Detection with Deep Reinforcement Learning
Míriam Bellver
Xavier Giró-i-Nieto
F. Marqués
Jordi Torres
160
110
0
11 Nov 2016
Getting Started with Neural Models for Semantic Matching in Web Search
Getting Started with Neural Models for Semantic Matching in Web Search
Kezban Dilek Onal
I. S. Altingövde
Pinar Senkul
Maarten de Rijke
VLM3DV
155
10
0
08 Nov 2016
Memory-augmented Attention Modelling for Videos
Memory-augmented Attention Modelling for Videos
Rasool Fakoor
Abdel-rahman Mohamed
Margaret Mitchell
S. B. Kang
Pushmeet Kohli
274
20
0
07 Nov 2016
Latent Attention For If-Then Program Synthesis
Latent Attention For If-Then Program Synthesis
Xinyun Chen
Chang-rui Liu
E. C. Shin
Basel Alomair
Mingcheng Chen
124
72
0
07 Nov 2016
Hierarchical Question Answering for Long Documents
Hierarchical Question Answering for Long Documents
Eunsol Choi
D. Hewlett
Alexandre Lacoste
Illia Polosukhin
Jakob Uszkoreit
Jonathan Berant
RALM
282
170
0
06 Nov 2016
Boosting Image Captioning with Attributes
Boosting Image Captioning with Attributes
Ting Yao
Yingwei Pan
Yehao Li
Zhaofan Qiu
Tao Mei
VLM
303
650
0
05 Nov 2016
Categorical Reparameterization with Gumbel-Softmax
Categorical Reparameterization with Gumbel-Softmax
Eric Jang
S. Gu
Ben Poole
BDL
1.1K
5,977
0
03 Nov 2016
The Concrete Distribution: A Continuous Relaxation of Discrete Random
  Variables
The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables
Chris J. Maddison
A. Mnih
Yee Whye Teh
BDL
728
2,726
0
02 Nov 2016
Dual Attention Networks for Multimodal Reasoning and Matching
Dual Attention Networks for Multimodal Reasoning and Matching
Hyeonseob Nam
Jung-Woo Ha
Jeonghee Kim
237
703
0
02 Nov 2016
Phased LSTM: Accelerating Recurrent Network Training for Long or
  Event-based Sequences
Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences
Daniel Neil
Michael Pfeiffer
Shih-Chii Liu
AI4TS
223
477
0
29 Oct 2016
Professor Forcing: A New Algorithm for Training Recurrent Networks
Professor Forcing: A New Algorithm for Training Recurrent Networks
Alex Lamb
Anirudh Goyal
Ying Zhang
Saizheng Zhang
Aaron Courville
Yoshua Bengio
GAN
327
649
0
27 Oct 2016
Cross-Modal Scene Networks
Cross-Modal Scene Networks
Y. Aytar
Lluis Castrejon
Carl Vondrick
Hamed Pirsiavash
Antonio Torralba
SSL
180
117
0
27 Oct 2016
Can Active Memory Replace Attention?
Can Active Memory Replace Attention?
Lukasz Kaiser
Samy Bengio
174
60
0
27 Oct 2016
Jointly Learning to Align and Convert Graphemes to Phonemes with Neural
  Attention Models
Jointly Learning to Align and Convert Graphemes to Phonemes with Neural Attention Models
Shubham Toshniwal
Karen Livescu
128
42
0
20 Oct 2016
Lexicon Integrated CNN Models with Attention for Sentiment Analysis
Lexicon Integrated CNN Models with Attention for Sentiment Analysis
Bonggun Shin
Timothy Lee
Jinho Choi
176
117
0
20 Oct 2016
Using Fast Weights to Attend to the Recent Past
Using Fast Weights to Attend to the Recent Past
Jimmy Ba
Geoffrey E. Hinton
Volodymyr Mnih
Joel Z Leibo
Catalin Ionescu
297
303
0
20 Oct 2016
Learning Robust Video Synchronization without Annotations
Learning Robust Video Synchronization without Annotations
P. Wieschollek
Ido Freeman
Hendrik P. A. Lensch
232
7
0
19 Oct 2016
Spatio-Temporal Attention Models for Grounded Video Captioning
Spatio-Temporal Attention Models for Grounded Video Captioning
M. Zanfir
Elisabeta Marinoiu
C. Sminchisescu
231
51
0
17 Oct 2016
Recurrent 3D Attentional Networks for End-to-End Active Object
  Recognition
Recurrent 3D Attentional Networks for End-to-End Active Object RecognitionComputational Visual Media (CVM), 2016
Min Liu
Yifei Shi
Lintao Zheng
Kai Xu
Hui Huang
Dinesh Manocha
3DPC
198
10
0
14 Oct 2016
Video Fill in the Blank with Merging LSTMs
Video Fill in the Blank with Merging LSTMs
Amir Mazaheri
Dong Zhang
M. Shah
144
18
0
13 Oct 2016
Generating captions without looking beyond objects
Generating captions without looking beyond objects
Hendrik Heuer
Christof Monz
A. Smeulders
97
18
0
12 Oct 2016
Attention and Anticipation in Fast Visual-Inertial Navigation
Attention and Anticipation in Fast Visual-Inertial NavigationIEEE International Conference on Robotics and Automation (ICRA), 2016
Luca Carlone
S. Karaman
178
86
0
11 Oct 2016
Latent Sequence Decompositions
Latent Sequence DecompositionsInternational Conference on Learning Representations (ICLR), 2016
William Chan
Yu Zhang
Quoc V. Le
Navdeep Jaitly
355
62
0
10 Oct 2016
End-to-end Concept Word Detection for Video Captioning, Retrieval, and
  Question Answering
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question AnsweringComputer Vision and Pattern Recognition (CVPR), 2016
Youngjae Yu
Hyungjin Ko
Jongwook Choi
Gunhee Kim
410
239
0
10 Oct 2016
Understanding intermediate layers using linear classifier probes
Understanding intermediate layers using linear classifier probesInternational Conference on Learning Representations (ICLR), 2016
Guillaume Alain
Yoshua Bengio
FAtt
563
1,187
0
05 Oct 2016
Visual Question Answering: Datasets, Algorithms, and Future Challenges
Visual Question Answering: Datasets, Algorithms, and Future ChallengesComputer Vision and Image Understanding (CVIU), 2016
Kushal Kafle
Christopher Kanan
OOD
267
258
0
05 Oct 2016
A Survey of Multi-View Representation Learning
A Survey of Multi-View Representation Learning
Yingming Li
Ming Yang
Zhongfei Zhang
AI4TS3DV
654
587
0
03 Oct 2016
Controlling Output Length in Neural Encoder-Decoders
Controlling Output Length in Neural Encoder-Decoders
Yuta Kikuchi
Graham Neubig
Ryohei Sasano
Hiroya Takamura
Manabu Okumura
228
251
0
30 Sep 2016
Variational Autoencoder for Deep Learning of Images, Labels and Captions
Variational Autoencoder for Deep Learning of Images, Labels and Captions
Yunchen Pu
Zhe Gan
Ricardo Henao
Xin Yuan
Chunyuan Li
Andrew Stevens
Lawrence Carin
BDLCoGe
200
815
0
28 Sep 2016
Character Sequence Models for ColorfulWords
Character Sequence Models for ColorfulWords
Kazuya Kawakami
Chris Dyer
Bryan R. Routledge
Noah A. Smith
3DV
95
20
0
28 Sep 2016
Learning Language-Visual Embedding for Movie Understanding with
  Natural-Language
Learning Language-Visual Embedding for Movie Understanding with Natural-Language
Atousa Torabi
Niket Tandon
Leonid Sigal
162
106
0
26 Sep 2016
Visual Fashion-Product Search at SK Planet
Visual Fashion-Product Search at SK Planet
Taewan Kim
Seyeong Kim
Sangil Na
Hayoon Kim
Moonki Kim
Beyeongki Jeon
253
6
0
26 Sep 2016
Language as a Latent Variable: Discrete Generative Models for Sentence
  Compression
Language as a Latent Variable: Discrete Generative Models for Sentence Compression
Yishu Miao
Phil Blunsom
502
225
0
23 Sep 2016
The Color of the Cat is Gray: 1 Million Full-Sentences Visual Question
  Answering (FSVQA)
The Color of the Cat is Gray: 1 Million Full-Sentences Visual Question Answering (FSVQA)
Andrew Shin
Yoshitaka Ushiku
Tatsuya Harada
171
16
0
21 Sep 2016
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning
  Challenge
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
256
902
0
21 Sep 2016
Enhanced LSTM for Natural Language Inference
Enhanced LSTM for Natural Language Inference
Qian Chen
Xiao-Dan Zhu
Zhenhua Ling
Si Wei
Hui Jiang
Diana Inkpen
LRMReLM
532
1,173
0
20 Sep 2016
Image-to-Markup Generation with Coarse-to-Fine Attention
Image-to-Markup Generation with Coarse-to-Fine Attention
Yuntian Deng
Anssi Kanervisto
Jeffrey Ling
Alexander M. Rush
243
260
0
16 Sep 2016
Predicting Shot Making in Basketball Learnt from Adversarial Multiagent
  Trajectories
Predicting Shot Making in Basketball Learnt from Adversarial Multiagent Trajectories
Mark Harmon
Abdolghani Ebrahimi
P. Lucey
Diego Klabjan
GAN
299
21
0
15 Sep 2016
Multimodal Attention for Neural Machine Translation
Multimodal Attention for Neural Machine Translation
Ozan Caglayan
Loïc Barrault
Fethi Bougares
156
79
0
13 Sep 2016
Read, Tag, and Parse All at Once, or Fully-neural Dependency Parsing
Read, Tag, and Parse All at Once, or Fully-neural Dependency Parsing
J. Chorowski
Michal Zapotoczny
Paweł Rychlikowski
180
5
0
12 Sep 2016
The Role of Context Selection in Object Detection
The Role of Context Selection in Object Detection
Ruichi Yu
Xi Chen
Vlad I. Morariu
L. Davis
125
42
0
09 Sep 2016
Optimizing Recurrent Neural Networks Architectures under Time Constraints
Junqi Jin
Ziang Yan
Kun Fu
Nan Jiang
Changshui Zhang
248
2
0
29 Aug 2016
A Boundary Tilting Persepective on the Phenomenon of Adversarial
  Examples
A Boundary Tilting Persepective on the Phenomenon of Adversarial Examples
T. Tanay
Lewis D. Griffin
AAML
240
282
0
27 Aug 2016
Previous
123...666768...707172
Next
Page 67 of 72
Pageof 72