ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Dong Wang
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXiv (abs)PDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,576 papers shown
Title
MAST: Video Polyp Segmentation with a Mixture-Attention Siamese
  Transformer
MAST: Video Polyp Segmentation with a Mixture-Attention Siamese Transformer
Geng Chen
Junqing Yang
Xiaozhou Pu
Ge-Peng Ji
Huan Xiong
Yongsheng Pan
Hengfei Cui
Yong-quan Xia
MedImViT
170
2
0
23 Jan 2024
Unsupervised Learning of Graph from Recipes
Unsupervised Learning of Graph from Recipes
Aissatou Diallo
Antonis Bikakis
Luke Dickens
Anthony Hunter
Rob Miller
SSL
114
0
0
22 Jan 2024
Collaborative Position Reasoning Network for Referring Image
  Segmentation
Collaborative Position Reasoning Network for Referring Image Segmentation
Jianjian Cao
Beiya Dai
Yulin Li
Xiameng Qin
Jingdong Wang
257
1
0
22 Jan 2024
Spatial-temporal Forecasting for Regions without Observations
Spatial-temporal Forecasting for Regions without Observations
Xinyu Su
Jianzhong Qi
E. Tanin
Yanchuan Chang
Majid Sarvi
AI4TS
224
8
0
19 Jan 2024
Enhancing Scalability in Recommender Systems through Lottery Ticket
  Hypothesis and Knowledge Distillation-based Neural Network Pruning
Enhancing Scalability in Recommender Systems through Lottery Ticket Hypothesis and Knowledge Distillation-based Neural Network Pruning
R. Rajaram
Manoj Bharadhwaj
VS Vasan
N. Pervin
77
1
0
19 Jan 2024
Supervised Fine-tuning in turn Improves Visual Foundation Models
Supervised Fine-tuning in turn Improves Visual Foundation Models
Xiaohu Jiang
Yixiao Ge
Yuying Ge
Dachuan Shi
Chun Yuan
Ying Shan
VLMCLIP
166
12
0
18 Jan 2024
Jewelry Recognition via Encoder-Decoder Models
Jewelry Recognition via Encoder-Decoder Models
José M. Alcalde-Llergo
Enrique Yeguas-Bolivar
Andrea Zingoni
Alejandro Fuerte-Jurado
82
1
0
15 Jan 2024
Survey of Natural Language Processing for Education: Taxonomy, Systematic Review, and Future Trends
Survey of Natural Language Processing for Education: Taxonomy, Systematic Review, and Future TrendsIEEE Transactions on Knowledge and Data Engineering (TKDE), 2024
Yunshi Lan
Xinyuan Li
Hanyue Du
Xuesong Lu
Ming Gao
Weining Qian
Aoying Zhou
371
10
0
15 Jan 2024
HiCAST: Highly Customized Arbitrary Style Transfer with Adapter Enhanced
  Diffusion Models
HiCAST: Highly Customized Arbitrary Style Transfer with Adapter Enhanced Diffusion Models
Hanzhang Wang
Haoran Wang
Jinze Yang
Zhongrui Yu
Zeke Xie
Lei Tian
Xinyan Xiao
Junjun Jiang
Xianming Liu
Mingming Sun
DiffM
114
4
0
11 Jan 2024
Complementary Information Mutual Learning for Multimodality Medical
  Image Segmentation
Complementary Information Mutual Learning for Multimodality Medical Image Segmentation
Chuyun Shen
Wenhao Li
Haoqing Chen
Xiaoling Wang
Fengping Zhu
Yuxin Li
Xiangfeng Wang
Bo Jin
164
3
0
05 Jan 2024
Object-oriented backdoor attack against image captioning
Object-oriented backdoor attack against image captioning
Meiling Li
Nan Zhong
Xinpeng Zhang
Zhenxing Qian
Sheng Li
153
9
0
05 Jan 2024
Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via
  Text-Only Training
Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via Text-Only Training
Longtian Qiu
Shan Ning
Xuming He
VLM
162
11
0
04 Jan 2024
Short-Term Multi-Horizon Line Loss Rate Forecasting of a Distribution
  Network Using Attention-GCN-LSTM
Short-Term Multi-Horizon Line Loss Rate Forecasting of a Distribution Network Using Attention-GCN-LSTM
Jie Liu
Yijia Cao
Yong Li
Yixiu Guo
Wei Deng
193
1
0
19 Dec 2023
Satellite Captioning: Large Language Models to Augment Labeling
Satellite Captioning: Large Language Models to Augment Labeling
Grant Rosario
David Noever
294
0
0
18 Dec 2023
Dual Branch Network Towards Accurate Printed Mathematical Expression
  Recognition
Dual Branch Network Towards Accurate Printed Mathematical Expression RecognitionInternational Conference on Artificial Neural Networks (ICANN), 2023
Yuqing Wang
Zhenyu Weng
Zhaokun Zhou
Shuaijian Ji
Zhongjie Ye
Yuesheng Zhu
140
2
0
14 Dec 2023
See, Say, and Segment: Teaching LMMs to Overcome False Premises
See, Say, and Segment: Teaching LMMs to Overcome False PremisesComputer Vision and Pattern Recognition (CVPR), 2023
Tsung-Han Wu
Giscard Biamby
David M. Chan
Lisa Dunlap
Ritwik Gupta
Xudong Wang
Joseph E. Gonzalez
Trevor Darrell
VLMMLLM
255
33
0
13 Dec 2023
Pain Analysis using Adaptive Hierarchical Spatiotemporal Dynamic Imaging
Pain Analysis using Adaptive Hierarchical Spatiotemporal Dynamic Imaging
Issam Serraoui
Eric Granger
Abdenour Hadid
Abdelmalik Taleb-Ahmed
106
0
0
12 Dec 2023
Medical Vision Language Pretraining: A survey
Medical Vision Language Pretraining: A survey
Prashant Shrestha
Sanskar Amgain
Bidur Khanal
Cristian A. Linte
Binod Bhattarai
VLM
214
26
0
11 Dec 2023
Deciphering 'What' and 'Where' Visual Pathways from Spectral Clustering
  of Layer-Distributed Neural Representations
Deciphering 'What' and 'Where' Visual Pathways from Spectral Clustering of Layer-Distributed Neural Representations
Xiao Zhang
David Yunis
Michael Maire
140
6
0
11 Dec 2023
PixLore: A Dataset-driven Approach to Rich Image Captioning
PixLore: A Dataset-driven Approach to Rich Image Captioning
Diego Bonilla
VLM
76
0
0
08 Dec 2023
User-Aware Prefix-Tuning is a Good Learner for Personalized Image
  Captioning
User-Aware Prefix-Tuning is a Good Learner for Personalized Image CaptioningChinese Conference on Pattern Recognition and Computer Vision (CPRCV), 2023
Xuan Wang
Guanhong Wang
Wenhao Chai
Jiayu Zhou
Gaoang Wang
233
7
0
08 Dec 2023
Adaptive Dependency Learning Graph Neural Networks
Adaptive Dependency Learning Graph Neural Networks
Abishek Sriramulu
Nicolas Fourrier
Christoph Bergmeir
AI4TSAI4CE
179
27
0
06 Dec 2023
Enhancing Image Captioning with Neural Models
Enhancing Image Captioning with Neural Models
Pooja Bhatnagar
Sai Mrunaal
Sachin Kamnure
VLM
91
1
0
01 Dec 2023
Brainformer: Mimic Human Visual Brain Functions to Machine Vision Models
  via fMRI
Brainformer: Mimic Human Visual Brain Functions to Machine Vision Models via fMRI
Xuan-Bac Nguyen
Pawan Sinha
Arabinda Kumar Choudhary
Samee U. Khan
Khoa Luu
ViTMedIm
268
3
0
30 Nov 2023
Improving Interpretation Faithfulness for Vision Transformers
Improving Interpretation Faithfulness for Vision TransformersInternational Conference on Machine Learning (ICML), 2023
Lijie Hu
Yixin Liu
Ninghao Liu
Mengdi Huai
Lichao Sun
Haiyan Zhao
159
12
0
29 Nov 2023
EVCap: Retrieval-Augmented Image Captioning with External Visual-Name
  Memory for Open-World Comprehension
EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World ComprehensionComputer Vision and Pattern Recognition (CVPR), 2023
Jiaxuan Li
D. Vo
Akihiro Sugimoto
Hideki Nakayama
KELMVLM
221
40
0
27 Nov 2023
Model-agnostic Body Part Relevance Assessment for Pedestrian Detection
Model-agnostic Body Part Relevance Assessment for Pedestrian Detection
Maurice Günder
Sneha Banerjee
R. Sifa
Christian Bauckhage
FAtt
148
0
0
27 Nov 2023
WsiCaption: Multiple Instance Generation of Pathology Reports for
  Gigapixel Whole-Slide Images
WsiCaption: Multiple Instance Generation of Pathology Reports for Gigapixel Whole-Slide ImagesInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2023
Pingyi Chen
Honglin Li
Chenglu Zhu
Sunyi Zheng
Honglin Li
Lin Yang
251
15
0
27 Nov 2023
DECap: Towards Generalized Explicit Caption Editing via Diffusion
  Mechanism
DECap: Towards Generalized Explicit Caption Editing via Diffusion MechanismEuropean Conference on Computer Vision (ECCV), 2023
Zhen Wang
Xinyun Jiang
Jun Xiao
Tao Chen
Long Chen
DiffM
172
4
0
25 Nov 2023
Unified Medical Image Pre-training in Language-Guided Common Semantic
  Space
Unified Medical Image Pre-training in Language-Guided Common Semantic SpaceEuropean Conference on Computer Vision (ECCV), 2023
Xiaoxuan He
Yifan Yang
Xinyang Jiang
Xufang Luo
Haoji Hu
Siyun Zhao
Dongsheng Li
Yuqing Yang
Lili Qiu
245
5
0
24 Nov 2023
Causality is all you need
Causality is all you need
Ning Xu
Yifei Gao
Hongshuo Tian
Yongdong Zhang
An-An Liu
130
0
0
21 Nov 2023
Identifying DNA Sequence Motifs Using Deep Learning
Identifying DNA Sequence Motifs Using Deep Learning
Asmita Poddar
Vladimir Uzun
Elizabeth Tunbridge
W. Haerty
A. Nevado-Holgado
100
1
0
20 Nov 2023
System 2 Attention (is something you might need too)
System 2 Attention (is something you might need too)
Jason Weston
Sainbayar Sukhbaatar
RALMOffRLLRM
161
77
0
20 Nov 2023
Violet: A Vision-Language Model for Arabic Image Captioning with Gemini
  Decoder
Violet: A Vision-Language Model for Arabic Image Captioning with Gemini Decoder
Abdelrahman Mohamed
Fakhraddin Alwajih
El Moatez Billah Nagoudi
Alcides Alcoba Inciarte
Muhammad Abdul-Mageed
VLMMLLM
131
12
0
15 Nov 2023
The Heat is On: Thermal Facial Landmark Tracking
The Heat is On: Thermal Facial Landmark Tracking
James Baker
CVBM
96
0
0
14 Nov 2023
FIRST: A Million-Entry Dataset for Text-Driven Fashion Synthesis and
  Design
FIRST: A Million-Entry Dataset for Text-Driven Fashion Synthesis and Design
Zhen Huang
Yihao Li
Dong Pei
Jiapeng Zhou
Xuliang Ning
Jianlin Han
Xiaoguang Han
Xuejun Chen
183
3
0
13 Nov 2023
Concept-wise Fine-tuning Matters in Preventing Negative Transfer
Concept-wise Fine-tuning Matters in Preventing Negative TransferIEEE International Conference on Computer Vision (ICCV), 2023
Yunqiao Yang
Long-Kai Huang
Ying Wei
194
2
0
12 Nov 2023
Automatic Report Generation for Histopathology images using pre-trained
  Vision Transformers
Automatic Report Generation for Histopathology images using pre-trained Vision TransformersIEEE International Symposium on Biomedical Imaging (ISBI), 2023
S. Sengupta
Donald E. Brown
VLMMedImViT
145
16
0
10 Nov 2023
FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts
FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual PromptsAAAI Conference on Artificial Intelligence (AAAI), 2023
Yichen Gong
Delong Ran
Jinyuan Liu
Conglei Wang
Tianshuo Cong
Anyu Wang
Sisi Duan
Xiaoyun Wang
MLLM
569
263
0
09 Nov 2023
JaSPICE: Automatic Evaluation Metric Using Predicate-Argument Structures
  for Image Captioning Models
JaSPICE: Automatic Evaluation Metric Using Predicate-Argument Structures for Image Captioning Models
Yuiga Wada
Kanta Kaneda
Komei Sugiura
184
5
0
07 Nov 2023
Scene-Driven Multimodal Knowledge Graph Construction for Embodied AI
Scene-Driven Multimodal Knowledge Graph Construction for Embodied AIIEEE Transactions on Knowledge and Data Engineering (TKDE), 2023
Yaoxian Song
Yixiang Chen
Haoyu Liu
Li Zhixu
Wei Song
Yanghua Xiao
Xiaofang Zhou
LM&Ro
186
28
0
07 Nov 2023
Complex Organ Mask Guided Radiology Report Generation
Complex Organ Mask Guided Radiology Report GenerationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Tiancheng Gu
Dongnan Liu
Zhiyuan Li
Weidong Cai
MedIm
212
26
0
04 Nov 2023
RigLSTM: Recurrent Independent Grid LSTM for Generalizable Sequence
  Learning
RigLSTM: Recurrent Independent Grid LSTM for Generalizable Sequence Learning
Ziyu Wang
Wenhao Jiang
Zixuan Zhang
Wei Tang
Junchi Yan
153
0
0
03 Nov 2023
A Systematic Evaluation of GPT-4V's Multimodal Capability for Medical
  Image Analysis
A Systematic Evaluation of GPT-4V's Multimodal Capability for Medical Image AnalysismedRxiv (medRxiv), 2023
Yingshu Li
Yunyi Liu
Zhanyu Wang
Xinyu Liang
Lei Wang
Lingqiao Liu
Leyang Cui
Zhaopeng Tu
Longyue Wang
Luping Zhou
ELMLM&MA
260
44
0
31 Oct 2023
Causal Interpretation of Self-Attention in Pre-Trained Transformers
Causal Interpretation of Self-Attention in Pre-Trained TransformersNeural Information Processing Systems (NeurIPS), 2023
R. Y. Rohekar
Yaniv Gurwicz
Shami Nisimov
MILM
189
31
0
31 Oct 2023
The Expressibility of Polynomial based Attention Scheme
The Expressibility of Polynomial based Attention Scheme
Zhao Song
Guangyi Xu
Junze Yin
266
7
0
30 Oct 2023
Semi-Supervised Panoptic Narrative Grounding
Semi-Supervised Panoptic Narrative GroundingACM Multimedia (ACM MM), 2023
Danni Yang
Jiayi Ji
Xiaoshuai Sun
Haowei Wang
Yinan Li
Yiwei Ma
Rongrong Ji
192
5
0
27 Oct 2023
Style-Aware Radiology Report Generation with RadGraph and Few-Shot
  Prompting
Style-Aware Radiology Report Generation with RadGraph and Few-Shot PromptingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Benjamin Yan
Ruochen Liu
David E. Kuo
Subathra Adithan
Eduardo Pontes Reis
...
V. Venugopal
Chloe P. O'Connell
Agustina Saenz
Pranav Rajpurkar
Michael Moor
MedIm
215
33
0
26 Oct 2023
Cross-modal Active Complementary Learning with Self-refining
  Correspondence
Cross-modal Active Complementary Learning with Self-refining CorrespondenceNeural Information Processing Systems (NeurIPS), 2023
Yang Qin
Yuan Sun
Dezhong Peng
Qiufeng Wang
Xiaocui Peng
Peng Hu
219
32
0
26 Oct 2023
FloCoDe: Unbiased Dynamic Scene Graph Generation with Temporal
  Consistency and Correlation Debiasing
FloCoDe: Unbiased Dynamic Scene Graph Generation with Temporal Consistency and Correlation Debiasing
Anant Khandelwal
359
2
0
24 Oct 2023
Previous
123...678...707172
Next