ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXivPDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,508 papers shown
Title
Pain Analysis using Adaptive Hierarchical Spatiotemporal Dynamic Imaging
Pain Analysis using Adaptive Hierarchical Spatiotemporal Dynamic Imaging
Issam Serraoui
Eric Granger
Abdenour Hadid
Abdelmalik Taleb-Ahmed
18
0
0
12 Dec 2023
Medical Vision Language Pretraining: A survey
Medical Vision Language Pretraining: A survey
Prashant Shrestha
Sanskar Amgain
Bidur Khanal
Cristian A. Linte
Binod Bhattarai
VLM
32
14
0
11 Dec 2023
Deciphering 'What' and 'Where' Visual Pathways from Spectral Clustering
  of Layer-Distributed Neural Representations
Deciphering 'What' and 'Where' Visual Pathways from Spectral Clustering of Layer-Distributed Neural Representations
Xiao Zhang
David Yunis
Michael Maire
25
2
0
11 Dec 2023
PixLore: A Dataset-driven Approach to Rich Image Captioning
PixLore: A Dataset-driven Approach to Rich Image Captioning
Diego Bonilla
VLM
14
3
0
08 Dec 2023
User-Aware Prefix-Tuning is a Good Learner for Personalized Image
  Captioning
User-Aware Prefix-Tuning is a Good Learner for Personalized Image Captioning
Xuan Wang
Guanhong Wang
Wenhao Chai
Jiayu Zhou
Gaoang Wang
27
4
0
08 Dec 2023
Adaptive Dependency Learning Graph Neural Networks
Adaptive Dependency Learning Graph Neural Networks
Abishek Sriramulu
Nicolas Fourrier
Christoph Bergmeir
AI4TS
AI4CE
27
21
0
06 Dec 2023
Enhancing Image Captioning with Neural Models
Enhancing Image Captioning with Neural Models
Pooja Bhatnagar
Sai Mrunaal
Sachin Kamnure
VLM
34
0
0
01 Dec 2023
Brainformer: Mimic Human Visual Brain Functions to Machine Vision Models
  via fMRI
Brainformer: Mimic Human Visual Brain Functions to Machine Vision Models via fMRI
Xuan-Bac Nguyen
Xin Li
Pawan Sinha
Samee U. Khan
Khoa Luu
ViT
MedIm
27
0
0
30 Nov 2023
Improving Interpretation Faithfulness for Vision Transformers
Improving Interpretation Faithfulness for Vision Transformers
Lijie Hu
Yixin Liu
Ninghao Liu
Mengdi Huai
Lichao Sun
Di Wang
21
5
0
29 Nov 2023
EVCap: Retrieval-Augmented Image Captioning with External Visual-Name
  Memory for Open-World Comprehension
EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension
Jiaxuan Li
D. Vo
Akihiro Sugimoto
Hideki Nakayama
KELM
VLM
39
23
0
27 Nov 2023
Model-agnostic Body Part Relevance Assessment for Pedestrian Detection
Model-agnostic Body Part Relevance Assessment for Pedestrian Detection
Maurice Günder
Sneha Banerjee
R. Sifa
Christian Bauckhage
FAtt
16
0
0
27 Nov 2023
WsiCaption: Multiple Instance Generation of Pathology Reports for
  Gigapixel Whole-Slide Images
WsiCaption: Multiple Instance Generation of Pathology Reports for Gigapixel Whole-Slide Images
Pingyi Chen
Honglin Li
Chenglu Zhu
Sunyi Zheng
Zhongyi Shui
Lin Yang
21
5
0
27 Nov 2023
DECap: Towards Generalized Explicit Caption Editing via Diffusion
  Mechanism
DECap: Towards Generalized Explicit Caption Editing via Diffusion Mechanism
Zhen Wang
Xinyun Jiang
Jun Xiao
Tao Chen
Long Chen
DiffM
20
1
0
25 Nov 2023
Unified Medical Image Pre-training in Language-Guided Common Semantic
  Space
Unified Medical Image Pre-training in Language-Guided Common Semantic Space
Xiaoxuan He
Yifan Yang
Xinyang Jiang
Xufang Luo
Haoji Hu
Siyun Zhao
Dongsheng Li
Yuqing Yang
Lili Qiu
32
1
0
24 Nov 2023
Causality is all you need
Causality is all you need
Ning Xu
Yifei Gao
Hongshuo Tian
Yongdong Zhang
An-An Liu
31
0
0
21 Nov 2023
Identifying DNA Sequence Motifs Using Deep Learning
Identifying DNA Sequence Motifs Using Deep Learning
Asmita Poddar
Vladimir Uzun
Elizabeth Tunbridge
W. Haerty
A. Nevado-Holgado
16
0
0
20 Nov 2023
System 2 Attention (is something you might need too)
System 2 Attention (is something you might need too)
Jason Weston
Sainbayar Sukhbaatar
RALM
OffRL
LRM
22
57
0
20 Nov 2023
Violet: A Vision-Language Model for Arabic Image Captioning with Gemini
  Decoder
Violet: A Vision-Language Model for Arabic Image Captioning with Gemini Decoder
Abdelrahman Mohamed
Fakhraddin Alwajih
El Moatez Billah Nagoudi
Alcides Alcoba Inciarte
Muhammad Abdul-Mageed
VLM
MLLM
25
7
0
15 Nov 2023
The Heat is On: Thermal Facial Landmark Tracking
The Heat is On: Thermal Facial Landmark Tracking
James Baker
CVBM
14
0
0
14 Nov 2023
FIRST: A Million-Entry Dataset for Text-Driven Fashion Synthesis and
  Design
FIRST: A Million-Entry Dataset for Text-Driven Fashion Synthesis and Design
Zhen Huang
Yihao Li
Dong Pei
Jiapeng Zhou
Xuliang Ning
Jianlin Han
Xiaoguang Han
Xuejun Chen
33
3
0
13 Nov 2023
Concept-wise Fine-tuning Matters in Preventing Negative Transfer
Concept-wise Fine-tuning Matters in Preventing Negative Transfer
Yunqiao Yang
Long-Kai Huang
Ying Wei
22
2
0
12 Nov 2023
Automatic Report Generation for Histopathology images using pre-trained
  Vision Transformers
Automatic Report Generation for Histopathology images using pre-trained Vision Transformers
S. Sengupta
Donald E. Brown
VLM
MedIm
ViT
24
8
0
10 Nov 2023
FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts
FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts
Yichen Gong
Delong Ran
Jinyuan Liu
Conglei Wang
Tianshuo Cong
Anyu Wang
Sisi Duan
Xiaoyun Wang
MLLM
129
117
0
09 Nov 2023
JaSPICE: Automatic Evaluation Metric Using Predicate-Argument Structures
  for Image Captioning Models
JaSPICE: Automatic Evaluation Metric Using Predicate-Argument Structures for Image Captioning Models
Yuiga Wada
Kanta Kaneda
Komei Sugiura
23
4
0
07 Nov 2023
Scene-Driven Multimodal Knowledge Graph Construction for Embodied AI
Scene-Driven Multimodal Knowledge Graph Construction for Embodied AI
Yaoxian Song
Penglei Sun
Haoyu Liu
Li Zhixu
Wei Song
Yanghua Xiao
Xiaofang Zhou
LM&Ro
53
13
0
07 Nov 2023
Complex Organ Mask Guided Radiology Report Generation
Complex Organ Mask Guided Radiology Report Generation
Tiancheng Gu
Dongnan Liu
Zhiyuan Li
Weidong Cai
MedIm
25
14
0
04 Nov 2023
RigLSTM: Recurrent Independent Grid LSTM for Generalizable Sequence
  Learning
RigLSTM: Recurrent Independent Grid LSTM for Generalizable Sequence Learning
Ziyu Wang
Wenhao Jiang
Zixuan Zhang
Wei Tang
Junchi Yan
13
0
0
03 Nov 2023
A Systematic Evaluation of GPT-4V's Multimodal Capability for Medical
  Image Analysis
A Systematic Evaluation of GPT-4V's Multimodal Capability for Medical Image Analysis
Yingshu Li
Yunyi Liu
Zhanyu Wang
Xinyu Liang
Lei Wang
Lingqiao Liu
Leyang Cui
Zhaopeng Tu
Longyue Wang
Luping Zhou
ELM
LM&MA
32
36
0
31 Oct 2023
Causal Interpretation of Self-Attention in Pre-Trained Transformers
Causal Interpretation of Self-Attention in Pre-Trained Transformers
R. Y. Rohekar
Yaniv Gurwicz
Shami Nisimov
MILM
18
14
0
31 Oct 2023
The Expressibility of Polynomial based Attention Scheme
The Expressibility of Polynomial based Attention Scheme
Zhao-quan Song
Guangyi Xu
Junze Yin
30
5
0
30 Oct 2023
Semi-Supervised Panoptic Narrative Grounding
Semi-Supervised Panoptic Narrative Grounding
Danni Yang
Jiayi Ji
Xiaoshuai Sun
Haowei Wang
Yinan Li
Yiwei Ma
Rongrong Ji
24
5
0
27 Oct 2023
Style-Aware Radiology Report Generation with RadGraph and Few-Shot
  Prompting
Style-Aware Radiology Report Generation with RadGraph and Few-Shot Prompting
Benjamin Yan
Ruochen Liu
David E. Kuo
Subathra Adithan
Eduardo Pontes Reis
...
V. Venugopal
Chloe P. O'Connell
Agustina Saenz
Pranav Rajpurkar
Michael Moor
MedIm
19
25
0
26 Oct 2023
Cross-modal Active Complementary Learning with Self-refining
  Correspondence
Cross-modal Active Complementary Learning with Self-refining Correspondence
Yang Qin
Yuan Sun
Dezhong Peng
Joey Tianyi Zhou
Xiaocui Peng
Peng Hu
21
18
0
26 Oct 2023
FloCoDe: Unbiased Dynamic Scene Graph Generation with Temporal
  Consistency and Correlation Debiasing
FloCoDe: Unbiased Dynamic Scene Graph Generation with Temporal Consistency and Correlation Debiasing
Anant Khandelwal
20
2
0
24 Oct 2023
CPSeg: Finer-grained Image Semantic Segmentation via Chain-of-Thought
  Language Prompting
CPSeg: Finer-grained Image Semantic Segmentation via Chain-of-Thought Language Prompting
Lei Li
18
23
0
24 Oct 2023
PrivImage: Differentially Private Synthetic Image Generation using
  Diffusion Models with Semantic-Aware Pretraining
PrivImage: Differentially Private Synthetic Image Generation using Diffusion Models with Semantic-Aware Pretraining
Kecen Li
Chen Gong
Zhixiang Li
Yuzhong Zhao
Xinwen Hou
Tianhao Wang
25
10
0
19 Oct 2023
Getting aligned on representational alignment
Getting aligned on representational alignment
Ilia Sucholutsky
Lukas Muttenthaler
Adrian Weller
Andi Peng
Andreea Bobu
...
Thomas Unterthiner
Andrew Kyle Lampinen
Klaus-Robert Muller
M. Toneva
Thomas L. Griffiths
56
74
0
18 Oct 2023
Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in
  the Real World
Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World
Rujie Wu
Xiaojian Ma
Zhenliang Zhang
Wei Wang
Qing Li
Song-Chun Zhu
Yizhou Wang
LRM
VLM
27
7
0
16 Oct 2023
Few-shot Action Recognition with Captioning Foundation Models
Few-shot Action Recognition with Captioning Foundation Models
Xiang Wang
Shiwei Zhang
Hangjie Yuan
Yingya Zhang
Changxin Gao
Deli Zhao
Nong Sang
VLM
26
7
0
16 Oct 2023
Visual Question Generation in Bengali
Visual Question Generation in Bengali
Mahmud Hasan
Labiba Islam
J. Ruma
T. Mayeesha
Rashedur Rahman
19
1
0
12 Oct 2023
CLIP for Lightweight Semantic Segmentation
CLIP for Lightweight Semantic Segmentation
Ke Jin
Wankou Yang
VLM
16
1
0
11 Oct 2023
A Comparative Study of Pre-trained CNNs and GRU-Based Attention for
  Image Caption Generation
A Comparative Study of Pre-trained CNNs and GRU-Based Attention for Image Caption Generation
Rashid Khan
Bingding Huang
Haseeb Hassan
Asim Zaman
Z. Ye
21
2
0
11 Oct 2023
A Lightweight Video Anomaly Detection Model with Weak Supervision and
  Adaptive Instance Selection
A Lightweight Video Anomaly Detection Model with Weak Supervision and Adaptive Instance Selection
Yang Wang
Jiaogen Zhou
Jihong Guan
29
4
0
09 Oct 2023
Driving with LLMs: Fusing Object-Level Vector Modality for Explainable
  Autonomous Driving
Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving
Long Chen
Oleg Sinavski
Jan Hünermann
Alice Karnsund
Andrew James Willmott
Danny Birch
Daniel Maund
Jamie Shotton
MLLM
6
180
0
03 Oct 2023
Constructing Image-Text Pair Dataset from Books
Constructing Image-Text Pair Dataset from Books
Yamato Okamoto
Haruto Toyonaga
Yoshihisa Ijiri
Hirokatsu Kataoka
55
2
0
03 Oct 2023
Application of frozen large-scale models to multimodal task-oriented
  dialogue
Application of frozen large-scale models to multimodal task-oriented dialogue
Tatsuki Kawamoto
Takuma Suzuki
Ko Miyama
Takumi Meguro
Tomohiro Takagi
27
0
0
02 Oct 2023
YOLOR-Based Multi-Task Learning
YOLOR-Based Multi-Task Learning
Hung-Shuo Chang
Chien-Yao Wang
Hang Yan
Yukun Zhu
Hongpeng Liao
MoE
VLM
11
17
0
29 Sep 2023
PROSE: Predicting Operators and Symbolic Expressions using Multimodal
  Transformers
PROSE: Predicting Operators and Symbolic Expressions using Multimodal Transformers
Yuxuan Liu
Zecheng Zhang
Hayden Schaeffer
29
18
0
28 Sep 2023
XVO: Generalized Visual Odometry via Cross-Modal Self-Training
XVO: Generalized Visual Odometry via Cross-Modal Self-Training
Tohida Rehman
Ronit Mandal
Jimuyang Zhang
Debarshi Kumar Sanyal
SSL
25
17
0
28 Sep 2023
Social Media Fashion Knowledge Extraction as Captioning
Social Media Fashion Knowledge Extraction as Captioning
Yifei Yuan
Wenxuan Zhang
Yang Deng
Wai Lam
11
1
0
28 Sep 2023
Previous
123...567...697071
Next