Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.03044
Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"
50 / 3,508 papers shown
Title
Pain Analysis using Adaptive Hierarchical Spatiotemporal Dynamic Imaging
Issam Serraoui
Eric Granger
Abdenour Hadid
Abdelmalik Taleb-Ahmed
18
0
0
12 Dec 2023
Medical Vision Language Pretraining: A survey
Prashant Shrestha
Sanskar Amgain
Bidur Khanal
Cristian A. Linte
Binod Bhattarai
VLM
32
14
0
11 Dec 2023
Deciphering 'What' and 'Where' Visual Pathways from Spectral Clustering of Layer-Distributed Neural Representations
Xiao Zhang
David Yunis
Michael Maire
25
2
0
11 Dec 2023
PixLore: A Dataset-driven Approach to Rich Image Captioning
Diego Bonilla
VLM
14
3
0
08 Dec 2023
User-Aware Prefix-Tuning is a Good Learner for Personalized Image Captioning
Xuan Wang
Guanhong Wang
Wenhao Chai
Jiayu Zhou
Gaoang Wang
27
4
0
08 Dec 2023
Adaptive Dependency Learning Graph Neural Networks
Abishek Sriramulu
Nicolas Fourrier
Christoph Bergmeir
AI4TS
AI4CE
27
21
0
06 Dec 2023
Enhancing Image Captioning with Neural Models
Pooja Bhatnagar
Sai Mrunaal
Sachin Kamnure
VLM
34
0
0
01 Dec 2023
Brainformer: Mimic Human Visual Brain Functions to Machine Vision Models via fMRI
Xuan-Bac Nguyen
Xin Li
Pawan Sinha
Samee U. Khan
Khoa Luu
ViT
MedIm
27
0
0
30 Nov 2023
Improving Interpretation Faithfulness for Vision Transformers
Lijie Hu
Yixin Liu
Ninghao Liu
Mengdi Huai
Lichao Sun
Di Wang
21
5
0
29 Nov 2023
EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension
Jiaxuan Li
D. Vo
Akihiro Sugimoto
Hideki Nakayama
KELM
VLM
39
23
0
27 Nov 2023
Model-agnostic Body Part Relevance Assessment for Pedestrian Detection
Maurice Günder
Sneha Banerjee
R. Sifa
Christian Bauckhage
FAtt
16
0
0
27 Nov 2023
WsiCaption: Multiple Instance Generation of Pathology Reports for Gigapixel Whole-Slide Images
Pingyi Chen
Honglin Li
Chenglu Zhu
Sunyi Zheng
Zhongyi Shui
Lin Yang
21
5
0
27 Nov 2023
DECap: Towards Generalized Explicit Caption Editing via Diffusion Mechanism
Zhen Wang
Xinyun Jiang
Jun Xiao
Tao Chen
Long Chen
DiffM
20
1
0
25 Nov 2023
Unified Medical Image Pre-training in Language-Guided Common Semantic Space
Xiaoxuan He
Yifan Yang
Xinyang Jiang
Xufang Luo
Haoji Hu
Siyun Zhao
Dongsheng Li
Yuqing Yang
Lili Qiu
32
1
0
24 Nov 2023
Causality is all you need
Ning Xu
Yifei Gao
Hongshuo Tian
Yongdong Zhang
An-An Liu
31
0
0
21 Nov 2023
Identifying DNA Sequence Motifs Using Deep Learning
Asmita Poddar
Vladimir Uzun
Elizabeth Tunbridge
W. Haerty
A. Nevado-Holgado
16
0
0
20 Nov 2023
System 2 Attention (is something you might need too)
Jason Weston
Sainbayar Sukhbaatar
RALM
OffRL
LRM
22
57
0
20 Nov 2023
Violet: A Vision-Language Model for Arabic Image Captioning with Gemini Decoder
Abdelrahman Mohamed
Fakhraddin Alwajih
El Moatez Billah Nagoudi
Alcides Alcoba Inciarte
Muhammad Abdul-Mageed
VLM
MLLM
25
7
0
15 Nov 2023
The Heat is On: Thermal Facial Landmark Tracking
James Baker
CVBM
14
0
0
14 Nov 2023
FIRST: A Million-Entry Dataset for Text-Driven Fashion Synthesis and Design
Zhen Huang
Yihao Li
Dong Pei
Jiapeng Zhou
Xuliang Ning
Jianlin Han
Xiaoguang Han
Xuejun Chen
33
3
0
13 Nov 2023
Concept-wise Fine-tuning Matters in Preventing Negative Transfer
Yunqiao Yang
Long-Kai Huang
Ying Wei
22
2
0
12 Nov 2023
Automatic Report Generation for Histopathology images using pre-trained Vision Transformers
S. Sengupta
Donald E. Brown
VLM
MedIm
ViT
24
8
0
10 Nov 2023
FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts
Yichen Gong
Delong Ran
Jinyuan Liu
Conglei Wang
Tianshuo Cong
Anyu Wang
Sisi Duan
Xiaoyun Wang
MLLM
129
117
0
09 Nov 2023
JaSPICE: Automatic Evaluation Metric Using Predicate-Argument Structures for Image Captioning Models
Yuiga Wada
Kanta Kaneda
Komei Sugiura
23
4
0
07 Nov 2023
Scene-Driven Multimodal Knowledge Graph Construction for Embodied AI
Yaoxian Song
Penglei Sun
Haoyu Liu
Li Zhixu
Wei Song
Yanghua Xiao
Xiaofang Zhou
LM&Ro
53
13
0
07 Nov 2023
Complex Organ Mask Guided Radiology Report Generation
Tiancheng Gu
Dongnan Liu
Zhiyuan Li
Weidong Cai
MedIm
25
14
0
04 Nov 2023
RigLSTM: Recurrent Independent Grid LSTM for Generalizable Sequence Learning
Ziyu Wang
Wenhao Jiang
Zixuan Zhang
Wei Tang
Junchi Yan
13
0
0
03 Nov 2023
A Systematic Evaluation of GPT-4V's Multimodal Capability for Medical Image Analysis
Yingshu Li
Yunyi Liu
Zhanyu Wang
Xinyu Liang
Lei Wang
Lingqiao Liu
Leyang Cui
Zhaopeng Tu
Longyue Wang
Luping Zhou
ELM
LM&MA
32
36
0
31 Oct 2023
Causal Interpretation of Self-Attention in Pre-Trained Transformers
R. Y. Rohekar
Yaniv Gurwicz
Shami Nisimov
MILM
18
14
0
31 Oct 2023
The Expressibility of Polynomial based Attention Scheme
Zhao-quan Song
Guangyi Xu
Junze Yin
30
5
0
30 Oct 2023
Semi-Supervised Panoptic Narrative Grounding
Danni Yang
Jiayi Ji
Xiaoshuai Sun
Haowei Wang
Yinan Li
Yiwei Ma
Rongrong Ji
24
5
0
27 Oct 2023
Style-Aware Radiology Report Generation with RadGraph and Few-Shot Prompting
Benjamin Yan
Ruochen Liu
David E. Kuo
Subathra Adithan
Eduardo Pontes Reis
...
V. Venugopal
Chloe P. O'Connell
Agustina Saenz
Pranav Rajpurkar
Michael Moor
MedIm
19
25
0
26 Oct 2023
Cross-modal Active Complementary Learning with Self-refining Correspondence
Yang Qin
Yuan Sun
Dezhong Peng
Joey Tianyi Zhou
Xiaocui Peng
Peng Hu
21
18
0
26 Oct 2023
FloCoDe: Unbiased Dynamic Scene Graph Generation with Temporal Consistency and Correlation Debiasing
Anant Khandelwal
20
2
0
24 Oct 2023
CPSeg: Finer-grained Image Semantic Segmentation via Chain-of-Thought Language Prompting
Lei Li
18
23
0
24 Oct 2023
PrivImage: Differentially Private Synthetic Image Generation using Diffusion Models with Semantic-Aware Pretraining
Kecen Li
Chen Gong
Zhixiang Li
Yuzhong Zhao
Xinwen Hou
Tianhao Wang
25
10
0
19 Oct 2023
Getting aligned on representational alignment
Ilia Sucholutsky
Lukas Muttenthaler
Adrian Weller
Andi Peng
Andreea Bobu
...
Thomas Unterthiner
Andrew Kyle Lampinen
Klaus-Robert Muller
M. Toneva
Thomas L. Griffiths
56
74
0
18 Oct 2023
Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World
Rujie Wu
Xiaojian Ma
Zhenliang Zhang
Wei Wang
Qing Li
Song-Chun Zhu
Yizhou Wang
LRM
VLM
27
7
0
16 Oct 2023
Few-shot Action Recognition with Captioning Foundation Models
Xiang Wang
Shiwei Zhang
Hangjie Yuan
Yingya Zhang
Changxin Gao
Deli Zhao
Nong Sang
VLM
26
7
0
16 Oct 2023
Visual Question Generation in Bengali
Mahmud Hasan
Labiba Islam
J. Ruma
T. Mayeesha
Rashedur Rahman
19
1
0
12 Oct 2023
CLIP for Lightweight Semantic Segmentation
Ke Jin
Wankou Yang
VLM
16
1
0
11 Oct 2023
A Comparative Study of Pre-trained CNNs and GRU-Based Attention for Image Caption Generation
Rashid Khan
Bingding Huang
Haseeb Hassan
Asim Zaman
Z. Ye
21
2
0
11 Oct 2023
A Lightweight Video Anomaly Detection Model with Weak Supervision and Adaptive Instance Selection
Yang Wang
Jiaogen Zhou
Jihong Guan
29
4
0
09 Oct 2023
Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving
Long Chen
Oleg Sinavski
Jan Hünermann
Alice Karnsund
Andrew James Willmott
Danny Birch
Daniel Maund
Jamie Shotton
MLLM
6
180
0
03 Oct 2023
Constructing Image-Text Pair Dataset from Books
Yamato Okamoto
Haruto Toyonaga
Yoshihisa Ijiri
Hirokatsu Kataoka
55
2
0
03 Oct 2023
Application of frozen large-scale models to multimodal task-oriented dialogue
Tatsuki Kawamoto
Takuma Suzuki
Ko Miyama
Takumi Meguro
Tomohiro Takagi
27
0
0
02 Oct 2023
YOLOR-Based Multi-Task Learning
Hung-Shuo Chang
Chien-Yao Wang
Hang Yan
Yukun Zhu
Hongpeng Liao
MoE
VLM
11
17
0
29 Sep 2023
PROSE: Predicting Operators and Symbolic Expressions using Multimodal Transformers
Yuxuan Liu
Zecheng Zhang
Hayden Schaeffer
29
18
0
28 Sep 2023
XVO: Generalized Visual Odometry via Cross-Modal Self-Training
Tohida Rehman
Ronit Mandal
Jimuyang Zhang
Debarshi Kumar Sanyal
SSL
25
17
0
28 Sep 2023
Social Media Fashion Knowledge Extraction as Captioning
Yifei Yuan
Wenxuan Zhang
Yang Deng
Wai Lam
11
1
0
28 Sep 2023
Previous
1
2
3
...
5
6
7
...
69
70
71
Next