Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1511.02274
Cited By
Stacked Attention Networks for Image Question Answering
7 November 2015
Zichao Yang
Xiaodong He
Jianfeng Gao
Li Deng
Alex Smola
BDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Stacked Attention Networks for Image Question Answering"
50 / 217 papers shown
Title
VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering
Ekta Sood
Fabian Kögel
Florian Strohm
Prajit Dhar
Andreas Bulling
31
19
0
27 Sep 2021
How to find a good image-text embedding for remote sensing visual question answering?
Christel Chappuis
Sylvain Lobry
B. Kellenberger
Bertrand Le Saux
D. Tuia
34
20
0
24 Sep 2021
Improving Joint Learning of Chest X-Ray and Radiology Report by Word Region Alignment
Zhanghexuan Ji
Mohammad Abuzar Shaikh
Dana Moukheiber
S. Srihari
Yifan Peng
Mingchen Gao
SSL
14
20
0
04 Sep 2021
Understanding the computational demands underlying visual reasoning
Mohit Vaishnav
Rémi Cadène
A. Alamia
Drew Linsley
Rufin VanRullen
Thomas Serre
GNN
CoGe
32
16
0
08 Aug 2021
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Mohit Bansal
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
196
405
0
13 Jul 2021
Zero-shot Visual Question Answering using Knowledge Graph
Zhuo Chen
Jiaoyan Chen
Yuxia Geng
Jeff Z. Pan
Zonggang Yuan
Huajun Chen
15
70
0
12 Jul 2021
Adventurer's Treasure Hunt: A Transparent System for Visually Grounded Compositional Visual Question Answering based on Scene Graphs
Daniel Reich
F. Putze
Tanja Schultz
22
2
0
28 Jun 2021
Attend What You Need: Motion-Appearance Synergistic Networks for Video Question Answering
Ahjeong Seo
Gi-Cheon Kang
J. Park
Byoung-Tak Zhang
13
53
0
19 Jun 2021
Measuring and Improving BERT's Mathematical Abilities by Predicting the Order of Reasoning
Piotr Pikekos
Henryk Michalewski
Mateusz Malinowski
22
28
0
07 Jun 2021
Multiple Meta-model Quantifying for Medical Visual Question Answering
Tuong Khanh Long Do
Binh X. Nguyen
Erman Tjiputra
Minh-Ngoc Tran
Quang-Dieu Tran
A. Nguyen
31
98
0
19 May 2021
InfographicVQA
Minesh Mathew
Viraj Bagal
Rubèn Pérez Tito
Dimosthenis Karatzas
Ernest Valveny
C. V. Jawahar
22
202
0
26 Apr 2021
AttWalk: Attentive Cross-Walks for Deep Mesh Analysis
Ran Ben Izhak
Alon Lahav
A. Tal
3DV
29
10
0
23 Apr 2021
Visual Navigation with Spatial Attention
Bar Mayo
Tamir Hazan
A. Tal
EgoV
19
72
0
20 Apr 2021
Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering
Corentin Dancette
Rémi Cadène
Damien Teney
Matthieu Cord
CML
28
75
0
07 Apr 2021
Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning
Zhicheng Huang
Zhaoyang Zeng
Yupan Huang
Bei Liu
Dongmei Fu
Jianlong Fu
VLM
ViT
34
271
0
07 Apr 2021
Dual Contrastive Loss and Attention for GANs
Ning Yu
Guilin Liu
Aysegül Dündar
Andrew Tao
Bryan Catanzaro
Larry S. Davis
Mario Fritz
GAN
24
60
0
31 Mar 2021
Incorporating Convolution Designs into Visual Transformers
Kun Yuan
Shaopeng Guo
Ziwei Liu
Aojun Zhou
F. Yu
Wei Wu
ViT
38
467
0
22 Mar 2021
Local Interpretations for Explainable Natural Language Processing: A Survey
Siwen Luo
Hamish Ivison
S. Han
Josiah Poon
MILM
33
48
0
20 Mar 2021
Decoupled Spatial Temporal Graphs for Generic Visual Grounding
Qi Feng
Yunchao Wei
Mingming Cheng
Yi Yang
24
5
0
18 Mar 2021
Causal Attention for Vision-Language Tasks
Xu Yang
Hanwang Zhang
Guojun Qi
Jianfei Cai
CML
28
148
0
05 Mar 2021
SLAKE: A Semantically-Labeled Knowledge-Enhanced Dataset for Medical Visual Question Answering
Bo Liu
Li-Ming Zhan
Li Xu
Lin Ma
Y. Yang
Xiao-Ming Wu
22
234
0
18 Feb 2021
Biomedical Question Answering: A Survey of Approaches and Challenges
Qiao Jin
Zheng Yuan
Guangzhi Xiong
Qian Yu
Huaiyuan Ying
Chuanqi Tan
Mosha Chen
Songfang Huang
Xiaozhong Liu
Sheng Yu
21
95
0
10 Feb 2021
Answer Questions with Right Image Regions: A Visual Attention Regularization Approach
Y. Liu
Yangyang Guo
Jianhua Yin
Xuemeng Song
Weifeng Liu
Liqiang Nie
24
28
0
03 Feb 2021
Latent Variable Models for Visual Question Answering
Zixu Wang
Yishu Miao
Lucia Specia
25
5
0
16 Jan 2021
Explainability of deep vision-based autonomous driving systems: Review and challenges
Éloi Zablocki
H. Ben-younes
P. Pérez
Matthieu Cord
XAI
37
169
0
13 Jan 2021
ORDNet: Capturing Omni-Range Dependencies for Scene Parsing
Shaofei Huang
Si Liu
Tianrui Hui
Jizhong Han
Bo-wen Li
Jiashi Feng
Shuicheng Yan
3DPC
OffRL
29
15
0
11 Jan 2021
MELINDA: A Multimodal Dataset for Biomedical Experiment Method Classification
Te-Lin Wu
Shikhar Singh
S. Paul
Gully A. Burns
Nanyun Peng
22
18
0
16 Dec 2020
WeaQA: Weak Supervision via Captions for Visual Question Answering
Pratyay Banerjee
Tejas Gokhale
Yezhou Yang
Chitta Baral
17
34
0
04 Dec 2020
ATSal: An Attention Based Architecture for Saliency Prediction in 360 Videos
Y. A. D. Djilali
M. Tliba
Kevin McGuinness
Noel E. O'Connor
33
42
0
20 Nov 2020
An Improved Attention for Visual Question Answering
Tanzila Rahman
Shih-Han Chou
Leonid Sigal
Giuseppe Carenini
13
42
0
04 Nov 2020
Deep Reinforcement Learning with Stacked Hierarchical Attention for Text-based Games
Yunqiu Xu
Meng Fang
Ling-Hao Chen
Yali Du
Joey Tianyi Zhou
Chengqi Zhang
OffRL
25
44
0
22 Oct 2020
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
Wei-Neng Chen
Weiping Wang
Li Liu
M. Lew
VLM
112
31
0
16 Oct 2020
A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and Reports
Yikuan Li
Hanyin Wang
Yuan Luo
6
63
0
03 Sep 2020
Counting from Sky: A Large-scale Dataset for Remote Sensing Object Counting and A Benchmark Method
Guangshuai Gao
Qingjie Liu
Yunhong Wang
13
53
0
28 Aug 2020
AiR: Attention with Reasoning Capability
Shi Chen
Ming Jiang
Jinhui Yang
Qi Zhao
LRM
13
36
0
28 Jul 2020
Category-Specific CNN for Visual-aware CTR Prediction at JD.com
Hu Liu
Jing Lu
Hao Yang
Xiwei Zhao
Sulong Xu
...
Zehua Zhang
Wenjie Niu
Xiaokun Zhu
Yongjun Bao
Weipeng P. Yan
9
31
0
18 Jun 2020
Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering
Zihao Zhu
J. Yu
Yujing Wang
Yajing Sun
Yue Hu
Qi Wu
19
125
0
16 Jun 2020
Large-Scale Adversarial Training for Vision-and-Language Representation Learning
Zhe Gan
Yen-Chun Chen
Linjie Li
Chen Zhu
Yu Cheng
Jingjing Liu
ObjD
VLM
26
487
0
11 Jun 2020
Estimating semantic structure for the VQA answer space
Corentin Kervadec
G. Antipov
M. Baccouche
Christian Wolf
18
4
0
10 Jun 2020
Hyperspectral Image Classification with Attention Aided CNNs
Renlong Hang
Zhu Li
Qingshan Liu
Pedram Ghamisi
Shuvra S. Bhattacharyya
7
225
0
25 May 2020
Modeling Human Visual Search Performance on Realistic Webpages Using Analytical and Deep Learning Methods
Arianna Yuan
Y. Li
HAI
17
24
0
07 May 2020
Exploring Self-attention for Image Recognition
Hengshuang Zhao
Jiaya Jia
V. Koltun
SSL
26
772
0
28 Apr 2020
Causal Interpretability for Machine Learning -- Problems, Methods and Evaluation
Raha Moraffah
Mansooreh Karami
Ruocheng Guo
A. Raglin
Huan Liu
CML
ELM
XAI
27
212
0
09 Mar 2020
Adaptive Offline Quintuplet Loss for Image-Text Matching
Tianlang Chen
Jiajun Deng
Jiebo Luo
181
68
0
07 Mar 2020
Dropout: Explicit Forms and Capacity Control
R. Arora
Peter L. Bartlett
Poorya Mianjy
Nathan Srebro
55
37
0
06 Mar 2020
RP-DNN: A Tweet level propagation context based deep neural networks for early rumor detection in Social Media
Jie Gao
Sooji Han
Xingyi Song
F. Ciravegna
8
20
0
28 Feb 2020
An Attention Transfer Model for Human-Assisted Failure Avoidance in Robot Manipulations
Boyi Song
Yu-Tang Peng
Ruijiao Luo
R. Liu
11
2
0
11 Feb 2020
Weakly Supervised Few-shot Object Segmentation using Co-Attention with Visual and Semantic Embeddings
Mennatullah Siam
Naren Doraiswamy
Boris N. Oreshkin
Hengshuai Yao
Martin Jägersand
21
8
0
26 Jan 2020
Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models
M. Farazi
Salman H. Khan
Nick Barnes
23
17
0
20 Jan 2020
Human-Aware Motion Deblurring
Ziyi Shen
Wenguan Wang
Xiankai Lu
Jianbing Shen
Haibin Ling
Tingfa Xu
Ling Shao
3DH
19
284
0
19 Jan 2020
Previous
1
2
3
4
5
Next