ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1511.02274
  4. Cited By
Stacked Attention Networks for Image Question Answering

Stacked Attention Networks for Image Question Answering

7 November 2015
Zichao Yang
Xiaodong He
Jianfeng Gao
Li Deng
Alex Smola
    BDL
ArXivPDFHTML

Papers citing "Stacked Attention Networks for Image Question Answering"

50 / 217 papers shown
Title
A Multimodal Target-Source Classifier with Attention Branches to
  Understand Ambiguous Instructions for Fetching Daily Objects
A Multimodal Target-Source Classifier with Attention Branches to Understand Ambiguous Instructions for Fetching Daily Objects
A. Magassouba
K. Sugiura
Hisashi Kawai
38
9
0
23 Dec 2019
Towards Causal VQA: Revealing and Reducing Spurious Correlations by
  Invariant and Covariant Semantic Editing
Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing
Vedika Agarwal
Rakshith Shetty
Mario Fritz
CML
AAML
21
155
0
16 Dec 2019
A Real-time Global Inference Network for One-stage Referring Expression
  Comprehension
A Real-time Global Inference Network for One-stage Referring Expression Comprehension
Yiyi Zhou
Rongrong Ji
Gen Luo
Xiaoshuai Sun
Jinsong Su
Xinghao Ding
Chia-Wen Lin
Q. Tian
ObjD
24
60
0
07 Dec 2019
Towards Making Deep Transfer Learning Never Hurt
Towards Making Deep Transfer Learning Never Hurt
Ruosi Wan
Haoyi Xiong
Xingjian Li
Zhanxing Zhu
Jun Huan
14
21
0
18 Nov 2019
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in
  Visual Dialogue
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue
X. Jiang
J. Yu
Zengchang Qin
Yingying Zhuang
Xingxing Zhang
Yue Hu
Qi Wu
17
70
0
17 Nov 2019
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning
  Baselines
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines
Jingxiang Lin
Unnat Jain
A. Schwing
LRM
ReLM
31
9
0
31 Oct 2019
Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation
Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation
Risto Vuorio
Shao-Hua Sun
Hexiang Hu
Joseph J. Lim
27
219
0
30 Oct 2019
Automatic Reminiscence Therapy for Dementia
Automatic Reminiscence Therapy for Dementia
Mariona Carós
M. Garolera
P. Radeva
Xavier Giró-i-Nieto
14
40
0
25 Oct 2019
Unsupervised High-Resolution Depth Learning From Videos With Dual
  Networks
Unsupervised High-Resolution Depth Learning From Videos With Dual Networks
Junsheng Zhou
Yuwang Wang
K. Qin
Wenjun Zeng
MDE
21
71
0
20 Oct 2019
Cross Attention Network for Few-shot Classification
Cross Attention Network for Few-shot Classification
Rui Hou
Hong Chang
Bingpeng Ma
Shiguang Shan
Xilin Chen
204
629
0
17 Oct 2019
Multi-modal Deep Analysis for Multimedia
Multi-modal Deep Analysis for Multimedia
Wenwu Zhu
Xin Eric Wang
Hongzhi Li
19
38
0
11 Oct 2019
REMIND Your Neural Network to Prevent Catastrophic Forgetting
REMIND Your Neural Network to Prevent Catastrophic Forgetting
Tyler L. Hayes
Kushal Kafle
Robik Shrestha
Manoj Acharya
Christopher Kanan
CLL
29
294
0
06 Oct 2019
Compact Trilinear Interaction for Visual Question Answering
Compact Trilinear Interaction for Visual Question Answering
Tuong Khanh Long Do
Thanh-Toan Do
Huy Tran
Erman Tjiputra
Quang-Dieu Tran
28
59
0
26 Sep 2019
Overcoming Data Limitation in Medical Visual Question Answering
Overcoming Data Limitation in Medical Visual Question Answering
Binh Duc Nguyen
Thanh-Toan Do
Binh X. Nguyen
Tuong Khanh Long Do
Erman Tjiputra
Quang-Dieu Tran
MedIm
13
145
0
26 Sep 2019
CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval
CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval
Zihao W. Wang
Xihui Liu
Hongsheng Li
Lu Sheng
Junjie Yan
Xiaogang Wang
Jing Shao
VLM
23
299
0
12 Sep 2019
Probabilistic framework for solving Visual Dialog
Probabilistic framework for solving Visual Dialog
Badri N. Patro
Anupriy
Vinay P. Namboodiri
BDL
24
13
0
11 Sep 2019
PlotQA: Reasoning over Scientific Plots
PlotQA: Reasoning over Scientific Plots
Nitesh Methani
Pritha Ganguly
Mitesh M. Khapra
Pratyush Kumar
24
206
0
03 Sep 2019
Attention on Attention for Image Captioning
Attention on Attention for Image Captioning
Lun Huang
Wenmin Wang
Jie Chen
Xiao-Yong Wei
22
823
0
19 Aug 2019
SPA-GAN: Spatial Attention GAN for Image-to-Image Translation
SPA-GAN: Spatial Attention GAN for Image-to-Image Translation
H. Emami
Majid Moradi Aliabadi
Ming Dong
R. Chinnam
GAN
23
168
0
19 Aug 2019
What is needed for simple spatial language capabilities in VQA?
What is needed for simple spatial language capabilities in VQA?
A. Kuhnle
Ann A. Copestake
CoGe
18
1
0
17 Aug 2019
U-CAM: Visual Explanation using Uncertainty based Class Activation Maps
U-CAM: Visual Explanation using Uncertainty based Class Activation Maps
Badri N. Patro
Mayank Lunayach
Shivansh Patel
Vinay P. Namboodiri
FAtt
UQCV
21
76
0
17 Aug 2019
VideoNavQA: Bridging the Gap between Visual and Embodied Question
  Answering
VideoNavQA: Bridging the Gap between Visual and Embodied Question Answering
Cătălina Cangea
Eugene Belilovsky
Pietro Lió
Aaron Courville
16
16
0
14 Aug 2019
VisualBERT: A Simple and Performant Baseline for Vision and Language
VisualBERT: A Simple and Performant Baseline for Vision and Language
Liunian Harold Li
Mark Yatskar
Da Yin
Cho-Jui Hsieh
Kai-Wei Chang
VLM
35
1,912
0
09 Aug 2019
Question-Agnostic Attention for Visual Question Answering
Question-Agnostic Attention for Visual Question Answering
M. Farazi
Salman H Khan
Nick Barnes
13
10
0
09 Aug 2019
An Empirical Study on Leveraging Scene Graphs for Visual Question
  Answering
An Empirical Study on Leveraging Scene Graphs for Visual Question Answering
Cheng Zhang
Wei-Lun Chao
D. Xuan
23
50
0
28 Jul 2019
Compact Global Descriptor for Neural Networks
Xiangyu He
Ke Cheng
Qiang Chen
Qinghao Hu
Peisong Wang
Jian Cheng
31
8
0
23 Jul 2019
Gated Recurrent Neural Network Approach for Multilabel Emotion Detection
  in Microblogs
Gated Recurrent Neural Network Approach for Multilabel Emotion Detection in Microblogs
Prabod Rathnayaka
Supun Abeysinghe
Chamod Samarajeewa
Isura Manchanayake
M. Walpola
Rashmika Nawaratne
T. Bandaragoda
D. Alahakoon
11
21
0
17 Jul 2019
RUBi: Reducing Unimodal Biases in Visual Question Answering
RUBi: Reducing Unimodal Biases in Visual Question Answering
Rémi Cadène
Corentin Dancette
H. Ben-younes
Matthieu Cord
Devi Parikh
CML
19
368
0
24 Jun 2019
Self-Critical Reasoning for Robust Visual Question Answering
Self-Critical Reasoning for Robust Visual Question Answering
Jialin Wu
Raymond J. Mooney
OOD
NAI
24
159
0
24 May 2019
Aggregation Cross-Entropy for Sequence Recognition
Aggregation Cross-Entropy for Sequence Recognition
Zecheng Xie
Yaoxiong Huang
Yuanzhi Zhu
Lianwen Jin
Yuliang Liu
Lele Xie
17
92
0
17 Apr 2019
DSTP-RNN: a dual-stage two-phase attention-based recurrent neural
  networks for long-term and multivariate time series prediction
DSTP-RNN: a dual-stage two-phase attention-based recurrent neural networks for long-term and multivariate time series prediction
Yeqi Liu
Chuanyang Gong
Ling Yang
Yingyi Chen
AI4TS
19
305
0
16 Apr 2019
MAANet: Multi-view Aware Attention Networks for Image Super-Resolution
MAANet: Multi-view Aware Attention Networks for Image Super-Resolution
Jingcai Guo
Shiheng Ma
Song Guo
SupR
11
5
0
12 Apr 2019
Factor Graph Attention
Factor Graph Attention
Idan Schwartz
Seunghak Yu
Tamir Hazan
A. Schwing
19
110
0
11 Apr 2019
A Simple Baseline for Audio-Visual Scene-Aware Dialog
A Simple Baseline for Audio-Visual Scene-Aware Dialog
Idan Schwartz
A. Schwing
Tamir Hazan
19
69
0
11 Apr 2019
Reasoning Visual Dialogs with Structural and Partial Observations
Reasoning Visual Dialogs with Structural and Partial Observations
Zilong Zheng
Wenguan Wang
Siyuan Qi
Song-Chun Zhu
39
117
0
11 Apr 2019
Multi-vision Attention Networks for On-line Red Jujube Grading
Multi-vision Attention Networks for On-line Red Jujube Grading
Xiaoye Sun
Liyan Ma
Gongyang Li
9
9
0
31 Mar 2019
Spatiotemporal Pyramid Network for Video Action Recognition
Spatiotemporal Pyramid Network for Video Action Recognition
Yunbo Wang
Mingsheng Long
Jianmin Wang
Philip S. Yu
24
226
0
04 Mar 2019
Improving Referring Expression Grounding with Cross-modal
  Attention-guided Erasing
Improving Referring Expression Grounding with Cross-modal Attention-guided Erasing
Xihui Liu
Zihao W. Wang
Jing Shao
Xiaogang Wang
Hongsheng Li
ObjD
19
180
0
03 Mar 2019
Learning To Follow Directions in Street View
Learning To Follow Directions in Street View
Karl Moritz Hermann
Mateusz Malinowski
Piotr Wojciech Mirowski
Andras Banki-Horvath
Keith Anderson
R. Hadsell
SSL
16
66
0
01 Mar 2019
Answer Them All! Toward Universal Visual Question Answering Models
Answer Them All! Toward Universal Visual Question Answering Models
Robik Shrestha
Kushal Kafle
Christopher Kanan
17
82
0
01 Mar 2019
MUREL: Multimodal Relational Reasoning for Visual Question Answering
MUREL: Multimodal Relational Reasoning for Visual Question Answering
Rémi Cadène
H. Ben-younes
Matthieu Cord
Nicolas Thome
LRM
19
271
0
25 Feb 2019
Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog
Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog
Zhe Gan
Yu Cheng
Ahmed El Kholy
Linjie Li
Jingjing Liu
Jianfeng Gao
11
104
0
01 Feb 2019
Learning Spatial Pyramid Attentive Pooling in Image Synthesis and
  Image-to-Image Translation
Learning Spatial Pyramid Attentive Pooling in Image Synthesis and Image-to-Image Translation
Wei Sun
Tianfu Wu
13
13
0
18 Jan 2019
Toward Multimodal Model-Agnostic Meta-Learning
Toward Multimodal Model-Agnostic Meta-Learning
Risto Vuorio
Shao-Hua Sun
Hexiang Hu
Joseph J. Lim
47
31
0
18 Dec 2018
PiCANet: Pixel-wise Contextual Attention Learning for Accurate Saliency
  Detection
PiCANet: Pixel-wise Contextual Attention Learning for Accurate Saliency Detection
Nian Liu
Junwei Han
Ming-Hsuan Yang
SSeg
33
99
0
15 Dec 2018
Selective Feature Connection Mechanism: Concatenating Multi-layer CNN
  Features with a Feature Selector
Selective Feature Connection Mechanism: Concatenating Multi-layer CNN Features with a Feature Selector
Chen Du
Chunheng Wang
Yanna Wang
Cunzhao Shi
Baihua Xiao
19
42
0
15 Nov 2018
Semantic Aware Attention Based Deep Object Co-segmentation
Semantic Aware Attention Based Deep Object Co-segmentation
Hong Chen
Yifei Huang
Hideki Nakayama
SSeg
13
73
0
16 Oct 2018
Overcoming Language Priors in Visual Question Answering with Adversarial
  Regularization
Overcoming Language Priors in Visual Question Answering with Adversarial Regularization
S. Ramakrishnan
Aishwarya Agrawal
Stefan Lee
AAML
20
235
0
08 Oct 2018
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language
  Understanding
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
Kexin Yi
Jiajun Wu
Chuang Gan
Antonio Torralba
Pushmeet Kohli
J. Tenenbaum
NAI
32
595
0
04 Oct 2018
How clever is the FiLM model, and how clever can it be?
How clever is the FiLM model, and how clever can it be?
A. Kuhnle
Huiyuan Xie
Ann A. Copestake
16
6
0
09 Sep 2018
Previous
12345
Next