Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1612.00837
Cited By
v1
v2
v3 (latest)
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
2 December 2016
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering"
50 / 2,273 papers shown
Title
Deep Modular Co-Attention Networks for Visual Question Answering
Computer Vision and Pattern Recognition (CVPR), 2019
Zhou Yu
Jun Yu
Yuhao Cui
Dacheng Tao
Q. Tian
281
915
0
25 Jun 2019
RUBi: Reducing Unimodal Biases in Visual Question Answering
Neural Information Processing Systems (NeurIPS), 2019
Rémi Cadène
Corentin Dancette
H. Ben-younes
Matthieu Cord
Devi Parikh
CML
266
401
0
24 Jun 2019
Investigating Biases in Textual Entailment Datasets
Shawn Tan
Songlin Yang
Chin-Wei Huang
Aaron Courville
105
8
0
23 Jun 2019
Adversarial Regularization for Visual Question Answering: Strengths, Shortcomings, and Side Effects
Gabriel Grand
Yonatan Belinkov
172
70
0
20 Jun 2019
Improving Visual Question Answering by Referring to Generated Paragraph Captions
Annual Meeting of the Association for Computational Linguistics (ACL), 2019
Hyounghun Kim
Joey Tianyi Zhou
CoGe
106
21
0
14 Jun 2019
Mimic and Fool: A Task Agnostic Adversarial Attack
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2019
Akshay Chaturvedi
Utpal Garain
AAML
110
29
0
11 Jun 2019
ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering
AAAI Conference on Artificial Intelligence (AAAI), 2019
Zhou Yu
D. Xu
Jun-chen Yu
Ting Yu
Zhou Zhao
Yueting Zhuang
Dacheng Tao
283
602
0
06 Jun 2019
Generating Question Relevant Captions to Aid Visual Question Answering
Annual Meeting of the Association for Computational Linguistics (ACL), 2019
Jialin Wu
Zeyuan Hu
Raymond J. Mooney
220
45
0
03 Jun 2019
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge
Computer Vision and Pattern Recognition (CVPR), 2019
Kenneth Marino
Mohammad Rastegari
Ali Farhadi
Roozbeh Mottaghi
550
1,347
0
31 May 2019
Scene Text Visual Question Answering
IEEE International Conference on Computer Vision (ICCV), 2019
Ali Furkan Biten
Rubèn Pérez Tito
Andrés Mafla
Lluís Gómez
Marçal Rusiñol
Ernest Valveny
C. V. Jawahar
Dimosthenis Karatzas
404
439
0
31 May 2019
What Makes Training Multi-Modal Classification Networks Hard?
Computer Vision and Pattern Recognition (CVPR), 2019
Weiyao Wang
Du Tran
Matt Feiszli
522
556
0
29 May 2019
Learning Dynamics of Attention: Human Prior for Interpretable Machine Reasoning
Neural Information Processing Systems (NeurIPS), 2019
Wonjae Kim
Yoonho Lee
191
6
0
28 May 2019
Structure Learning for Neural Module Networks
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Vardaan Pahuja
Jie Fu
Sarath Chandar
C. Pal
119
8
0
27 May 2019
Deep Reason: A Strong Baseline for Real-World Visual Reasoning
Chenfei Wu
Yanzhao Zhou
Gen Li
Nan Duan
Duyu Tang
Xiaojie Wang
LRM
NAI
ReLM
182
2
0
24 May 2019
Self-Critical Reasoning for Robust Visual Question Answering
Neural Information Processing Systems (NeurIPS), 2019
Jialin Wu
Raymond J. Mooney
OOD
NAI
213
170
0
24 May 2019
AttentionRNN: A Structured Spatial Attention Mechanism
IEEE International Conference on Computer Vision (ICCV), 2019
Siddhesh Khandelwal
Leonid Sigal
173
3
0
22 May 2019
Multimodal Transformer with Multi-View Visual Representation for Image Captioning
Jun-chen Yu
Jing Li
Zhou Yu
Qingming Huang
ViT
180
424
0
20 May 2019
SplitNet: Sim2Sim and Task2Task Transfer for Embodied Visual Navigation
IEEE International Conference on Computer Vision (ICCV), 2019
Daniel Gordon
Abhishek Kadian
Devi Parikh
Judy Hoffman
Dhruv Batra
242
79
0
18 May 2019
Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations
Neural Information Processing Systems (NeurIPS), 2019
Fenglin Liu
Yuanxin Liu
Xuancheng Ren
Xiaodong He
Xu Sun
VLM
154
90
0
15 May 2019
Misleading Failures of Partial-input Baselines
Shi Feng
Eric Wallace
Jordan L. Boyd-Graber
199
0
0
14 May 2019
Quantifying and Alleviating the Language Prior Problem in Visual Question Answering
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2019
Yangyang Guo
Zhiyong Cheng
Liqiang Nie
Zichen Liu
Yinglong Wang
Mohan Kankanhalli
178
37
0
13 May 2019
Language-Conditioned Graph Networks for Relational Reasoning
IEEE International Conference on Computer Vision (ICCV), 2019
Ronghang Hu
Anna Rohrbach
Trevor Darrell
Kate Saenko
182
182
0
10 May 2019
TVQA+: Spatio-Temporal Grounding for Video Question Answering
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
202
252
0
25 Apr 2019
Challenges and Prospects in Vision and Language Research
Kushal Kafle
Robik Shrestha
Christopher Kanan
191
42
0
19 Apr 2019
Integrating Text and Image: Determining Multimodal Document Intent in Instagram Posts
Julia Kruk
Jonah Lubin
Karan Sikka
Xiaoyu Lin
Dan Jurafsky
Ajay Divakaran
250
106
0
19 Apr 2019
Towards VQA Models That Can Read
Amanpreet Singh
Vivek Natarajan
Meet Shah
Yu Jiang
Xinlei Chen
Dhruv Batra
Devi Parikh
Marcus Rohrbach
EgoV
573
1,675
0
18 Apr 2019
Learning to Collocate Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Jianfei Cai
123
84
0
18 Apr 2019
A Simple Baseline for Audio-Visual Scene-Aware Dialog
Idan Schwartz
Alex Schwing
Tamir Hazan
186
79
0
11 Apr 2019
Reasoning Visual Dialogs with Structural and Partial Observations
Zilong Zheng
Wenguan Wang
Siyuan Qi
Song-Chun Zhu
216
118
0
11 Apr 2019
Quizbowl: The Case for Incremental Question Answering
Pedro Rodriguez
Shi Feng
Mohit Iyyer
He He
Jordan L. Boyd-Graber
230
54
0
09 Apr 2019
Revisiting EmbodiedQA: A Simple Baseline and Beyond
Yuehua Wu
Lu Jiang
Yi Yang
LM&Ro
175
33
0
08 Apr 2019
Actively Seeking and Learning from Live Data
Damien Teney
Anton Van Den Hengel
OOD
124
22
0
05 Apr 2019
VQD: Visual Query Detection in Natural Scenes
Manoj Acharya
Karan Jariwala
Christopher Kanan
ObjD
180
18
0
04 Apr 2019
Context and Attribute Grounded Dense Captioning
Guojun Yin
Lu Sheng
Bin Liu
Nenghai Yu
Xiaogang Wang
Jing Shao
131
83
0
02 Apr 2019
Relation-Aware Graph Attention Network for Visual Question Answering
Linjie Li
Zhe Gan
Yu Cheng
Jingjing Liu
GNN
423
379
0
29 Mar 2019
Visual Query Answering by Entity-Attribute Graph Matching and Reasoning
Peixi Xiong
Huayi Zhan
Xin Eric Wang
Baivab Sinha
Ying Nian Wu
139
17
0
16 Mar 2019
Answer Them All! Toward Universal Visual Question Answering Models
Computer Vision and Pattern Recognition (CVPR), 2019
Robik Shrestha
Kushal Kafle
Christopher Kanan
282
86
0
01 Mar 2019
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering
Drew A. Hudson
Christopher D. Manning
CoGe
NAI
234
147
0
25 Feb 2019
MUREL: Multimodal Relational Reasoning for Visual Question Answering
Rémi Cadène
H. Ben-younes
Matthieu Cord
Nicolas Thome
LRM
229
295
0
25 Feb 2019
Cycle-Consistency for Robust Visual Question Answering
Meet Shah
Xinlei Chen
Marcus Rohrbach
Devi Parikh
OOD
172
198
0
15 Feb 2019
Can We Automate Diagrammatic Reasoning?
Pattern Recognition (Pattern Recognit.), 2019
Sk. Arif Ahmed
D. P. Dogra
S. Kar
P. Roy
D. Prasad
140
4
0
13 Feb 2019
Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded
IEEE International Conference on Computer Vision (ICCV), 2019
Ramprasaath R. Selvaraju
Stefan Lee
Yilin Shen
Hongxia Jin
Shalini Ghosh
Larry Heck
Dhruv Batra
Devi Parikh
FAtt
VLM
251
279
0
11 Feb 2019
EvalAI: Towards Better Evaluation Systems for AI Agents
Deshraj Yadav
Rishabh Jain
Harsh Agrawal
Prithvijit Chattopadhyay
Taranjeet Singh
Akash Jain
Shivkaran Singh
Stefan Lee
Dhruv Batra
ELM
158
66
0
10 Feb 2019
Explanation in Human-AI Systems: A Literature Meta-Review, Synopsis of Key Ideas and Publications, and Bibliography for Explainable AI
Shane T. Mueller
R. Hoffman
W. Clancey
Abigail Emrey
Gary Klein
XAI
213
304
0
05 Feb 2019
VrR-VG: Refocusing Visually-Relevant Relationships
Yuanzhi Liang
Yalong Bai
Wei Zhang
Xueming Qian
Li Zhu
Tao Mei
3DH
235
8
0
01 Feb 2019
BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection
AAAI Conference on Artificial Intelligence (AAAI), 2019
H. Ben-younes
Rémi Cadène
Nicolas Thome
Matthieu Cord
316
230
0
31 Jan 2019
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
CoGe
304
346
0
20 Jan 2019
Evaluating Text-to-Image Matching using Binary Image Selection (BISON)
Hexiang Hu
Ishan Misra
Laurens van der Maaten
162
24
0
19 Jan 2019
Response to "Visual Dialogue without Vision or Dialogue" (Massiceti et al., 2018)
Abhishek Das
Devi Parikh
Dhruv Batra
68
2
0
16 Jan 2019
CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions
Runtao Liu
Chenxi Liu
Yutong Bai
Alan Yuille
NAI
ObjD
317
141
0
03 Jan 2019
Previous
1
2
3
...
42
43
44
45
46
Next