Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1612.00837
Cited By
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
2 December 2016
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering"
50 / 1,956 papers shown
Title
Misleading Failures of Partial-input Baselines
Shi Feng
Eric Wallace
Jordan L. Boyd-Graber
17
0
0
14 May 2019
Quantifying and Alleviating the Language Prior Problem in Visual Question Answering
Yangyang Guo
Zhiyong Cheng
Liqiang Nie
Y. Liu
Yinglong Wang
Mohan S. Kankanhalli
11
36
0
13 May 2019
Language-Conditioned Graph Networks for Relational Reasoning
Ronghang Hu
Anna Rohrbach
Trevor Darrell
Kate Saenko
15
171
0
10 May 2019
TVQA+: Spatio-Temporal Grounding for Video Question Answering
Jie Lei
Licheng Yu
Tamara L. Berg
Mohit Bansal
23
227
0
25 Apr 2019
Challenges and Prospects in Vision and Language Research
Kushal Kafle
Robik Shrestha
Christopher Kanan
14
41
0
19 Apr 2019
Integrating Text and Image: Determining Multimodal Document Intent in Instagram Posts
Julia Kruk
Jonah Lubin
Karan Sikka
Xiaoyu Lin
Dan Jurafsky
Ajay Divakaran
8
94
0
19 Apr 2019
Towards VQA Models That Can Read
Amanpreet Singh
Vivek Natarajan
Meet Shah
Yu Jiang
Xinlei Chen
Dhruv Batra
Devi Parikh
Marcus Rohrbach
EgoV
11
1,111
0
18 Apr 2019
Learning to Collocate Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Jianfei Cai
9
77
0
18 Apr 2019
A Simple Baseline for Audio-Visual Scene-Aware Dialog
Idan Schwartz
A. Schwing
Tamir Hazan
16
69
0
11 Apr 2019
Reasoning Visual Dialogs with Structural and Partial Observations
Zilong Zheng
Wenguan Wang
Siyuan Qi
Song-Chun Zhu
28
117
0
11 Apr 2019
Quizbowl: The Case for Incremental Question Answering
Pedro Rodriguez
Shi Feng
Mohit Iyyer
He He
Jordan L. Boyd-Graber
4
50
0
09 Apr 2019
Revisiting EmbodiedQA: A Simple Baseline and Beyond
Yuehua Wu
Lu Jiang
Yi Yang
LM&Ro
28
30
0
08 Apr 2019
Actively Seeking and Learning from Live Data
Damien Teney
A. Hengel
OOD
27
21
0
05 Apr 2019
VQD: Visual Query Detection in Natural Scenes
Manoj Acharya
Karan Jariwala
Christopher Kanan
ObjD
16
18
0
04 Apr 2019
Context and Attribute Grounded Dense Captioning
Guojun Yin
Lu Sheng
Bin Liu
Nenghai Yu
Xiaogang Wang
Jing Shao
14
75
0
02 Apr 2019
Relation-Aware Graph Attention Network for Visual Question Answering
Linjie Li
Zhe Gan
Yu Cheng
Jingjing Liu
GNN
22
341
0
29 Mar 2019
Visual Query Answering by Entity-Attribute Graph Matching and Reasoning
Peixi Xiong
Huayi Zhan
Xin Eric Wang
Baivab Sinha
Ying Nian Wu
8
16
0
16 Mar 2019
Answer Them All! Toward Universal Visual Question Answering Models
Robik Shrestha
Kushal Kafle
Christopher Kanan
14
81
0
01 Mar 2019
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering
Drew A. Hudson
Christopher D. Manning
CoGe
NAI
11
136
0
25 Feb 2019
MUREL: Multimodal Relational Reasoning for Visual Question Answering
Rémi Cadène
H. Ben-younes
Matthieu Cord
Nicolas Thome
LRM
10
271
0
25 Feb 2019
Cycle-Consistency for Robust Visual Question Answering
Meet Shah
Xinlei Chen
Marcus Rohrbach
Devi Parikh
OOD
14
185
0
15 Feb 2019
Can We Automate Diagrammatic Reasoning?
Sk. Arif Ahmed
D. P. Dogra
S. Kar
P. Roy
D. Prasad
6
4
0
13 Feb 2019
Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded
Ramprasaath R. Selvaraju
Stefan Lee
Yilin Shen
Hongxia Jin
Shalini Ghosh
Larry Heck
Dhruv Batra
Devi Parikh
FAtt
VLM
12
250
0
11 Feb 2019
EvalAI: Towards Better Evaluation Systems for AI Agents
Deshraj Yadav
Rishabh Jain
Harsh Agrawal
Prithvijit Chattopadhyay
Taranjeet Singh
Akash Jain
Shivkaran Singh
Stefan Lee
Dhruv Batra
ELM
9
56
0
10 Feb 2019
Explanation in Human-AI Systems: A Literature Meta-Review, Synopsis of Key Ideas and Publications, and Bibliography for Explainable AI
Shane T. Mueller
R. Hoffman
W. Clancey
Abigail Emrey
Gary Klein
XAI
10
282
0
05 Feb 2019
VrR-VG: Refocusing Visually-Relevant Relationships
Yuanzhi Liang
Yalong Bai
Wei Zhang
Xueming Qian
Li Zhu
Tao Mei
3DH
14
8
0
01 Feb 2019
BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection
H. Ben-younes
Rémi Cadène
Nicolas Thome
Matthieu Cord
14
218
0
31 Jan 2019
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
CoGe
31
320
0
20 Jan 2019
Evaluating Text-to-Image Matching using Binary Image Selection (BISON)
Hexiang Hu
Ishan Misra
L. V. D. van der Maaten
8
22
0
19 Jan 2019
Response to "Visual Dialogue without Vision or Dialogue" (Massiceti et al., 2018)
Abhishek Das
Devi Parikh
Dhruv Batra
9
2
0
16 Jan 2019
CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions
Runtao Liu
Chenxi Liu
Yutong Bai
Alan Yuille
NAI
ObjD
14
120
0
03 Jan 2019
The meaning of "most" for visual question answering models
A. Kuhnle
Ann A. Copestake
6
4
0
31 Dec 2018
Scene Graph Reasoning with Prior Visual Relationship for Visual Question Answering
Zhuoqian Yang
Zengchang Qin
Jing Yu
Yue Hu
GNN
22
16
0
23 Dec 2018
From FiLM to Video: Multi-turn Question Answering with Multi-modal Context
T. Nguyen
Shikhar Sharma
Hannes Schulz
Layla El Asri
10
33
0
17 Dec 2018
Visual Social Relationship Recognition
Junnan Li
Yongkang Wong
Qi Zhao
Mohan S. Kankanhalli
27
27
0
13 Dec 2018
Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering
Peng Gao
Zhengkai Jiang
Haoxuan You
Pan Lu
Steven C. H. Hoi
Xiaogang Wang
Hongsheng Li
AIMat
8
362
0
13 Dec 2018
Learning Representations of Sets through Optimized Permutations
Yan Zhang
Jonathon S. Hare
Adam Prugel-Bennett
SSL
9
24
0
10 Dec 2018
Learning to Compose Dynamic Tree Structures for Visual Contexts
Kaihua Tang
Hanwang Zhang
Baoyuan Wu
Wenhan Luo
W. Liu
9
490
0
05 Dec 2018
Explainable and Explicit Visual Reasoning over Scene Graphs
Jiaxin Shi
Hanwang Zhang
Juan-Zi Li
OCL
152
230
0
05 Dec 2018
Learning to Explain with Complemental Examples
Atsushi Kanehira
Tatsuya Harada
8
40
0
04 Dec 2018
Multimodal Explanations by Predicting Counterfactuality in Videos
Atsushi Kanehira
Kentaro Takemoto
S. Inayoshi
Tatsuya Harada
18
35
0
04 Dec 2018
Multi-task Learning of Hierarchical Vision-Language Representation
Duy-Kien Nguyen
Takayuki Okatani
15
51
0
03 Dec 2018
Learning to Caption Images through a Lifetime by Asking Questions
Tingke Shen
Amlan Kar
Sanja Fidler
14
31
0
01 Dec 2018
From Known to the Unknown: Transferring Knowledge to Answer Questions about Novel Visual and Semantic Concepts
M. Farazi
Salman H Khan
Nick Barnes
15
13
0
30 Nov 2018
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments
Howard Chen
Alane Suhr
Dipendra Kumar Misra
Noah Snavely
Yoav Artzi
12
379
0
29 Nov 2018
From Recognition to Cognition: Visual Commonsense Reasoning
Rowan Zellers
Yonatan Bisk
Ali Farhadi
Yejin Choi
LRM
BDL
OCL
ReLM
27
864
0
27 Nov 2018
Visual Entailment Task for Visually-Grounded Language Learning
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
13
53
0
26 Nov 2018
VQA with no questions-answers training
B. Vatashsky
S. Ullman
30
12
0
20 Nov 2018
Explicit Bias Discovery in Visual Question Answering Models
Varun Manjunatha
Nirat Saini
L. Davis
CML
FAtt
19
92
0
19 Nov 2018
On transfer learning using a MAC model variant
Vincent Marois
T. S. Jayram
V. Albouy
Tomasz Kornuta
Younes Bouhadjar
A. Ozcan
DRL
16
9
0
15 Nov 2018
Previous
1
2
3
...
36
37
38
39
40
Next