Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1612.06890
Cited By
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
20 December 2016
Justin Johnson
B. Hariharan
L. V. D. van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
CoGe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning"
50 / 1,475 papers shown
Title
Information Maximizing Visual Question Generation
Ranjay Krishna
Michael S. Bernstein
Li Fei-Fei
23
95
0
27 Mar 2019
Visual Query Answering by Entity-Attribute Graph Matching and Reasoning
Peixi Xiong
Huayi Zhan
Xin Wang
Baivab Sinha
Ying Nian Wu
18
16
0
16 Mar 2019
CLEVR-Dialog: A Diagnostic Dataset for Multi-Round Reasoning in Visual Dialog
Satwik Kottur
José M. F. Moura
Devi Parikh
Dhruv Batra
Marcus Rohrbach
28
86
0
07 Mar 2019
RAVEN: A Dataset for Relational and Analogical Visual rEasoNing
Chi Zhang
Feng Gao
Baoxiong Jia
Yixin Zhu
Song-Chun Zhu
AIMat
32
304
0
07 Mar 2019
Multi-Object Representation Learning with Iterative Variational Inference
Klaus Greff
Raphael Lopez Kaufman
Rishabh Kabra
Nicholas Watters
Christopher P. Burgess
Daniel Zoran
Loic Matthey
M. Botvinick
Alexander Lerchner
OCL
SSL
42
499
0
01 Mar 2019
Learning To Follow Directions in Street View
Karl Moritz Hermann
Mateusz Malinowski
Piotr Wojciech Mirowski
Andras Banki-Horvath
Keith Anderson
R. Hadsell
SSL
26
66
0
01 Mar 2019
Answer Them All! Toward Universal Visual Question Answering Models
Robik Shrestha
Kushal Kafle
Christopher Kanan
25
82
0
01 Mar 2019
From Visual to Acoustic Question Answering
Jerome Abdelnour
G. Salvi
Jean Rouat
21
3
0
28 Feb 2019
Differentiable Scene Graphs
Moshiko Raboh
Roei Herzig
Gal Chechik
Jonathan Berant
Amir Globerson
OCL
27
34
0
26 Feb 2019
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering
Drew A. Hudson
Christopher D. Manning
CoGe
NAI
21
137
0
25 Feb 2019
MUREL: Multimodal Relational Reasoning for Visual Question Answering
Rémi Cadène
H. Ben-younes
Matthieu Cord
Nicolas Thome
LRM
19
271
0
25 Feb 2019
Can We Automate Diagrammatic Reasoning?
Sk. Arif Ahmed
D. P. Dogra
S. Kar
P. Roy
D. Prasad
10
4
0
13 Feb 2019
Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded
Ramprasaath R. Selvaraju
Stefan Lee
Yilin Shen
Hongxia Jin
Shalini Ghosh
Larry Heck
Dhruv Batra
Devi Parikh
FAtt
VLM
19
252
0
11 Feb 2019
MONet: Unsupervised Scene Decomposition and Representation
Christopher P. Burgess
Loic Matthey
Nicholas Watters
Rishabh Kabra
I. Higgins
M. Botvinick
Alexander Lerchner
OCL
33
516
0
22 Jan 2019
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
CoGe
51
322
0
20 Jan 2019
Robust Change Captioning
Dong Huk Park
Trevor Darrell
Anna Rohrbach
30
5
0
08 Jan 2019
CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions
Runtao Liu
Chenxi Liu
Yutong Bai
Alan Yuille
NAI
ObjD
22
122
0
03 Jan 2019
Generating Multiple Objects at Spatially Distinct Locations
Tobias Hinz
Stefan Heinrich
S. Wermter
18
103
0
03 Jan 2019
The meaning of "most" for visual question answering models
A. Kuhnle
Ann A. Copestake
8
4
0
31 Dec 2018
Composing Text and Image for Image Retrieval - An Empirical Odyssey
Nam S. Vo
Lu Jiang
Chen Sun
Kevin Patrick Murphy
Li-Jia Li
Li Fei-Fei
James Hays
CoGe
24
357
0
18 Dec 2018
From FiLM to Video: Multi-turn Question Answering with Multi-modal Context
T. Nguyen
Shikhar Sharma
Hannes Schulz
Layla El Asri
15
33
0
17 Dec 2018
Gold Seeker: Information Gain from Policy Distributions for Goal-oriented Vision-and-Langauge Reasoning
Ehsan Abbasnejad
Iman Abbasnejad
Qi Wu
Javen Qinfeng Shi
Anton Van Den Hengel
OffRL
33
5
0
16 Dec 2018
Chat-crowd: A Dialog-based Platform for Visual Layout Composition
Paola Cascante-Bonilla
Xuwang Yin
Vicente Ordonez
Song Feng
10
8
0
10 Dec 2018
Spatial Knowledge Distillation to aid Visual Reasoning
Somak Aditya
Rudra Saha
Yezhou Yang
Chitta Baral
34
14
0
10 Dec 2018
Learning to Assemble Neural Module Tree Networks for Visual Grounding
Daqing Liu
Hanwang Zhang
Feng Wu
Zhengjun Zha
11
266
0
08 Dec 2018
StoryGAN: A Sequential Conditional GAN for Story Visualization
Yitong Li
Zhe Gan
Yelong Shen
Jingjing Liu
Yu Cheng
Yuexin Wu
Lawrence Carin
David Carlson
Jianfeng Gao
41
226
0
06 Dec 2018
Auto-Encoding Scene Graphs for Image Captioning
Xu Yang
Kaihua Tang
Hanwang Zhang
Jianfei Cai
30
693
0
06 Dec 2018
Explainable and Explicit Visual Reasoning over Scene Graphs
Jiaxin Shi
Hanwang Zhang
Juan-Zi Li
OCL
169
230
0
05 Dec 2018
Photo-Realistic Blocksworld Dataset
Masataro Asai
13
11
0
05 Dec 2018
Generating Diverse Programs with Instruction Conditioned Reinforced Adversarial Learning
Aishwarya Agrawal
Mateusz Malinowski
Felix Hill
S. M. Ali Eslami
Oriol Vinyals
Tejas D. Kulkarni
21
4
0
03 Dec 2018
CRAVES: Controlling Robotic Arm with a Vision-based Economic System
Yiming Zuo
Weichao Qiu
Lingxi Xie
Fangwei Zhong
Yizhou Wang
Alan Yuille
9
2
0
03 Dec 2018
Adversarial Domain Randomization
Rawal Khirodkar
Kris Kitani
19
5
0
03 Dec 2018
How to Make a BLT Sandwich? Learning to Reason towards Understanding Web Instructional Videos
Shaojie Wang
Wentian Zhao
Ziyi Kou
Chenliang Xu
11
5
0
02 Dec 2018
Learning to Caption Images through a Lifetime by Asking Questions
Tingke Shen
Amlan Kar
Sanja Fidler
22
31
0
01 Dec 2018
Generating Easy-to-Understand Referring Expressions for Target Identifications
Mikihiro Tanaka
Takayuki Itamochi
Kenichi Narioka
Ikuro Sato
Yoshitaka Ushiku
Tatsuya Harada
21
1
0
29 Nov 2018
Visual Entailment Task for Visually-Grounded Language Learning
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
20
53
0
26 Nov 2018
CLEAR: A Dataset for Compositional Language and Elementary Acoustic Reasoning
Jerome Abdelnour
G. Salvi
Jean Rouat
14
14
0
26 Nov 2018
An Interpretable Model for Scene Graph Generation
Ji Zhang
Kevin J. Shih
Andrew Tao
Bryan Catanzaro
Ahmed Elgammal
GNN
28
22
0
21 Nov 2018
Early Fusion for Goal Directed Robotic Vision
Aaron Walsman
Yonatan Bisk
Saadia Gabriel
Dipendra Kumar Misra
Yoav Artzi
Yejin Choi
Dieter Fox
9
9
0
21 Nov 2018
VQA with no questions-answers training
B. Vatashsky
S. Ullman
41
12
0
20 Nov 2018
Scene Graph Generation via Conditional Random Fields
Weilin Cong
Wei Wang
Wang-Chien Lee
GNN
27
22
0
20 Nov 2018
Explicit Bias Discovery in Visual Question Answering Models
Varun Manjunatha
Nirat Saini
L. Davis
CML
FAtt
19
92
0
19 Nov 2018
Grasp2Vec: Learning Object Representations from Self-Supervised Grasping
Eric Jang
Coline Devin
Vincent Vanhoucke
Sergey Levine
SSL
26
112
0
16 Nov 2018
On transfer learning using a MAC model variant
Vincent Marois
T. S. Jayram
V. Albouy
Tomasz Kornuta
Younes Bouhadjar
A. Ozcan
DRL
26
9
0
15 Nov 2018
Holistic Multi-modal Memory Network for Movie Question Answering
Anran Wang
Anh Tuan Luu
Chuan-Sheng Foo
Erik Cambria
Yi Tay
V. Chandrasekhar
36
20
0
12 Nov 2018
Bias and Generalization in Deep Generative Models: An Empirical Study
Shengjia Zhao
Hongyu Ren
Arianna Yuan
Jiaming Song
Noah D. Goodman
Stefano Ermon
AI4CE
18
137
0
08 Nov 2018
Compositional Language Understanding with Text-based Relational Reasoning
Koustuv Sinha
Shagun Sodhani
William L. Hamilton
Xin Dang
NAI
9
3
0
07 Nov 2018
Concept Learning with Energy-Based Models
William J. Wilkinson
27
25
0
06 Nov 2018
Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning
Vasili Ramanishka
Yi-Ting Chen
Teruhisa Misu
Kate Saenko
16
277
0
06 Nov 2018
Zero-Shot Transfer VQA Dataset
Yuanpeng Li
Yi Yang
Jianyu Wang
Wei-ping Xu
19
8
0
02 Nov 2018
Previous
1
2
3
...
26
27
28
29
30
Next