Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1612.06890
Cited By
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
20 December 2016
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
CoGe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning"
50 / 1,475 papers shown
Title
Neuro-Symbolic Spatio-Temporal Reasoning
Pascal Hitzler
Michael Sioutis
Md Kamruzzaman Sarker
Marjan Alirezaie
Aaron Eberhart
Stefan Wermter
NAI
28
0
0
28 Nov 2022
Pitfalls of Conditional Batch Normalization for Contextual Multi-Modal Learning
Ivaxi Sheth
A. Rahman
Mohammad Havaei
Samira Ebrahimi Kahou
11
1
0
28 Nov 2022
Target-Free Text-guided Image Manipulation
Wanshu Fan
Cheng Yang
Chiao-An Yang
Yu-Chiang Frank Wang
DiffM
31
2
0
26 Nov 2022
TPA-Net: Generate A Dataset for Text to Physics-based Animation
Yuxing Qiu
Feng Gao
Minchen Li
Govind Thattai
Yin Yang
Chenfanfu Jiang
PINN
DiffM
VGen
49
0
0
25 Nov 2022
Discovering Generalizable Spatial Goal Representations via Graph-based Active Reward Learning
Aviv Netanyahu
Tianmin Shu
J. Tenenbaum
Pulkit Agrawal
32
5
0
24 Nov 2022
Make-A-Story: Visual Memory Conditioned Consistent Story Generation
Tanzila Rahman
Hsin-Ying Lee
Jian Ren
Sergey Tulyakov
Shweta Mahajan
Leonid Sigal
DiffM
21
68
0
23 Nov 2022
Look, Read and Ask: Learning to Ask Questions by Reading Text in Images
Soumya Jahagirdar
Shankar Gangisetty
Anand Mishra
30
4
0
23 Nov 2022
Towards Human-Interpretable Prototypes for Visual Assessment of Image Classification Models
Poulami Sinhamahapatra
Lena Heidemann
Maureen Monnet
Karsten Roscher
50
5
0
22 Nov 2022
ONeRF: Unsupervised 3D Object Segmentation from Multiple Views
Sheng-Ming Liang
Yichen Liu
Shangzhe Wu
Yu-Wing Tai
Chi-Keung Tang
43
7
0
22 Nov 2022
A Short Survey of Systematic Generalization
Yuanpeng Li
AI4CE
45
1
0
22 Nov 2022
Neural Meta-Symbolic Reasoning and Learning
Zihan Ye
Hikaru Shindo
Devendra Singh Dhami
Kristian Kersting
NAI
LRM
28
1
0
21 Nov 2022
Compositional Scene Modeling with Global Object-Centric Representations
Tonglin Chen
Bin Li
Zhimeng Shen
Xiangyang Xue
OCL
22
2
0
21 Nov 2022
On the Complexity of Bayesian Generalization
Yuge Shi
Manjie Xu
J. Hopcroft
Kun He
J. Tenenbaum
Song-Chun Zhu
Ying Nian Wu
Wenjuan Han
Yixin Zhu
35
4
0
20 Nov 2022
A survey on knowledge-enhanced multimodal learning
Maria Lymperaiou
Giorgos Stamou
46
14
0
19 Nov 2022
CL-CrossVQA: A Continual Learning Benchmark for Cross-Domain Visual Question Answering
Yao Zhang
Haokun Chen
A. Frikha
Yezi Yang
Denis Krompass
Gengyuan Zhang
Jindong Gu
Volker Tresp
VLM
LRM
16
7
0
19 Nov 2022
RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation
Titas Anciukevicius
Zexiang Xu
Matthew Fisher
Paul Henderson
Hakan Bilen
Niloy J. Mitra
Paul Guerrero
53
155
0
17 Nov 2022
Cross-Modal Adapter for Text-Video Retrieval
Haojun Jiang
Jianke Zhang
Rui Huang
Chunjiang Ge
Zanlin Ni
Jiwen Lu
Jie Zhou
S. Song
Gao Huang
53
37
0
17 Nov 2022
MapQA: A Dataset for Question Answering on Choropleth Maps
Shuaichen Chang
David Palzer
Jialin Li
Eric Fosler-Lussier
N. Xiao
19
40
0
15 Nov 2022
Pragmatics in Language Grounding: Phenomena, Tasks, and Modeling Approaches
Daniel Fried
Nicholas Tomlin
Jennifer Hu
Roma Patel
Aida Nematzadeh
29
6
0
15 Nov 2022
A Rigorous Study Of The Deep Taylor Decomposition
Leon Sixt
Tim Landgraf
FAtt
AAML
27
4
0
14 Nov 2022
Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense
Zhecan Wang
Haoxuan You
Yicheng He
Wenhao Li
Kai-Wei Chang
Shih-Fu Chang
23
5
0
10 Nov 2022
Can Transformers Reason in Fragments of Natural Language?
Viktor Schlegel
Kamen V. Pavlov
Ian Pratt-Hartmann
LRM
ReLM
37
7
0
10 Nov 2022
Towards Reasoning-Aware Explainable VQA
Rakesh Vaideeswaran
Feng Gao
Abhinav Mathur
Govind Thattai
LRM
46
3
0
09 Nov 2022
CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering
Maitreya Patel
Tejas Gokhale
Chitta Baral
Yezhou Yang
49
9
0
07 Nov 2022
CASA: Category-agnostic Skeletal Animal Reconstruction
Yuefan Wu
Ze-Yin Chen
Shao-Wei Liu
Zhongzheng Ren
Shenlong Wang
33
30
0
04 Nov 2022
lilGym: Natural Language Visual Reasoning with Reinforcement Learning
Anne Wu
Kianté Brantley
Noriyuki Kojima
Yoav Artzi
ReLM
OffRL
LRM
29
3
0
03 Nov 2022
Neural Systematic Binder
Gautam Singh
Yeongbin Kim
Sungjin Ahn
OCL
39
36
0
02 Nov 2022
Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality
Anuj Diwan
Layne Berry
Eunsol Choi
David Harwath
Kyle Mahowald
CoGe
117
41
0
01 Nov 2022
Generalization Differences between End-to-End and Neuro-Symbolic Vision-Language Reasoning Systems
Wang Zhu
Jesse Thomason
Robin Jia
VLM
OOD
NAI
LRM
39
6
0
26 Oct 2022
Multi-Viewpoint and Multi-Evaluation with Felicitous Inductive Bias Boost Machine Abstract Reasoning Ability
Qinglai Wei
Diancheng Chen
Beiming Yuan
34
10
0
26 Oct 2022
A Survey on Deep Generative 3D-aware Image Synthesis
Weihao Xia
Jing-Hao Xue
3DV
51
20
0
25 Oct 2022
Search for Concepts: Discovering Visual Concepts Using Direct Optimization
P. Reddy
Paul Guerrero
Niloy J. Mitra
OCL
26
4
0
25 Oct 2022
When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks
Ankur Sikarwar
Arkil Patel
Navin Goyal
ViT
38
11
0
23 Oct 2022
RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data
Yangfan Zhan
Zhitong Xiong
Yuan. Yuan
78
108
0
23 Oct 2022
Unsupervised Multi-object Segmentation by Predicting Probable Motion Patterns
Laurynas Karazija
Subhabrata Choudhury
Iro Laina
Christian Rupprecht
Andrea Vedaldi
OCL
108
21
0
21 Oct 2022
Counterfactual Recipe Generation: Exploring Compositional Generalization in a Realistic Scenario
Xiao Liu
Yansong Feng
Jizhi Tang
ChenGang Hu
Dongyan Zhao
6
9
0
20 Oct 2022
Solving Reasoning Tasks with a Slot Transformer
Ryan Faulkner
Daniel Zoran
LRM
26
1
0
20 Oct 2022
Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text Generation
Yu Zhao
Jianguo Wei
Zhichao Lin
Yueheng Sun
Meishan Zhang
Hao Fei
30
16
0
20 Oct 2022
TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation
Pengfei Li
Beiwen Tian
Yongliang Shi
Xiaoxue Chen
Hao Zhao
Guyue Zhou
Ya Zhang
44
20
0
19 Oct 2022
ULN: Towards Underspecified Vision-and-Language Navigation
Weixi Feng
Tsu-Jui Fu
Yujie Lu
William Yang Wang
54
5
0
18 Oct 2022
Non-Contrastive Learning Meets Language-Image Pre-Training
Jinghao Zhou
Li Dong
Zhe Gan
Lijuan Wang
Furu Wei
VLM
CLIP
25
26
0
17 Oct 2022
What Makes Convolutional Models Great on Long Sequence Modeling?
Yuhong Li
Tianle Cai
Yi Zhang
De-huai Chen
Debadeepta Dey
VLM
39
96
0
17 Oct 2022
Scaling & Shifting Your Features: A New Baseline for Efficient Model Tuning
Dongze Lian
Daquan Zhou
Jiashi Feng
Xinchao Wang
36
250
0
17 Oct 2022
Neural Attentive Circuits
Nasim Rahaman
M. Weiß
Francesco Locatello
C. Pal
Yoshua Bengio
Bernhard Schölkopf
Erran L. Li
Nicolas Ballas
37
6
0
14 Oct 2022
On the Relationship Between Variational Inference and Auto-Associative Memory
Louis Annabi
Alexandre Pitti
M. Quoy
BDL
35
5
0
14 Oct 2022
The Hidden Uniform Cluster Prior in Self-Supervised Learning
Mahmoud Assran
Randall Balestriero
Quentin Duval
Florian Bordes
Ishan Misra
Piotr Bojanowski
Pascal Vincent
Michael G. Rabbat
Nicolas Ballas
SSL
52
47
0
13 Oct 2022
SlotFormer: Unsupervised Visual Dynamics Simulation with Object-Centric Models
Ziyi Wu
Nikita Dvornik
Klaus Greff
Thomas Kipf
Animesh Garg
OCL
BDL
67
91
0
12 Oct 2022
Human Body Measurement Estimation with Adversarial Augmentation
Nataniel Ruiz
Míriam Bellver
Timo Bolkart
Ambuj Arora
Ming-Chia Lin
Javier Romero
Raj Bala
3DH
39
3
0
11 Oct 2022
Robust and Controllable Object-Centric Learning through Energy-based Models
Ruixiang Zhang
Tong Che
Boris Ivanovic
Renhao Wang
Marco Pavone
Yoshua Bengio
Liam Paull
OCL
39
8
0
11 Oct 2022
Markup-to-Image Diffusion Models with Scheduled Sampling
Yuntian Deng
Noriyuki Kojima
Alexander M. Rush
DiffM
49
4
0
11 Oct 2022
Previous
1
2
3
...
12
13
14
...
28
29
30
Next