ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1612.06890
  4. Cited By
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary
  Visual Reasoning

CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

20 December 2016
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
    CoGe
ArXivPDFHTML

Papers citing "CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning"

50 / 1,475 papers shown
Title
Neuro-Symbolic Spatio-Temporal Reasoning
Neuro-Symbolic Spatio-Temporal Reasoning
Pascal Hitzler
Michael Sioutis
Md Kamruzzaman Sarker
Marjan Alirezaie
Aaron Eberhart
Stefan Wermter
NAI
28
0
0
28 Nov 2022
Pitfalls of Conditional Batch Normalization for Contextual Multi-Modal
  Learning
Pitfalls of Conditional Batch Normalization for Contextual Multi-Modal Learning
Ivaxi Sheth
A. Rahman
Mohammad Havaei
Samira Ebrahimi Kahou
11
1
0
28 Nov 2022
Target-Free Text-guided Image Manipulation
Target-Free Text-guided Image Manipulation
Wanshu Fan
Cheng Yang
Chiao-An Yang
Yu-Chiang Frank Wang
DiffM
31
2
0
26 Nov 2022
TPA-Net: Generate A Dataset for Text to Physics-based Animation
TPA-Net: Generate A Dataset for Text to Physics-based Animation
Yuxing Qiu
Feng Gao
Minchen Li
Govind Thattai
Yin Yang
Chenfanfu Jiang
PINN
DiffM
VGen
49
0
0
25 Nov 2022
Discovering Generalizable Spatial Goal Representations via Graph-based
  Active Reward Learning
Discovering Generalizable Spatial Goal Representations via Graph-based Active Reward Learning
Aviv Netanyahu
Tianmin Shu
J. Tenenbaum
Pulkit Agrawal
32
5
0
24 Nov 2022
Make-A-Story: Visual Memory Conditioned Consistent Story Generation
Make-A-Story: Visual Memory Conditioned Consistent Story Generation
Tanzila Rahman
Hsin-Ying Lee
Jian Ren
Sergey Tulyakov
Shweta Mahajan
Leonid Sigal
DiffM
21
68
0
23 Nov 2022
Look, Read and Ask: Learning to Ask Questions by Reading Text in Images
Look, Read and Ask: Learning to Ask Questions by Reading Text in Images
Soumya Jahagirdar
Shankar Gangisetty
Anand Mishra
30
4
0
23 Nov 2022
Towards Human-Interpretable Prototypes for Visual Assessment of Image
  Classification Models
Towards Human-Interpretable Prototypes for Visual Assessment of Image Classification Models
Poulami Sinhamahapatra
Lena Heidemann
Maureen Monnet
Karsten Roscher
50
5
0
22 Nov 2022
ONeRF: Unsupervised 3D Object Segmentation from Multiple Views
ONeRF: Unsupervised 3D Object Segmentation from Multiple Views
Sheng-Ming Liang
Yichen Liu
Shangzhe Wu
Yu-Wing Tai
Chi-Keung Tang
43
7
0
22 Nov 2022
A Short Survey of Systematic Generalization
A Short Survey of Systematic Generalization
Yuanpeng Li
AI4CE
45
1
0
22 Nov 2022
Neural Meta-Symbolic Reasoning and Learning
Neural Meta-Symbolic Reasoning and Learning
Zihan Ye
Hikaru Shindo
Devendra Singh Dhami
Kristian Kersting
NAI
LRM
28
1
0
21 Nov 2022
Compositional Scene Modeling with Global Object-Centric Representations
Compositional Scene Modeling with Global Object-Centric Representations
Tonglin Chen
Bin Li
Zhimeng Shen
Xiangyang Xue
OCL
22
2
0
21 Nov 2022
On the Complexity of Bayesian Generalization
On the Complexity of Bayesian Generalization
Yuge Shi
Manjie Xu
J. Hopcroft
Kun He
J. Tenenbaum
Song-Chun Zhu
Ying Nian Wu
Wenjuan Han
Yixin Zhu
35
4
0
20 Nov 2022
A survey on knowledge-enhanced multimodal learning
A survey on knowledge-enhanced multimodal learning
Maria Lymperaiou
Giorgos Stamou
46
14
0
19 Nov 2022
CL-CrossVQA: A Continual Learning Benchmark for Cross-Domain Visual
  Question Answering
CL-CrossVQA: A Continual Learning Benchmark for Cross-Domain Visual Question Answering
Yao Zhang
Haokun Chen
A. Frikha
Yezi Yang
Denis Krompass
Gengyuan Zhang
Jindong Gu
Volker Tresp
VLM
LRM
16
7
0
19 Nov 2022
RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and
  Generation
RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation
Titas Anciukevicius
Zexiang Xu
Matthew Fisher
Paul Henderson
Hakan Bilen
Niloy J. Mitra
Paul Guerrero
53
155
0
17 Nov 2022
Cross-Modal Adapter for Text-Video Retrieval
Cross-Modal Adapter for Text-Video Retrieval
Haojun Jiang
Jianke Zhang
Rui Huang
Chunjiang Ge
Zanlin Ni
Jiwen Lu
Jie Zhou
S. Song
Gao Huang
53
37
0
17 Nov 2022
MapQA: A Dataset for Question Answering on Choropleth Maps
MapQA: A Dataset for Question Answering on Choropleth Maps
Shuaichen Chang
David Palzer
Jialin Li
Eric Fosler-Lussier
N. Xiao
19
40
0
15 Nov 2022
Pragmatics in Language Grounding: Phenomena, Tasks, and Modeling
  Approaches
Pragmatics in Language Grounding: Phenomena, Tasks, and Modeling Approaches
Daniel Fried
Nicholas Tomlin
Jennifer Hu
Roma Patel
Aida Nematzadeh
29
6
0
15 Nov 2022
A Rigorous Study Of The Deep Taylor Decomposition
A Rigorous Study Of The Deep Taylor Decomposition
Leon Sixt
Tim Landgraf
FAtt
AAML
27
4
0
14 Nov 2022
Understanding ME? Multimodal Evaluation for Fine-grained Visual
  Commonsense
Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense
Zhecan Wang
Haoxuan You
Yicheng He
Wenhao Li
Kai-Wei Chang
Shih-Fu Chang
23
5
0
10 Nov 2022
Can Transformers Reason in Fragments of Natural Language?
Can Transformers Reason in Fragments of Natural Language?
Viktor Schlegel
Kamen V. Pavlov
Ian Pratt-Hartmann
LRM
ReLM
37
7
0
10 Nov 2022
Towards Reasoning-Aware Explainable VQA
Towards Reasoning-Aware Explainable VQA
Rakesh Vaideeswaran
Feng Gao
Abhinav Mathur
Govind Thattai
LRM
46
3
0
09 Nov 2022
CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties
  via Video Question Answering
CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering
Maitreya Patel
Tejas Gokhale
Chitta Baral
Yezhou Yang
49
9
0
07 Nov 2022
CASA: Category-agnostic Skeletal Animal Reconstruction
CASA: Category-agnostic Skeletal Animal Reconstruction
Yuefan Wu
Ze-Yin Chen
Shao-Wei Liu
Zhongzheng Ren
Shenlong Wang
33
30
0
04 Nov 2022
lilGym: Natural Language Visual Reasoning with Reinforcement Learning
lilGym: Natural Language Visual Reasoning with Reinforcement Learning
Anne Wu
Kianté Brantley
Noriyuki Kojima
Yoav Artzi
ReLM
OffRL
LRM
29
3
0
03 Nov 2022
Neural Systematic Binder
Neural Systematic Binder
Gautam Singh
Yeongbin Kim
Sungjin Ahn
OCL
39
36
0
02 Nov 2022
Why is Winoground Hard? Investigating Failures in Visuolinguistic
  Compositionality
Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality
Anuj Diwan
Layne Berry
Eunsol Choi
David Harwath
Kyle Mahowald
CoGe
117
41
0
01 Nov 2022
Generalization Differences between End-to-End and Neuro-Symbolic
  Vision-Language Reasoning Systems
Generalization Differences between End-to-End and Neuro-Symbolic Vision-Language Reasoning Systems
Wang Zhu
Jesse Thomason
Robin Jia
VLM
OOD
NAI
LRM
39
6
0
26 Oct 2022
Multi-Viewpoint and Multi-Evaluation with Felicitous Inductive Bias
  Boost Machine Abstract Reasoning Ability
Multi-Viewpoint and Multi-Evaluation with Felicitous Inductive Bias Boost Machine Abstract Reasoning Ability
Qinglai Wei
Diancheng Chen
Beiming Yuan
34
10
0
26 Oct 2022
A Survey on Deep Generative 3D-aware Image Synthesis
A Survey on Deep Generative 3D-aware Image Synthesis
Weihao Xia
Jing-Hao Xue
3DV
51
20
0
25 Oct 2022
Search for Concepts: Discovering Visual Concepts Using Direct
  Optimization
Search for Concepts: Discovering Visual Concepts Using Direct Optimization
P. Reddy
Paul Guerrero
Niloy J. Mitra
OCL
26
4
0
25 Oct 2022
When Can Transformers Ground and Compose: Insights from Compositional
  Generalization Benchmarks
When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks
Ankur Sikarwar
Arkil Patel
Navin Goyal
ViT
38
11
0
23 Oct 2022
RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing
  Data
RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data
Yangfan Zhan
Zhitong Xiong
Yuan. Yuan
78
108
0
23 Oct 2022
Unsupervised Multi-object Segmentation by Predicting Probable Motion
  Patterns
Unsupervised Multi-object Segmentation by Predicting Probable Motion Patterns
Laurynas Karazija
Subhabrata Choudhury
Iro Laina
Christian Rupprecht
Andrea Vedaldi
OCL
108
21
0
21 Oct 2022
Counterfactual Recipe Generation: Exploring Compositional Generalization
  in a Realistic Scenario
Counterfactual Recipe Generation: Exploring Compositional Generalization in a Realistic Scenario
Xiao Liu
Yansong Feng
Jizhi Tang
ChenGang Hu
Dongyan Zhao
6
9
0
20 Oct 2022
Solving Reasoning Tasks with a Slot Transformer
Solving Reasoning Tasks with a Slot Transformer
Ryan Faulkner
Daniel Zoran
LRM
26
1
0
20 Oct 2022
Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text
  Generation
Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text Generation
Yu Zhao
Jianguo Wei
Zhichao Lin
Yueheng Sun
Meishan Zhang
Hao Fei
30
16
0
20 Oct 2022
TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun
  Distillation
TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation
Pengfei Li
Beiwen Tian
Yongliang Shi
Xiaoxue Chen
Hao Zhao
Guyue Zhou
Ya Zhang
44
20
0
19 Oct 2022
ULN: Towards Underspecified Vision-and-Language Navigation
ULN: Towards Underspecified Vision-and-Language Navigation
Weixi Feng
Tsu-Jui Fu
Yujie Lu
William Yang Wang
54
5
0
18 Oct 2022
Non-Contrastive Learning Meets Language-Image Pre-Training
Non-Contrastive Learning Meets Language-Image Pre-Training
Jinghao Zhou
Li Dong
Zhe Gan
Lijuan Wang
Furu Wei
VLM
CLIP
25
26
0
17 Oct 2022
What Makes Convolutional Models Great on Long Sequence Modeling?
What Makes Convolutional Models Great on Long Sequence Modeling?
Yuhong Li
Tianle Cai
Yi Zhang
De-huai Chen
Debadeepta Dey
VLM
39
96
0
17 Oct 2022
Scaling & Shifting Your Features: A New Baseline for Efficient Model
  Tuning
Scaling & Shifting Your Features: A New Baseline for Efficient Model Tuning
Dongze Lian
Daquan Zhou
Jiashi Feng
Xinchao Wang
36
250
0
17 Oct 2022
Neural Attentive Circuits
Neural Attentive Circuits
Nasim Rahaman
M. Weiß
Francesco Locatello
C. Pal
Yoshua Bengio
Bernhard Schölkopf
Erran L. Li
Nicolas Ballas
37
6
0
14 Oct 2022
On the Relationship Between Variational Inference and Auto-Associative
  Memory
On the Relationship Between Variational Inference and Auto-Associative Memory
Louis Annabi
Alexandre Pitti
M. Quoy
BDL
35
5
0
14 Oct 2022
The Hidden Uniform Cluster Prior in Self-Supervised Learning
The Hidden Uniform Cluster Prior in Self-Supervised Learning
Mahmoud Assran
Randall Balestriero
Quentin Duval
Florian Bordes
Ishan Misra
Piotr Bojanowski
Pascal Vincent
Michael G. Rabbat
Nicolas Ballas
SSL
52
47
0
13 Oct 2022
SlotFormer: Unsupervised Visual Dynamics Simulation with Object-Centric
  Models
SlotFormer: Unsupervised Visual Dynamics Simulation with Object-Centric Models
Ziyi Wu
Nikita Dvornik
Klaus Greff
Thomas Kipf
Animesh Garg
OCL
BDL
67
91
0
12 Oct 2022
Human Body Measurement Estimation with Adversarial Augmentation
Human Body Measurement Estimation with Adversarial Augmentation
Nataniel Ruiz
Míriam Bellver
Timo Bolkart
Ambuj Arora
Ming-Chia Lin
Javier Romero
Raj Bala
3DH
39
3
0
11 Oct 2022
Robust and Controllable Object-Centric Learning through Energy-based
  Models
Robust and Controllable Object-Centric Learning through Energy-based Models
Ruixiang Zhang
Tong Che
Boris Ivanovic
Renhao Wang
Marco Pavone
Yoshua Bengio
Liam Paull
OCL
39
8
0
11 Oct 2022
Markup-to-Image Diffusion Models with Scheduled Sampling
Markup-to-Image Diffusion Models with Scheduled Sampling
Yuntian Deng
Noriyuki Kojima
Alexander M. Rush
DiffM
49
4
0
11 Oct 2022
Previous
123...121314...282930
Next