ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1612.06890
  4. Cited By
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary
  Visual Reasoning

CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

20 December 2016
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
    CoGe
ArXivPDFHTML

Papers citing "CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning"

50 / 1,475 papers shown
Title
ViRel: Unsupervised Visual Relations Discovery with Graph-level Analogy
ViRel: Unsupervised Visual Relations Discovery with Graph-level Analogy
D. Zeng
Tailin Wu
J. Leskovec
GNN
33
1
0
04 Jul 2022
Enabling Harmonious Human-Machine Interaction with Visual-Context
  Augmented Dialogue System: A Review
Enabling Harmonious Human-Machine Interaction with Visual-Context Augmented Dialogue System: A Review
Hao Wang
Bin Guo
Y. Zeng
Yasan Ding
Chen Qiu
Ying Zhang
Li Yao
Zhiwen Yu
45
2
0
02 Jul 2022
Modern Question Answering Datasets and Benchmarks: A Survey
Modern Question Answering Datasets and Benchmarks: A Survey
Zhen Wang
58
23
0
30 Jun 2022
EBMs vs. CL: Exploring Self-Supervised Visual Pretraining for Visual
  Question Answering
EBMs vs. CL: Exploring Self-Supervised Visual Pretraining for Visual Question Answering
Violetta Shevchenko
Ehsan Abbasnejad
A. Dick
Anton Van Den Hengel
Damien Teney
51
0
0
29 Jun 2022
Guillotine Regularization: Why removing layers is needed to improve
  generalization in Self-Supervised Learning
Guillotine Regularization: Why removing layers is needed to improve generalization in Self-Supervised Learning
Florian Bordes
Randall Balestriero
Q. Garrido
Adrien Bardes
Pascal Vincent
35
22
0
27 Jun 2022
VisFIS: Visual Feature Importance Supervision with
  Right-for-the-Right-Reason Objectives
VisFIS: Visual Feature Importance Supervision with Right-for-the-Right-Reason Objectives
Zhuofan Ying
Peter Hase
Joey Tianyi Zhou
LRM
35
13
0
22 Jun 2022
Interactive Visual Reasoning under Uncertainty
Interactive Visual Reasoning under Uncertainty
Manjie Xu
Guangyuan Jiang
Wei Liang
Song-Chun Zhu
Yixin Zhu
LRM
52
5
0
18 Jun 2022
Conditional Permutation Invariant Flows
Conditional Permutation Invariant Flows
Berend Zwartsenberg
Adam Scibior
Matthew Niedoba
Vasileios Lioutas
Yunpeng Liu
Justice Sefas
Setareh Dabiri
J. Lavington
Trevor Campbell
Frank Wood
17
8
0
17 Jun 2022
FiT: Parameter Efficient Few-shot Transfer Learning for Personalized and
  Federated Image Classification
FiT: Parameter Efficient Few-shot Transfer Learning for Personalized and Federated Image Classification
Aliaksandra Shysheya
J. Bronskill
Massimiliano Patacchiola
Sebastian Nowozin
Richard Turner
3DH
FedML
46
27
0
17 Jun 2022
TUSK: Task-Agnostic Unsupervised Keypoints
TUSK: Task-Agnostic Unsupervised Keypoints
Yuhe Jin
Weiwei Sun
J. Hosang
Eduard Trulls
K. M. Yi
19
5
0
16 Jun 2022
Multimodal Dialogue State Tracking
Multimodal Dialogue State Tracking
Hung Le
Nancy F. Chen
Guosheng Lin
30
9
0
16 Jun 2022
Object Scene Representation Transformer
Object Scene Representation Transformer
Mehdi S. M. Sajjadi
Daniel Duckworth
Aravindh Mahendran
Sjoerd van Steenkiste
Filip Pavetić
Mario Luvcić
Leonidas J. Guibas
Klaus Greff
Thomas Kipf
ViT
OCL
38
91
0
14 Jun 2022
Discovering Object Masks with Transformers for Unsupervised Semantic
  Segmentation
Discovering Object Masks with Transformers for Unsupervised Semantic Segmentation
Wouter Van Gansbeke
Simon Vandenhende
Luc Van Gool
47
55
0
13 Jun 2022
A Benchmark for Compositional Visual Reasoning
A Benchmark for Compositional Visual Reasoning
Aimen Zerroug
Mohit Vaishnav
Julien Colin
Sebastian Musslick
Thomas Serre
OCL
CoGe
36
28
0
11 Jun 2022
Neural Prompt Search
Neural Prompt Search
Yuanhan Zhang
Kaiyang Zhou
Ziwei Liu
VPVLM
VLM
55
145
0
09 Jun 2022
On Neural Architecture Inductive Biases for Relational Tasks
On Neural Architecture Inductive Biases for Relational Tasks
Giancarlo Kerg
Sarthak Mittal
David Rolnick
Yoshua Bengio
Blake A. Richards
Guillaume Lajoie
OOD
23
25
0
09 Jun 2022
ObPose: Leveraging Pose for Object-Centric Scene Inference and
  Generation in 3D
ObPose: Leveraging Pose for Object-Centric Scene Inference and Generation in 3D
Yizhe Wu
Oiwi Parker Jones
Ingmar Posner
OCL
BDL
18
2
0
07 Jun 2022
cViL: Cross-Lingual Training of Vision-Language Models using Knowledge
  Distillation
cViL: Cross-Lingual Training of Vision-Language Models using Knowledge Distillation
Kshitij Gupta
Devansh Gautam
R. Mamidi
VLM
26
3
0
07 Jun 2022
A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge
A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge
Dustin Schwenk
Apoorv Khandelwal
Christopher Clark
Kenneth Marino
Roozbeh Mottaghi
16
507
0
03 Jun 2022
Compositional Visual Generation with Composable Diffusion Models
Compositional Visual Generation with Composable Diffusion Models
Nan Liu
Shuang Li
Yilun Du
Antonio Torralba
J. Tenenbaum
DiffM
CoGe
37
500
0
03 Jun 2022
VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
Wangchunshu Zhou
Yan Zeng
Shizhe Diao
Xinsong Zhang
CoGe
VLM
37
13
0
30 May 2022
Visual Superordinate Abstraction for Robust Concept Learning
Visual Superordinate Abstraction for Robust Concept Learning
Qinjie Zheng
Chaoyue Wang
Dadong Wang
Dacheng Tao
VLM
28
2
0
28 May 2022
Object-wise Masked Autoencoders for Fast Pre-training
Object-wise Masked Autoencoders for Fast Pre-training
Jiantao Wu
Shentong Mo
ViT
OCL
25
15
0
28 May 2022
Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object
  Interactions
Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions
Huaizu Jiang
Xiaojian Ma
Weili Nie
Zhiding Yu
Yuke Zhu
Song-Chun Zhu
Anima Anandkumar
VLM
28
36
0
27 May 2022
V-Doc : Visual questions answers with Documents
V-Doc : Visual questions answers with Documents
Yihao Ding
Zhe Huang
Runlin Wang
Yanhang Zhang
Xianru Chen
Yuzhong Ma
Hyunsuk Chung
S. Han
31
15
0
27 May 2022
Effective Abstract Reasoning with Dual-Contrast Network
Effective Abstract Reasoning with Dual-Contrast Network
Tao Zhuo
Mohan S. Kankanhalli
16
40
0
27 May 2022
Learning What and Where: Disentangling Location and Identity Tracking
  Without Supervision
Learning What and Where: Disentangling Location and Identity Tracking Without Supervision
Manuel Traub
S. Otte
Tobias Menge
Matthias Karlbauer
Jannik Thummel
Martin Volker Butz
41
20
0
26 May 2022
Unsupervised Multi-object Segmentation Using Attention and Soft-argmax
Unsupervised Multi-object Segmentation Using Attention and Soft-argmax
Bruno Sauvalle
A. de La Fortelle
3DPC
56
12
0
26 May 2022
An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale
  Multitask Learning Systems
An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale Multitask Learning Systems
Andrea Gesmundo
J. Dean
40
23
0
25 May 2022
Reassessing Evaluation Practices in Visual Question Answering: A Case
  Study on Out-of-Distribution Generalization
Reassessing Evaluation Practices in Visual Question Answering: A Case Study on Out-of-Distribution Generalization
Aishwarya Agrawal
Ivana Kajić
Emanuele Bugliarello
Elnaz Davoodi
Anita Gergely
Phil Blunsom
Aida Nematzadeh
OOD
45
17
0
24 May 2022
Sim-To-Real Transfer of Visual Grounding for Human-Aided Ambiguity
  Resolution
Sim-To-Real Transfer of Visual Grounding for Human-Aided Ambiguity Resolution
Georgios Tziafas
S. Kasaei
26
2
0
24 May 2022
On the Paradox of Learning to Reason from Data
On the Paradox of Learning to Reason from Data
Honghua Zhang
Liunian Harold Li
Tao Meng
Kai-Wei Chang
Guy Van den Broeck
NAI
ReLM
OOD
LRM
140
105
0
23 May 2022
On the Feasibility and Generality of Patch-based Adversarial Attacks on
  Semantic Segmentation Problems
On the Feasibility and Generality of Patch-based Adversarial Attacks on Semantic Segmentation Problems
Soma Kontár
A. Horváth
AAML
42
1
0
21 May 2022
Visual Concepts Tokenization
Visual Concepts Tokenization
Tao Yang
Yuwang Wang
Yan Lu
Nanning Zheng
OCL
ViT
51
12
0
20 May 2022
AIGenC: An AI generalisation model via creativity
AIGenC: An AI generalisation model via creativity
Corina Catarau-Cotutiu
Esther Mondragón
Eduardo Alonso
24
1
0
19 May 2022
Voxel-informed Language Grounding
Voxel-informed Language Grounding
Rodolfo Corona
Shizhan Zhu
Dan Klein
Trevor Darrell
147
12
0
19 May 2022
Color Overmodification Emerges from Data-Driven Learning and Pragmatic
  Reasoning
Color Overmodification Emerges from Data-Driven Learning and Pragmatic Reasoning
Fei Fang
Kunal Sinha
Noah D. Goodman
Christopher Potts
Elisa Kreiss
30
1
0
18 May 2022
Guess What Moves: Unsupervised Video and Image Segmentation by
  Anticipating Motion
Guess What Moves: Unsupervised Video and Image Segmentation by Anticipating Motion
Subhabrata Choudhury
Laurynas Karazija
Iro Laina
Andrea Vedaldi
Christian Rupprecht
OCL
VOS
108
39
0
16 May 2022
A Neuro-Symbolic ASP Pipeline for Visual Question Answering
A Neuro-Symbolic ASP Pipeline for Visual Question Answering
Thomas Eiter
N. Higuera
J. Oetsch
Michael Pritz
NAI
22
17
0
16 May 2022
Unsupervised Discovery and Composition of Object Light Fields
Unsupervised Discovery and Composition of Object Light Fields
Cameron Smith
Hong-Xing Yu
Sergey Zakharov
F. Durand
J. Tenenbaum
Jiajun Wu
Vincent Sitzmann
OCL
22
29
0
08 May 2022
QLEVR: A Diagnostic Dataset for Quantificational Language and Elementary
  Visual Reasoning
QLEVR: A Diagnostic Dataset for Quantificational Language and Elementary Visual Reasoning
Zechen Li
Anders Søgaard
6
6
0
06 May 2022
BlobGAN: Spatially Disentangled Scene Representations
BlobGAN: Spatially Disentangled Scene Representations
Dave Epstein
Taesung Park
Richard Y. Zhang
Eli Shechtman
Alexei A. Efros
GAN
SSL
OCL
42
42
0
05 May 2022
What is Right for Me is Not Yet Right for You: A Dataset for Grounding
  Relative Directions via Multi-Task Learning
What is Right for Me is Not Yet Right for You: A Dataset for Grounding Relative Directions via Multi-Task Learning
Jae Hee Lee
Matthias Kerzel
Kyra Ahrens
C. Weber
S. Wermter
42
9
0
05 May 2022
ComPhy: Compositional Physical Reasoning of Objects and Events from
  Videos
ComPhy: Compositional Physical Reasoning of Objects and Events from Videos
Zhenfang Chen
Kexin Yi
Yunzhu Li
Mingyu Ding
Antonio Torralba
J. Tenenbaum
Chuang Gan
CoGe
OCL
22
52
0
02 May 2022
Visual Spatial Reasoning
Visual Spatial Reasoning
Fangyu Liu
Guy Edward Toh Emerson
Nigel Collier
ReLM
58
160
0
30 Apr 2022
Toward Compositional Generalization in Object-Oriented World Modeling
Toward Compositional Generalization in Object-Oriented World Modeling
Linfeng Zhao
Lingzhi Kong
Robin Walters
Lawson L. S. Wong
OCL
26
21
0
28 Apr 2022
Offline Visual Representation Learning for Embodied Navigation
Offline Visual Representation Learning for Embodied Navigation
Karmesh Yadav
Ram Ramrakhya
Arjun Majumdar
Vincent-Pierre Berges
Sachit Kuhar
Dhruv Batra
Alexei Baevski
Oleksandr Maksymets
OffRL
SSL
38
72
0
27 Apr 2022
Do Users Benefit From Interpretable Vision? A User Study, Baseline, And
  Dataset
Do Users Benefit From Interpretable Vision? A User Study, Baseline, And Dataset
Leon Sixt
M. Schuessler
Oana-Iuliana Popescu
Philipp Weiß
Tim Landgraf
FAtt
34
14
0
25 Apr 2022
RelViT: Concept-guided Vision Transformer for Visual Relational
  Reasoning
RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning
Xiaojian Ma
Weili Nie
Zhiding Yu
Huaizu Jiang
Chaowei Xiao
Yuke Zhu
Song-Chun Zhu
Anima Anandkumar
ViT
LRM
35
19
0
24 Apr 2022
Revealing Occlusions with 4D Neural Fields
Revealing Occlusions with 4D Neural Fields
Basile Van Hoorick
Purva Tendulkar
Dídac Surís
Dennis Park
Simon Stent
Carl Vondrick
32
16
0
22 Apr 2022
Previous
123...141516...282930
Next