ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1612.06890
  4. Cited By
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary
  Visual Reasoning

CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

20 December 2016
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
    CoGe
ArXivPDFHTML

Papers citing "CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning"

50 / 1,475 papers shown
Title
The Abduction of Sherlock Holmes: A Dataset for Visual Abductive
  Reasoning
The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning
Jack Hessel
Jena D. Hwang
Jinho Park
Rowan Zellers
Chandra Bhagavatula
Anna Rohrbach
Kate Saenko
Yejin Choi
ReLM
158
48
0
10 Feb 2022
The slurk Interaction Server Framework: Better Data for Better Dialog
  Models
The slurk Interaction Server Framework: Better Data for Better Dialog Models
Jana Gotze
Maike Paetzel-Prusmann
Wencke Liermann
Tim Diekmann
David Schlangen
VLM
42
12
0
02 Feb 2022
Adversarial Masking for Self-Supervised Learning
Adversarial Masking for Self-Supervised Learning
Yuge Shi
N. Siddharth
Philip Torr
Adam R. Kosiorek
SSL
64
83
0
31 Jan 2022
Compositionality as Lexical Symmetry
Compositionality as Lexical Symmetry
Ekin Akyürek
Jacob Andreas
CoGe
57
8
0
30 Jan 2022
Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's
  Progressive Matrices
Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's Progressive Matrices
Mikolaj Malkiñski
Jacek Mańdziuk
127
42
0
28 Jan 2022
Explanatory Learning: Beyond Empiricism in Neural Networks
Explanatory Learning: Beyond Empiricism in Neural Networks
Antonio Norelli
Giorgio Mariani
Luca Moschella
Andrea Santilli
Giambattista Parascandolo
Simone Melzi
Emanuele Rodolà
16
2
0
25 Jan 2022
A Thousand Words Are Worth More Than a Picture: Natural Language-Centric
  Outside-Knowledge Visual Question Answering
A Thousand Words Are Worth More Than a Picture: Natural Language-Centric Outside-Knowledge Visual Question Answering
Feng Gao
Q. Ping
Govind Thattai
Aishwarya N. Reganti
Yingting Wu
Premkumar Natarajan
18
17
0
14 Jan 2022
Head2Toe: Utilizing Intermediate Representations for Better Transfer
  Learning
Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning
Utku Evci
Vincent Dumoulin
Hugo Larochelle
Michael C. Mozer
30
83
0
10 Jan 2022
Discrete and continuous representations and processing in deep learning:
  Looking forward
Discrete and continuous representations and processing in deep learning: Looking forward
Ruben Cartuyvels
Graham Spinks
Marie-Francine Moens
OCL
38
20
0
04 Jan 2022
LatteGAN: Visually Guided Language Attention for Multi-Turn
  Text-Conditioned Image Manipulation
LatteGAN: Visually Guided Language Attention for Multi-Turn Text-Conditioned Image Manipulation
Shoya Matsumori
Yukikoko Abe
Kosuke Shingyouchi
K. Sugiura
M. Imai
39
9
0
28 Dec 2021
Multi-Image Visual Question Answering
Multi-Image Visual Question Answering
Harsh Raj
Janhavi Dadhania
Akhilesh Bhardwaj
Prabuchandran KJ
11
2
0
27 Dec 2021
SLIP: Self-supervision meets Language-Image Pre-training
SLIP: Self-supervision meets Language-Image Pre-training
Norman Mu
Alexander Kirillov
David Wagner
Saining Xie
VLM
CLIP
63
483
0
23 Dec 2021
Comprehensive Visual Question Answering on Point Clouds through
  Compositional Scene Manipulation
Comprehensive Visual Question Answering on Point Clouds through Compositional Scene Manipulation
Xu Yan
Zhihao Yuan
Yuhao Du
Yinghong Liao
Yao Guo
Zhen Li
Shuguang Cui
3DPC
CoGe
23
14
0
22 Dec 2021
General Greedy De-bias Learning
General Greedy De-bias Learning
Xinzhe Han
Shuhui Wang
Chi Su
Qingming Huang
Qi Tian
11
7
0
20 Dec 2021
3D Question Answering
3D Question Answering
Shuquan Ye
Dongdong Chen
Songfang Han
Jing Liao
ViT
31
47
0
15 Dec 2021
PTR: A Benchmark for Part-based Conceptual, Relational, and Physical
  Reasoning
PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning
Yining Hong
Li Yi
J. Tenenbaum
Antonio Torralba
Chuang Gan
9
39
0
09 Dec 2021
FLAVA: A Foundational Language And Vision Alignment Model
FLAVA: A Foundational Language And Vision Alignment Model
Amanpreet Singh
Ronghang Hu
Vedanuj Goswami
Guillaume Couairon
Wojciech Galuba
Marcus Rohrbach
Douwe Kiela
CLIP
VLM
40
691
0
08 Dec 2021
Unsupervised Learning of Compositional Scene Representations from
  Multiple Unspecified Viewpoints
Unsupervised Learning of Compositional Scene Representations from Multiple Unspecified Viewpoints
Jinyang Yuan
Bin Li
Xiangyang Xue
CoGe
OCL
17
11
0
07 Dec 2021
Embedding Arithmetic of Multimodal Queries for Image Retrieval
Embedding Arithmetic of Multimodal Queries for Image Retrieval
Guillaume Couairon
Matthieu Cord
Matthijs Douze
Holger Schwenk
40
23
0
06 Dec 2021
Task2Sim : Towards Effective Pre-training and Transfer from Synthetic
  Data
Task2Sim : Towards Effective Pre-training and Transfer from Synthetic Data
Samarth Mishra
Yikang Shen
Cheng Perng Phoo
Chun-Fu Chen
Leonid Karlinsky
Kate Saenko
Venkatesh Saligrama
Rogerio Feris
28
37
0
30 Nov 2021
Recurrent Vision Transformer for Solving Visual Reasoning Problems
Recurrent Vision Transformer for Solving Visual Reasoning Problems
Nicola Messina
Giuseppe Amato
F. Carrara
Claudio Gennaro
Fabrizio Falchi
ViT
LRM
22
11
0
29 Nov 2021
Make an Omelette with Breaking Eggs: Zero-Shot Learning for Novel
  Attribute Synthesis
Make an Omelette with Breaking Eggs: Zero-Shot Learning for Novel Attribute Synthesis
Yu-Hsuan Li
Tzu-Yin Chao
Ching-Chun Huang
Pin-Yu Chen
Wei-Chen Chiu
VLM
12
1
0
28 Nov 2021
Natural Language and Spatial Rules
Natural Language and Spatial Rules
Alexandros Haridis
Stella Rossikopoulou Pappa
6
2
0
28 Nov 2021
NeSF: Neural Semantic Fields for Generalizable Semantic Segmentation of
  3D Scenes
NeSF: Neural Semantic Fields for Generalizable Semantic Segmentation of 3D Scenes
Suhani Vora
Noha Radwan
Klaus Greff
H. Meyer
Kyle Genova
Mehdi S. M. Sajjadi
Etienne Pot
Andrea Tagliasacchi
Daniel Duckworth
37
123
0
25 Nov 2021
Conditional Object-Centric Learning from Video
Conditional Object-Centric Learning from Video
Thomas Kipf
Gamaleldin F. Elsayed
Aravindh Mahendran
Austin Stone
S. Sabour
G. Heigold
Rico Jonschkowski
Alexey Dosovitskiy
Klaus Greff
OCL
48
214
0
24 Nov 2021
Two-stage Rule-induction Visual Reasoning on RPMs with an Application to
  Video Prediction
Two-stage Rule-induction Visual Reasoning on RPMs with an Application to Video Prediction
Wentao He
Jianfeng Ren
Ruibin Bai
Xudong Jiang
LRM
40
5
0
24 Nov 2021
Multiset-Equivariant Set Prediction with Approximate Implicit
  Differentiation
Multiset-Equivariant Set Prediction with Approximate Implicit Differentiation
Yan Zhang
David W. Zhang
Simon Lacoste-Julien
Gertjan J. Burghouts
Cees G. M. Snoek
BDL
48
21
0
23 Nov 2021
Talk-to-Resolve: Combining scene understanding and spatial dialogue to
  resolve granular task ambiguity for a collocated robot
Talk-to-Resolve: Combining scene understanding and spatial dialogue to resolve granular task ambiguity for a collocated robot
Pradip Pramanick
Chayan Sarkar
Snehasis Banerjee
Brojeshwar Bhowmick
24
14
0
22 Nov 2021
TraVLR: Now You See It, Now You Don't! A Bimodal Dataset for Evaluating
  Visio-Linguistic Reasoning
TraVLR: Now You See It, Now You Don't! A Bimodal Dataset for Evaluating Visio-Linguistic Reasoning
Keng Ji Chow
Samson Tan
MingSung Kan
LRM
26
4
0
21 Nov 2021
ClevrTex: A Texture-Rich Benchmark for Unsupervised Multi-Object
  Segmentation
ClevrTex: A Texture-Rich Benchmark for Unsupervised Multi-Object Segmentation
Laurynas Karazija
Iro Laina
Christian Rupprecht
3DV
VOS
52
84
0
19 Nov 2021
Learning to Compose Visual Relations
Learning to Compose Visual Relations
Nan Liu
Shuang Li
Yilun Du
J. Tenenbaum
Antonio Torralba
CoGe
OCL
32
77
0
17 Nov 2021
Compositional Transformers for Scene Generation
Compositional Transformers for Scene Generation
Drew A. Hudson
C. L. Zitnick
ViT
34
34
0
17 Nov 2021
Learning Object-Centric Representations of Multi-Object Scenes from
  Multiple Views
Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views
Nanbo Li
Cian Eastwood
Robert B. Fisher
OCL
27
53
0
13 Nov 2021
Object-Centric Representation Learning with Generative Spatial-Temporal
  Factorization
Object-Centric Representation Learning with Generative Spatial-Temporal Factorization
Nanbo Li
Muhammad Ahmed Raza
Wenbin Hu
Zhaole Sun
Robert B. Fisher
OCL
29
14
0
09 Nov 2021
Visual Question Answering based on Formal Logic
Visual Question Answering based on Formal Logic
Muralikrishnna G. Sethuraman
Ali Payani
Faramarz Fekri
J. C. Kerce
NAI
21
3
0
08 Nov 2021
Unsupervised Learning of Compositional Energy Concepts
Unsupervised Learning of Compositional Energy Concepts
Yilun Du
Shuang Li
Yash Sharma
J. Tenenbaum
Igor Mordatch
CoGe
OCL
32
76
0
04 Nov 2021
Projected GANs Converge Faster
Projected GANs Converge Faster
Axel Sauer
Kashyap Chitta
Jens Muller
Andreas Geiger
52
234
0
01 Nov 2021
Unsupervised Foreground Extraction via Deep Region Competition
Unsupervised Foreground Extraction via Deep Region Competition
Peiyu Yu
Sirui Xie
Xiaojian Ma
Yixin Zhu
Ying Nian Wu
Song-Chun Zhu
OCL
36
42
0
29 Oct 2021
Dynamic Visual Reasoning by Learning Differentiable Physics Models from
  Video and Language
Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language
Mingyu Ding
Zhenfang Chen
Tao Du
Ping Luo
J. Tenenbaum
Chuang Gan
VGen
PINN
OCL
43
74
0
28 Oct 2021
IconQA: A New Benchmark for Abstract Diagram Understanding and Visual
  Language Reasoning
IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning
Pan Lu
Liang Qiu
Jiaqi Chen
Tony Xia
Yizhou Zhao
Wei Zhang
Zhou Yu
Xiaodan Liang
Song-Chun Zhu
AIMat
43
184
0
25 Oct 2021
Logical Activation Functions: Logit-space equivalents of Probabilistic
  Boolean Operators
Logical Activation Functions: Logit-space equivalents of Probabilistic Boolean Operators
S. Lowe
Robert C. Earle
Jason dÉon
Thomas Trappenberg
Sageev Oore
23
1
0
22 Oct 2021
Single-Modal Entropy based Active Learning for Visual Question Answering
Single-Modal Entropy based Active Learning for Visual Question Answering
Dong-Jin Kim
Jae-Won Cho
Jinsoo Choi
Yunjae Jung
In So Kweon
30
12
0
21 Oct 2021
Behavioral Experiments for Understanding Catastrophic Forgetting
Behavioral Experiments for Understanding Catastrophic Forgetting
Samuel J. Bell
Neil D. Lawrence
37
4
0
20 Oct 2021
StructFormer: Learning Spatial Structure for Language-Guided Semantic
  Rearrangement of Novel Objects
StructFormer: Learning Spatial Structure for Language-Guided Semantic Rearrangement of Novel Objects
Weiyu Liu
Chris Paxton
Tucker Hermans
Dieter Fox
40
92
0
19 Oct 2021
Neuro-Symbolic Forward Reasoning
Neuro-Symbolic Forward Reasoning
Hikaru Shindo
Devendra Singh Dhami
Kristian Kersting
NAI
LRM
40
22
0
18 Oct 2021
Illiterate DALL-E Learns to Compose
Illiterate DALL-E Learns to Compose
Gautam Singh
Fei Deng
Sungjin Ahn
CoGe
OCL
27
133
0
17 Oct 2021
Towards Language-guided Visual Recognition via Dynamic Convolutions
Towards Language-guided Visual Recognition via Dynamic Convolutions
Gen Luo
Yiyi Zhou
Xiaoshuai Sun
Yongjian Wu
Yue Gao
Rongrong Ji
ObjD
33
19
0
17 Oct 2021
Semantically Distributed Robust Optimization for Vision-and-Language
  Inference
Semantically Distributed Robust Optimization for Vision-and-Language Inference
Tejas Gokhale
A. Chaudhary
Pratyay Banerjee
Chitta Baral
Yezhou Yang
54
17
0
14 Oct 2021
Program Transfer for Answering Complex Questions over Knowledge Bases
Program Transfer for Answering Complex Questions over Knowledge Bases
S. Cao
Jiaxin Shi
Zijun Yao
Xin Lv
Jifan Yu
Lei Hou
Juanzi Li
Zhiyuan Liu
Jinghui Xiao
35
56
0
12 Oct 2021
Beyond Accuracy: A Consolidated Tool for Visual Question Answering
  Benchmarking
Beyond Accuracy: A Consolidated Tool for Visual Question Answering Benchmarking
Dirk Vath
Pascal Tilli
Ngoc Thang Vu
41
4
0
11 Oct 2021
Previous
123...161718...282930
Next