ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1612.06890
  4. Cited By
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary
  Visual Reasoning

CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

20 December 2016
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
    CoGe
ArXivPDFHTML

Papers citing "CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning"

50 / 1,475 papers shown
Title
Attention in Reasoning: Dataset, Analysis, and Modeling
Attention in Reasoning: Dataset, Analysis, and Modeling
Shi Chen
Ming Jiang
Jinhui Yang
Qi Zhao
LRM
36
3
0
20 Apr 2022
Inductive Biases for Object-Centric Representations in the Presence of
  Complex Textures
Inductive Biases for Object-Centric Representations in the Presence of Complex Textures
Samuele Papa
Ole Winther
Andrea Dittadi
OCL
18
14
0
18 Apr 2022
Towards Lightweight Transformer via Group-wise Transformation for
  Vision-and-Language Tasks
Towards Lightweight Transformer via Group-wise Transformation for Vision-and-Language Tasks
Gen Luo
Yiyi Zhou
Xiaoshuai Sun
Yan Wang
Liujuan Cao
Yongjian Wu
Feiyue Huang
Rongrong Ji
ViT
22
43
0
16 Apr 2022
Measuring Compositional Consistency for Video Question Answering
Measuring Compositional Consistency for Video Question Answering
Mona Gandhi
Mustafa Omer Gul
Eva Prakash
Madeleine Grunde-McLaughlin
Ranjay Krishna
Maneesh Agrawala
CoGe
40
15
0
14 Apr 2022
Optimal quadratic binding for relational reasoning in vector symbolic
  neural architectures
Optimal quadratic binding for relational reasoning in vector symbolic neural architectures
Naoki Hiratani
H. Sompolinsky
38
5
0
14 Apr 2022
Brainish: Formalizing A Multimodal Language for Intelligence and
  Consciousness
Brainish: Formalizing A Multimodal Language for Intelligence and Consciousness
Paul Pu Liang
35
4
0
14 Apr 2022
Can Visual Dialogue Models Do Scorekeeping? Exploring How Dialogue Representations Incrementally Encode Shared Knowledge
Can Visual Dialogue Models Do Scorekeeping? Exploring How Dialogue Representations Incrementally Encode Shared Knowledge
Brielen Madureira
David Schlangen
28
4
0
14 Apr 2022
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression
  Comprehension
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension
Sanjay Subramanian
William Merrill
Trevor Darrell
Matt Gardner
Sameer Singh
Anna Rohrbach
ObjD
46
126
0
12 Apr 2022
Personal VAD 2.0: Optimizing Personal Voice Activity Detection for
  On-Device Speech Recognition
Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition
Shaojin Ding
R. Rikhye
Qiao Liang
Yanzhang He
Quan Wang
A. Narayanan
Tom O'Malley
Ian McGraw
29
27
0
08 Apr 2022
Winoground: Probing Vision and Language Models for Visio-Linguistic
  Compositionality
Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality
Tristan Thrush
Ryan Jiang
Max Bartolo
Amanpreet Singh
Adina Williams
Douwe Kiela
Candace Ross
CoGe
48
404
0
07 Apr 2022
An Algebraic Approach to Learning and Grounding
An Algebraic Approach to Learning and Grounding
Johanna Björklund
Adam Dahlgren Lindström
F. Drewes
34
0
0
06 Apr 2022
CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations
CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations
Leonard Salewski
A. Sophia Koepke
Hendrik P. A. Lensch
Zeynep Akata
LRM
NAI
35
20
0
05 Apr 2022
DT2I: Dense Text-to-Image Generation from Region Descriptions
DT2I: Dense Text-to-Image Generation from Region Descriptions
Stanislav Frolov
Prateek Bansal
Jörn Hees
Andreas Dengel
VLM
27
5
0
05 Apr 2022
Disentangling Abstraction from Statistical Pattern Matching in Human and
  Machine Learning
Disentangling Abstraction from Statistical Pattern Matching in Human and Machine Learning
Sreejan Kumar
Ishita Dasgupta
Nathaniel D. Daw
Jonathan Cohen
Thomas Griffiths
35
10
0
04 Apr 2022
IR-GAN: Image Manipulation with Linguistic Instruction by Increment
  Reasoning
IR-GAN: Image Manipulation with Linguistic Instruction by Increment Reasoning
Zhenhuan Liu
Jincan Deng
Liang Li
Shaofei Cai
Qianqian Xu
Shuhui Wang
Qingming Huang
49
21
0
02 Apr 2022
FindIt: Generalized Localization with Natural Language Queries
FindIt: Generalized Localization with Natural Language Queries
Weicheng Kuo
Fred Bertsch
Wei Li
A. Piergiovanni
M. Saffar
A. Angelova
ObjD
19
17
0
31 Mar 2022
Exploring Visual Prompts for Adapting Large-Scale Models
Exploring Visual Prompts for Adapting Large-Scale Models
Hyojin Bahng
Ali Jahanian
S. Sankaranarayanan
Phillip Isola
VLM
VPVLM
LRM
27
256
0
31 Mar 2022
SimVQA: Exploring Simulated Environments for Visual Question Answering
SimVQA: Exploring Simulated Environments for Visual Question Answering
Paola Cascante-Bonilla
Hui Wu
Letao Wang
Rogerio Feris
Vicente Ordonez
19
7
0
31 Mar 2022
FALCON: Fast Visual Concept Learning by Integrating Images, Linguistic
  descriptions, and Conceptual Relations
FALCON: Fast Visual Concept Learning by Integrating Images, Linguistic descriptions, and Conceptual Relations
Lingjie Mei
Jiayuan Mao
Ziqi Wang
Chuang Gan
J. Tenenbaum
VLM
29
21
0
30 Mar 2022
Large-scale Bilingual Language-Image Contrastive Learning
Large-scale Bilingual Language-Image Contrastive Learning
ByungSoo Ko
Geonmo Gu
VLM
34
14
0
28 Mar 2022
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Guangyao Li
Yake Wei
Yapeng Tian
Chenliang Xu
Ji-Rong Wen
Di Hu
39
136
0
26 Mar 2022
Learning Relational Rules from Rewards
Learning Relational Rules from Rewards
Guillermo Puebla
L. Doumas
22
0
0
25 Mar 2022
Compositional Temporal Grounding with Structured Variational Cross-Graph
  Correspondence Learning
Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning
Juncheng Li
Junlin Xie
Long Qian
Linchao Zhu
Siliang Tang
Fei Wu
Yi Yang
Yueting Zhuang
Xinze Wang
44
73
0
24 Mar 2022
Complex Scene Image Editing by Scene Graph Comprehension
Complex Scene Image Editing by Scene Graph Comprehension
Zhongping Zhang
Huiwen He
Bryan A. Plummer
Z. Liao
Huayan Wang
DiffM
35
6
0
24 Mar 2022
Visual Prompt Tuning
Visual Prompt Tuning
Menglin Jia
Luming Tang
Bor-Chun Chen
Claire Cardie
Serge Belongie
Bharath Hariharan
Ser-Nam Lim
VLM
VPVLM
45
1,538
0
23 Mar 2022
Test-time Adaptation with Slot-Centric Models
Test-time Adaptation with Slot-Centric Models
Mihir Prabhudesai
Anirudh Goyal
S. Paul
Sjoerd van Steenkiste
Mehdi S. M. Sajjadi
Gaurav Aggarwal
Thomas Kipf
Deepak Pathak
Katerina Fragkiadaki
TTA
26
9
0
21 Mar 2022
Discovering Objects that Can Move
Discovering Objects that Can Move
Zhipeng Bao
P. Tokmakov
Allan Jabri
Yu-xiong Wang
Adrien Gaidon
M. Hebert
OCL
38
43
0
18 Mar 2022
Context-Dependent Anomaly Detection with Knowledge Graph Embedding
  Models
Context-Dependent Anomaly Detection with Knowledge Graph Embedding Models
Nathan Vaska
Kevin J. Leahy
Victoria Helus
16
1
0
17 Mar 2022
Things not Written in Text: Exploring Spatial Commonsense from Visual
  Signals
Things not Written in Text: Exploring Spatial Commonsense from Visual Signals
Xiao Liu
Da Yin
Yansong Feng
Dongyan Zhao
LRM
26
45
0
15 Mar 2022
Can you even tell left from right? Presenting a new challenge for VQA
Can you even tell left from right? Presenting a new challenge for VQA
Sairaam Venkatraman
Rishi Rao
S. Balasubramanian
C. Vorugunti
R. R. Sarma
CoGe
13
0
0
15 Mar 2022
CARETS: A Consistency And Robustness Evaluative Test Suite for VQA
CARETS: A Consistency And Robustness Evaluative Test Suite for VQA
Carlos E. Jimenez
Olga Russakovsky
Karthik Narasimhan
CoGe
31
14
0
15 Mar 2022
REX: Reasoning-aware and Grounded Explanation
REX: Reasoning-aware and Grounded Explanation
Shi Chen
Qi Zhao
30
18
0
11 Mar 2022
Towards Self-Supervised Learning of Global and Object-Centric
  Representations
Towards Self-Supervised Learning of Global and Object-Centric Representations
Federico Baldassarre
Hossein Azizpour
SSL
3DPC
OCL
48
13
0
11 Mar 2022
Rethinking Task Sampling for Few-shot Vision-Language Transfer Learning
Rethinking Task Sampling for Few-shot Vision-Language Transfer Learning
Zhenhailong Wang
Hangyeol Yu
Manling Li
Han Zhao
Heng Ji
VLM
36
0
0
09 Mar 2022
A Neuro-vector-symbolic Architecture for Solving Raven's Progressive
  Matrices
A Neuro-vector-symbolic Architecture for Solving Raven's Progressive Matrices
Michael Hersche
Mustafa Zeqiri
Luca Benini
Abu Sebastian
Abbas Rahimi
NAI
40
92
0
09 Mar 2022
Geodesic Multi-Modal Mixup for Robust Fine-Tuning
Geodesic Multi-Modal Mixup for Robust Fine-Tuning
Changdae Oh
Junhyuk So
Hoyoon Byun
Yongtaek Lim
Minchul Shin
Jong-June Jeon
Kyungwoo Song
38
26
0
08 Mar 2022
Kubric: A scalable dataset generator
Kubric: A scalable dataset generator
Klaus Greff
Francois Belletti
Lucas Beyer
Carl Doersch
Yilun Du
...
Ziyu Wang
Tianhao Wu
K. M. Yi
Fangcheng Zhong
Andrea Tagliasacchi
50
250
0
07 Mar 2022
Do Explanations Explain? Model Knows Best
Do Explanations Explain? Model Knows Best
Ashkan Khakzar
Pedram J. Khorsandi
Rozhin Nobahari
Nassir Navab
XAI
AAML
FAtt
11
23
0
04 Mar 2022
DIME: Fine-grained Interpretations of Multimodal Models via Disentangled
  Local Explanations
DIME: Fine-grained Interpretations of Multimodal Models via Disentangled Local Explanations
Yiwei Lyu
Paul Pu Liang
Zihao Deng
Ruslan Salakhutdinov
Louis-Philippe Morency
26
31
0
03 Mar 2022
There is a Time and Place for Reasoning Beyond the Image
There is a Time and Place for Reasoning Beyond the Image
Xingyu Fu
Ben Zhou
I. Chandratreya
Carl Vondrick
Dan Roth
89
20
0
01 Mar 2022
On Modality Bias Recognition and Reduction
On Modality Bias Recognition and Reduction
Yangyang Guo
Liqiang Nie
Harry Cheng
Zhiyong Cheng
Mohan S. Kankanhalli
A. Bimbo
33
25
0
25 Feb 2022
Joint Answering and Explanation for Visual Commonsense Reasoning
Joint Answering and Explanation for Visual Commonsense Reasoning
Zhenyang Li
Yangyang Guo
Ke-Jyun Wang
Yin-wei Wei
Liqiang Nie
Mohan S. Kankanhalli
29
16
0
25 Feb 2022
Measuring CLEVRness: Blackbox testing of Visual Reasoning Models
Measuring CLEVRness: Blackbox testing of Visual Reasoning Models
Spyridon Mouselinos
Henryk Michalewski
Mateusz Malinowski
26
3
0
24 Feb 2022
Learning Multi-Object Dynamics with Compositional Neural Radiance Fields
Learning Multi-Object Dynamics with Compositional Neural Radiance Fields
Danny Driess
Zhiao Huang
Yunzhu Li
Russ Tedrake
Marc Toussaint
OCL
AI4CE
122
85
0
24 Feb 2022
Improving Systematic Generalization Through Modularity and Augmentation
Improving Systematic Generalization Through Modularity and Augmentation
Laura Ruis
Brenden M. Lake
53
16
0
22 Feb 2022
A Review on Methods and Applications in Multimodal Deep Learning
A Review on Methods and Applications in Multimodal Deep Learning
Summaira Jabeen
Xi Li
Muhammad Shoib Amin
Abdul Jabbar
VLM
HAI
32
88
0
18 Feb 2022
Grammar-Based Grounded Lexicon Learning
Grammar-Based Grounded Lexicon Learning
Jiayuan Mao
Haoyue Shi
Jiajun Wu
R. Levy
J. Tenenbaum
NAI
31
14
0
17 Feb 2022
Vision Models Are More Robust And Fair When Pretrained On Uncurated
  Images Without Supervision
Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
Priya Goyal
Quentin Duval
Isaac Seessel
Mathilde Caron
Ishan Misra
Levent Sagun
Armand Joulin
Piotr Bojanowski
VLM
SSL
28
110
0
16 Feb 2022
Saving Dense Retriever from Shortcut Dependency in Conversational Search
Saving Dense Retriever from Shortcut Dependency in Conversational Search
Sungdong Kim
Gangwoo Kim
30
27
0
15 Feb 2022
An experimental study of the vision-bottleneck in VQA
An experimental study of the vision-bottleneck in VQA
Pierre Marza
Corentin Kervadec
G. Antipov
M. Baccouche
Christian Wolf
30
1
0
14 Feb 2022
Previous
123...151617...282930
Next