ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1612.06890
  4. Cited By
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary
  Visual Reasoning

CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

20 December 2016
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
    CoGe
ArXivPDFHTML

Papers citing "CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning"

50 / 1,475 papers shown
Title
EgoTaskQA: Understanding Human Tasks in Egocentric Videos
EgoTaskQA: Understanding Human Tasks in Egocentric Videos
Baoxiong Jia
Ting Lei
Song-Chun Zhu
Siyuan Huang
EgoV
37
61
0
08 Oct 2022
Promising or Elusive? Unsupervised Object Segmentation from Real-world
  Single Images
Promising or Elusive? Unsupervised Object Segmentation from Real-world Single Images
Yafei Yang
Bo Yang
OCL
111
17
0
05 Oct 2022
Vision+X: A Survey on Multimodal Learning in the Light of Data
Vision+X: A Survey on Multimodal Learning in the Light of Data
Ye Zhu
Yuehua Wu
N. Sebe
Yan Yan
40
16
0
05 Oct 2022
RankMe: Assessing the downstream performance of pretrained
  self-supervised representations by their rank
RankMe: Assessing the downstream performance of pretrained self-supervised representations by their rank
Q. Garrido
Randall Balestriero
Laurent Najman
Yann LeCun
SSL
68
74
0
05 Oct 2022
Differentiable Mathematical Programming for Object-Centric
  Representation Learning
Differentiable Mathematical Programming for Object-Centric Representation Learning
Adeel Pervez
Phillip Lippe
E. Gavves
OCL
49
5
0
05 Oct 2022
Learning to Collocate Visual-Linguistic Neural Modules for Image
  Captioning
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Chongyang Gao
Jianfei Cai
MLLM
45
10
0
04 Oct 2022
Extending Compositional Attention Networks for Social Reasoning in
  Videos
Extending Compositional Attention Networks for Social Reasoning in Videos
Christina Sartzetaki
Georgios Paraskevopoulos
Alexandros Potamianos
LRM
31
3
0
03 Oct 2022
Enhancing Interpretability and Interactivity in Robot Manipulation: A
  Neurosymbolic Approach
Enhancing Interpretability and Interactivity in Robot Manipulation: A Neurosymbolic Approach
Georgios Tziafas
Hamidreza Kasaei
LM&Ro
20
3
0
03 Oct 2022
Unsupervised Multi-View Object Segmentation Using Radiance Field
  Propagation
Unsupervised Multi-View Object Segmentation Using Radiance Field Propagation
Xinhang Liu
Jiaben Chen
Huai Yu
Yu-Wing Tai
Chi-Keung Tang
95
28
0
02 Oct 2022
Multimodal Analogical Reasoning over Knowledge Graphs
Multimodal Analogical Reasoning over Knowledge Graphs
Ningyu Zhang
Lei Li
Xiang Chen
Xiaozhuan Liang
Shumin Deng
Huajun Chen
62
26
0
01 Oct 2022
Compositional Semantic Parsing with Large Language Models
Compositional Semantic Parsing with Large Language Models
Andrew Drozdov
Nathanael Scharli
Ekin Akyuurek
Nathan Scales
Xinying Song
Xinyun Chen
Olivier Bousquet
Denny Zhou
ReLM
LRM
208
92
0
29 Sep 2022
A Multiagent Framework for the Asynchronous and Collaborative Extension
  of Multitask ML Systems
A Multiagent Framework for the Asynchronous and Collaborative Extension of Multitask ML Systems
Andrea Gesmundo
29
2
0
29 Sep 2022
On the visual analytic intelligence of neural networks
On the visual analytic intelligence of neural networks
Stanislaw Wo'zniak
Hlynur Jónsson
G. Cherubini
A. Pantazi
E. Eleftheriou
25
0
0
28 Sep 2022
Towards Faithful Model Explanation in NLP: A Survey
Towards Faithful Model Explanation in NLP: A Survey
Qing Lyu
Marianna Apidianaki
Chris Callison-Burch
XAI
120
110
0
22 Sep 2022
Toward 3D Spatial Reasoning for Human-like Text-based Visual Question
  Answering
Toward 3D Spatial Reasoning for Human-like Text-based Visual Question Answering
Hao Li
Jinfa Huang
Peng Jin
Guoli Song
Qi Wu
Jie Chen
44
21
0
21 Sep 2022
Learn to Explain: Multimodal Reasoning via Thought Chains for Science
  Question Answering
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
Ashwin Kalyan
ELM
ReLM
LRM
211
1,134
0
20 Sep 2022
A Continual Development Methodology for Large-scale Multitask Dynamic ML
  Systems
A Continual Development Methodology for Large-scale Multitask Dynamic ML Systems
Andrea Gesmundo
21
18
0
15 Sep 2022
The Embeddings World and Artificial General Intelligence
The Embeddings World and Artificial General Intelligence
M. H. Chehreghani
19
1
0
14 Sep 2022
StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story
  Continuation
StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story Continuation
A. Maharana
Darryl Hannan
Joey Tianyi Zhou
DiffM
39
78
0
13 Sep 2022
MaXM: Towards Multilingual Visual Question Answering
MaXM: Towards Multilingual Visual Question Answering
Soravit Changpinyo
Linting Xue
Michal Yarom
Ashish V. Thapliyal
Idan Szpektor
J. Amelot
Xi Chen
Radu Soricut
33
8
0
12 Sep 2022
Ask Before You Act: Generalising to Novel Environments by Asking
  Questions
Ask Before You Act: Generalising to Novel Environments by Asking Questions
Ross Murphy
S. Mosesov
Javier Leguina Peral
Thymo ter Doest
LRM
32
0
0
10 Sep 2022
Foundations and Trends in Multimodal Machine Learning: Principles,
  Challenges, and Open Questions
Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions
Paul Pu Liang
Amir Zadeh
Louis-Philippe Morency
18
63
0
07 Sep 2022
Benchmarking Multimodal Variational Autoencoders: CdSprites+ Dataset and
  Toolkit
Benchmarking Multimodal Variational Autoencoders: CdSprites+ Dataset and Toolkit
G. Sejnova
M. Vavrecka
Karla Stepanova
VGen
28
0
0
07 Sep 2022
Trust in Language Grounding: a new AI challenge for human-robot teams
Trust in Language Grounding: a new AI challenge for human-robot teams
David M. Bossens
C. Evers
42
1
0
05 Sep 2022
Injecting Image Details into CLIP's Feature Space
Injecting Image Details into CLIP's Feature Space
Zilun Zhang
Cuifeng Shen
Yuan-Chung Shen
Huixin Xiong
Xinyu Zhou
VLM
CLIP
32
0
0
31 Aug 2022
Shaken, and Stirred: Long-Range Dependencies Enable Robust Outlier
  Detection with PixelCNN++
Shaken, and Stirred: Long-Range Dependencies Enable Robust Outlier Detection with PixelCNN++
Barath Mohan Umapathi
Kushal Chauhan
Pradeep Shenoy
D. Sridharan
37
0
0
29 Aug 2022
LogicRank: Logic Induced Reranking for Generative Text-to-Image Systems
LogicRank: Logic Induced Reranking for Generative Text-to-Image Systems
Bjorn Deiseroth
P. Schramowski
Hikaru Shindo
Devendra Singh Dhami
Kristian Kersting
EGVM
DiffM
24
1
0
29 Aug 2022
Symbolic Replay: Scene Graph as Prompt for Continual Learning on VQA
  Task
Symbolic Replay: Scene Graph as Prompt for Continual Learning on VQA Task
Stan Weixian Lei
Difei Gao
Jay Zhangjie Wu
Yuxuan Wang
Wei Liu
Meng Zhang
Mike Zheng Shou
25
35
0
24 Aug 2022
Neuro-Symbolic Visual Dialog
Neuro-Symbolic Visual Dialog
Adnen Abdessaied
Mihai Bâce
Andreas Bulling
NAI
21
3
0
22 Aug 2022
ILLUME: Rationalizing Vision-Language Models through Human Interactions
ILLUME: Rationalizing Vision-Language Models through Human Interactions
Manuel Brack
P. Schramowski
Bjorn Deiseroth
Kristian Kersting
VLM
MLLM
27
3
0
17 Aug 2022
Patching open-vocabulary models by interpolating weights
Patching open-vocabulary models by interpolating weights
Gabriel Ilharco
Mitchell Wortsman
S. Gadre
Shuran Song
Hannaneh Hajishirzi
Simon Kornblith
Ali Farhadi
Ludwig Schmidt
VLM
KELM
37
169
0
10 Aug 2022
CLEVR-Math: A Dataset for Compositional Language, Visual and
  Mathematical Reasoning
CLEVR-Math: A Dataset for Compositional Language, Visual and Mathematical Reasoning
Adam Dahlgren Lindström
Savitha Sam Abraham
19
50
0
10 Aug 2022
ChiQA: A Large Scale Image-based Real-World Question Answering Dataset
  for Multi-Modal Understanding
ChiQA: A Large Scale Image-based Real-World Question Answering Dataset for Multi-Modal Understanding
Bingning Wang
Feiya Lv
Ting Yao
Yiming Yuan
Jin Ma
Yu Luo
Haijin Liang
31
3
0
05 Aug 2022
Generative Bias for Robust Visual Question Answering
Generative Bias for Robust Visual Question Answering
Jae-Won Cho
Dong-Jin Kim
H. Ryu
In So Kweon
OOD
CML
41
19
0
01 Aug 2022
Testing Relational Understanding in Text-Guided Image Generation
Testing Relational Understanding in Text-Guided Image Generation
C. Conwell
T. Ullman
EGVM
160
65
0
29 Jul 2022
DoRO: Disambiguation of referred object for embodied agents
DoRO: Disambiguation of referred object for embodied agents
Pradip Pramanick
Chayan Sarkar
S. Paul
R. Roychoudhury
Brojeshwar Bhowmick
LM&Ro
20
14
0
28 Jul 2022
Unit Testing for Concepts in Neural Networks
Unit Testing for Concepts in Neural Networks
Charles Lovering
Ellie Pavlick
25
28
0
28 Jul 2022
Break and Make: Interactive Structural Understanding Using LEGO Bricks
Break and Make: Interactive Structural Understanding Using LEGO Bricks
Aaron Walsman
Muru Zhang
Klemen Kotar
Karthik Desingh
Ali Farhadi
Dieter Fox
40
10
0
27 Jul 2022
Neural Groundplans: Persistent Neural Scene Representations from a
  Single Image
Neural Groundplans: Persistent Neural Scene Representations from a Single Image
Prafull Sharma
A. Tewari
Yilun Du
Sergey Zakharov
Rares Andrei Ambrus
Adrien Gaidon
William T. Freeman
F. Durand
J. Tenenbaum
Vincent Sitzmann
SSL
OCL
29
16
0
22 Jul 2022
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
Andrew M. Saxe
Shagun Sodhani
Sam Lewallen
AI4CE
32
34
0
21 Jul 2022
Semantic-aware Modular Capsule Routing for Visual Question Answering
Semantic-aware Modular Capsule Routing for Visual Question Answering
Yudong Han
Jianhua Yin
Jianlong Wu
Yin-wei Wei
Liqiang Nie
35
7
0
21 Jul 2022
Semantic uncertainty intervals for disentangled latent spaces
Semantic uncertainty intervals for disentangled latent spaces
S. Sankaranarayanan
Anastasios Nikolas Angelopoulos
Stephen Bates
Yaniv Romano
Phillip Isola
UQCV
45
21
0
20 Jul 2022
Rethinking Data Augmentation for Robust Visual Question Answering
Rethinking Data Augmentation for Robust Visual Question Answering
Long Chen
Yuhang Zheng
Jun Xiao
OOD
37
42
0
18 Jul 2022
Semantic Novelty Detection via Relational Reasoning
Semantic Novelty Detection via Relational Reasoning
Francesco Cappio Borlino
S. Bucci
Tatiana Tommasi
17
4
0
18 Jul 2022
Sparse Relational Reasoning with Object-Centric Representations
Sparse Relational Reasoning with Object-Centric Representations
Alex F Spies
Alessandra Russo
Murray Shanahan
OCL
NAI
25
3
0
15 Jul 2022
Convolutional Bypasses Are Better Vision Transformer Adapters
Convolutional Bypasses Are Better Vision Transformer Adapters
Shibo Jie
Zhi-Hong Deng
VPVLM
21
132
0
14 Jul 2022
3D Concept Grounding on Neural Fields
3D Concept Grounding on Neural Fields
Yining Hong
Yilun Du
Chun-Tse Lin
J. Tenenbaum
Chuang Gan
29
19
0
13 Jul 2022
Fine-grained Activities of People Worldwide
Fine-grained Activities of People Worldwide
J. Byrne
Greg Castañón
Zhongheng Li
G. Ettinger
24
3
0
11 Jul 2022
CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination
CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination
Hyounghun Kim
Abhaysinh Zala
Joey Tianyi Zhou
22
6
0
08 Jul 2022
Knowing Earlier what Right Means to You: A Comprehensive VQA Dataset for
  Grounding Relative Directions via Multi-Task Learning
Knowing Earlier what Right Means to You: A Comprehensive VQA Dataset for Grounding Relative Directions via Multi-Task Learning
Kyra Ahrens
Matthias Kerzel
Jae Hee Lee
C. Weber
S. Wermter
21
0
0
06 Jul 2022
Previous
123...131415...282930
Next