CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning

10 October 2019

Papers citing "CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning"

21 / 21 papers shown

Title
The PanAf-FGBG Dataset: Understanding the Impact of Backgrounds in Wildlife Behaviour Recognition Otto Brookes Maksim Kukushkin Majid Mirmehdi Colleen Stephens Paula Dieguez ... Lukas Boesch Thomas Schmid M. Arandjelovic H. Kühl T. Burghardt 41 0 0 28 Feb 2025
Do Language Models Understand Time? Xi Ding Lei Wang 149 0 0 18 Dec 2024
Object-Attribute-Relation Representation Based Video Semantic Communication Qiyuan Du Yiping Duan Qianqian Yang Xiaoming Tao Mérouane Debbah 34 2 0 15 Jun 2024
Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering Xingrui Wang Wufei Ma Angtian Wang Shuo Chen Adam Kortylewski Alan L. Yuille 24 3 0 02 Jun 2024
STAR: A Benchmark for Situated Reasoning in Real-World Videos Bo Wu Shoubin Yu Zhenfang Chen Joshua B Tenenbaum Chuang Gan 17 176 0 15 May 2024
TempCompass: Do Video LLMs Really Understand Videos? Yuanxin Liu Shicheng Li Yi Liu Yuxiang Wang Shuhuai Ren Lei Li Sishuo Chen Xu Sun Lu Hou VLM 30 98 0 01 Mar 2024
ContPhy: Continuum Physical Concept Learning and Reasoning from Videos Zhicheng Zheng Xin Yan Zhenfang Chen Jingzhou Wang Qin Zhi Eddie Lim Joshua B. Tenenbaum Chuang Gan LRM 12 6 0 09 Feb 2024
VONet: Unsupervised Video Object Learning With Parallel U-Net Attention and Object-wise Sequential VAE Haonan Yu Wei Xu ViT 17 1 0 20 Jan 2024
Learning to Visually Connect Actions and their Effects Eric Peh Paritosh Parmar Basura Fernando 14 2 0 19 Jan 2024
Learning Object Permanence from Videos via Latent Imaginations Manuel Traub Frederic Becker S. Otte Martin Volker Butz 12 1 0 16 Oct 2023
STUPD: A Synthetic Dataset for Spatial and Temporal Relation Reasoning Palaash Agrawal Haidi Azaman Cheston Tan 22 3 0 13 Sep 2023
Does Visual Pretraining Help End-to-End Reasoning? Chen Sun Calvin Luo Xingyi Zhou Anurag Arnab Cordelia Schmid OCL LRM ViT 15 3 0 17 Jul 2023
Enabling Harmonious Human-Machine Interaction with Visual-Context Augmented Dialogue System: A Review Hao Wang Bin Guo Y. Zeng Yasan Ding Chen Qiu Ying Zhang Li Yao Zhiwen Yu 17 2 0 02 Jul 2022
SAVi++: Towards End-to-End Object-Centric Learning from Real-World Videos Gamaleldin F. Elsayed Aravindh Mahendran Sjoerd van Steenkiste Klaus Greff Michael C. Mozer Thomas Kipf VOS OCL 21 136 0 15 Jun 2022
Simple Unsupervised Object-Centric Learning for Complex and Naturalistic Videos Gautam Singh Yi-Fu Wu Sungjin Ahn OCL 10 113 0 27 May 2022
Conditional Object-Centric Learning from Video Thomas Kipf Gamaleldin F. Elsayed Aravindh Mahendran Austin Stone S. Sabour G. Heigold Rico Jonschkowski Alexey Dosovitskiy Klaus Greff OCL 24 213 0 24 Nov 2021
Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language Mingyu Ding Zhenfang Chen Tao Du Ping Luo J. Tenenbaum Chuang Gan VGen PINN OCL 8 74 0 28 Oct 2021
Causal Discovery from Conditionally Stationary Time Series Carles Balsells-Rodas Ruibo Tu Hedvig Kjellström Yingzhen Li Gabriele Schweikert Hedvig Kjellstrom Yingzhen Li BDL CML AI4TS 24 5 0 12 Oct 2021
Capturing the objects of vision with neural networks B. Peters N. Kriegeskorte OCL 8 56 0 07 Sep 2021
Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations Wolfgang Stammer P. Schramowski Kristian Kersting FAtt 6 105 0 25 Nov 2020
Learning Object Permanence from Video Aviv Shamsian Ofri Kleinfeld Amir Globerson Gal Chechik SSL 18 31 0 23 Mar 2020