Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.01442
Cited By
CLEVRER: CoLlision Events for Video REpresentation and Reasoning
3 October 2019
Kexin Yi
Yuta Saito
Yunzhu Li
Pushmeet Kohli
Jiajun Wu
Antonio Torralba
J. Tenenbaum
NAI
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CLEVRER: CoLlision Events for Video REpresentation and Reasoning"
50 / 74 papers shown
Title
NSFlow: An End-to-End FPGA Framework with Scalable Dataflow Architecture for Neuro-Symbolic AI
Hanchen Yang
Zishen Wan
Ritik Raj
Joongun Park
Ziwei Li
A. Samajdar
A. Raychowdhury
Tushar Krishna
19
0
0
27 Apr 2025
How Can Objects Help Video-Language Understanding?
Zitian Tang
Shijie Wang
Junho Cho
Jaewook Yoo
Chen Sun
40
0
0
10 Apr 2025
When Counterfactual Reasoning Fails: Chaos and Real-World Complexity
Yahya Aalaila
Gerrit Großmann
Sumantrak Mukherjee
Jonas Wahl
Sebastian Vollmer
CML
LRM
52
0
0
31 Mar 2025
BIMBA: Selective-Scan Compression for Long-Range Video Question Answering
Md. Mohaiminul Islam
Tushar Nagarajan
Huiyu Wang
Gedas Bertasius
Lorenzo Torresani
138
0
0
12 Mar 2025
M2-omni: Advancing Omni-MLLM for Comprehensive Modality Support with Competitive Performance
Qingpei Guo
Kaiyou Song
Zipeng Feng
Ziping Ma
Qinglong Zhang
...
Yunxiao Sun
Tai-WeiChang
Jingdong Chen
Ming Yang
Jun Zhou
MLLM
VLM
82
3
0
26 Feb 2025
Can Hallucination Correction Improve Video-Language Alignment?
Lingjun Zhao
Mingyang Xie
Paola Cascante-Bonilla
Hal Daumé III
Kwonjoon Lee
HILM
VLM
57
0
0
20 Feb 2025
When language and vision meet road safety: leveraging multimodal large language models for video-based traffic accident analysis
Ruixuan Zhang
Beichen Wang
Juexiao Zhang
Zilin Bian
Chen Feng
K. Ozbay
39
2
0
17 Jan 2025
TimeLogic: A Temporal Logic Benchmark for Video QA
S. Swetha
Hilde Kuehne
Mubarak Shah
41
1
0
13 Jan 2025
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding
Chenxin Tao
Shiqian Su
X. Zhu
Chenyu Zhang
Zhe Chen
...
Wenhai Wang
Lewei Lu
Gao Huang
Yu Qiao
Jifeng Dai
MLLM
VLM
104
2
0
20 Dec 2024
InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
Rick Akkerman
Haiwen Feng
M. Black
Dimitrios Tzionas
Victoria Fernandez-Abrevaya
VGen
AI4CE
105
3
0
16 Dec 2024
Black Swan: Abductive and Defeasible Video Reasoning in Unpredictable Events
Aditya Chinchure
Sahithya Ravi
R. Ng
Vered Shwartz
Boyang Albert Li
Leonid Sigal
ReLM
LRM
VLM
77
2
0
07 Dec 2024
Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning Scenarios
Shantanu Jaiswal
Debaditya Roy
Basura Fernando
Cheston Tan
ReLM
LRM
71
2
0
20 Nov 2024
FACTS: A Factored State-Space Framework For World Modelling
Li Nanbo
Firas Laakom
Yucheng Xu
Wenyi Wang
Jürgen Schmidhuber
AI4TS
119
0
0
28 Oct 2024
TRACE: Temporal Grounding Video LLM via Causal Event Modeling
Yongxin Guo
Jingyu Liu
Mingda Li
Xiaoying Tang
Qingbin Liu
Xiaoying Tang
37
14
0
08 Oct 2024
QTG-VQA: Question-Type-Guided Architectural for VideoQA Systems
Zhixian He
Pengcheng Zhao
Fuwei Zhang
Shujin Lin
36
0
0
14 Sep 2024
Tarsier: Recipes for Training and Evaluating Large Video Description Models
Jiawei Wang
Liping Yuan
Yuchen Zhang
33
52
0
30 Jun 2024
Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
Xingrui Wang
Wufei Ma
Angtian Wang
Shuo Chen
Adam Kortylewski
Alan L. Yuille
29
3
0
02 Jun 2024
STAR: A Benchmark for Situated Reasoning in Real-World Videos
Bo Wu
Shoubin Yu
Zhenfang Chen
Joshua B Tenenbaum
Chuang Gan
33
176
0
15 May 2024
Unsupervised Dynamics Prediction with Object-Centric Kinematics
Yeon-Ji Song
Suhyung Choi
Jaein Kim
Jin-Hwa Kim
Byoung-Tak Zhang
36
0
0
29 Apr 2024
A Survey on the Integration of Generative AI for Critical Thinking in Mobile Networks
Athanasios Karapantelakis
Alexandros Nikou
Ajay Kattepur
Jean Martins
Leonid Mokrushin
S. Mohalik
Marin Orlic
Aneta Vulgarakis Feljan
24
1
0
10 Apr 2024
TempCompass: Do Video LLMs Really Understand Videos?
Yuanxin Liu
Shicheng Li
Yi Liu
Yuxiang Wang
Shuhuai Ren
Lei Li
Sishuo Chen
Xu Sun
Lu Hou
VLM
41
98
0
01 Mar 2024
OLViT: Multi-Modal State Tracking via Attention-Based Embeddings for Video-Grounded Dialog
Adnen Abdessaied
Manuel von Hochmeister
Andreas Bulling
35
2
0
20 Feb 2024
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Kunchang Li
Yali Wang
Yinan He
Yizhuo Li
Yi Wang
...
Jilan Xu
Guo Chen
Ping Luo
Limin Wang
Yu Qiao
VLM
MLLM
56
398
0
28 Nov 2023
3D-Aware Visual Question Answering about Parts, Poses and Occlusions
Xingrui Wang
Wufei Ma
Zhuowan Li
Adam Kortylewski
Alan L. Yuille
CoGe
19
12
0
27 Oct 2023
DetermiNet: A Large-Scale Diagnostic Dataset for Complex Visually-Grounded Referencing using Determiners
Clarence Lee
M Ganesh Kumar
Cheston Tan
28
3
0
07 Sep 2023
Does Visual Pretraining Help End-to-End Reasoning?
Chen Sun
Calvin Luo
Xingyi Zhou
Anurag Arnab
Cordelia Schmid
OCL
LRM
ViT
30
3
0
17 Jul 2023
Coupling Large Language Models with Logic Programming for Robust and General Reasoning from Text
Zhun Yang
Adam Ishay
Joohyung Lee
LRM
ELM
26
50
0
15 Jul 2023
Learning Differentiable Logic Programs for Abstract Visual Reasoning
Hikaru Shindo
Viktor Pfanschilling
D. Dhami
Kristian Kersting
NAI
19
6
0
03 Jul 2023
Physics-Informed Computer Vision: A Review and Perspectives
C. Banerjee
Kien Nguyen
Clinton Fookes
G. Karniadakis
PINN
AI4CE
30
28
0
29 May 2023
Reusable Slotwise Mechanisms
Trang Nguyen
Amin Mansouri
Kanika Madan
Khuong N. Nguyen
Kartik Ahuja
Dianbo Liu
Yoshua Bengio
OCL
22
4
0
21 Feb 2023
Evaluating Temporal Observation-Based Causal Discovery Techniques Applied to Road Driver Behaviour
Rhys Howard
Lars Kunze
CML
23
7
0
31 Jan 2023
Integrating Earth Observation Data into Causal Inference: Challenges and Opportunities
Connor Jerzak
Fredrik D. Johansson
Adel Daoud
CML
28
11
0
30 Jan 2023
See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning
Zhenfang Chen
Qinhong Zhou
Yikang Shen
Yining Hong
Hao Zhang
Chuang Gan
LRM
VLM
31
35
0
12 Jan 2023
Multi-Task Edge Prediction in Temporally-Dynamic Video Graphs
Osman Ulger
Julian Wiederer
Mohsen Ghafoorian
Vasileios Belagiannis
Pascal Mettes
35
0
0
06 Dec 2022
Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning
Zhuowan Li
Xingrui Wang
Elias Stengel-Eskin
Adam Kortylewski
Wufei Ma
Benjamin Van Durme
Max Planck Institute for Informatics
OOD
LRM
21
57
0
01 Dec 2022
On the Learning Mechanisms in Physical Reasoning
Shiqian Li
Ke Wu
Chi Zhang
Yixin Zhu
AI4CE
44
13
0
05 Oct 2022
Entropy-driven Unsupervised Keypoint Representation Learning in Videos
A. Younes
Simone Schaub-Meyer
Georgia Chalvatzaki
SSL
19
0
0
30 Sep 2022
Neuro-Symbolic Learning: Principles and Applications in Ophthalmology
Muhammad Hassan
Haifei Guan
Aikaterini Melliou
Yuqi Wang
Qianhui Sun
...
Qi Huang
Jiefu Tan
Qinwang Xing
Peiwu Qin
Dongmei Yu
NAI
29
14
0
31 Jul 2022
Is an Object-Centric Video Representation Beneficial for Transfer?
Chuhan Zhang
Ankush Gupta
Andrew Zisserman
ViT
31
26
0
20 Jul 2022
Reasoning about Actions over Visual and Linguistic Modalities: A Survey
Shailaja Keyur Sampat
Maitreya Patel
Subhasish Das
Yezhou Yang
Chitta Baral
ReLM
LM&Ro
LRM
19
12
0
15 Jul 2022
Interactive Visual Reasoning under Uncertainty
Manjie Xu
Guangyuan Jiang
Wei Liang
Song-Chun Zhu
Yixin Zhu
LRM
42
5
0
18 Jun 2022
VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation
Kai Zheng
Xiaotong Chen
Odest Chadwicke Jenkins
X. Wang
LM&Ro
CoGe
9
54
0
17 Jun 2022
Image-based Treatment Effect Heterogeneity
Connor Jerzak
Fredrik D. Johansson
Adel Daoud
19
20
0
13 Jun 2022
Revisiting the "Video" in Video-Language Understanding
S. Buch
Cristobal Eyzaguirre
Adrien Gaidon
Jiajun Wu
L. Fei-Fei
Juan Carlos Niebles
27
155
0
03 Jun 2022
Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions
Huaizu Jiang
Xiaojian Ma
Weili Nie
Zhiding Yu
Yuke Zhu
Song-Chun Zhu
Anima Anandkumar
VLM
26
36
0
27 May 2022
Learning What and Where: Disentangling Location and Identity Tracking Without Supervision
Manuel Traub
S. Otte
Tobias Menge
Matthias Karlbauer
Jannik Thummel
Martin Volker Butz
21
20
0
26 May 2022
When Physics Meets Machine Learning: A Survey of Physics-Informed Machine Learning
Chuizheng Meng
Sungyong Seo
Defu Cao
Sam Griesemer
Yan Liu
PINN
AI4CE
34
55
0
31 Mar 2022
PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning
Yining Hong
Li Yi
J. Tenenbaum
Antonio Torralba
Chuang Gan
9
39
0
09 Dec 2021
Conditional Object-Centric Learning from Video
Thomas Kipf
Gamaleldin F. Elsayed
Aravindh Mahendran
Austin Stone
S. Sabour
G. Heigold
Rico Jonschkowski
Alexey Dosovitskiy
Klaus Greff
OCL
39
214
0
24 Nov 2021
Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language
Mingyu Ding
Zhenfang Chen
Tao Du
Ping Luo
J. Tenenbaum
Chuang Gan
VGen
PINN
OCL
24
74
0
28 Oct 2021
1
2
Next