Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2406.09455
Cited By
Pandora: Towards General World Model with Natural Language Actions and Video States
12 June 2024
Jiannan Xiang
Guangyi Liu
Yi Gu
Qiyue Gao
Yuting Ning
Yuheng Zha
Zeyu Feng
Tianhua Tao
Shibo Hao
Yemin Shi
Zhengzhong Liu
Eric P. Xing
Zhiting Hu
VGen
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (15 upvotes)
Papers citing
"Pandora: Towards General World Model with Natural Language Actions and Video States"
47 / 47 papers shown
Title
PhyDetEx: Detecting and Explaining the Physical Plausibility of T2V Models
Zeqing Wang
Keze Wang
Lei Zhang
VGen
104
0
0
01 Dec 2025
SimWorld: An Open-ended Realistic Simulator for Autonomous Agents in Physical and Social Worlds
J. Ren
Yan Zhuang
Xiaokang Ye
Lingjun Mao
Xuhong He
...
Xianrui Zhong
Ziqiao Ma
Tianmin Shu
Zhiting Hu
Lianhui Qin
LLMAG
VGen
212
1
0
30 Nov 2025
In-Video Instructions: Visual Signals as Generative Control
Gongfan Fang
Xinyin Ma
Xinchao Wang
VGen
76
0
0
24 Nov 2025
Counterfactual World Models via Digital Twin-conditioned Video Diffusion
Yiqing Shen
Aiza Maksutova
Chenjia Li
Mathias Unberath
DiffM
VGen
157
0
0
21 Nov 2025
Towards High-Consistency Embodied World Model with Multi-View Trajectory Videos
Taiyi Su
Jian Zhu
Yaxuan Li
Chong Ma
Zitai Huang
Yichen Zhu
Hanli Wang
VGen
250
0
0
17 Nov 2025
Simulating the Visual World with Artificial Intelligence: A Roadmap
Jingtong Yue
Z. Huang
Z. Chen
Xintao Wang
Pengfei Wan
Ziwei Liu
VGen
LM&Ro
387
0
0
11 Nov 2025
A Step Toward World Models: A Survey on Robotic Manipulation
Peng-Fei Zhang
Ying Cheng
Xiaofan Sun
S. Wang
Lei Zhu
Lei Zhu
Heng Tao Shen
LM&Ro
662
2
0
31 Oct 2025
A Comprehensive Survey on World Models for Embodied AI
Xinqing Li
Xin He
Le Zhang
Yun-Hai Liu
Xiaoli Li
Yun-Hai Liu
VGen
LM&Ro
SyDa
228
2
0
19 Oct 2025
Terra: Explorable Native 3D World Model with Point Latents
Yuanhui Huang
Weiliang Chen
Wenzhao Zheng
Xin Tao
Pengfei Wan
Jie Zhou
Jiwen Lu
VGen
114
0
0
16 Oct 2025
MorphoSim: An Interactive, Controllable, and Editable Language-guided 4D World Simulator
Xuehai He
Shijie Zhou
Thivyanth Venkateswaran
Kaizhi Zheng
Ziyu Wan
A. Kadambi
Xin Eric Wang
VGen
SyDa
AI4CE
152
0
0
05 Oct 2025
VIVA+: Human-Centered Situational Decision-Making
Zhe Hu
Yixiao Ren
Guanzhong Liu
Jing Li
Yu Yin
LRM
103
0
0
28 Sep 2025
Learning Primitive Embodied World Models: Towards Scalable Robotic Learning
Qiao Sun
Liujia Yang
Wei Tang
Wei Huang
Kaixin Xu
...
Tong He
Yilun Chen
Xili Dai
Nanyang Ye
Qinying Gu
VGen
LM&Ro
389
1
0
28 Aug 2025
Critiques of World Models
Eric P. Xing
Mingkai Deng
Jinyu Hou
Zhiting Hu
SyDa
198
6
0
07 Jul 2025
GenWorld: Towards Detecting AI-generated Real-world Simulation Videos
Weiliang Chen
Wenzhao Zheng
Yu Zheng
Lei Chen
Jie Zhou
Jiwen Lu
Yueqi Duan
VGen
288
3
0
12 Jun 2025
Long-Context State-Space Video World Models
Ryan Po
Yotam Nitzan
Richard Zhang
Berlin Chen
Tri Dao
Eli Shechtman
Gordon Wetzstein
Xun Huang
308
24
0
26 May 2025
DreamGen: Unlocking Generalization in Robot Learning through Video World Models
Joel Jang
Seonghyeon Ye
Zongyu Lin
Jiannan Xiang
Johan Bjorck
...
Dieter Fox
Jan Kautz
Scott Reed
Yuke Zhu
Linxi Fan
VGen
OffRL
AI4TS
370
0
0
19 May 2025
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction
Junhao Cheng
Yuying Ge
Yixiao Ge
Jing Liao
Mingyu Ding
VGen
AI4CE
404
5
0
01 Apr 2025
WorldScore: A Unified Evaluation Benchmark for World Generation
Haoyi Duan
Hong-Xing Yu
Sirui Chen
L. Fei-Fei
Jiajun Wu
VGen
385
42
0
01 Apr 2025
VLIPP: Towards Physically Plausible Video Generation with Vision and Language Informed Physical Prior
Xindi Yang
Baolu Li
Yanzhe Zhang
Zhenfei Yin
Lei Bai
...
Zhiyong Wang
Jianfei Cai
Tien-Tsin Wong
Huchuan Lu
Xu Jia
DiffM
VGen
476
13
0
30 Mar 2025
CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models
Computer Vision and Pattern Recognition (CVPR), 2025
Qingqing Zhao
Yao Lu
Moo Jin Kim
Zipeng Fu
Zhuoyang Zhang
...
Ankur Handa
Xuan Li
Donglai Xiang
Gordon Wetzstein
Nayeon Lee
LM&Ro
LRM
331
192
0
27 Mar 2025
AdaWorld: Learning Adaptable World Models with Latent Actions
Shenyuan Gao
Siyuan Zhou
Yilun Du
Jun Zhang
Chuang Gan
VGen
529
31
0
24 Mar 2025
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Nvidia
Johan Bjorck
Fernando Castañeda
Nikita Cherniadev
Xingye Da
...
Ao Zhang
Hao Zhang
Yizhou Zhao
Ruijie Zheng
Yuke Zhu
VLM
525
379
0
18 Mar 2025
WISA: World Simulator Assistant for Physics-Aware Text-to-Video Generation
Jing Wang
Ao Ma
Ke Cao
Jun Zheng
Zhanjie Zhang
...
Yuhang Ma
Bo Cheng
Dawei Leng
Yuhui Yin
Xiaodan Liang
VGen
312
29
0
11 Mar 2025
WorldModelBench: Judging Video Generation Models As World Models
Dacheng Li
Yunhao Fang
Yukang Chen
Shuo Yang
Shiyi Cao
...
Hongxu Yin
Alfons Kemper
Ion Stoica
Enze Xie
Yaojie Lu
VGen
210
25
0
28 Feb 2025
Learning Human Skill Generators at Key-Step Levels
Yilu Wu
Chenhui Zhu
Shuai Wang
Hanlin Wang
Jing Wang
Zhaoxiang Zhang
Limin Wang
VGen
382
1
0
12 Feb 2025
DMWM: Dual-Mind World Model with Long-Term Imagination
Lingyi Wang
Rashed Shelim
Walid Saad
Naren Ramakrishnan
LRM
991
4
0
11 Feb 2025
Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile
Hangliang Ding
Dacheng Li
Runlong Su
Peiyuan Zhang
Zhijie Deng
Eric Liang
Hao Zhang
VGen
345
15
0
10 Feb 2025
Pre-Trained Video Generative Models as World Simulators
Haoran He
Yang Zhang
Guanbin Li
Zhihao Xu
Ling Pan
VGen
364
21
0
10 Feb 2025
DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers
Yuntao Chen
Yuqi Wang
Rundong Wang
977
40
0
24 Dec 2024
Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation
Computer Vision and Pattern Recognition (CVPR), 2024
Yiping Wang
Xuehai He
Kuan-Chieh Wang
Luyao Ma
Jianwei Yang
Shuohang Wang
Simon Shaolei Du
Yelong Shen
VGen
322
9
0
17 Dec 2024
GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
Computer Vision and Pattern Recognition (CVPR), 2024
Mariam Hassan
Sebastian Stapf
Ahmad Rahimi
Pedro M B Rezende
Yasaman Haghighi
...
Mathieu Salzmann
Davide Scaramuzza
Marc Pollefeys
Paolo Favaro
Alexandre Alahi
VLM
VGen
310
38
0
15 Dec 2024
The Matrix: Infinite-Horizon World Generation with Real-Time Moving Control
Ruili Feng
Han Zhang
Zhantao Yang
Jie Xiao
Zhilei Shu
Zhiheng Liu
Andy Zheng
Yukun Huang
Yu Liu
Han Zhang
VGen
270
45
0
04 Dec 2024
ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration
Computer Vision and Pattern Recognition (CVPR), 2024
Chaojun Ni
Guosheng Zhao
Xiaofeng Wang
Zheng Hua Zhu
Wenkang Qin
...
Kun Zhan
Fu Liu
Xianpeng Lang
Xingang Wang
Wenjun Mei
VGen
806
51
0
29 Nov 2024
Understanding World or Predicting Future? A Comprehensive Survey of World Models
ACM Computing Surveys (ACM CSUR), 2024
Jingtao Ding
Yunke Zhang
Yu Shang
Yuheng Zhang
Zefang Zong
...
Fengli Xu
Yong Li
Chen Gao
Fengli Xu
Yong Li
VGen
SyDa
477
71
0
21 Nov 2024
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
Hao Fei
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
482
38
0
08 Nov 2024
GameGen-X: Interactive Open-world Game Video Generation
International Conference on Learning Representations (ICLR), 2024
Haoxuan Che
Xuanhua He
Quande Liu
Cheng Jin
Hao Chen
VGen
361
64
0
01 Nov 2024
SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation
International Conference on Learning Representations (ICLR), 2024
Yining Hong
Beide Liu
Maxine Wu
Yuanhao Zhai
Kai-Wei Chang
...
Chung-Ching Lin
Jianfeng Wang
Zhiyong Yang
Yingnian Wu
Lijuan Wang
VGen
258
17
0
30 Oct 2024
Multi-Task Interactive Robot Fleet Learning with Visual World Models
Conference on Robot Learning (CoRL), 2024
Huihan Liu
Yu Zhang
Vaarij Betala
Evan Zhang
James Liu
Crystal Ding
Yinlin Zhu
310
18
0
30 Oct 2024
DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation
Computer Vision and Pattern Recognition (CVPR), 2024
Guosheng Zhao
Chaojun Ni
Xiaofeng Wang
Zheng Zhu
Xinming Zhang
...
Xinze Chen
Boyuan Wang
Youyi Zhang
Wenjun Mei
Xingang Wang
VGen
529
79
0
17 Oct 2024
Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation
Fanqing Meng
Jiaqi Liao
Xinyu Tan
Wenqi Shao
Quanfeng Lu
Kaipeng Zhang
Yu Cheng
Dianqi Li
Yu Qiao
Ping Luo
VGen
EGVM
218
64
0
07 Oct 2024
ACDC: Autoregressive Coherent Multimodal Generation using Diffusion Correction
Hyungjin Chung
Dohun Lee
Jong Chul Ye
VGen
DiffM
183
2
0
07 Oct 2024
AVID: Adapting Video Diffusion Models to World Models
Marc Rigter
Tarun Gupta
Agrin Hilmkil
Chao Ma
VGen
274
18
0
01 Oct 2024
MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
Yatian Wang
Yatian Wang
Aosong Cheng
Pengjun Fang
Zeyue Tian
...
Wenhan Luo
Qifeng Chen
Shanghang Zhang
Qi-fei Liu
Yi-Ting Guo
273
8
0
30 Jul 2024
Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI
Zehua Wang
Weixing Chen
Yongjie Bai
Xiaodan Liang
Guanbin Li
Wen Gao
Liang Lin
LM&Ro
SyDa
AI4CE
597
180
0
09 Jul 2024
Learning Action and Reasoning-Centric Image Editing from Videos and Simulations
Benno Krojer
Dheeraj Vattikonda
Luis Lara
Varun Jampani
Eva Portelance
Christopher Pal
Siva Reddy
EGVM
VGen
318
17
0
03 Jul 2024
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos
Xuehai He
Weixi Feng
Kaizhi Zheng
Yujie Lu
Wanrong Zhu
...
Zhengyuan Yang
Kevin Lin
William Yang Wang
Lijuan Wang
Xin Eric Wang
VGen
LRM
578
33
0
12 Jun 2024
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
Hongxin Zhang
Zeyuan Wang
Qiushi Lyu
Zheyuan Zhang
Sunli Chen
Tianmin Shu
Yilun Du
Kwonjoon Lee
Yilun Du
Chuang Gan
397
32
0
16 Apr 2024
1