Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2403.05131
Cited By
v1
v2
v3 (latest)
Sora as a World Model? A Complete Survey on Text-to-Video Generation
8 March 2024
Joseph Cho
Fachrina Dewi Puspitasari
Sheng Zheng
Jingyao Zheng
Noor Ul Eman
Tae-Ho Kim
Choong Seon Hong
Chaoning Zhang
Jingyao Zheng
Sheng Zheng
Lik-Hang Lee
Caiyan Qin
Tae-Ho Kim
Choong Seon Hong
Yang Yang
Heng Tao Shen
EGVM
VGen
Re-assign community
ArXiv (abs)
PDF
HTML
Github (3579★)
Papers citing
"Sora as a World Model? A Complete Survey on Text-to-Video Generation"
39 / 39 papers shown
Generative Action Tell-Tales: Assessing Human Motion in Synthesized Videos
Xavier Thomas
Youngsun Lim
Ananya Srinivasan
Audrey Zheng
Deepti Ghadiyaram
EGVM
VGen
322
0
0
01 Dec 2025
PhyDetEx: Detecting and Explaining the Physical Plausibility of T2V Models
Zeqing Wang
Keze Wang
Lei Zhang
VGen
139
0
0
01 Dec 2025
Counterfactual World Models via Digital Twin-conditioned Video Diffusion
Yiqing Shen
Aiza Maksutova
Chenjia Li
Mathias Unberath
DiffM
VGen
165
0
0
21 Nov 2025
VEIL: Jailbreaking Text-to-Video Models via Visual Exploitation from Implicit Language
Zonghao Ying
Moyang Chen
Nizhang Li
Zhiqiang Wang
Wenxin Zhang
Quanchen Zou
Zonglei Jing
Aishan Liu
Xianglong Liu
128
0
0
17 Nov 2025
PipeDiT: Accelerating Diffusion Transformers in Video Generation with Task Pipelining and Model Decoupling
S. Wang
Qiang Wang
Shaohuai Shi
VGen
134
0
0
15 Nov 2025
Simulating the Visual World with Artificial Intelligence: A Roadmap
Jingtong Yue
Z. Huang
Z. Chen
Xintao Wang
Pengfei Wan
Ziwei Liu
VGen
LM&Ro
474
1
0
11 Nov 2025
Embodied AI: From LLMs to World Models
Tongtong Feng
Xin Wang
Yu Jiang
Wenwu Zhu
LM&Ro
340
11
0
24 Sep 2025
InPhyRe Discovers: Large Multimodal Models Struggle in Inductive Physical Reasoning
Gautam Sreekumar
Vishnu Boddeti
ReLM
LRM
147
0
0
12 Sep 2025
Video Understanding by Design: How Datasets Shape Architectures and Insights
Lei Wang
Piotr Koniusz
Yongsheng Gao
3DV
VGen
AI4TS
238
0
0
11 Sep 2025
CineTrans: Learning to Generate Videos with Cinematic Transitions via Masked Diffusion Models
Xiaoxue Wu
Bingjie Gao
Yu Qiao
Yaohui Wang
Xinyuan Chen
DiffM
VGen
198
5
0
15 Aug 2025
NarrLV: Towards a Comprehensive Narrative-Centric Evaluation for Long Video Generation
X. Feng
H. Yu
M. Wu
Shuyan Hu
J. Chen
C. Zhu
J. Wu
X. Chu
K. Huang
DiffM
EGVM
VGen
568
6
0
15 Jul 2025
Vision Technologies with Applications in Traffic Surveillance Systems: A Holistic Survey
ACM Computing Surveys (ACM CSUR), 2024
Wei Zhou
Lei Zhao
Lei Zhao
Runyu Zhang
Yifan Cui
Hongpu Huang
Kun Qie
Chen Wang
AI4TS
513
8
0
01 Jul 2025
G4Seg: Generation for Inexact Segmentation Refinement with Diffusion Models
Tianjiao Zhang
Fei Zhang
Jiangchao Yao
Ya Zhang
Yanfeng Wang
DiffM
342
4
0
02 Jun 2025
MOVi: Training-free Text-conditioned Multi-Object Video Generation
Aimon Rahman
Jiang Liu
Ze Wang
Ximeng Sun
Jialian Wu
Xiaodong Yu
Yusheng Su
Vishal M. Patel
Zicheng Liu
Emad Barsoum
DiffM
VGen
275
1
0
29 May 2025
A Challenge to Build Neuro-Symbolic Video Agents
Sahil Shah
Harsh Goel
Sai Shankar Narasimhan
Minkyu Choi
S P Sharan
Oguzhan Akcin
Sandeep Chinchali
AI4TS
268
1
0
20 May 2025
We'll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback
Minkyu Choi
S P Sharan
Harsh Goel
Sahil Shah
Sandeep Chinchali
DiffM
VGen
421
4
0
24 Apr 2025
Morpheus: Benchmarking Physical Reasoning of Video Generative Models with Real Physical Experiments
Chenyu Zhang
Daniil Cherniavskii
Antonios Tragoudaras
Antonios Vozikis
Thijmen Nijdam
Thijmen Nijdam
Mark Bodracska
Mark Bodracska
Andrii Zadaianchuk
E. Gavves
EGVM
VGen
294
12
0
03 Apr 2025
HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation
Computer Vision and Pattern Recognition (CVPR), 2025
Boyuan Wang
Xiaofeng Wang
Chaojun Ni
Guosheng Zhao
Zhiqin Yang
...
Yukun Zhou
Xinze Chen
Guan Huang
Lihong Liu
Xingang Wang
VGen
391
18
0
31 Mar 2025
A Self-supervised Motion Representation for Portrait Video Generation
Qiyuan Zhang
Chenyu Wu
Wenzhang Sun
Huaize Liu
Donglin Di
Wei Chen
Changqing Zou
VGen
309
0
0
13 Mar 2025
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion
Ziyi Yang
Fanqi Wan
Longguang Zhong
Canbin Huang
Guosheng Liang
Xiaojun Quan
MoMe
286
10
0
06 Mar 2025
BounTCHA: A CAPTCHA Utilizing Boundary Identification in Guided Generative AI-extended Videos
Lehao Lin
Ke Wang
Maha Abdallah
Wei Cai
AAML
366
0
0
30 Jan 2025
Generative AI for Cel-Animation: A Survey
Yunlong Tang
Junjia Guo
Pinxin Liu
Zhiyuan Wang
Hang Hua
...
Jing Bi
Mingqian Feng
Xuzhao Li
Zeliang Zhang
Chenliang Xu
VGen
706
17
0
08 Jan 2025
Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric
Zhichao Zhang
Wei Sun
Xinyue Li
Yunhao Li
Qihang Ge
...
Zhongpeng Ji
Fengyu Sun
Shangling Jui
Xiongkuo Min
Guoquan Zheng
EGVM
533
11
0
25 Nov 2024
Understanding World or Predicting Future? A Comprehensive Survey of World Models
ACM Computing Surveys (ACM CSUR), 2024
Jingtao Ding
Yunke Zhang
Yu Shang
Yuheng Zhang
Zefang Zong
...
Fengli Xu
Yong Li
Chen Gao
Fengli Xu
Yong Li
VGen
SyDa
517
17
0
21 Nov 2024
Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey
Xuannan Liu
Xing Cui
Peipei Li
Zekun Li
Huaibo Huang
Shuhan Xia
Miaoxuan Zhang
Yueying Zou
Ran He
AAML
538
24
0
14 Nov 2024
Artificial Intelligence for Biomedical Video Generation
Linyuan Li
Jianing Qiu
Anujit Saha
Lin Li
Poyuan Li
Mengxian He
Ziyu Guo
Wu Yuan
VGen
402
3
0
12 Nov 2024
Survey of User Interface Design and Interaction Techniques in Generative AI Applications
Reuben Luera
Ryan Rossi
Alexa F. Siu
Franck Dernoncourt
Tong Yu
...
Hanieh Salehy
Jian Zhao
Samyadeep Basu
Puneet Mathur
Nedim Lipka
AI4TS
287
5
0
28 Oct 2024
A Transformer Based Generative Chemical Language AI Model for Structural Elucidation of Organic Compounds
Journal of Cheminformatics (J Cheminform), 2024
Xiaofeng Tan
143
4
0
13 Oct 2024
K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences
Computer Vision and Pattern Recognition (CVPR), 2024
Zhikai Li
Xuewen Liu
Dongrong Fu
Jianquan Li
Qingyi Gu
Kurt Keutzer
Zhen Dong
EGVM
VGen
DiffM
351
8
0
26 Aug 2024
LessonPlanner: Assisting Novice Teachers to Prepare Pedagogy-Driven Lesson Plans with Large Language Models
ACM Symposium on User Interface Software and Technology (UIST), 2024
Haoxiang Fan
Guanzheng Chen
Xingbo Wang
Zhenhui Peng
AI4Ed
217
12
0
02 Aug 2024
Benchmarking AIGC Video Quality Assessment: A Dataset and Unified Model
Zhichao Zhang
Xinyue Li
Wei Sun
Jun Jia
Xiongkuo Min
...
Puyi Wang
Zhongpeng Ji
Fengyu Sun
Shangling Jui
Guangtao Zhai
EGVM
256
0
0
31 Jul 2024
A Comprehensive Survey on Human Video Generation: Challenges, Methods, and Insights
Wentao Lei
Jinting Wang
Fengji Ma
Guanjie Huang
Li Liu
VGen
EGVM
290
16
0
11 Jul 2024
Latent Energy-Based Odyssey: Black-Box Optimization via Expanded Exploration in the Energy-Based Latent Space
Peiyu Yu
Dinghuai Zhang
Hengzhi He
Xiaojian Ma
Ruiyao Miao
...
Deqian Kong
Ruiqi Gao
Jianwen Xie
Guang Cheng
Ying Nian Wu
340
10
0
27 May 2024
Sora and V-JEPA Have Not Learned The Complete Real World Model -- A Philosophical Analysis of Video AIs Through the Theory of Productive Imagination
Jianqiu Zhang
VGen
101
0
0
06 May 2024
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Zheng Zhu
Xiaofeng Wang
Wangbo Zhao
Chen Min
Nianchen Deng
...
Dawei Zhao
Liang Xiao
Jian-jun Zhao
Jiwen Lu
Guan Huang
VGen
LM&Ro
365
82
0
06 May 2024
DeepFake-O-Meter v2.0: An Open Platform for DeepFake Detection
Yan Ju
Chengzhe Sun
Shan Jia
Shuwei Hou
Zhaofeng Si
Soumyya Kanti Datta
Lipeng Ke
Riky Zhou
Anita Nikolich
Siwei Lyu
307
6
0
19 Apr 2024
BEND: Bagging Deep Learning Training Based on Efficient Neural Network Diffusion
Jia Wei
Xingjun Zhang
Witold Pedrycz
DiffM
184
0
0
23 Mar 2024
A Roadmap Towards Automated and Regulated Robotic Systems
Yihao Liu
Mehran Armand
190
3
0
21 Mar 2024
Xception: Deep Learning with Depthwise Separable Convolutions
Computer Vision and Pattern Recognition (CVPR), 2016
François Chollet
MDE
BDL
PINN
3.0K
16,722
0
07 Oct 2016
1