Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2403.05131
Cited By

Sora as a World Model? A Complete Survey on Text-to-Video Generation

v1v2v3 (latest)

Sora as a World Model? A Complete Survey on Text-to-Video Generation

8 March 2024

Fachrina Dewi Puspitasari

Noor Ul Eman

Choong Seon Hong

Jingyao Zheng

Sheng Zheng

Lik-Hang Lee

Caiyan Qin

Tae-Ho Kim

Choong Seon Hong

Yang Yang

Heng Tao Shen

ArXiv (abs)PDF HTML Github (3579★)

Papers citing "Sora as a World Model? A Complete Survey on Text-to-Video Generation"

39 / 39 papers shown

Generative Action Tell-Tales: Assessing Human Motion in Synthesized Videos

Generative Action Tell-Tales: Assessing Human Motion in Synthesized Videos

Ananya Srinivasan

Deepti Ghadiyaram

322

0

0

01 Dec 2025

PhyDetEx: Detecting and Explaining the Physical Plausibility of T2V Models

139

0

0

01 Dec 2025

Counterfactual World Models via Digital Twin-conditioned Video Diffusion

Counterfactual World Models via Digital Twin-conditioned Video Diffusion

Mathias Unberath

165

0

0

21 Nov 2025

VEIL: Jailbreaking Text-to-Video Models via Visual Exploitation from Implicit Language

VEIL: Jailbreaking Text-to-Video Models via Visual Exploitation from Implicit Language

128

0

0

17 Nov 2025

PipeDiT: Accelerating Diffusion Transformers in Video Generation with Task Pipelining and Model Decoupling

PipeDiT: Accelerating Diffusion Transformers in Video Generation with Task Pipelining and Model Decoupling

134

0

0

15 Nov 2025

Simulating the Visual World with Artificial Intelligence: A Roadmap

Simulating the Visual World with Artificial Intelligence: A Roadmap

474

1

0

11 Nov 2025

Embodied AI: From LLMs to World Models

Embodied AI: From LLMs to World Models

340

11

0

24 Sep 2025

InPhyRe Discovers: Large Multimodal Models Struggle in Inductive Physical Reasoning

InPhyRe Discovers: Large Multimodal Models Struggle in Inductive Physical Reasoning

Gautam Sreekumar

147

0

0

12 Sep 2025

Video Understanding by Design: How Datasets Shape Architectures and Insights

Video Understanding by Design: How Datasets Shape Architectures and Insights

238

0

0

11 Sep 2025

CineTrans: Learning to Generate Videos with Cinematic Transitions via Masked Diffusion Models

CineTrans: Learning to Generate Videos with Cinematic Transitions via Masked Diffusion Models

198

5

0

15 Aug 2025

NarrLV: Towards a Comprehensive Narrative-Centric Evaluation for Long Video Generation

NarrLV: Towards a Comprehensive Narrative-Centric Evaluation for Long Video Generation

DiffM EGVM VGen

568

6

0

15 Jul 2025

Vision Technologies with Applications in Traffic Surveillance Systems: A Holistic SurveyACM Computing Surveys (ACM CSUR), 2024

513

8

0

01 Jul 2025

G4Seg: Generation for Inexact Segmentation Refinement with Diffusion Models

G4Seg: Generation for Inexact Segmentation Refinement with Diffusion Models

342

4

0

02 Jun 2025

MOVi: Training-free Text-conditioned Multi-Object Video Generation

MOVi: Training-free Text-conditioned Multi-Object Video Generation

Vishal M. Patel

275

1

0

29 May 2025

A Challenge to Build Neuro-Symbolic Video Agents

A Challenge to Build Neuro-Symbolic Video Agents

Sai Shankar Narasimhan

Sandeep Chinchali

268

1

0

20 May 2025

We'll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback

We'll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback

Sandeep Chinchali

421

4

0

24 Apr 2025

Morpheus: Benchmarking Physical Reasoning of Video Generative Models with Real Physical Experiments

Morpheus: Benchmarking Physical Reasoning of Video Generative Models with Real Physical Experiments

Daniil Cherniavskii

Antonios Tragoudaras

Antonios Vozikis

Andrii Zadaianchuk

294

12

0

03 Apr 2025

HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation

HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled GenerationComputer Vision and Pattern Recognition (CVPR), 2025

...

391

18

0

31 Mar 2025

A Self-supervised Motion Representation for Portrait Video Generation

A Self-supervised Motion Representation for Portrait Video Generation

309

0

0

13 Mar 2025

FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion

Longguang Zhong

286

10

0

06 Mar 2025

BounTCHA: A CAPTCHA Utilizing Boundary Identification in Guided Generative AI-extended Videos

BounTCHA: A CAPTCHA Utilizing Boundary Identification in Guided Generative AI-extended Videos

366

0

0

30 Jan 2025

Generative AI for Cel-Animation: A Survey

Generative AI for Cel-Animation: A Survey

...

706

17

0

08 Jan 2025

Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric

Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric

...

533

11

0

25 Nov 2024

Understanding World or Predicting Future? A Comprehensive Survey of World Models

Understanding World or Predicting Future? A Comprehensive Survey of World ModelsACM Computing Surveys (ACM CSUR), 2024

...

Chen Gao

Fengli Xu

Yong Li

517

17

0

21 Nov 2024

Jailbreak Attacks and Defenses against Multimodal Generative Models: A
Survey

Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey

538

24

0

14 Nov 2024

Artificial Intelligence for Biomedical Video Generation

Artificial Intelligence for Biomedical Video Generation

402

3

0

12 Nov 2024

Survey of User Interface Design and Interaction Techniques in Generative
AI Applications

Survey of User Interface Design and Interaction Techniques in Generative AI Applications

Franck Dernoncourt

...

287

5

0

28 Oct 2024

A Transformer Based Generative Chemical Language AI Model for Structural
Elucidation of Organic Compounds

A Transformer Based Generative Chemical Language AI Model for Structural Elucidation of Organic CompoundsJournal of Cheminformatics (J Cheminform), 2024

Xiaofeng Tan

143

4

0

13 Oct 2024

K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences

K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human PreferencesComputer Vision and Pattern Recognition (CVPR), 2024

Kurt Keutzer

EGVM VGen DiffM

351

8

0

26 Aug 2024

LessonPlanner: Assisting Novice Teachers to Prepare Pedagogy-Driven
Lesson Plans with Large Language Models

LessonPlanner: Assisting Novice Teachers to Prepare Pedagogy-Driven Lesson Plans with Large Language ModelsACM Symposium on User Interface Software and Technology (UIST), 2024

217

12

0

02 Aug 2024

Benchmarking AIGC Video Quality Assessment: A Dataset and Unified Model

Benchmarking AIGC Video Quality Assessment: A Dataset and Unified Model

Wei Sun

Jun Jia

Xiongkuo Min

...

Guangtao Zhai

256

0

0

31 Jul 2024

A Comprehensive Survey on Human Video Generation: Challenges, Methods,
and Insights

A Comprehensive Survey on Human Video Generation: Challenges, Methods, and Insights

Wentao Lei

290

16

0

11 Jul 2024

Latent Energy-Based Odyssey: Black-Box Optimization via Expanded
Exploration in the Energy-Based Latent Space

Latent Energy-Based Odyssey: Black-Box Optimization via Expanded Exploration in the Energy-Based Latent Space

Xiaojian Ma

...

Ruiqi Gao

340

10

0

27 May 2024

Sora and V-JEPA Have Not Learned The Complete Real World Model -- A
Philosophical Analysis of Video AIs Through the Theory of Productive
Imagination

Sora and V-JEPA Have Not Learned The Complete Real World Model -- A Philosophical Analysis of Video AIs Through the Theory of Productive Imagination

101

0

0

06 May 2024

Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond

Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond

...

365

82

0

06 May 2024

DeepFake-O-Meter v2.0: An Open Platform for DeepFake Detection

DeepFake-O-Meter v2.0: An Open Platform for DeepFake Detection

Soumyya Kanti Datta

307

6

0

19 Apr 2024

BEND: Bagging Deep Learning Training Based on Efficient Neural Network
Diffusion

BEND: Bagging Deep Learning Training Based on Efficient Neural Network Diffusion

Witold Pedrycz

184

0

0

23 Mar 2024

A Roadmap Towards Automated and Regulated Robotic Systems

A Roadmap Towards Automated and Regulated Robotic Systems

190

3

0

21 Mar 2024

Xception: Deep Learning with Depthwise Separable Convolutions

Xception: Deep Learning with Depthwise Separable ConvolutionsComputer Vision and Pattern Recognition (CVPR), 2016

François Chollet

3.0K

16,722

0

07 Oct 2016