Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.12631
Cited By
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
21 November 2023
Jiaxi Lv
Yi Huang
Mingfu Yan
Jiancheng Huang
Jianzhuang Liu
Yifan Liu
Yafei Wen
Xiaoxin Chen
Shifeng Chen
VGen
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning"
24 / 24 papers shown
Title
T2VPhysBench: A First-Principles Benchmark for Physical Consistency in Text-to-Video Generation
Xuyang Guo
Jiayan Huo
Zhenmei Shi
Zhao-quan Song
Jiahao Zhang
Jiale Zhao
EGVM
VGen
PINN
75
1
0
01 May 2025
Reasoning Physical Video Generation with Diffusion Timestep Tokens via Reinforcement Learning
Wang Lin
Liyu Jia
Wentao Hu
Kaihang Pan
Zhongqi Yue
Wei Zhao
Jingyuan Chen
Fei Wu
Hanwang Zhang
VGen
44
0
0
22 Apr 2025
OmniVDiff: Omni Controllable Video Diffusion for Generation and Understanding
Dianbing Xi
J. Wang
Yuanzhi Liang
Xi Qiu
Yuchi Huo
R. Wang
Chi Zhang
X. Li
DiffM
VGen
62
0
0
15 Apr 2025
Can You Count to Nine? A Human Evaluation Benchmark for Counting Limits in Modern Text-to-Video Models
Xuyang Guo
Zekai Huang
Jiayan Huo
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
Jiahao Zhang
ALM
VGen
62
2
0
05 Apr 2025
MG-Gen: Single Image to Motion Graphics Generation with Layer Decomposition
Takahiro Shirakawa
Tomoyuki Suzuki
Daichi Haraguchi
VGen
36
0
0
03 Apr 2025
Exploring the Evolution of Physics Cognition in Video Generation: A Survey
Minghui Lin
Xiang Wang
Y. Wang
Shu Wang
Fengqi Dai
...
Cunxiang Wang
Zhengrong Zuo
Nong Sang
Siteng Huang
Donglin Wang
EGVM
VGen
78
3
0
27 Mar 2025
Synthetic Video Enhances Physical Fidelity in Video Synthesis
Qi Zhao
Xingyu Ni
Ziyu Wang
Feng Cheng
Ziyan Yang
Lu Jiang
Bohan Wang
VGen
41
2
0
26 Mar 2025
Multi-Object Sketch Animation by Scene Decomposition and Motion Planning
Jingyu Liu
Zijie Xin
Yuhan Fu
Ruixiang Zhao
Bangxiang Lan
Xirong Li
39
0
0
25 Mar 2025
Learning to Animate Images from A Few Videos to Portray Delicate Human Actions
Haoxin Li
Yingchen Yu
Qilong Wu
Hanwang Zhang
Boyang Li
Song Bai
3DH
VGen
61
0
0
01 Mar 2025
Dual-Schedule Inversion: Training- and Tuning-Free Inversion for Real Image Editing
Jiancheng Huang
Yi Huang
Jianzhuang Liu
Donghao Zhou
Y. Liu
Shifeng Chen
DiffM
77
0
0
15 Dec 2024
DIVE: Taming DINO for Subject-Driven Video Editing
Yi Huang
Wei Xiong
He Zhang
Chaoqi Chen
Jianzhuang Liu
Mingfu Yan
Shifeng Chen
VGen
DiffM
73
0
0
04 Dec 2024
PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation
Qiyao Xue
Xiangyu Yin
Boyuan Yang
Wei Gao
DiffM
VGen
75
9
0
30 Nov 2024
AutoVFX: Physically Realistic Video Editing from Natural Language Instructions
Hao-Yu Hsu
Zhi-Hao Lin
Albert Zhai
Hongchi Xia
Shenlong Wang
VGen
40
9
0
04 Nov 2024
Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning
Penghui Ruan
Pichao Wang
Divya Saxena
Jiannong Cao
Yuhui Shi
DiffM
VGen
24
0
0
31 Oct 2024
PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation
Shaowei Liu
Zhongzheng Ren
Saurabh Gupta
Shenlong Wang
VGen
DiffM
PINN
37
33
0
27 Sep 2024
AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction
Zhen Xing
Qi Dai
Zejia Weng
Zuxuan Wu
Yu-Gang Jiang
VGen
39
14
0
10 Jun 2024
From Sora What We Can See: A Survey of Text-to-Video Generation
Rui Sun
Yumin Zhang
Tejal Shah
Jiahao Sun
Shuoying Zhang
Wenqi Li
Haoran Duan
Bo Wei
R. Ranjan
EGVM
76
17
0
17 May 2024
Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation
Joseph Cho
Fachrina Dewi Puspitasari
Sheng Zheng
Jingyao Zheng
Lik-Hang Lee
Tae-Ho Kim
Choong Seon Hong
Chaoning Zhang
EGVM
VGen
36
11
0
08 Mar 2024
SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code
Ziniu Hu
Ahmet Iscen
Aashi Jain
Thomas Kipf
Yisong Yue
David A. Ross
Cordelia Schmid
Alireza Fathi
LLMAG
34
23
0
02 Mar 2024
Diffusion Model-Based Image Editing: A Survey
Yi Huang
Jiancheng Huang
Yifan Liu
Mingfu Yan
Jiaxi Lv
Jianzhuang Liu
Wei Xiong
He Zhang
Liangliang Cao
Liangliang Cao
EGVM
66
82
0
27 Feb 2024
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
235
556
0
29 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
382
4,010
0
28 Jan 2022
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
229
74,467
0
18 May 2015
1