Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.02399
Cited By
Phenaki: Variable Length Video Generation From Open Domain Textual Description
5 October 2022
Ruben Villegas
Mohammad Babaeizadeh
Pieter-Jan Kindermans
Hernan Moraldo
Han Zhang
M. Saffar
Santiago Castro
Julius Kunze
D. Erhan
DiffM
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Phenaki: Variable Length Video Generation From Open Domain Textual Description"
50 / 287 papers shown
Title
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions
Qingwen Bu
Y. Yang
Jisong Cai
Shenyuan Gao
Guanghui Ren
Maoqing Yao
Ping Luo
Hongyang Li
57
0
0
09 May 2025
We'll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback
Minkyu Choi
Sundar Sripada V. S.
Harsh Goel
Sahil Shah
Sandeep P. Chinchali
DiffM
VGen
86
0
0
24 Apr 2025
Solving New Tasks by Adapting Internet Video Knowledge
Calvin Luo
Zilai Zeng
Yilun Du
Chen Sun
21
0
0
21 Apr 2025
Packing Input Frame Context in Next-Frame Prediction Models for Video Generation
Lvmin Zhang
Maneesh Agrawala
DiffM
VGen
70
0
0
17 Apr 2025
OmniVDiff: Omni Controllable Video Diffusion for Generation and Understanding
Dianbing Xi
J. Wang
Yuanzhi Liang
Xi Qiu
Yuchi Huo
R. Wang
Chi Zhang
X. Li
DiffM
VGen
65
0
0
15 Apr 2025
One-Minute Video Generation with Test-Time Training
Karan Dalal
Daniel Koceja
Gashon Hussein
Jiarui Xu
Yue Zhao
...
Tatsunori Hashimoto
Sanmi Koyejo
Yejin Choi
Yu Sun
Xiaolong Wang
ViT
91
3
0
07 Apr 2025
Exploration-Driven Generative Interactive Environments
N. Savov
Naser Kazemi
Mohammad Mahdi
Danda Pani Paudel
Xi Wang
Luc Van Gool
VGen
3DV
38
0
0
03 Apr 2025
SkyReels-A2: Compose Anything in Video Diffusion Transformers
Zhengcong Fei
D. Li
Di Qiu
J. Wang
Yikun Dou
...
J. Xu
Mingyuan Fan
Guibin Chen
Yang Li
Yahui Zhou
DiffM
VGen
63
2
0
03 Apr 2025
Mask
2
^2
2
DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Tianhao Qi
Jianlong Yuan
Wanquan Feng
Shancheng Fang
Jiawei Liu
Siyu Zhou
Qian He
Hongtao Xie
Yongdong Zhang
DiffM
VGen
39
0
0
25 Mar 2025
Halton Scheduler For Masked Generative Image Transformer
Victor Besnier
Mickael Chen
David Hurych
Eduardo Valle
Matthieu Cord
47
1
0
21 Mar 2025
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation
Y. Wang
Zhijie Lin
Yao Teng
Yuanzhi Zhu
Shuhuai Ren
Jiashi Feng
Xihui Liu
46
0
0
20 Mar 2025
RealGeneral: Unifying Visual Generation via Temporal In-Context Learning with Video Models
Yijing Lin
Mengqi Huang
Shuhan Zhuang
Zhendong Mao
VGen
43
0
0
13 Mar 2025
Long Context Tuning for Video Generation
Yuwei Guo
Ceyuan Yang
Ziyan Yang
Zhibei Ma
Zhijie Lin
Zhenheng Yang
Dahua Lin
Lu Jiang
DiffM
VGen
72
2
0
13 Mar 2025
VideoPhy-2: A Challenging Action-Centric Physical Commonsense Evaluation in Video Generation
Hritik Bansal
Clark Peng
Yonatan Bitton
Roman Goldenberg
Aditya Grover
Kai-Wei Chang
EGVM
VGen
49
2
0
09 Mar 2025
Text2Story: Advancing Video Storytelling with Text Guidance
Taewon Kang
D. Kothandaraman
Ming C. Lin
DiffM
VGen
59
0
0
08 Mar 2025
Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think
Jie Tian
Xiaoye Qu
Zhenyi Lu
Wei Wei
Sichen Liu
Yu-Xi Cheng
DiffM
VGen
44
0
0
02 Mar 2025
Unified Video Action Model
Shuang Li
Yihuai Gao
Dorsa Sadigh
Shuran Song
VGen
44
1
0
28 Feb 2025
ASurvey: Spatiotemporal Consistency in Video Generation
Zhiyu Yin
Kehai Chen
Xuefeng Bai
Ruili Jiang
J. Li
Hongdong Li
Jin Liu
Yang Xiang
Jun Yu
Min Zhang
EGVM
VGen
AI4TS
54
0
0
25 Feb 2025
VaViM and VaVAM: Autonomous Driving through Video Generative Modeling
Florent Bartoccioni
Elias Ramzi
Victor Besnier
Shashanka Venkataramanan
Tuan-Hung Vu
...
Mickael Chen
Éloi Zablocki
Andrei Bursuc
Eduardo Valle
Matthieu Cord
VGen
78
1
0
24 Feb 2025
MotionMatcher: Motion Customization of Text-to-Video Diffusion Models via Motion Feature Matching
Yen-Siang Wu
Chi-Pin Huang
Fu-En Yang
Yu-Jie Wang
DiffM
VGen
54
1
0
18 Feb 2025
MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation
Sihyun Yu
Meera Hahn
Dan Kondratyuk
Jinwoo Shin
Agrim Gupta
José Lezama
Irfan Essa
David A. Ross
Jonathan Huang
DiffM
VGen
72
0
0
18 Feb 2025
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models
Gaojie Lin
Jianwen Jiang
Jiaqi Yang
Zerong Zheng
Chao Liang
DiffM
VGen
169
11
0
03 Feb 2025
Taming Teacher Forcing for Masked Autoregressive Video Generation
Deyu Zhou
Quan Sun
Yuang Peng
Kun Yan
Runpei Dong
...
Zheng Ge
Nan Duan
Xiangyu Zhang
L. Ni
H. Shum
VGen
51
6
0
21 Jan 2025
Simplified and Generalized Masked Diffusion for Discrete Data
Jiaxin Shi
Kehang Han
Z. Wang
Arnaud Doucet
Michalis K. Titsias
DiffM
74
62
0
17 Jan 2025
ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning
Yuzhou Huang
Ziyang Yuan
Quande Liu
Qiulin Wang
Xintao Wang
Ruimao Zhang
Pengfei Wan
Di Zhang
Kun Gai
VGen
DiffM
35
10
0
08 Jan 2025
Towards Precise Scaling Laws for Video Diffusion Transformers
Yuanyang Yin
Yaqi Zhao
Mingwu Zheng
Ke Lin
Jiarong Ou
...
Pengfei Wan
Di Zhang
Baoqun Yin
Wentao Zhang
Kun Gai
122
2
0
03 Jan 2025
VidTwin: Video VAE with Decoupled Structure and Dynamics
Yuchi Wang
Junliang Guo
Xinyi Xie
Tianyu He
Xu Sun
Jiang Bian
DRL
VGen
73
3
0
23 Dec 2024
Video Diffusion Transformers are In-Context Learners
Zhengcong Fei
Di Qiu
Changqian Yu
Debang Li
Mingyuan Fan
VGen
DiffM
142
2
0
14 Dec 2024
LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity
Hongjie Wang
Chih-Yao Ma
Yen-Cheng Liu
Ji Hou
Tao Xu
...
Peizhao Zhang
Tingbo Hou
Peter Vajda
N. Jha
Xiaoliang Dai
LMTD
DiffM
VGen
VLM
81
5
0
13 Dec 2024
Owl-1: Omni World Model for Consistent Long Video Generation
Yuanhui Huang
Wenzhao Zheng
Yuan Gao
Xin Tao
Pengfei Wan
Di Zhang
Jie Zhou
Jiwen Lu
VGen
VLM
82
0
0
12 Dec 2024
[MASK] is All You Need
Vincent Tao Hu
Bjorn Ommer
DiffM
135
2
0
09 Dec 2024
Sound2Vision: Generating Diverse Visuals from Audio through Cross-Modal Latent Alignment
Kim Sung-Bin
Arda Senocak
Hyunwoo Ha
Tae-Hyun Oh
DiffM
68
0
0
09 Dec 2024
DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models
Yizhuo Li
Yuying Ge
Yixiao Ge
Ping Luo
Ying Shan
DiffM
VGen
90
0
0
05 Dec 2024
Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook
Florinel-Alin Croitoru
Andrei Iulian Hiji
Vlad Hondru
Nicolae-Cătălin Ristea
Paul Irofti
Marius Popescu
Cristian Rusu
Radu Tudor Ionescu
F. Khan
Mubarak Shah
82
2
0
29 Nov 2024
OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation
Hui Li
Mingwang Xu
Yun Zhan
Shan Mu
Jiaye Li
...
Y. Chen
Tan Chen
Mao Ye
Jingdong Wang
Siyu Zhu
VGen
99
2
0
28 Nov 2024
Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric
Zhichao Zhang
Wei Sun
Xinyue Li
Yunhao Li
Qihang Ge
...
Zhongpeng Ji
Fengyu Sun
Shangling Jui
Xiongkuo Min
Guangtao Zhai
EGVM
117
1
0
25 Nov 2024
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
Weijia Wu
Mingyu Liu
Zeyu Zhu
Xi Xia
Haoen Feng
Wen Wang
Kevin Qinghong Lin
Chunhua Shen
Mike Zheng Shou
DiffM
VGen
114
1
0
22 Nov 2024
Towards motion from video diffusion models
Paul Janson
Tiberiu Popa
Eugene Belilovsky
DiffM
VGen
62
0
0
19 Nov 2024
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
M. Zhang
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
46
9
0
08 Nov 2024
Pre-trained Visual Dynamics Representations for Efficient Policy Learning
Hao Luo
Bohan Zhou
Zongqing Lu
30
1
0
05 Nov 2024
SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation
Yining Hong
Beide Liu
Maxine Wu
Yuanhao Zhai
Kai-Wei Chang
...
Chung-Ching Lin
Jianfeng Wang
Z. Yang
Yingnian Wu
Lijuan Wang
VGen
35
6
0
30 Oct 2024
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
Hanyu Wang
Saksham Suri
Yixuan Ren
Hao Chen
Abhinav Shrivastava
VGen
24
9
0
28 Oct 2024
ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
Zongyi Li
Shujie Hu
Shujie Liu
Long Zhou
Jeongsoo Choi
Lingwei Meng
Xun Guo
J. Li
H. Ling
Furu Wei
VGen
DiffM
72
5
0
27 Oct 2024
Your Image is Secretly the Last Frame of a Pseudo Video
Wenlong Chen
Wenlin Chen
Lapo Rastrelli
Yingzhen Li
DiffM
VGen
32
0
0
26 Oct 2024
Dynamic 3D Gaussian Tracking for Graph-Based Neural Dynamics Modeling
Mingtong Zhang
Kaifeng Zhang
Yunzhu Li
3DGS
AI4CE
23
5
0
24 Oct 2024
Latent Action Pretraining from Videos
Seonghyeon Ye
Joel Jang
Byeongguk Jeon
Sejune Joo
Jianwei Yang
...
Kimin Lee
J. Gao
Luke Zettlemoyer
Dieter Fox
Minjoon Seo
32
27
0
15 Oct 2024
Focused ReAct: Improving ReAct through Reiterate and Early Stop
Shuoqiu Li
Han Xu
Haipeng Chen
ReLM
LRM
28
6
0
14 Oct 2024
Depth Any Video with Scalable Synthetic Data
Honghui Yang
Di Huang
Wei Yin
Chunhua Shen
Haifeng Liu
Xiaofei He
Binbin Lin
Wanli Ouyang
Tong He
VGen
MDE
21
16
0
14 Oct 2024
Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction
Jarrid Rector-Brooks
Mohsin Hasan
Zhangzhi Peng
Zachary Quinn
Chenghao Liu
...
Michael Bronstein
Yoshua Bengio
Pranam Chatterjee
Alexander Tong
Avishek Joey Bose
DiffM
42
6
0
10 Oct 2024
ECHOPulse: ECG controlled echocardio-grams video generation
Yiwei Li
Sekeun Kim
Zihao Wu
Hanqi Jiang
Yi Pan
...
Sifan Song
Yucheng Shi
Tianming Liu
Quanzheng Li
Xiang Li
VGen
24
1
0
04 Oct 2024
1
2
3
4
5
6
Next