Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.10709
Cited By
Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning
17 November 2023
Rohit Girdhar
Mannat Singh
Andrew Brown
Quentin Duval
S. Azadi
Sai Saketh Rambhatla
Akbar Shah
Xi Yin
Devi Parikh
Ishan Misra
DiffM
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning"
50 / 159 papers shown
Title
T2S: High-resolution Time Series Generation with Text-to-Series Diffusion Models
Yunfeng Ge
Jiawei Li
Yiji Zhao
Haomin Wen
Zhao Li
M. Qiu
H. Li
Ming Jin
Shirui Pan
DiffM
46
0
0
05 May 2025
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models
Xu Ma
Peize Sun
Haoyu Ma
Hao Tang
Chih-Yao Ma
...
Matt Feiszli
Peizhao Zhang
Peter Vajda
Sam S. Tsai
Y. Fu
68
1
0
24 Apr 2025
VGDFR: Diffusion-based Video Generation with Dynamic Latent Frame Rate
Zhihang Yuan
Rui Xie
Yuzhang Shang
H. Zhang
Siyuan Wang
Shengen Yan
Guohao Dai
Yu Wang
DiffM
VGen
42
0
0
16 Apr 2025
VideoPanda: Video Panoramic Diffusion with Multi-view Attention
Kevin Xie
Amirmojtaba Sabour
Jiahui Huang
Despoina Paschalidou
G. Klár
Umar Iqbal
Sanja Fidler
Xiaohui Zeng
VGen
MDE
34
0
0
15 Apr 2025
SkyReels-A2: Compose Anything in Video Diffusion Transformers
Zhengcong Fei
D. Li
Di Qiu
J. Wang
Yikun Dou
...
J. Xu
Mingyuan Fan
Guibin Chen
Yang Li
Yahui Zhou
DiffM
VGen
63
2
0
03 Apr 2025
Distilling Multi-view Diffusion Models into 3D Generators
Hao Qin
Luyuan Chen
Ming Kong
Mengxu Lu
Qiang Zhu
3DGS
64
0
0
01 Apr 2025
Semantix: An Energy Guided Sampler for Semantic Style Transfer
Huiang He
Minghui Hu
C. Zheng
Chaoyue Wang
Tat-Jen Cham
DiffM
39
0
0
28 Mar 2025
Can Video Diffusion Model Reconstruct 4D Geometry?
Jinjie Mai
Wenxuan Zhu
Haozhe Liu
Bing Li
Cheng Zheng
Jürgen Schmidhuber
Bernard Ghanem
VGen
MDE
70
0
0
27 Mar 2025
FullDiT: Multi-Task Video Generative Foundation Model with Full Attention
Xuan Ju
Weicai Ye
Quande Liu
Qiulin Wang
Xintao Wang
Pengfei Wan
Di Zhang
Kun Gai
Qiang Xu
VGen
39
1
0
25 Mar 2025
LongDiff: Training-Free Long Video Generation in One Go
Zhuoling Li
Hossein Rahmani
Qiuhong Ke
J. Liu
DiffM
VGen
VLM
56
0
0
23 Mar 2025
Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model
Yingying Fan
Quanwei Yang
Kaisiyuan Wang
Hang Zhou
Yingying Li
Haocheng Feng
Errui Ding
Y. Wu
J. Wang
DiffM
42
0
0
21 Mar 2025
Enabling Versatile Controls for Video Diffusion Models
Xu Zhang
Hao Zhou
Haoming Qin
Xiaobin Lu
Jiaxing Yan
Guanzhong Wang
Zeyu Chen
Yi Liu
DiffM
VGen
60
0
0
21 Mar 2025
SV4D 2.0: Enhancing Spatio-Temporal Consistency in Multi-View Video Diffusion for High-Quality 4D Generation
Chun-Han Yao
Yiming Xie
Vikram S. Voleti
Huaizu Jiang
Varun Jampani
3DGS
VGen
60
0
0
20 Mar 2025
Stitch-a-Recipe: Video Demonstration from Multistep Descriptions
Chi Hsuan Wu
Kumar Ashutosh
Kristen Grauman
DiffM
58
0
0
18 Mar 2025
Long Context Tuning for Video Generation
Yuwei Guo
Ceyuan Yang
Ziyan Yang
Zhibei Ma
Zhijie Lin
Zhenheng Yang
Dahua Lin
Lu Jiang
DiffM
VGen
72
2
0
13 Mar 2025
WISA: World Simulator Assistant for Physics-Aware Text-to-Video Generation
Jing Wang
Ao Ma
Ke Cao
Jun Zheng
Zhanjie Zhang
...
Yuhang Ma
Bo Cheng
Dawei Leng
Yuhui Yin
Xiaodan Liang
VGen
82
3
0
11 Mar 2025
Generative Video Bi-flow
Chen Liu
Tobias Ritschel
DiffM
VGen
44
0
0
09 Mar 2025
Text2Story: Advancing Video Storytelling with Text Guidance
Taewon Kang
D. Kothandaraman
Ming C. Lin
DiffM
VGen
59
0
0
08 Mar 2025
Get In Video: Add Anything You Want to the Video
Shaobin Zhuang
Zhipeng Huang
Binxin Yang
Ying Zhang
Fangyikang Wang
Canmiao Fu
Chong Sun
Zheng-Jun Zha
Chen Li
Y. Wang
DiffM
VGen
47
0
0
08 Mar 2025
FluidNexus: 3D Fluid Reconstruction and Prediction from a Single Video
Yue Gao
Hong-Xing Yu
Bo Zhu
Jiajun Wu
VGen
51
1
0
06 Mar 2025
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion
Ziyi Yang
Fanqi Wan
Longguang Zhong
Canbin Huang
Guosheng Liang
Xiaojun Quan
MoMe
90
0
0
06 Mar 2025
Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models
Jay Zhangjie Wu
Yuxuan Zhang
Haithem Turki
Xuanchi Ren
Jun Gao
Mike Zheng Shou
Sanja Fidler
Zan Gojcic
Huan Ling
53
1
0
03 Mar 2025
Learning to Animate Images from A Few Videos to Portray Delicate Human Actions
Haoxin Li
Yingchen Yu
Qilong Wu
Hanwang Zhang
Boyang Li
Song Bai
3DH
VGen
69
0
0
01 Mar 2025
Unified Video Action Model
Shuang Li
Yihuai Gao
Dorsa Sadigh
Shuran Song
VGen
44
1
0
28 Feb 2025
SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion
Junxian Ma
Shiwen Wang
Jian Yang
Junyi Hu
Jian Liang
Guosheng Lin
Jingbo Chen
Kai Li
Yu Meng
DiffM
VGen
61
3
0
17 Feb 2025
Multi-subject Open-set Personalization in Video Generation
Tsai-Shien Chen
Aliaksandr Siarohin
Willi Menapace
Yuwei Fang
Kwot Sin Lee
Ivan Skorokhodov
Kfir Aberman
Jun-Yan Zhu
Ming Yang
Sergey Tulyakov
DiffM
VGen
69
7
0
10 Jan 2025
Towards Precise Scaling Laws for Video Diffusion Transformers
Yuanyang Yin
Yaqi Zhao
Mingwu Zheng
Ke Lin
Jiarong Ou
...
Pengfei Wan
Di Zhang
Baoqun Yin
Wentao Zhang
Kun Gai
119
2
0
03 Jan 2025
InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
Rick Akkerman
Haiwen Feng
M. Black
Dimitrios Tzionas
Victoria Fernandez-Abrevaya
VGen
AI4CE
100
3
0
16 Dec 2024
GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
Mariam Hassan
Sebastian Stapf
Ahmad Rahimi
Pedro M B Rezende
Yasaman Haghighi
...
Mathieu Salzmann
Davide Scaramuzza
Marc Pollefeys
Paolo Favaro
Alexandre Alahi
VLM
VGen
69
5
0
15 Dec 2024
Video Diffusion Transformers are In-Context Learners
Zhengcong Fei
Di Qiu
Changqian Yu
Debang Li
Mingyuan Fan
VGen
DiffM
121
2
0
14 Dec 2024
LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity
Hongjie Wang
Chih-Yao Ma
Yen-Cheng Liu
Ji Hou
Tao Xu
...
Peizhao Zhang
Tingbo Hou
Peter Vajda
N. Jha
Xiaoliang Dai
LMTD
DiffM
VGen
VLM
81
5
0
13 Dec 2024
Navigation World Models
Amir Bar
G. Zhou
Danny Tran
Trevor Darrell
Yann LeCun
VGen
EgoV
80
14
0
04 Dec 2024
World-consistent Video Diffusion with Explicit 3D Modeling
Qihang Zhang
Shuangfei Zhai
Miguel Angel Bautista
Kevin Miao
Alexander Toshev
J. Susskind
Jiatao Gu
VGen
75
8
0
02 Dec 2024
InfinityDrive: Breaking Time Limits in Driving World Models
Xi Guo
C. Ding
Haoxuan Dou
Xin Zhang
Weixuan Tang
Wei Yu Wu
VGen
81
5
0
02 Dec 2024
MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models
Xiaomin Li
Xu Jia
Qinghe Wang
Haiwen Diao
Mengmeng Ge
Pengxiang Li
You He
Huchuan Lu
VGen
DiffM
60
3
0
02 Dec 2024
ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration
Chaojun Ni
Guosheng Zhao
Xiaofeng Wang
Zheng Hua Zhu
Wenkang Qin
...
Kun Zhan
Peng Jia
Xianpeng Lang
Xingang Wang
Wenjun Mei
VGen
98
6
0
29 Nov 2024
Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing
Kaifeng Gao
Jiaxin Shi
Hanwang Zhang
Chunping Wang
Jun Xiao
Long Chen
DiffM
VGen
90
0
0
25 Nov 2024
MVGenMaster: Scaling Multi-View Generation from Any Image via 3D Priors Enhanced Diffusion Model
Chenjie Cao
Chaohui Yu
Shang Liu
Fan Wang
Xiangyang Xue
Yanwei Fu
87
1
0
25 Nov 2024
Generative Omnimatte: Learning to Decompose Video into Layers
Yao-Chih Lee
Erika Lu
Sarah Rumbley
Michal Geyer
Jia-Bin Huang
Tali Dekel
Forrester Cole
DiffM
VGen
93
4
0
25 Nov 2024
Revelio
\textit{Revelio}
Revelio
: Interpreting and leveraging semantic information in diffusion models
Dahye Kim
Xavier Thomas
Deepti Ghadiyaram
67
0
0
23 Nov 2024
Generating 3D-Consistent Videos from Unposed Internet Photos
Gene Chou
Kai Zhang
Sai Bi
Hao Tan
Zexiang Xu
Fujun Luan
Bharath Hariharan
Noah Snavely
3DGS
VGen
75
3
0
20 Nov 2024
Artificial Intelligence for Biomedical Video Generation
Linyuan Li
Jianing Qiu
Anujit Saha
Lin Li
Poyuan Li
Mengxian He
Ziyu Guo
Wu Yuan
VGen
55
1
0
12 Nov 2024
Grounding Video Models to Actions through Goal Conditioned Exploration
Yunhao Luo
Yilun Du
LM&Ro
VGen
77
1
0
11 Nov 2024
I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength
Wanquan Feng
Jiawei Liu
Pengqi Tu
Tianhao Qi
Mingzhen Sun
Tianxiang Ma
Songtao Zhao
Siyu Zhou
Qian He
VGen
47
7
0
10 Nov 2024
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
David Junhao Zhang
Roni Paiss
Shiran Zada
Nikhil Karnad
David E. Jacobs
Yael Pritch
Inbar Mosseri
Mike Zheng Shou
Neal Wadhwa
Nataniel Ruiz
DiffM
VGen
66
15
0
07 Nov 2024
AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
Anil Kag
Huseyin Coskun
Jierun Chen
Junli Cao
Willi Menapace
Aliaksandr Siarohin
Sergey Tulyakov
Jian Ren
46
2
0
07 Nov 2024
Fashion-VDM: Video Diffusion Model for Virtual Try-On
J. Karras
Yingwei Li
Nan Liu
Luyang Zhu
Innfarn Yoo
Andreas Lugmayr
Chris Lee
Ira Kemelmacher-Shlizerman
DiffM
VGen
40
4
0
31 Oct 2024
Warped Diffusion: Solving Video Inverse Problems with Image Diffusion Models
Giannis Daras
Weili Nie
Karsten Kreis
A. Dimakis
Morteza Mardani
Nikola B. Kovachki
Arash Vahdat
DiffM
33
5
0
21 Oct 2024
DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation
Guosheng Zhao
Chaojun Ni
Xiaofeng Wang
Zheng Zhu
X. Zhang
...
Xinze Chen
Boyuan Wang
Youyi Zhang
Wenjun Mei
Xingang Wang
VGen
77
24
0
17 Oct 2024
Depth Any Video with Scalable Synthetic Data
Honghui Yang
Di Huang
Wei Yin
Chunhua Shen
Haifeng Liu
Xiaofei He
Binbin Lin
Wanli Ouyang
Tong He
VGen
MDE
21
16
0
14 Oct 2024
1
2
3
4
Next