Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2311.15127
Cited By
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
25 November 2023
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
Dominik Lorenz
Yam Levi
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
VGen
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (13 upvotes)
Github (25943★)
Papers citing
"Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets"
50 / 967 papers shown
Title
CDI3D: Cross-guided Dense-view Interpolation for 3D Reconstruction
Z. Wu
Xibin Song
Senbo Wang
Weizhe Liu
Jiayu Yang
...
Shenzhou Chen
Taizhang Shang
Weixuan Sun
Shan Luo
Pan Ji
DiffM
200
2
0
13 Mar 2025
Long Context Tuning for Video Generation
Yuwei Guo
Ceyuan Yang
Ziyan Yang
Zhibei Ma
Zhijie Lin
Zhenheng Yang
Dahua Lin
Lu Jiang
DiffM
VGen
356
52
0
13 Mar 2025
DreamInsert: Zero-Shot Image-to-Video Object Insertion from A Single Image
Qi Zhao
Zhan Ma
Pan Zhou
VGen
347
2
0
13 Mar 2025
Streaming Generation of Co-Speech Gestures via Accelerated Rolling Diffusion
Evgeniia Vu
Andrei Boiarov
Dmitry Vetrov
VGen
361
0
0
13 Mar 2025
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance
Yufan Deng
Xun Guo
Yanjie Wang
Yizhi Wang
Angtian Wang
Shenghai Yuan
Yiding Yang
Bo Liu
Haibin Huang
Chongyang Ma
DiffM
VGen
297
7
0
13 Mar 2025
NIL: No-data Imitation Learning by Leveraging Pre-trained Video Diffusion Models
Mert Albaba
Chenhao Li
Markos Diomataris
Omid Taheri
Andreas Krause
M. Black
VGen
218
6
0
13 Mar 2025
Cosh-DiT: Co-Speech Gesture Video Synthesis via Hybrid Audio-Visual Diffusion Transformers
Yasheng Sun
Zhiliang Xu
Hang Zhou
Jiazhi Guan
Quanwei Yang
...
Yingying Li
Haocheng Feng
Jiadong Wang
Ziwei Liu
Koike Hideki
VGen
293
2
0
13 Mar 2025
CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models
Hao He
Ceyuan Yang
Shanchuan Lin
Yinghao Xu
Meng Wei
Liangke Gui
Qi Zhao
Gordon Wetzstein
Lu Jiang
Hongsheng Li
DiffM
VGen
358
36
0
13 Mar 2025
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation
Hyeonho Jeong
Suhyeon Lee
Jong Chul Ye
VGen
1.1K
9
0
12 Mar 2025
RewardSDS: Aligning Score Distillation via Reward-Weighted Sampling
Itay Chachy
Guy Yariv
Sagie Benaim
997
3
0
12 Mar 2025
I2V3D: Controllable image-to-video generation with 3D guidance
Zhiyuan Zhang
DongDong Chen
J. Liao
VGen
267
3
0
12 Mar 2025
Error Analyses of Auto-Regressive Video Diffusion Models: A Unified Framework
Jing Wang
Fengzhuo Zhang
Xiaoli Li
Vincent Y. F. Tan
Tianyu Pang
Chao Du
Aixin Sun
Zhuoran Yang
VGen
298
10
0
12 Mar 2025
Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latent Space
Jian Zhu
Zhengyu Jia
Tian Gao
Jiaxin Deng
Shidi Li
Han Li
Fu Liu
Xianpeng Lang
Xiaolong Sun
VGen
882
4
0
12 Mar 2025
AnyMoLe: Any Character Motion In-betweening Leveraging Video Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2025
Kwan Yun
Seokhyeon Hong
Chaelin Kim
Junyong Noh
DiffM
VGen
133
3
0
11 Mar 2025
Identity Preserving Latent Diffusion for Brain Aging Modeling
Gexin Huang
Zhangsihao Yang
Yalin Wang
Guido Gerig
Mengwei Ren
Xiaoxiao Li
MedIm
DiffM
253
0
0
11 Mar 2025
FP3: A 3D Foundation Policy for Robotic Manipulation
Rujia Yang
Geng Chen
Chuan Wen
Yang Gao
LM&Ro
253
18
0
11 Mar 2025
High-Quality 3D Head Reconstruction from Any Single Portrait Image
Jianfu Zhang
yujie Gao
Jiahui Zhan
Wentao Wang
Yiyi Zhang
H. Zhao
Liqing Zhang
3DH
238
0
0
11 Mar 2025
MVD-HuGaS: Human Gaussians from a Single Image via 3D Human Multi-view Diffusion Prior
K. Xiong
Ying Feng
Qi Zhang
Jianbo Jiao
Yang Zhao
Zhihao Liang
Huachen Gao
Ronggang Wang
3DGS
261
1
0
11 Mar 2025
V2M4: 4D Mesh Animation Reconstruction from a Single Monocular Video
Jianqi Chen
Biao Zhang
Xiangjun Tang
Peter Wonka
VGen
257
11
0
11 Mar 2025
WISA: World Simulator Assistant for Physics-Aware Text-to-Video Generation
Jing Wang
Ao Ma
Ke Cao
Jun Zheng
Zhanjie Zhang
...
Yuhang Ma
Bo Cheng
Dawei Leng
Yuhui Yin
Xiaodan Liang
VGen
265
28
0
11 Mar 2025
SARA: Structural and Adversarial Representation Alignment for Training-efficient Diffusion Models
Hesen Chen
Junyan Wang
Zhiyu Tan
Hao Li
241
4
0
11 Mar 2025
AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion
Computer Vision and Pattern Recognition (CVPR), 2025
Mingzhen Sun
Weining Wang
Gen Li
Jiawei Liu
Jiahui Sun
Wanquan Feng
Shanshan Lao
Siyu Zhou
Qian He
Qingbin Liu
DiffM
VGen
309
24
0
10 Mar 2025
From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers
Jiacheng Liu
Chang Zou
Yuanhuiyi Lyu
Junjie Chen
Linfeng Zhang
DiffM
367
26
0
10 Mar 2025
RayFlow: Instance-Aware Diffusion Acceleration via Adaptive Flow Trajectories
Computer Vision and Pattern Recognition (CVPR), 2025
Huiyang Shao
Xin Xia
Yanting Yang
Yuxi Ren
Xing Wang
Xuefeng Xiao
341
10
0
10 Mar 2025
DreamRelation: Relation-Centric Video Customization
Yujie Wei
Shiwei Zhang
Hangjie Yuan
Biao Gong
Longxiang Tang
...
Haonan Qiu
Hengjia Li
Shuai Tan
Yujiao Shi
Hongming Shan
VGen
225
14
0
10 Mar 2025
PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity
Kwanyoung Kim
Byeongsu Sim
DiffM
VLM
378
1
0
10 Mar 2025
Automated Movie Generation via Multi-Agent CoT Planning
Weijia Wu
Zeyu Zhu
Mike Zheng Shou
VGen
279
24
0
10 Mar 2025
LightMotion: A Light and Tuning-free Method for Simulating Camera Motion in Video Generation
Quanjian Song
Zhihang Lin
Zhanpeng Zeng
Ziyue Zhang
Liujuan Cao
Rongrong Ji
VGen
254
5
0
09 Mar 2025
VideoPhy-2: A Challenging Action-Centric Physical Commonsense Evaluation in Video Generation
Hritik Bansal
Clark Peng
Yonatan Bitton
Roman Goldenberg
Aditya Grover
Kai-Wei Chang
EGVM
VGen
261
30
0
09 Mar 2025
Generative Video Bi-flow
Chen Liu
Tobias Ritschel
DiffM
VGen
200
0
0
09 Mar 2025
Text2Story: Advancing Video Storytelling with Text Guidance
Taewon Kang
D. Kothandaraman
Ming C. Lin
DiffM
VGen
350
3
0
08 Mar 2025
GSV3D: Gaussian Splatting-based Geometric Distillation with Stable Video Diffusion for Single-Image 3D Object Generation
Ye Tao
Jiawei Zhang
Yahao Shi
Dongqing Zou
Bin Zhou
3DGS
322
0
0
08 Mar 2025
DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation
Runze Zhang
Guoguang Du
Xiaochuan Li
Qi Jia
Liang Jin
...
Zhenhua Guo
Yaqian Zhao
Xiaoli Gong
Rengang Li
Baoyu Fan
VGen
282
5
0
08 Mar 2025
TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models
Mark YU
Wenbo Hu
Jinbo Xing
Mingyu Ding
VGen
283
35
0
07 Mar 2025
MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice
Hongwei Yi
Tian Ye
Shitong Shao
Xuancheng Yang
Jiantong Zhao
...
Bo Han
Lei Zhu
Wei Li
Michael Lingelbach
Daquan Zhou
VGen
261
6
0
07 Mar 2025
FluidNexus: 3D Fluid Reconstruction and Prediction from a Single Video
Computer Vision and Pattern Recognition (CVPR), 2025
Yue Gao
Hong-Xing Yu
Bo Zhu
Jiajun Wu
VGen
623
11
0
06 Mar 2025
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion
Ziyi Yang
Fanqi Wan
Longguang Zhong
Canbin Huang
Guosheng Liang
Xiaojun Quan
MoMe
240
9
0
06 Mar 2025
GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control
Computer Vision and Pattern Recognition (CVPR), 2025
Xuanchi Ren
Tianchang Shen
Jiahui Huang
Huan Ling
Yifan Lu
Merlin Nimier-David
Thomas Muller
Alexander Keller
Sanja Fidler
Jun Gao
DiffM
VGen
291
114
0
05 Mar 2025
Generative Artificial Intelligence in Robotic Manipulation: A Survey
Kun Zhang
Peng Yun
Jun Cen
Junhao Cai
DiDi Zhu
...
Qifeng Chen
Jia Pan
Wei Zhang
Bo Yang
Hua Chen
561
11
0
05 Mar 2025
SPG: Improving Motion Diffusion by Smooth Perturbation Guidance
Boseong Jeon
DiffM
264
2
0
04 Mar 2025
MindSimulator: Exploring Brain Concept Localization via Synthetic FMRI
International Conference on Learning Representations (ICLR), 2025
Guangyin Bao
Tao Gui
Z. Gong
Zhuojia Wu
Duoqian Miao
192
5
0
04 Mar 2025
KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation
Computer Vision and Pattern Recognition (CVPR), 2025
Antoni Bigata
Michał Stypułkowski
Rodrigo Mira
Stella Bounareli
Konstantinos Vougioukas
Zoe Landgraf
Nikita Drobyshev
Maciej Ziȩba
Stavros Petridis
Maja Pantic
DiffM
VGen
283
5
0
03 Mar 2025
Morpheus: Text-Driven 3D Gaussian Splat Shape and Color Stylization
Computer Vision and Pattern Recognition (CVPR), 2025
Jamie Wynn
Z. Qureshi
Jakub Powierza
Jamie Watson
Mohamed Sayed
3DGS
DiffM
345
4
0
03 Mar 2025
Dynamical Diffusion: Learning Temporal Dynamics with Diffusion Models
International Conference on Learning Representations (ICLR), 2025
Xingzhuo Guo
Yu Zhang
Baixu Chen
Haoran Xu
Chao Guo
Mingsheng Long
DiffM
AI4TS
317
6
0
02 Mar 2025
Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think
Computer Vision and Pattern Recognition (CVPR), 2025
Jie Tian
Xiaoye Qu
Zhenyi Lu
Xiaoye Qu
Sichen Liu
Yu Cheng
DiffM
VGen
182
8
0
02 Mar 2025
FaceShot: Bring Any Character into Life
International Conference on Learning Representations (ICLR), 2025
Junyao Gao
Yanan Sun
Fei Shen
Xin Jiang
Zhening Xing
Kai-xiang Chen
Cairong Zhao
CVBM
3DH
261
10
0
02 Mar 2025
PodAgent: A Comprehensive Framework for Podcast Generation
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Yujia Xiao
Lei He
Haohan Guo
Fenglong Xie
Tan Lee
812
2
0
01 Mar 2025
Unified Video Action Model
Shuang Li
Yihuai Gao
Dorsa Sadigh
Shuran Song
VGen
566
58
0
28 Feb 2025
WorldModelBench: Judging Video Generation Models As World Models
Dacheng Li
Yunhao Fang
Yukang Chen
Shuo Yang
Shiyi Cao
...
Hongxu Yin
Alfons Kemper
Ion Stoica
Enze Xie
Yaojie Lu
VGen
198
25
0
28 Feb 2025
Raccoon: Multi-stage Diffusion Training with Coarse-to-Fine Curating Videos
Zhiyu Tan
Junyan Wang
Hao Yang
Luozheng Qin
Hesen Chen
Qiang-feng Zhou
Hao Li
VGen
332
3
0
28 Feb 2025
Previous
1
2
3
...
12
13
14
...
18
19
20
Next