Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2311.15127
Cited By
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
25 November 2023
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
Dominik Lorenz
Yam Levi
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
VGen
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (13 upvotes)
Github (25943★)
Papers citing
"Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets"
50 / 967 papers shown
Title
DualReal: Adaptive Joint Training for Lossless Identity-Motion Fusion in Video Customization
Wenchuan Wang
Mengqi Huang
Yijing Tu
Zhendong Mao
VGen
403
4
0
04 May 2025
Generating Animated Layouts as Structured Text Representations
Yeonsang Shin
Jihwan Kim
Yumin Song
Kyungseung Lee
Hyunhee Chung
Taeyoung Na
DiffM
VGen
250
1
0
02 May 2025
VIDSTAMP: A Temporally-Aware Watermark for Ownership and Integrity in Video Diffusion Models
Mohammadreza Teymoorianfard
Siddarth Sitaraman
Shiqing Ma
Amir Houmansadr
WIGM
380
1
0
02 May 2025
FreePCA: Integrating Consistency Information across Long-short Frames in Training-free Long Video Generation via Principal Component Analysis
Computer Vision and Pattern Recognition (CVPR), 2025
Jiangtong Tan
Hu Yu
Jie Huang
Jie Xiao
Feng Zhao
289
5
0
02 May 2025
KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution
Antoni Bigata
Rodrigo Mira
Stella Bounareli
Michał Stypułkowski
Konstantinos Vougioukas
Stavros Petridis
Maja Pantic
277
2
0
01 May 2025
Controllable Weather Synthesis and Removal with Video Diffusion Models
Chih-Hao Lin
Liang Luo
Ruofan Liang
Yuxuan Zhang
Sanja Fidler
Shenlong Wang
Zan Gojcic
DiffM
VGen
281
4
0
01 May 2025
T2VPhysBench: A First-Principles Benchmark for Physical Consistency in Text-to-Video Generation
Xuyang Guo
Jiayan Huo
Zhenmei Shi
Zhao Song
Jiahao Zhang
Jiale Zhao
EGVM
VGen
PINN
420
20
0
01 May 2025
Eye2Eye: A Simple Approach for Monocular-to-Stereo Video Synthesis
Michal Geyer
Omer Tov
Linyi Jin
Richard Tucker
Inbar Mosseri
Tali Dekel
Noah Snavely
DiffM
VGen
346
2
0
30 Apr 2025
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction
Qihao Liu
Ju He
Qihang Yu
Liang-Chieh Chen
Alan Yuille
DiffM
VGen
374
5
0
30 Apr 2025
A Survey of Interactive Generative Video
Jiwen Yu
Yiran Qin
Haoxuan Che
Quande Liu
Xinyu Wang
Pengfei Wan
Di Zhang
Kun Gai
Hao Chen
Xihui Liu
VGen
375
15
0
30 Apr 2025
ADiff4TPP: Asynchronous Diffusion Models for Temporal Point Processes
Amartya Mukherjee
Ruizhi Deng
He Zhao
Yuzhen Mao
Leonid Sigal
Frederick Tung
DiffM
AI4TS
222
0
0
29 Apr 2025
SynergyAmodal: Deocclude Anything with Text Control
Xinyang Li
Chengjie Yi
Jiawei Lai
Mingbao Lin
Yansong Qu
Shengchuan Zhang
Liujuan Cao
DiffM
219
3
0
28 Apr 2025
AnimateAnywhere: Rouse the Background in Human Image Animation
Xiaoyu Liu
Mingshuai Yao
Y. Zhang
Xianhui Lin
Peiran Ren
Xiaochen Li
Ming-Yu Liu
W. Zuo
3DH
DiffM
305
4
0
28 Apr 2025
DiVE: Efficient Multi-View Driving Scenes Generation Based on Video Diffusion Transformer
Junpeng Jiang
Gangyi Hong
Miao Zhang
Hengtong Hu
Kun Zhan
Rui Shao
Liqiang Nie
VGen
201
4
0
28 Apr 2025
Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation
Weipeng Tan
Chuming Lin
Chengming Xu
F. Xu
Xiaobin Hu
Xiaozhong Ji
Junwei Zhu
Chengjie Wang
Yanwei Fu
277
3
0
25 Apr 2025
Dynamic Camera Poses and Where to Find Them
Computer Vision and Pattern Recognition (CVPR), 2025
C. Rockwell
Joseph Tung
Nayeon Lee
Xuan Li
David Fouhey
Chen-Hsuan Lin
349
11
0
24 Apr 2025
VideoMark: A Distortion-Free Robust Watermarking Framework for Video Diffusion Models
Xuming Hu
Haoyang Li
Jiajun Li
Yu Huang
Aiwei Liu
Qi Zheng
Junhao Chen
Aiwei Liu
WIGM
VGen
408
6
0
23 Apr 2025
BadVideo: Stealthy Backdoor Attack against Text-to-Video Generation
Ke Xu
Mingli Zhu
Jiarong Ou
Ruoxin Chen
Xin Tao
Pengfei Wan
Baoyuan Wu
DiffM
AAML
VGen
347
2
0
23 Apr 2025
Subject-driven Video Generation via Disentangled Identity and Motion
Daneul Kim
Jingxu Zhang
W. Jin
Sunghyun Cho
Jingdong Sun
Jaesik Park
Chong Luo
DiffM
VGen
283
4
0
23 Apr 2025
Gaussian Splatting is an Effective Data Generator for 3D Object Detection
F. G. Zanjani
Davide Abati
Auke Wiggers
Dimitris Kalatzis
Jens Petersen
Hong Cai
A. Habibian
3DGS
848
1
0
23 Apr 2025
DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment
Xuzhao Li
Chenming Wu
Zhao Yang
Zhihao Xu
Dingkang Liang
Yanzhe Zhang
Ji Wan
Jiadong Wang
VGen
347
5
0
22 Apr 2025
Satellite to GroundScape -- Large-scale Consistent Ground View Generation from Satellite Views
Computer Vision and Pattern Recognition (CVPR), 2025
Ningli Xu
R. Qin
DiffM
312
4
0
22 Apr 2025
T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models
Yaning Tan
Jiayang Liu
Jiecheng Zhai
Tianmeng Fang
Rongcheng Tu
A. Liu
Xiaochun Cao
Dacheng Tao
VGen
325
10
0
22 Apr 2025
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation
Chenjie Cao
Jingkai Zhou
Shikai Li
Jingyun Liang
Chaohui Yu
Fan Wang
Xiangyang Xue
Yanwei Fu
VGen
DiffM
356
21
0
21 Apr 2025
VGNC: Reducing the Overfitting of Sparse-view 3DGS via Validation-guided Gaussian Number Control
Lifeng Lin
Rongfeng Lu
Quan Chen
Haofan Ren
Ming Lu
Yaoqi Sun
Chenggang Yan
Anke Xue
3DGS
156
2
0
20 Apr 2025
FlowLoss: Dynamic Flow-Conditioned Loss Strategy for Video Diffusion Models
Kuanting Wu
Kei Ota
Asako Kanezaki
DiffM
VGen
318
0
0
20 Apr 2025
Visual Prompting for One-shot Controllable Video Editing without Inversion
Computer Vision and Pattern Recognition (CVPR), 2025
Zitao Gao
Yuxi Zhou
Duo Peng
Joo-Hwee Lim
Zhigang Tu
De Wen Soh
Lin Geng Foo
DiffM
325
3
0
19 Apr 2025
U-Shape Mamba: State Space Model for faster diffusion
Alex Ergasti
Filippo Botti
Tomaso Fontanini
Claudio Ferrari
Massimo Bertozzi
Andrea Prati
Mamba
373
3
0
18 Apr 2025
SkyReels-V2: Infinite-length Film Generative Model
Guibin Chen
D. Lin
Jiangping Yang
Chunze Lin
J. Zhu
...
Di Qiu
Debang Li
Zhengcong Fei
Yang Li
Yahui Zhou
DiffM
VGen
434
71
0
17 Apr 2025
Physical Reservoir Computing in Hook-Shaped Rover Wheel Spokes for Real-Time Terrain Identification
Xiao Jin
Zihan Wang
Zhenhua Yu
Changrak Choi
Kalind Carpenter
T. Nanayakkara
202
0
0
17 Apr 2025
HiScene: Creating Hierarchical 3D Scenes with Isometric View Generation
Wenqi Dong
Bangbang Yang
Zesong Yang
Yuan Li
Tao Hu
Ruixing Wang
Yuewen Ma
Zhaopeng Cui
259
6
0
17 Apr 2025
SOPHY: Learning to Generate Simulation-Ready Objects with Physical Materials
Junyi Cao
Evangelos Kalogerakis
AI4CE
312
0
0
17 Apr 2025
EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
International Conference on Learning Representations (ICLR), 2025
Jinfeng Xu
Yuanmin Huang
Baoqi Pei
Junlin Hou
Qingqiu Li
Guo Chen
Yuhui Zhang
Rui Feng
Weidi Xie
DiffM
245
16
0
16 Apr 2025
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
Computer Vision and Pattern Recognition (CVPR), 2025
Bingjie Gao
Xinyu Gao
Xiaoxue Wu
Yujie Zhou
Yu Qiao
Li Niu
Xinyuan Chen
Yaohui Wang
428
6
0
16 Apr 2025
VGDFR: Diffusion-based Video Generation with Dynamic Latent Frame Rate
Zhihang Yuan
Rui Xie
Yuzhang Shang
Hao Zhang
Siyuan Wang
Shengen Yan
Guohao Dai
Yu Wang
DiffM
VGen
229
1
0
16 Apr 2025
PT-Mark: Invisible Watermarking for Text-to-image Diffusion Models via Semantic-aware Pivotal Tuning
Longji Xu
Huiyu Xu
Peng Kuang
Jiacheng Du
Hui Yuan
Yiming Li
Qiu Wang
Kui Ren
WIGM
359
0
0
15 Apr 2025
OmniVDiff: Omni Controllable Video Diffusion for Generation and Understanding
Dianbing Xi
Jiadong Wang
Yuanzhi Liang
Xi Qiu
Yuchi Huo
Ruiqi Wang
Fangqiu Yi
Xuzhao Li
DiffM
VGen
500
10
0
15 Apr 2025
NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors
Yanrui Bin
Wenbo Hu
Haoyuan Wang
Xinya Chen
Bing Wang
DiffM
214
5
0
15 Apr 2025
InterAnimate: Taming Region-aware Diffusion Model for Realistic Human Interaction Animation
Yukang Lin
Y. Hong
Zunnan Xu
Xiaochen Li
Chao Xu
...
Jun Lan
Huijia Zhu
Weiqiang Wang
Jianfu Zhang
Xiu Li
VGen
288
1
0
15 Apr 2025
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL
Junke Wang
Zhi Tian
Xinyu Wang
Xinyu Zhang
Weilin Huang
Zuxuan Wu
Yu Jiang
VGen
366
56
0
15 Apr 2025
VideoPanda: Video Panoramic Diffusion with Multi-view Attention
Kevin Xie
Amirmojtaba Sabour
Jiahui Huang
Despoina Paschalidou
G. Klár
Umar Iqbal
Sanja Fidler
Fangyin Wei
VGen
MDE
338
4
0
15 Apr 2025
Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting
Jiaxin Huang
Sheng Miao
BangBnag Yang
Yuewen Ma
Yiyi Liao
VGen
MDE
530
1
0
15 Apr 2025
Decoupled Diffusion Sparks Adaptive Scene Generation
Yunsong Zhou
Naisheng Ye
William Ljungbergh
Tianyu Li
Jiazhi Yang
Zetong Yang
Hongzi Zhu
Christoffer Petersson
Hongyang Li
219
9
0
14 Apr 2025
H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models
Yushu Wu
Yanyu Li
Ivan Skorokhodov
Vidit Goel
Willi Menapace
Sharath Girish
Aliaksandr Siarohin
Yanzhi Wang
Sergey Tulyakov
DiffM
VGen
298
4
0
14 Apr 2025
SpinMeRound: Consistent Multi-View Identity Generation Using Diffusion Models
Stathis Galanakis
Alexandros Lattas
Stylianos Moschoglou
Bernhard Kainz
Stefanos Zafeiriou
DiffM
350
0
0
14 Apr 2025
GaussVideoDreamer: 3D Scene Generation with Video Diffusion and Inconsistency-Aware Gaussian Splatting
Junlin Hao
Peiheng Wang
Haoyang Wang
Xinggong Zhang
Xinggong Zhang
3DGS
VGen
599
0
0
14 Apr 2025
On Equivariance and Fast Sampling in Video Diffusion Models Trained with Warped Noise
Chao Liu
Arash Vahdat
DiffM
VGen
302
5
0
14 Apr 2025
KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation
Xingrui Wang
Jiang-Long Liu
Liang Luo
Xiaodong Yu
Jialian Wu
Xingwu Sun
Yusheng Su
Yaoyao Liu
Zicheng Liu
Emad Barsoum
DiffM
VGen
221
4
0
13 Apr 2025
TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation
Computer Vision and Pattern Recognition (CVPR), 2025
Ruineng Li
Daitao Xing
Huiming Sun
Yuanzhou Ha
Jinglin Shen
C. Ho
DiffM
VGen
231
4
0
11 Apr 2025
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
Team Seawead
Ceyuan Yang
Zhijie Lin
Yang Zhao
Shanchuan Lin
...
Zuquan Song
Zhenheng Yang
Jiashi Feng
Jianchao Yang
Lu Jiang
DiffM
486
59
0
11 Apr 2025
Previous
1
2
3
...
9
10
11
...
18
19
20
Next