Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2311.15127
Cited By
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
25 November 2023
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
Dominik Lorenz
Yam Levi
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
VGen
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (13 upvotes)
Github (25943★)
Papers citing
"Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets"
50 / 967 papers shown
Title
Generative Video Propagation
Computer Vision and Pattern Recognition (CVPR), 2024
Shaoteng Liu
Tianyu Wang
Jiadong Wang
Qing Liu
Zhifei Zhang
...
Rui Wang
Bei Yu
Zhe Lin
Seunggeun Kim
Jiaya Jia
DiffM
VGen
235
16
0
27 Dec 2024
AdaEAGLE: Optimizing Speculative Decoding via Explicit Modeling of Adaptive Draft Structures
Situo Zhang
Hankun Wang
Da Ma
Zichen Zhu
Lu Chen
Kunyao Lan
Kai Yu
214
22
0
25 Dec 2024
DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers
Yuntao Chen
Yuqi Wang
Rundong Wang
925
39
0
24 Dec 2024
VidTwin: Video VAE with Decoupled Structure and Dynamics
Computer Vision and Pattern Recognition (CVPR), 2024
Yuchi Wang
Junliang Guo
Xinyi Xie
Tianyu He
Xu Sun
Li Zhao
DRL
VGen
316
7
0
23 Dec 2024
Enhancing Long Video Generation Consistency without Tuning
Xingyao Li
Fengzhuo Zhang
Jiachun Pan
Yunlong Hou
Vincent Y. F. Tan
Zhuoran Yang
DiffM
VGen
282
0
0
23 Dec 2024
Adapting Image-to-Video Diffusion Models for Large-Motion Frame Interpolation
Luoxu Jin
Hiroshi Watanabe
DiffM
VGen
474
0
0
22 Dec 2024
Label-Efficient Data Augmentation with Video Diffusion Models for Guidewire Segmentation in Cardiac Fluoroscopy
AAAI Conference on Artificial Intelligence (AAAI), 2024
Shaoyan Pan
Yikang Liu
Lin Zhao
Eric Z. Chen
Xiao Chen
Terrence Chen
Shanhui Sun
VGen
MedIm
380
1
0
20 Dec 2024
AniDoc: Animation Creation Made Easier
Computer Vision and Pattern Recognition (CVPR), 2024
Yihao Meng
Hao Ouyang
Hanlin Wang
Qiuyu Wang
Wen Wang
Ka Leong Cheng
Zhiheng Liu
Yujun Shen
Huamin Qu
DiffM
VGen
441
13
0
18 Dec 2024
Learning from Massive Human Videos for Universal Humanoid Pose Control
Jiageng Mao
Siheng Zhao
Siqi Song
Tianheng Shi
Junjie Ye
Mingtong Zhang
Haoran Geng
Jitendra Malik
Vitor Campagnolo Guizilini
Yue Wang
295
21
0
18 Dec 2024
VideoDPO: Omni-Preference Alignment for Video Diffusion Generation
Computer Vision and Pattern Recognition (CVPR), 2024
Runtao Liu
Haoyu Wu
Zheng Ziqiang
Chen Wei
Yingqing He
Renjie Pi
Qifeng Chen
VGen
276
59
0
18 Dec 2024
FlexCache: Flexible Approximate Cache System for Video Diffusion
Desen Sun
Henry Tian
Tim Lu
Sihang Liu
DiffM
472
3
0
18 Dec 2024
Move-in-2D: 2D-Conditioned Human Motion Generation
Computer Vision and Pattern Recognition (CVPR), 2024
Hsin-Ping Huang
Yang Zhou
Jui-Hsien Wang
Difan Liu
Feng Liu
Ming-Hsuan Yang
Zhan Xu
VGen
DiffM
168
3
0
17 Dec 2024
CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2024
Felix Taubner
Ruihang Zhang
Mathieu Tuli
David B. Lindell
298
19
0
16 Dec 2024
Generative Inbetweening through Frame-wise Conditions-Driven Video Generation
Computer Vision and Pattern Recognition (CVPR), 2024
Tianyi Zhu
Dongwei Ren
Qilong Wang
Xiaohe Wu
W. Zuo
VGen
246
7
0
16 Dec 2024
InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2024
Rick Akkerman
Haiwen Feng
M. Black
Dimitrios Tzionas
Victoria Fernandez-Abrevaya
VGen
AI4CE
550
5
0
16 Dec 2024
Wonderland: Navigating 3D Scenes from a Single Image
Computer Vision and Pattern Recognition (CVPR), 2024
Hanwen Liang
Junli Cao
Sergei Korolev
Guocheng Qian
Sergei Korolev
Demetri Terzopoulos
Konstantinos N. Plataniotis
Sergey Tulyakov
Jian Ren
VGen
400
49
0
16 Dec 2024
Can video generation replace cinematographers? Research on the cinematic language of generated video
Xuelong Li
Kai WU
Siyi Yang
YiZhan Qu
Guohua. Zhang
...
Mingliang Xiong
Hao Deng
Qingwen Liu
Gang Li
Bin He
VGen
DiffM
343
2
0
16 Dec 2024
VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping
Hao Shao
Shulun Wang
Yang Zhou
Guanglu Song
Dailan He
Shuo Qin
Zhuofan Zong
Bingqi Ma
Wenshu Fan
Jiaming Song
VGen
DiffM
290
2
0
15 Dec 2024
GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
Computer Vision and Pattern Recognition (CVPR), 2024
Mariam Hassan
Sebastian Stapf
Ahmad Rahimi
Pedro M B Rezende
Yasaman Haghighi
...
Mathieu Salzmann
Davide Scaramuzza
Marc Pollefeys
Paolo Favaro
Alexandre Alahi
VLM
VGen
270
35
0
15 Dec 2024
GenLit: Reformulating Single-Image Relighting as Video Generation
Shrisha Bharadwaj
Haiwen Feng
Giorgio Becherini
Victoria Fernandez-Abrevaya
Michael J. Black
VGen
443
5
0
15 Dec 2024
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
Computer Vision and Pattern Recognition (CVPR), 2024
Zeyang Zhang
Zihan Wang
Xianrui Li
Xingwu Sun
Fangyi Chen
Jiang Liu
Jiadong Wang
Bhiksha Raj
Zicheng Liu
Emad Barsoum
VLM
544
28
0
14 Dec 2024
Video Diffusion Transformers are In-Context Learners
Zhengcong Fei
Di Qiu
Changqian Yu
Debang Li
Mingyuan Fan
VGen
DiffM
795
7
0
14 Dec 2024
SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Computer Vision and Pattern Recognition (CVPR), 2024
Yushu Wu
Zhixing Zhang
Yanyu Li
Yanwu Xu
Vidit Goel
...
Ju Hu
Dimitris N. Metaxas
Yanzhi Wang
Sergey Tulyakov
Jian Ren
VGen
DiffM
341
15
0
13 Dec 2024
Owl-1: Omni World Model for Consistent Long Video Generation
Yuanhui Huang
Wenzhao Zheng
Yuan Gao
Xin Tao
Pengfei Wan
Di Zhang
Jie Zhou
Jiwen Lu
VGen
VLM
368
11
0
12 Dec 2024
T-SVG: Text-Driven Stereoscopic Video Generation
Qiao Jin
Xiaodong Chen
Wu Liu
Tao Mei
Yongdong Zhang
DiffM
VGen
251
4
0
12 Dec 2024
Mojito: Motion Trajectory and Intensity Control for Video Generation
Xuehai He
Shuohang Wang
Jianwei Yang
Xiaoxia Wu
Longji Xu
Kuan-Chieh Wang
Z. Zhan
Olatunji Ruwase
Yelong Shen
Xinze Wang
VGen
610
4
0
12 Dec 2024
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion
Haonan Qiu
Shiwei Zhang
Yujie Wei
Ruihang Chu
Hangjie Yuan
Xinyu Wang
Yujiao Shi
Ziwei Liu
327
17
0
12 Dec 2024
Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets
International Conference on Learning Representations (ICLR), 2024
Zhen Liu
Tim Z. Xiao
Weiyang Liu
Yoshua Bengio
Dinghuai Zhang
631
18
0
10 Dec 2024
MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds
Computer Vision and Pattern Recognition (CVPR), 2024
Z-H. Tang
Yuchen Fan
Dilin Wang
Hongyu Xu
Rakesh Ranjan
Alex Schwing
Zhicheng Yan
3DGS
VGen
3DV
224
75
0
09 Dec 2024
SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanations
Zhiwen Chen
Francesco Pinto
Minzhou Pan
Bo Li
283
16
0
09 Dec 2024
Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation
Computer Vision and Pattern Recognition (CVPR), 2024
Nicolas Dufour
David Picard
Vicky Kalogeiton
Loic Landrieu
199
13
0
09 Dec 2024
MuMu-LLaMA: Multi-modal Music Understanding and Generation via Large Language Models
Shansong Liu
Atin Sakkeer Hussain
Qilong Wu
Chenshuo Sun
Ying Shan
AuLLM
223
12
0
09 Dec 2024
You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale
Computer Vision and Pattern Recognition (CVPR), 2024
Baorui Ma
Huachen Gao
Haoge Deng
Zhengxiong Luo
Tiejun Huang
Lulu Tang
Xinlong Wang
DiffM
VGen
621
43
0
09 Dec 2024
MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation
Computer Vision and Pattern Recognition (CVPR), 2024
Shuwei Shi
Biao Gong
Xi Chen
Dandan Zheng
Shuai Tan
...
Jingwen He
Kecheng Zheng
Jingdong Chen
Ming-Hsuan Yang
Yinqiang Zheng
VGen
DiffM
199
10
0
08 Dec 2024
Birth and Death of a Rose
Computer Vision and Pattern Recognition (CVPR), 2024
Chen Geng
Yunzhi Zhang
Shangzhe Wu
Jiajun Wu
AI4CE
337
2
0
06 Dec 2024
Using Diffusion Priors for Video Amodal Segmentation
Computer Vision and Pattern Recognition (CVPR), 2024
Kaihua Chen
Deva Ramanan
Tarasha Khurana
DiffM
VOS
VGen
172
8
0
05 Dec 2024
DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models
Yizhuo Li
Yuying Ge
Yixiao Ge
Ping Luo
Mingyu Ding
DiffM
VGen
280
1
0
05 Dec 2024
InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models
Yifan Lu
Xuanchi Ren
Jiawei Yang
Tianchang Shen
Zhangjie Wu
...
Yanjie Wang
Siheng Chen
Mike Chen
Sanja Fidler
Jiahui Huang
VGen
379
32
0
05 Dec 2024
FreeSim: Toward Free-viewpoint Camera Simulation in Driving Scenes
Computer Vision and Pattern Recognition (CVPR), 2024
Lue Fan
Hao Zhang
Qitai Wang
Hongsheng Li
Rundong Wang
VGen
3DGS
165
20
0
04 Dec 2024
MV-Adapter: Multi-view Consistent Image Generation Made Easy
Zehuan Huang
Xu Tan
Haoran Wang
Ran Yi
Lizhuang Ma
Yan-Pei Cao
Lu Sheng
341
63
0
04 Dec 2024
Mimir: Improving Video Diffusion Models for Precise Text Understanding
Computer Vision and Pattern Recognition (CVPR), 2024
Shuai Tan
Biao Gong
Yutong Feng
Kecheng Zheng
Dandan Zheng
Shuwei Shi
Yujun Shen
Jingdong Chen
Ming-Hsuan Yang
VGen
225
12
0
04 Dec 2024
Navigation World Models
Computer Vision and Pattern Recognition (CVPR), 2024
Amir Bar
G. Zhou
Danny Tran
Trevor Darrell
Yann LeCun
VGen
EgoV
470
123
0
04 Dec 2024
Realistic Surgical Simulation from Monocular Videos
Kailing Wang
Chen-Ning Yang
Keyang Zhao
Yunbo Wang
Wei Shen
193
1
0
03 Dec 2024
World-consistent Video Diffusion with Explicit 3D Modeling
Computer Vision and Pattern Recognition (CVPR), 2024
Qihang Zhang
Shuangfei Zhai
Miguel Angel Bautista
Kevin Miao
Alexander Toshev
J. Susskind
Jiatao Gu
VGen
253
28
0
02 Dec 2024
InfinityDrive: Breaking Time Limits in Driving World Models
Xi Guo
C. Ding
Haoxuan Dou
Xin Zhang
Weixuan Tang
Wei Wu
VGen
355
12
0
02 Dec 2024
CPA: Camera-pose-awareness Diffusion Transformer for Video Generation
Yuelei Wang
Jian Zhang
Pengtao Jiang
Hao Zhang
Jinwei Chen
Bo Li
VGen
DiffM
279
9
0
02 Dec 2024
Driving View Synthesis on Free-form Trajectories with Generative Prior
Zeyu Yang
Zijie Pan
Yuankun Yang
Xiatian Zhu
Guang Dai
VGen
402
2
0
02 Dec 2024
Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Computer Vision and Pattern Recognition (CVPR), 2024
Zilyu Ye
Zhiyang Chen
Tiancheng Li
Zemin Huang
Weijian Luo
Guo-Jun Qi
DiffM
506
17
0
02 Dec 2024
Playable Game Generation
Mingyu Yang
Junyou Li
Zhongbin Fang
Sheng Chen
Yangbin Yu
Qiang Fu
Wei Yang
Deheng Ye
VGen
264
19
0
01 Dec 2024
DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses
Yatian Pang
Bin Zhu
Bin Lin
Mingzhe Zheng
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
VGen
3DH
256
11
0
30 Nov 2024
Previous
1
2
3
...
14
15
16
...
18
19
20
Next