ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.15127
  4. Cited By
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large
  Datasets

Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

25 November 2023
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
Dominik Lorenz
Yam Levi
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
    VGen
ArXiv (abs)PDFHTMLHuggingFace (13 upvotes)Github (25943★)

Papers citing "Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets"

50 / 967 papers shown
Title
Improved Distribution Matching Distillation for Fast Image Synthesis
Improved Distribution Matching Distillation for Fast Image Synthesis
Tianwei Yin
Michael Gharbi
Taesung Park
Richard Zhang
Eli Shechtman
Frédo Durand
William T. Freeman
DiffM
379
274
0
23 May 2024
Video Diffusion Models are Training-free Motion Interpreter and
  Controller
Video Diffusion Models are Training-free Motion Interpreter and Controller
Zeqi Xiao
Yifan Zhou
Shuai Yang
Xingang Pan
VGen
235
43
0
23 May 2024
LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models
LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion ModelsNeural Information Processing Systems (NeurIPS), 2024
Seyedmorteza Sadat
Jakob Buhmann
Derek Bradley
Otmar Hilliges
Romann M. Weber
343
18
0
23 May 2024
Enhanced Creativity and Ideation through Stable Video Synthesis
Enhanced Creativity and Ideation through Stable Video Synthesis
Elijah Miller
Thomas Dupont
Mingming Wang
VGen
131
2
0
22 May 2024
MagicPose4D: Crafting Articulated Models with Appearance and Motion Control
MagicPose4D: Crafting Articulated Models with Appearance and Motion Control
Hao Zhang
Di Chang
Fang Li
Mohammad Soleymani
Narendra Ahuja
394
16
0
22 May 2024
Lumina-T2X: Transforming Text into Any Modality, Resolution, and
  Duration via Flow-based Large Diffusion Transformers
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
Shiyang Feng
Le Zhuo
Ziyi Lin
Ruoyi Du
Xu Luo
...
Weicai Ye
He Tong
Jingwen He
Yu Qiao
Jiaming Song
VGen
286
120
0
09 May 2024
Video Diffusion Models: A Survey
Video Diffusion Models: A Survey
Andrew Melnik
Michal Ljubljanac
Cong Lu
Qi Yan
Weiming Ren
Helge J. Ritter
VGen
321
29
0
06 May 2024
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Zheng Zhu
Xiaofeng Wang
Wangbo Zhao
Chen Min
Nianchen Deng
...
Dawei Zhao
Liang Xiao
Jian-jun Zhao
Jiwen Lu
Guan Huang
VGenLM&Ro
298
76
0
06 May 2024
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video
  Generation
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video GenerationNeural Information Processing Systems (NeurIPS), 2024
Yupeng Zhou
Daquan Zhou
Ming-Ming Cheng
Jiashi Feng
Qibin Hou
DiffMVGen
279
175
0
02 May 2024
Streamlining Image Editing with Layered Diffusion Brushes
Streamlining Image Editing with Layered Diffusion Brushes
Peyman Gholami
Robert Xiao
DiffM
297
1
0
01 May 2024
X-Diffusion: Generating Detailed 3D MRI Volumes From a Single Image Using Cross-Sectional Diffusion Models
X-Diffusion: Generating Detailed 3D MRI Volumes From a Single Image Using Cross-Sectional Diffusion Models
Emmanuelle Bourigault
Abdullah Hamdi
Amir Jamaludin
MedIm
360
4
0
30 Apr 2024
FlexiFilm: Long Video Generation with Flexible Conditions
FlexiFilm: Long Video Generation with Flexible Conditions
Yichen Ouyang
Jianhao Yuan
Hao Zhao
Gaoang Wang
Bo Zhao
DiffM
183
12
0
29 Apr 2024
What Foundation Models can Bring for Robot Learning in Manipulation : A Survey
What Foundation Models can Bring for Robot Learning in Manipulation : A Survey
Dingzhe Li
Yixiang Jin
A. Yong
Yong A
Hongze Yu
...
Huaping Liu
Gang Hua
F. Sun
Jianwei Zhang
Bin Fang
AI4CELM&Ro
803
24
0
28 Apr 2024
Beyond Deepfake Images: Detecting AI-Generated Videos
Beyond Deepfake Images: Detecting AI-Generated Videos
Danial Samadi Vahdati
Tai D. Nguyen
Aref Azizpour
Matthew C. Stamm
219
22
0
24 Apr 2024
ID-Animator: Zero-Shot Identity-Preserving Human Video Generation
ID-Animator: Zero-Shot Identity-Preserving Human Video Generation
Xuanhua He
Quande Liu
Shengju Qian
Xin Eric Wang
Tao Hu
Ke Cao
K. Yan
Jie Zhang
VGen
314
81
0
23 Apr 2024
X-Ray: A Sequential 3D Representation For Generation
X-Ray: A Sequential 3D Representation For Generation
Tao Hu
Wenhang Ge
Yuyang Zhao
Gim Hee Lee
MedIm
224
8
0
22 Apr 2024
RingID: Rethinking Tree-Ring Watermarking for Enhanced Multi-Key
  Identification
RingID: Rethinking Tree-Ring Watermarking for Enhanced Multi-Key Identification
Hai Ci
Pei Yang
Yiren Song
Mike Zheng Shou
416
67
0
22 Apr 2024
Zero-shot High-fidelity and Pose-controllable Character Animation
Zero-shot High-fidelity and Pose-controllable Character Animation
Bingwen Zhu
Fanyi Wang
Tianyi Lu
Peng Liu
Jingwen Su
Yu Lei
Yanhao Zhang
Zuxuan Wu
Guo-Jun Qi
Yu-Gang Jiang
DiffMVGen
191
8
0
21 Apr 2024
PCQA: A Strong Baseline for AIGC Quality Assessment Based on Prompt
  Condition
PCQA: A Strong Baseline for AIGC Quality Assessment Based on Prompt Condition
Xi Fang
Weigang Wang
Xiaoxin Lv
Jun Yan
EGVM
149
5
0
20 Apr 2024
Dynamic Typography: Bringing Text to Life via Video Diffusion Prior
Dynamic Typography: Bringing Text to Life via Video Diffusion Prior
Zichen Liu
Yihao Meng
Ouyang Hao
Yue Yu
Bolin Zhao
Daniel Cohen-Or
Huamin Qu
DiffM
201
7
0
17 Apr 2024
Predicting Long-horizon Futures by Conditioning on Geometry and Time
Predicting Long-horizon Futures by Conditioning on Geometry and Time
Tarasha Khurana
Deva Ramanan
AI4TS
171
1
0
17 Apr 2024
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Sicheng Xu
Guojun Chen
Yu-Xiao Guo
Jiaolong Yang
Chong Li
Zhenyu Zang
Yizhong Zhang
Xin Tong
Baining Guo
224
174
0
16 Apr 2024
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
Hongxin Zhang
Zeyuan Wang
Qiushi Lyu
Zheyuan Zhang
Sunli Chen
Tianmin Shu
Yilun Du
Kwonjoon Lee
Yilun Du
Chuang Gan
373
33
0
16 Apr 2024
LoopAnimate: Loopable Salient Object Animation
LoopAnimate: Loopable Salient Object Animation
Fanyi Wang
Peng Liu
Haotian Hu
Dan Meng
Jingwen Su
Jinjin Xu
Yanhao Zhang
Xiaoming Ren
Zhiwang Zhang
VGen
178
3
0
14 Apr 2024
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Shenghai Yuan
Jinfa Huang
Yujun Shi
Yongqi Xu
Ruijie Zhu
Bin Lin
Xinhua Cheng
Li-xin Yuan
Jiebo Luo
VGen
399
55
0
07 Apr 2024
Faster Diffusion via Temporal Attention Decomposition
Faster Diffusion via Temporal Attention Decomposition
Haozhe Liu
Wentian Zhang
Jinheng Xie
Francesco Faccio
Mengmeng Xu
Tao Xiang
Mike Zheng Shou
Juan-Manuel Perez-Rua
Jürgen Schmidhuber
DiffM
416
37
0
03 Apr 2024
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models BetterInternational Conference on Learning Representations (ICLR), 2024
En-hao Liu
Junyi Zhu
Zinan Lin
Xuefei Ning
Shuaiqi Wang
...
Sergey Yekhanin
Guohao Dai
Huazhong Yang
Yu Wang
Yu Wang
MoMe
353
5
0
02 Apr 2024
Denoising Monte Carlo Renders With Diffusion Models
Denoising Monte Carlo Renders With Diffusion Models
Vaibhav Vavilala
R. Vasanth
David A. Forsyth
DiffM
189
4
0
30 Mar 2024
SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions
SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions
Yuda Song
Zehao Sun
Xuanwu Yin
VLM
184
24
0
25 Mar 2024
Opportunities and challenges in the application of large artificial
  intelligence models in radiology
Opportunities and challenges in the application of large artificial intelligence models in radiology
Liangrui Pan
Zhenyu Zhao
Ying Lu
Kewei Tang
Liyong Fu
Qingchun Liang
Shaoliang Peng
LM&MAMedImAI4CE
209
11
0
24 Mar 2024
Explorative Inbetweening of Time and Space
Explorative Inbetweening of Time and Space
Haiwen Feng
Zheng Ding
Zhihao Xia
Simon Niklaus
Victoria Fernandez-Abrevaya
Michael J. Black
Xuaner Zhang
DiffMVGen
173
12
0
21 Mar 2024
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
Roberto Henschel
Levon Khachatryan
Daniil Hayrapetyan
Hayk Poghosyan
Vahram Tadevosyan
Zinan Lin
Shant Navasardyan
Humphrey Shi
DiffMVGen
438
148
0
21 Mar 2024
StainDiffuser: MultiTask Dual Diffusion Model for Virtual Staining
StainDiffuser: MultiTask Dual Diffusion Model for Virtual Staining
Tushar Kataria
Beatrice Knudsen
Shireen Y. Elhabian
DiffMMedIm
341
16
0
17 Mar 2024
GazeFusion: Saliency-Guided Image Generation
GazeFusion: Saliency-Guided Image Generation
Yunxiang Zhang
Nan Wu
Connor Z. Lin
Gordon Wetzstein
Qi Sun
300
7
0
16 Mar 2024
VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis
VLOGGER: Multimodal Diffusion for Embodied Avatar SynthesisComputer Vision and Pattern Recognition (CVPR), 2024
Enric Corona
Andrei Zanfir
Eduard Gabriel Bazavan
Nikos Kolotouros
Thiemo Alldieck
C. Sminchisescu
VGenDiffM
188
45
0
13 Mar 2024
DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video
  Generation
DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation
Guosheng Zhao
Xiaofeng Wang
Zheng Zhu
Xinze Chen
Guan Huang
Xiaoyi Bao
Xingang Wang
VGen
181
129
0
11 Mar 2024
Sora as an AGI World Model? A Complete Survey on Text-to-Video
  Generation
Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation
Joseph Cho
Fachrina Dewi Puspitasari
Sheng Zheng
Jingyao Zheng
Lik-Hang Lee
Tae-Ho Kim
Choong Seon Hong
Chaoning Zhang
EGVMVGen
234
65
0
08 Mar 2024
Sora Generates Videos with Stunning Geometrical Consistency
Sora Generates Videos with Stunning Geometrical Consistency
Xuanyi Li
Daquan Zhou
Chenxu Zhang
Shaodong Wei
Qibin Hou
Ming-Ming Cheng
EGVM
116
23
0
27 Feb 2024
Diffusion Model-Based Image Editing: A Survey
Diffusion Model-Based Image Editing: A Survey
Yi Huang
Jiancheng Huang
Yifan Liu
Mingfu Yan
Jiaxi Lv
Jianzhuang Liu
Wei Xiong
Chentao Song
Liangliang Cao
Liangliang Cao
EGVM
785
187
0
27 Feb 2024
Diffusion Posterior Sampling is Computationally Intractable
Diffusion Posterior Sampling is Computationally IntractableInternational Conference on Machine Learning (ICML), 2024
Shivam Gupta
A. Jalal
Aditya Parulekar
Eric Price
Zhiyang Xun
214
15
0
20 Feb 2024
VGMShield: Mitigating Misuse of Video Generative Models
VGMShield: Mitigating Misuse of Video Generative Models
Yan Pang
Yang Zhang
Yang Zhang
Tianhao Wang
237
0
0
20 Feb 2024
Using Left and Right Brains Together: Towards Vision and Language
  Planning
Using Left and Right Brains Together: Towards Vision and Language Planning
Jun Cen
Chenfei Wu
Xiao Liu
Sheng-Siang Yin
Yixuan Pei
Jinglong Yang
Qifeng Chen
Nan Duan
Jianguo Zhang
222
9
0
16 Feb 2024
Sophia-in-Audition: Virtual Production with a Robot Performer
Sophia-in-Audition: Virtual Production with a Robot PerformerACM Multimedia (MM), 2024
Taotao Zhou
Teng Xu
Dong Zhang
Yuyang Jiao
Peijun Xu
Yaoyu He
Lan Xu
Jingyi Yu
221
1
0
10 Feb 2024
Detecting Multimedia Generated by Large AI Models: A Survey
Detecting Multimedia Generated by Large AI Models: A Survey
Li Lin
Neeraj Gupta
Yue Zhang
Hainan Ren
Chun-Hao Liu
Feng Ding
Xin Eric Wang
Xin Li
Luisa Verdoliva
Shu Hu
789
88
0
22 Jan 2024
Motion-Zero: Zero-Shot Moving Object Control Framework for Diffusion-Based Video Generation
Motion-Zero: Zero-Shot Moving Object Control Framework for Diffusion-Based Video Generation
Changgu Chen
Junwei Shu
Lianggangxu Chen
Gaoqi He
Changbo Wang
VGen
335
21
0
18 Jan 2024
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Zhao Wang
Aoxue Li
Lingting Zhu
Yong Guo
Qi Dou
Zhenguo Li
VGenDiffM
622
61
0
18 Jan 2024
UniVG: Towards UNIfied-modal Video Generation
UniVG: Towards UNIfied-modal Video Generation
Ludan Ruan
Lei Tian
Chuanwei Huang
Xu Zhang
Xinyan Xiao
VGenDiffM
149
5
0
17 Jan 2024
Efficient4D: Fast Dynamic 3D Object Generation from a Single-view Video
Efficient4D: Fast Dynamic 3D Object Generation from a Single-view Video
Zijie Pan
Zeyu Yang
Xiatian Zhu
Li Zhang
3DGS
280
44
0
16 Jan 2024
Latte: Latent Diffusion Transformer for Video Generation
Latte: Latent Diffusion Transformer for Video Generation
Xin Ma
Yaohui Wang
Gengyun Jia
Xinyuan Chen
Ziqiang Liu
Yuan-Fang Li
Cunjian Chen
Yu Qiao
DiffMVGen
778
413
0
05 Jan 2024
AIGCBench: Comprehensive Evaluation of Image-to-Video Content Generated
  by AI
AIGCBench: Comprehensive Evaluation of Image-to-Video Content Generated by AIBenchCouncil Transactions on Benchmarks, Standards and Evaluations (TBBSE), 2024
Fanda Fan
Chunjie Luo
Wanling Gao
Jianfeng Zhan
298
26
0
03 Jan 2024
Previous
123...181920
Next