ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.15127
  4. Cited By
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large
  Datasets

Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

25 November 2023
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
Dominik Lorenz
Yam Levi
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
    VGen
ArXiv (abs)PDFHTMLHuggingFace (13 upvotes)Github (25943★)

Papers citing "Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets"

50 / 967 papers shown
Title
Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion
Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion
Manuel Kansy
Jacek Naruniec
Christopher Schroers
Markus Gross
Romann M. Weber
DiffMVGen
358
6
0
01 Aug 2024
Tora: Trajectory-oriented Diffusion Transformer for Video Generation
Tora: Trajectory-oriented Diffusion Transformer for Video Generation
Zhenghao Zhang
Junchao Liao
Menghao Li
Zuozhuo Dai
Bingxue Qiu
Hao Hu
Shaowei Cai
Weizhi Wang
VGen
437
104
0
31 Jul 2024
MMTrail: A Multimodal Trailer Video Dataset with Language and Music
  Descriptions
MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
Yatian Wang
Yatian Wang
Aosong Cheng
Pengjun Fang
Zeyue Tian
...
Wenhan Luo
Qifeng Chen
Shanghang Zhang
Qi-fei Liu
Yi-Ting Guo
257
8
0
30 Jul 2024
SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency
SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency
Yiming Xie
Chun-Han Yao
Vikram S. Voleti
Huaizu Jiang
Varun Jampani
VGen
344
94
0
24 Jul 2024
DreamStory: Open-Domain Story Visualization by LLM-Guided Multi-Subject Consistent Diffusion
DreamStory: Open-Domain Story Visualization by LLM-Guided Multi-Subject Consistent Diffusion
Huiguo He
Huan Yang
Zixi Tuo
Yuan Zhou
Qiuyue Wang
Yuhang Zhang
Zeyu Liu
Wenhao Huang
Hongyang Chao
Jian Yin
VGenDiffM
479
27
0
17 Jul 2024
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
Sherwin Bahmani
Ivan Skorokhodov
Aliaksandr Siarohin
Willi Menapace
Guocheng Qian
...
Chaoyang Wang
Jiaxu Zou
Andrea Tagliasacchi
David B. Lindell
Sergey Tulyakov
VGenDiffM
460
102
0
17 Jul 2024
Animate3D: Animating Any 3D Model with Multi-view Video Diffusion
Animate3D: Animating Any 3D Model with Multi-view Video Diffusion
Yanqin Jiang
Chaohui Yu
Chenjie Cao
Fan Wang
Weiming Hu
Jin Gao
VGen
154
39
0
16 Jul 2024
Kinetic Typography Diffusion Model
Kinetic Typography Diffusion Model
Seonmi Park
Inhwan Bae
Seunghyun Shin
Hae-Gon Jeon
DiffM
253
5
0
15 Jul 2024
Live2Diff: Live Stream Translation via Uni-directional Attention in
  Video Diffusion Models
Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models
Zhening Xing
Gereon Fox
Yanhong Zeng
Xingang Pan
Mohamed A. Elgharib
Christian Theobalt
Kai Chen
VGen
201
4
0
11 Jul 2024
T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models
T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models
Yibo Miao
Yifan Zhu
Yinpeng Dong
Lijia Yu
Jun Zhu
Xiao-Shan Gao
EGVM
263
37
0
08 Jul 2024
A Survey on LoRA of Large Language Models
A Survey on LoRA of Large Language Models
Yuren Mao
Yuhang Ge
Yijiang Fan
Wenyi Xu
Yu Mi
Zhonghao Hu
Yunjun Gao
ALM
562
90
0
08 Jul 2024
OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
Kepan Nan
Rui Xie
Penghao Zhou
Tiehan Fan
Zhenheng Yang
Zhijie Chen
Xiang Li
Jian Yang
Ying Tai
450
183
0
02 Jul 2024
No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models
No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models
Seyedmorteza Sadat
Manuel Kansy
Otmar Hilliges
Romann M. Weber
317
35
0
02 Jul 2024
HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model
HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model
Hieu T. Nguyen
Yiwen Chen
Vikram S. Voleti
Varun Jampani
Huaizu Jiang
226
5
0
28 Jun 2024
MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
Yuang Zhang
Jiaxi Gu
L. Wang
Han Wang
Junqi Cheng
Yuefeng Zhu
Fangyuan Zou
VGen
392
144
0
28 Jun 2024
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of
  Text-to-Time-lapse Video Generation
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Shenghai Yuan
Jinfa Huang
Yongqi Xu
Yaoyang Liu
Shaofeng Zhang
Yujun Shi
Ruijie Zhu
Xinhua Cheng
Jiebo Luo
Li Yuan
EGVMVGen
344
53
0
26 Jun 2024
DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis
  through Structure Guidance
DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance
Younghyun Kim
Geunmin Hwang
Junyu Zhang
Eunbyung Park
536
24
0
26 Jun 2024
Text-Animator: Controllable Visual Text Video Generation
Text-Animator: Controllable Visual Text Video Generation
Lin Liu
Quande Liu
Shengju Qian
Yuan Zhou
Wengang Zhou
Houqiang Li
Lingxi Xie
Qi Tian
VGen
204
2
0
25 Jun 2024
EVALALIGN: Supervised Fine-Tuning Multimodal LLMs with Human-Aligned
  Data for Evaluating Text-to-Image Models
EVALALIGN: Supervised Fine-Tuning Multimodal LLMs with Human-Aligned Data for Evaluating Text-to-Image Models
Zhiyu Tan
Xiaomeng Yang
Luozheng Qin
Mengping Yang
Cheng Zhang
Hao Li
361
13
0
24 Jun 2024
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
Yuang Peng
Yuxin Cui
Haomiao Tang
Zekun Qi
Runpei Dong
Jing Bai
Chunrui Han
Zheng Ge
Xiangyu Zhang
Shu-Tao Xia
EGVM
415
86
0
24 Jun 2024
Identifying and Solving Conditional Image Leakage in Image-to-Video
  Diffusion Model
Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model
Min Zhao
Hongzhou Zhu
Chendong Xiang
Kaiwen Zheng
Chongxuan Li
Jun Zhu
282
19
0
22 Jun 2024
ExVideo: Extending Video Diffusion Models via Parameter-Efficient
  Post-Tuning
ExVideo: Extending Video Diffusion Models via Parameter-Efficient Post-Tuning
Zhongjie Duan
Wenmeng Zhou
Cen Chen
Yaliang Li
Weining Qian
VGenDiffM
150
3
0
20 Jun 2024
Fantastic Copyrighted Beasts and How (Not) to Generate Them
Fantastic Copyrighted Beasts and How (Not) to Generate Them
Luxi He
Yangsibo Huang
Weijia Shi
Tinghao Xie
Haotian Liu
Yue Wang
Luke Zettlemoyer
Chiyuan Zhang
Danqi Chen
Peter Henderson
340
20
0
20 Jun 2024
ViD-GPT: Introducing GPT-style Autoregressive Generation in Video
  Diffusion Models
ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models
Kaifeng Gao
Jiaxin Shi
Hanwang Zhang
Chunping Wang
Jun Xiao
DiffMVGen
248
30
0
16 Jun 2024
L4GM: Large 4D Gaussian Reconstruction Model
L4GM: Large 4D Gaussian Reconstruction ModelNeural Information Processing Systems (NeurIPS), 2024
Jiawei Ren
Kevin Xie
Ashkan Mirzaei
Hanxue Liang
Xiaohui Zeng
...
Ziwei Liu
Antonio Torralba
Sanja Fidler
Seung Wook Kim
Huan Ling
3DGS
240
91
0
14 Jun 2024
Training-free Camera Control for Video Generation
Training-free Camera Control for Video GenerationInternational Conference on Learning Representations (ICLR), 2024
Chen Hou
Guoqiang Wei
VGenDiffM
502
76
0
14 Jun 2024
LRM-Zero: Training Large Reconstruction Models with Synthesized Data
LRM-Zero: Training Large Reconstruction Models with Synthesized Data
Desai Xie
Sai Bi
Zhixin Shu
Kai Zhang
Zexiang Xu
Yi Zhou
Soren Pirk
Arie E. Kaufman
Xin Sun
Hao Tan
SyDa
275
23
0
13 Jun 2024
WonderWorld: Interactive 3D Scene Generation from a Single Image
WonderWorld: Interactive 3D Scene Generation from a Single Image
Hong-Xing Yu
Haoyi Duan
Charles Herrmann
William T. Freeman
Jiajun Wu
3DGSVGen
529
110
0
13 Jun 2024
GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement
GTR: Improving Large 3D Reconstruction Models through Geometry and Texture RefinementInternational Conference on Learning Representations (ICLR), 2024
Peiye Zhuang
Songfang Han
Chaoyang Wang
Aliaksandr Siarohin
Jiaxu Zou
Michael Vasilkovsky
V. Shakhrai
Sergey Korolev
Sergey Tulyakov
Hsin-Ying Lee
3DV
310
9
0
09 Jun 2024
ShareGPT4Video: Improving Video Understanding and Generation with Better
  Captions
ShareGPT4Video: Improving Video Understanding and Generation with Better CaptionsNeural Information Processing Systems (NeurIPS), 2024
Lin Chen
Xilin Wei
Jinsong Li
Xiaoyi Dong
Pan Zhang
...
Li Yuan
Yu Qiao
Dahua Lin
Feng Zhao
Jiaqi Wang
311
316
0
06 Jun 2024
Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image
Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image
Stanislaw Szymanowicz
Eldar Insafutdinov
Chuanxia Zheng
Dylan Campbell
João F. Henriques
Christian Rupprecht
Andrea Vedaldi
3DGS
300
89
0
06 Jun 2024
SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
Rishit Dagli
Shivesh Prakash
Robert Wu
H. Khosravani
321
14
0
06 Jun 2024
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion
Hao Wen
Zehuan Huang
Yaohui Wang
Xinyuan Chen
Yu Qiao
321
19
0
05 Jun 2024
Turning Text and Imagery into Captivating Visual Video
Turning Text and Imagery into Captivating Visual Video
Mingming Wang
Elijah Miller
VGen
132
0
0
03 Jun 2024
Learning Temporally Consistent Video Depth from Video Diffusion Priors
Learning Temporally Consistent Video Depth from Video Diffusion Priors
Jiahao Shao
Yuanbo Yang
Hongyu Zhou
Youmin Zhang
Yujun Shen
Vitor Campagnolo Guizilini
Yue Wang
Matteo Poggi
Yiyi Liao
VGenDiffMMDE
476
79
0
03 Jun 2024
EchoNet-Synthetic: Privacy-preserving Video Generation for Safe Medical
  Data Sharing
EchoNet-Synthetic: Privacy-preserving Video Generation for Safe Medical Data Sharing
Hadrien Reynaud
Qingjie Meng
Mischa Dombrowski
Arijit Ghosh
Thomas Day
Alberto Gomez
Paul Leeson
Bernhard Kainz
MedIm
253
17
0
02 Jun 2024
Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models
Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models
Xinxi Zhang
Song Wen
Ligong Han
Felix Juefei Xu
Akash Srivastava
Junzhou Huang
Hao Wang
Molei Tao
Dimitris N. Metaxas
DiffM
167
9
0
31 May 2024
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo
  Benchmark
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark
Haoxing Chen
Yan Hong
Zizheng Huang
Zhuoer Xu
Zhangxuan Gu
...
Jun Lan
Huijia Zhu
Jianfu Zhang
Weiqiang Wang
Huaxiong Li
Mamba
448
43
0
30 May 2024
Promptus: Can Prompts Streaming Replace Video Streaming with Stable Diffusion
Promptus: Can Prompts Streaming Replace Video Streaming with Stable Diffusion
Jiangkai Wu
Liming Liu
Yunpeng Tan
Junlin Hao
Xinggong Zhang
304
5
0
30 May 2024
EasyAnimate: A High-Performance Long Video Generation Method based on
  Transformer Architecture
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture
Jiaqi Xu
Xinyi Zou
Kunzhe Huang
Yunkuo Chen
Bo Liu
Mengli Cheng
Xing Shi
Yanjie Liang
VGen
273
78
0
29 May 2024
ToonCrafter: Generative Cartoon Interpolation
ToonCrafter: Generative Cartoon Interpolation
Jinbo Xing
Hanyuan Liu
Menghan Xia
Yong Zhang
Xintao Wang
Ying Shan
Tien-Tsin Wong
215
66
0
28 May 2024
Vista: A Generalizable Driving World Model with High Fidelity and
  Versatile Controllability
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
Shenyuan Gao
Jiazhi Yang
Li Chen
Kashyap Chitta
Yihang Qiu
Andreas Geiger
Jun Zhang
Hongyang Li
362
198
0
27 May 2024
Controllable Longer Image Animation with Diffusion Models
Controllable Longer Image Animation with Diffusion Models
Qiang Wang
Minghua Liu
Junjun Hu
Fan Jiang
Mu Xu
VGen
209
2
0
27 May 2024
A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training
A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training
Kai Wang
Yukun Zhou
Mingjia Shi
Zhihang Yuan
Yuzhang Shang
Yuzhang Shang
Hanwang Zhang
Hanwang Zhang
Yang You
346
22
0
27 May 2024
Diffusion4D: Fast Spatial-temporal Consistent 4D Generation via Video
  Diffusion Models
Diffusion4D: Fast Spatial-temporal Consistent 4D Generation via Video Diffusion Models
Hanwen Liang
Yuyang Yin
Dejia Xu
Hanxue Liang
Zinan Lin
Konstantinos N. Plataniotis
Yao Zhao
Yunchao Wei
VGen
195
74
0
26 May 2024
A Survey of Multimodal Large Language Model from A Data-centric
  Perspective
A Survey of Multimodal Large Language Model from A Data-centric Perspective
Tianyi Bai
Hao Liang
Binwang Wan
Yanran Xu
Xi Li
...
Ping Huang
Jiulong Shan
Conghui He
Binhang Yuan
Wentao Zhang
323
64
0
26 May 2024
I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion
  Models
I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion Models
Wenqi Ouyang
Yi Dong
Lei Yang
Jianlou Si
Xingang Pan
VGenDiffM
223
48
0
26 May 2024
Score Distillation via Reparametrized DDIM
Score Distillation via Reparametrized DDIM
Artem Lukoianov
Haitz Sáez de Ocáriz Borde
Kristjan Greenewald
Vitor Campagnolo Guizilini
Timur M. Bagautdinov
Vincent Sitzmann
Justin Solomon
DiffM
250
29
0
24 May 2024
Diffusion Actor-Critic with Entropy Regulator
Diffusion Actor-Critic with Entropy Regulator
Yinuo Wang
Guojian Zhan
Yuxuan Jiang
Wenjun Zou
Tong Liu
...
Wenxuan Wang
Liming Xiao
Jiang Wu
Jingliang Duan
Shengbo Eben Li
DiffM
419
34
0
24 May 2024
NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer
NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer
Meng You
Zhiyu Zhu
Hui Liu
Junhui Hou
VGenDiffM
334
46
0
24 May 2024
Previous
123...17181920
Next