ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.15127
  4. Cited By
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large
  Datasets

Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

25 November 2023
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
Dominik Lorenz
Yam Levi
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
    VGen
ArXiv (abs)PDFHTMLHuggingFace (13 upvotes)Github (25943★)

Papers citing "Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets"

50 / 970 papers shown
Title
GSFixer: Improving 3D Gaussian Splatting with Reference-Guided Video Diffusion Priors
GSFixer: Improving 3D Gaussian Splatting with Reference-Guided Video Diffusion Priors
Xingyilang Yin
Qi Zhang
Jiahao Chang
Ying Feng
Qingnan Fan
X. J. Yang
Chi-Man Pun
Huaqi Zhang
Xiaodong Cun
DiffM3DGSVGen
115
7
0
13 Aug 2025
Turbo-VAED: Fast and Stable Transfer of Video-VAEs to Mobile Devices
Turbo-VAED: Fast and Stable Transfer of Video-VAEs to Mobile Devices
Ya Zou
Jingfeng Yao
Siyuan Yu
Shuai Zhang
Wenyu Liu
Xinggang Wang
ViT
120
2
0
12 Aug 2025
Preview WB-DH: Towards Whole Body Digital Human Bench for the Generation of Whole-body Talking Avatar Videos
Preview WB-DH: Towards Whole Body Digital Human Bench for the Generation of Whole-body Talking Avatar Videos
Chaoyi Wang
Yifan Yang
Jun Pei
Lijie Xia
Jianpo Liu
Xiaobing Yuan
Xinhan Di
VGen
72
0
0
12 Aug 2025
RealisMotion: Decomposed Human Motion Control and Video Generation in the World Space
RealisMotion: Decomposed Human Motion Control and Video Generation in the World Space
Jingyun Liang
Jingkai Zhou
Shikai Li
Chenjie Cao
Lei Sun
Yichen Qian
Weihua Chen
Fan Wang
DiffMVGen
78
2
0
12 Aug 2025
Cut2Next: Generating Next Shot via In-Context Tuning
Cut2Next: Generating Next Shot via In-Context Tuning
Jingwen He
Hongbo Liu
Jiajun Li
Ziqi Huang
Botian Shi
Wanli Ouyang
Ziwei Liu
VGen
184
5
0
11 Aug 2025
Matrix-3D: Omnidirectional Explorable 3D World Generation
Matrix-3D: Omnidirectional Explorable 3D World Generation
Zhongqi Yang
Wenhang Ge
Yuqi Li
J. Chen
Haoyuan Li
...
Eric Li
Yang Liu
Yikai Wang
Hao Guo
Yahui Zhou
VGen
103
9
0
11 Aug 2025
Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation
Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation
Bowen Xue
Zheng-Peng Duan
Qixin Yan
Wenjing Wang
Hao Liu
Chun-Le Guo
Chongyi Li
Chen Li
Jing Lyu
DiffMVGen
123
4
0
11 Aug 2025
CObL: Toward Zero-Shot Ordinal Layering without User Prompting
CObL: Toward Zero-Shot Ordinal Layering without User Prompting
Aneel Damaraju
D. Hazineh
Todd E. Zickler
BDL
116
0
0
11 Aug 2025
Learning an Implicit Physics Model for Image-based Fluid Simulation
Learning an Implicit Physics Model for Image-based Fluid Simulation
Emily Yue-Ting Jia
Jiageng Mao
Zhiyuan Gao
Yajie Zhao
Yue Wang
3DHVGenAI4CE
57
0
0
11 Aug 2025
Splat4D: Diffusion-Enhanced 4D Gaussian Splatting for Temporally and Spatially Consistent Content Creation
Splat4D: Diffusion-Enhanced 4D Gaussian Splatting for Temporally and Spatially Consistent Content Creation
Minghao Yin
Yukang Cao
Songyou Peng
Kai Han
3DGS
88
2
0
11 Aug 2025
StableAvatar: Infinite-Length Audio-Driven Avatar Video Generation
StableAvatar: Infinite-Length Audio-Driven Avatar Video Generation
S. Tu
Yueming Pan
Y. Huang
Xintong Han
Zhen Xing
Jingdong Sun
Chong Luo
Zuxuan Wu
Yu-Gang Jiang
VGen
132
14
0
11 Aug 2025
Consistent and Controllable Image Animation with Motion Linear Diffusion Transformers
Consistent and Controllable Image Animation with Motion Linear Diffusion Transformers
Xin Ma
Yaohui Wang
Genyun Jia
Xinyuan Chen
Tien-Tsin Wong
C. L. P. Chen
VGen
136
0
0
10 Aug 2025
SketchAnimator: Animate Sketch via Motion Customization of Text-to-Video Diffusion Models
SketchAnimator: Animate Sketch via Motion Customization of Text-to-Video Diffusion ModelsVisual Communications and Image Processing (VCIP), 2024
Ruolin Yang
Da Li
Honggang Zhang
Yi-Zhe Song
DiffMVGen
94
2
0
10 Aug 2025
WeatherDiffusion: Controllable Weather Editing in Intrinsic Space
WeatherDiffusion: Controllable Weather Editing in Intrinsic Space
Yixin Zhu
Zuoliang Zhu
Jian Yang
Jian Yang
J. Xie
Beibei Wang
104
0
0
09 Aug 2025
SwiftVideo: A Unified Framework for Few-Step Video Generation through Trajectory-Distribution Alignment
SwiftVideo: A Unified Framework for Few-Step Video Generation through Trajectory-Distribution Alignment
Yanxiao Sun
Jiafu Wu
Yun Cao
C. Xu
Yabiao Wang
Weijian Cao
Donghao Luo
Chengjie Wang
Yanwei Fu
DiffMVGen
115
3
0
08 Aug 2025
Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation
Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation
Yue Liao
Pengfei Zhou
Siyuan Huang
Donglin Yang
Shengcong Chen
...
Jianlan Luo
Liliang Chen
Shuicheng Yan
Maoqing Yao
Maoqing Yao
VGenLM&Ro
238
22
0
07 Aug 2025
UNCAGE: Contrastive Attention Guidance for Masked Generative Transformers in Text-to-Image Generation
UNCAGE: Contrastive Attention Guidance for Masked Generative Transformers in Text-to-Image Generation
Wonjun Kang
Byeongkeun Ahn
Minjae Lee
Kevin Galim
Seunghyuk Oh
Hyung Il Koo
N. Cho
DiffM
161
0
0
07 Aug 2025
RAP: Real-time Audio-driven Portrait Animation with Video Diffusion Transformer
RAP: Real-time Audio-driven Portrait Animation with Video Diffusion Transformer
Fangyu Du
Taiqing Li
Ziwei Zhang
Qian Qiao
Tan Yu
Dingcheng Zhen
Xu Jia
Yang Yang
Shunshun Yin
Siyuan Liu
VGen
92
2
0
07 Aug 2025
When Deepfake Detection Meets Graph Neural Network:a Unified and Lightweight Learning Framework
When Deepfake Detection Meets Graph Neural Network:a Unified and Lightweight Learning Framework
Haoyu Liu
Chaoyu Gong
Mengke He
Jiate Li
Kai Han
Siqiang Luo
77
0
0
07 Aug 2025
LayerT2V: Interactive Multi-Object Trajectory Layering for Video Generation
LayerT2V: Interactive Multi-Object Trajectory Layering for Video Generation
Kangrui Cen
Baixuan Zhao
Yi Xin
Siqi Luo
Guoquan Zheng
Xiaohong Liu
DiffMVGen
116
0
0
06 Aug 2025
Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off
Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off
Seungyong Lee
Jeong-gi Kwak
DiffM
164
1
0
06 Aug 2025
QuantVSR: Low-Bit Post-Training Quantization for Real-World Video Super-Resolution
QuantVSR: Low-Bit Post-Training Quantization for Real-World Video Super-Resolution
Bowen Chai
Z. Chen
Libo Zhu
Wenbo Li
Yong Guo
Yulun Zhang
MQDiffM
129
0
0
06 Aug 2025
Multi-human Interactive Talking Dataset
Multi-human Interactive Talking Dataset
Zeyu Zhu
Weijia Wu
Mike Zheng Shou
VGen
103
0
0
05 Aug 2025
READ: Real-time and Efficient Asynchronous Diffusion for Audio-driven Talking Head Generation
READ: Real-time and Efficient Asynchronous Diffusion for Audio-driven Talking Head Generation
Haotian Wang
Yuzhe Weng
Jun Du
Haoran Xu
X. Wu
Shan He
Bing Yin
Cong Liu
J. Gao
Qingfeng Liu
DiffMVGen
216
1
0
05 Aug 2025
V.I.P. : Iterative Online Preference Distillation for Efficient Video Diffusion Models
V.I.P. : Iterative Online Preference Distillation for Efficient Video Diffusion Models
Jisoo Kim
Wooseok Seo
Junwan Kim
Seungho Park
Sooyeon Park
Youngjae Yu
VGen
167
1
0
05 Aug 2025
DreamVVT: Mastering Realistic Video Virtual Try-On in the Wild via a Stage-Wise Diffusion Transformer Framework
DreamVVT: Mastering Realistic Video Virtual Try-On in the Wild via a Stage-Wise Diffusion Transformer Framework
Tongchun Zuo
Zaiyu Huang
Shuliang Ning
Ente Lin
C. Liang
Zerong Zheng
Jianwen Jiang
Yuan Zhang
Mingyuan Gao
Xin Dong
DiffMVGen
84
4
0
04 Aug 2025
QuaDreamer: Controllable Panoramic Video Generation for Quadruped Robots
QuaDreamer: Controllable Panoramic Video Generation for Quadruped Robots
Sheng Wu
Fei Teng
Hao Shi
Qi Jiang
Kai Luo
Kaiwei Wang
Kailun Yang
VGen
224
0
0
04 Aug 2025
Forecasting When to Forecast: Accelerating Diffusion Models with Confidence-Gated Taylor
Forecasting When to Forecast: Accelerating Diffusion Models with Confidence-Gated TaylorKnowledge-Based Systems (KBS), 2025
Xiaoliu Guan
Lielin Jiang
Hanqi Chen
X. Zhang
Jiaxing Yan
Guanzhong Wang
Yi-Hsueh Liu
Zetao Zhang
Yu Wu
292
2
0
04 Aug 2025
VDEGaussian: Video Diffusion Enhanced 4D Gaussian Splatting for Dynamic Urban Scenes Modeling
VDEGaussian: Video Diffusion Enhanced 4D Gaussian Splatting for Dynamic Urban Scenes Modeling
Yuru Xiao
Zihan Lin
Chao Lu
Deming Zhai
Kui Jiang
Wenbo Zhao
Wei Zhang
Junjun Jiang
Huanran Wang
Xianming Liu
3DGS
127
0
0
04 Aug 2025
DisCo3D: Distilling Multi-View Consistency for 3D Scene Editing
DisCo3D: Distilling Multi-View Consistency for 3D Scene Editing
Yufeng Chi
Huimin Ma
Kafeng Wang
Jianmin Li
3DGS
122
2
0
03 Aug 2025
Versatile Transition Generation with Image-to-Video Diffusion
Versatile Transition Generation with Image-to-Video Diffusion
Zuhao Yang
Jiahui Zhang
Yingchen Yu
Shijian Lu
Song Bai
DiffMVGen
205
3
0
03 Aug 2025
Effective Damage Data Generation by Fusing Imagery with Human Knowledge Using Vision-Language Models
Effective Damage Data Generation by Fusing Imagery with Human Knowledge Using Vision-Language Models
Jie Wei
Erika Ardiles-Cruz
Aleksey Panasyuk
Erik P. Blasch
VLM
100
1
0
02 Aug 2025
Unraveling Hidden Representations: A Multi-Modal Layer Analysis for Better Synthetic Content Forensics
Unraveling Hidden Representations: A Multi-Modal Layer Analysis for Better Synthetic Content Forensics
Tom Or
Omri Azencot
AAML
154
1
0
01 Aug 2025
DreamSat-2.0: Towards a General Single-View Asteroid 3D Reconstruction
DreamSat-2.0: Towards a General Single-View Asteroid 3D Reconstruction
Santiago Diaz
Xinghui Hu
Josiane Uwumukiza
Giovanni Lavezzi
Victor Rodríguez-Fernández
Richard Linares
78
0
0
01 Aug 2025
Video Generators are Robot Policies
Video Generators are Robot Policies
Junbang Liang
P. Tokmakov
Ruoshi Liu
Sruthi Sudhakar
Paarth Shah
Rares Andrei Ambrus
Carl Vondrick
VGen
243
10
0
01 Aug 2025
Semantic and Temporal Integration in Latent Diffusion Space for High-Fidelity Video Super-Resolution
Semantic and Temporal Integration in Latent Diffusion Space for High-Fidelity Video Super-Resolution
Yiwen Wang
Xinning Chai
Yuhong Zhang
Zhengxue Cheng
Jun Zhao
Rong Xie
Li Song
DiffMVGen
70
0
0
01 Aug 2025
D3: Training-Free AI-Generated Video Detection Using Second-Order Features
D3: Training-Free AI-Generated Video Detection Using Second-Order Features
Chende Zheng
Ruiqi suo
Chenhao Lin
Zhengyu Zhao
Le Yang
Shuai Liu
Minghui Yang
Cong Wang
Chao Shen
DiffMVGen
170
3
0
01 Aug 2025
SpA2V: Harnessing Spatial Auditory Cues for Audio-driven Spatially-aware Video Generation
SpA2V: Harnessing Spatial Auditory Cues for Audio-driven Spatially-aware Video Generation
K. T. Pham
Yingqing He
Yazhou Xing
Qifeng Chen
L. Chen
DiffMVGen
1.1K
1
0
01 Aug 2025
Gaussian Variation Field Diffusion for High-fidelity Video-to-4D Synthesis
Gaussian Variation Field Diffusion for High-fidelity Video-to-4D Synthesis
Bowen Zhang
Sicheng Xu
Chuxin Wang
Jiaolong Yang
Feng Zhao
Dong Chen
Baining Guo
3DGSVGen
146
7
0
31 Jul 2025
DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion
DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion
Qingcheng Zhao
Xiang Zhang
Haiyang Xu
Z. Chen
Jianwen Xie
Yuan Gao
Zhuowen Tu
DiffMMDE
125
8
0
30 Jul 2025
DiTalker: A Unified DiT-based Framework for High-Quality and Speaking Styles Controllable Portrait Animation
DiTalker: A Unified DiT-based Framework for High-Quality and Speaking Styles Controllable Portrait Animation
He Feng
Yongjia Ma
Donglin Di
Lei Fan
Tonghua Su
Xiangqian Wu
DiffMVGen
105
1
0
29 Jul 2025
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels
HunyuanWorld Team
Zhenwei Wang
Yuhao Liu
Junta Wu
Zixiao Gu
...
Y. Liu
Linus
Jie Jiang
Tengfei Wang
Chunchao Guo
VGen
227
30
0
29 Jul 2025
JWB-DH-V1: Benchmark for Joint Whole-Body Talking Avatar and Speech Generation Version 1
JWB-DH-V1: Benchmark for Joint Whole-Body Talking Avatar and Speech Generation Version 1
Xinhan Di
Kristin Qi
Pengqian Yu
DiffMVGen
182
0
0
28 Jul 2025
Reconstructing 4D Spatial Intelligence: A Survey
Reconstructing 4D Spatial Intelligence: A Survey
Yukang Cao
Jiahao Lu
Z. Huang
Zhuowei Shen
Chengfeng Zhao
...
Z. Chen
Xin Li
Wenping Wang
Yuan Liu
Ziwei Liu
VGen
313
7
0
28 Jul 2025
MagicAnime: A Hierarchically Annotated, Multimodal and Multitasking Dataset with Benchmarks for Cartoon Animation Generation
MagicAnime: A Hierarchically Annotated, Multimodal and Multitasking Dataset with Benchmarks for Cartoon Animation Generation
Shuolin Xu
Bingyuan Wang
Zeyu Cai
Fangteng Fu
Yue Ma
Tongyi Lee
Hongchuan Yu
Zeyu Wang
VGen
153
1
0
27 Jul 2025
AnimeColor: Reference-based Animation Colorization with Diffusion Transformers
AnimeColor: Reference-based Animation Colorization with Diffusion Transformers
Yuhong Zhang
L. Wang
Han Wang
Danni Wu
Zuzeng Lin
Feng Wang
Li Song
DiffMVGen
91
0
0
27 Jul 2025
TransFlow: Motion Knowledge Transfer from Video Diffusion Models to Video Salient Object Detection
TransFlow: Motion Knowledge Transfer from Video Diffusion Models to Video Salient Object Detection
Suhwan Cho
Minhyeok Lee
Jungho Lee
Sunghun Yang
Sangyoun Lee
71
0
0
26 Jul 2025
ScenePainter: Semantically Consistent Perpetual 3D Scene Generation with Concept Relation Alignment
ScenePainter: Semantically Consistent Perpetual 3D Scene Generation with Concept Relation Alignment
Chong Xia
Shengjun Zhang
Fangfu Liu
Chang Liu
Khodchaphun Hirunyaratsameewong
Yueqi Duan
VGen
111
2
0
25 Jul 2025
MVG4D: Image Matrix-Based Multi-View and Motion Generation for 4D Content Creation from a Single Image
MVG4D: Image Matrix-Based Multi-View and Motion Generation for 4D Content Creation from a Single Image
Xiaotian Chen
DongFu Yin
Fei Richard Yu
Xuanchen Li
Xinhao Zhang
3DGS
228
0
0
24 Jul 2025
Zero-Shot Dynamic Concept Personalization with Grid-Based LoRA
Zero-Shot Dynamic Concept Personalization with Grid-Based LoRA
Rameen Abdal
Or Patashnik
Ekaterina Deyneka
Hao Chen
Aliaksandr Siarohin
Sergey Tulyakov
Daniel Cohen-Or
Kfir Aberman
DiffMVGen
99
2
0
23 Jul 2025
Previous
123...567...181920
Next