ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.15127
  4. Cited By
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large
  Datasets

Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

25 November 2023
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
Dominik Lorenz
Yam Levi
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
    VGen
ArXiv (abs)PDFHTMLHuggingFace (13 upvotes)Github (25943★)

Papers citing "Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets"

50 / 967 papers shown
Title
Simulating the Visual World with Artificial Intelligence: A Roadmap
Simulating the Visual World with Artificial Intelligence: A Roadmap
Jingtong Yue
Z. Huang
Z. Chen
Xintao Wang
Pengfei Wan
Ziwei Liu
VGenLM&Ro
344
0
0
11 Nov 2025
RoboTAG: End-to-end Robot Configuration Estimation via Topological Alignment Graph
RoboTAG: End-to-end Robot Configuration Estimation via Topological Alignment Graph
Yifan Liu
Fangneng Zhan
Wanhua Li
Haowen Sun
Katerina Fragkiadaki
Hanspeter Pfister
54
0
0
11 Nov 2025
ViPRA: Video Prediction for Robot Actions
ViPRA: Video Prediction for Robot Actions
Sandeep Routray
Hengkai Pan
Unnat Jain
Shikhar Bahl
Deepak Pathak
198
0
0
11 Nov 2025
DIMO: Diverse 3D Motion Generation for Arbitrary Objects
DIMO: Diverse 3D Motion Generation for Arbitrary Objects
Linzhan Mou
Jiahui Lei
Chen Wang
Lingjie Liu
Kostas Daniilidis
VGen
166
0
0
10 Nov 2025
Toward the Frontiers of Reliable Diffusion Sampling via Adversarial Sinkhorn Attention Guidance
Toward the Frontiers of Reliable Diffusion Sampling via Adversarial Sinkhorn Attention Guidance
Kwanyoung Kim
DiffM
154
0
0
10 Nov 2025
4DSTR: Advancing Generative 4D Gaussians with Spatial-Temporal Rectification for High-Quality and Consistent 4D Generation
4DSTR: Advancing Generative 4D Gaussians with Spatial-Temporal Rectification for High-Quality and Consistent 4D Generation
Mengmeng Liu
Jiuming Liu
Yunpeng Zhang
Jiangtao Li
M. Yang
Francesco Nex
Hao Cheng
3DGS
124
0
0
10 Nov 2025
RelightMaster: Precise Video Relighting with Multi-plane Light Images
RelightMaster: Precise Video Relighting with Multi-plane Light Images
Weikang Bian
Xiaoyu Shi
Z. Huang
J. Bai
Qinghe Wang
Xintao Wang
Pengfei Wan
Kun Gai
Jiaming Song
VGen
176
1
0
09 Nov 2025
FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Jiang Lin
Xinyu Chen
Song Wu
Zhiqiu Zhang
Jizhi Zhang
Ye Wang
Qiang Tang
Qian Wang
Jian Yang
Zili Yi
DiffM
112
0
0
07 Nov 2025
Tortoise and Hare Guidance: Accelerating Diffusion Model Inference with Multirate Integration
Tortoise and Hare Guidance: Accelerating Diffusion Model Inference with Multirate Integration
Yunghee Lee
Byeonghyun Pak
Junwha Hong
Hoseong Kim
196
0
0
06 Nov 2025
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
Jingqi Tong
Yurong Mou
Hangcheng Li
Mingzhe Li
Y. Yang
...
Y. Zheng
Xinchi Chen
Jun Zhao
Xuanjing Huang
Xipeng Qiu
VGenLRM
313
7
0
06 Nov 2025
InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation
InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation
Jinlai Liu
J. N. Han
B. Yan
Hui Wu
Fengda Zhu
Xing-Hui Wang
Yi Jiang
Bingyue Peng
Zehuan Yuan
VGen
228
2
0
06 Nov 2025
PhysCorr: Dual-Reward DPO for Physics-Constrained Text-to-Video Generation with Automated Preference Selection
PhysCorr: Dual-Reward DPO for Physics-Constrained Text-to-Video Generation with Automated Preference Selection
Peiyao Wang
Weining Wang
Qi Li
EGVMVGen
367
1
0
06 Nov 2025
UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback
UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback
Ropeway Liu
Hangjie Yuan
B. Dong
Jiazheng Xing
Jinwang Wang
Rui Zhao
Yan Xing
Weihua Chen
F. Wang
VGen
126
0
0
03 Nov 2025
MotionStream: Real-Time Video Generation with Interactive Motion Controls
MotionStream: Real-Time Video Generation with Interactive Motion Controls
Joonghyuk Shin
Zhengqi Li
Richard Zhang
Jun-Yan Zhu
Jaesik Park
Eli Schechtman
Xun Huang
DiffMVGen
304
7
0
03 Nov 2025
Driving scenario generation and evaluation using a structured layer representation and foundational models
Driving scenario generation and evaluation using a structured layer representation and foundational models
Arthur Hubert
Gamal Elghazaly
R. Frank
80
0
0
03 Nov 2025
Detail Enhanced Gaussian Splatting for Large-Scale Volumetric Capture
Detail Enhanced Gaussian Splatting for Large-Scale Volumetric CaptureACM Transactions on Graphics (TOG), 2025
Julien Philip
Li Ma
Pascal Clausen
Wenqi Xian
Ahmet Levent Taşel
...
Xueming Yu
David M George
Ning Yu
Oliver Pilarski
P. Debevec
3DGS3DH
233
0
0
31 Oct 2025
DANCER: Dance ANimation via Condition Enhancement and Rendering with diffusion model
DANCER: Dance ANimation via Condition Enhancement and Rendering with diffusion model
Yucheng Xing
Jinxing Yin
Xiaodong Liu
VGen
124
0
0
31 Oct 2025
Co-Evolving Latent Action World Models
Co-Evolving Latent Action World Models
Yucen Wang
Fengming Zhang
De-Chuan Zhan
Li Zhao
Kaixin Wang
Jiang Bian
VGen
190
0
0
30 Oct 2025
Rethinking Visual Intelligence: Insights from Video Pretraining
Rethinking Visual Intelligence: Insights from Video Pretraining
Pablo Acuaviva
A. Davtyan
Mariam Hassan
Sebastian Stapf
Ahmad Rahimi
Alexandre Alahi
Paolo Favaro
VLMLRM
193
0
0
28 Oct 2025
Decoupled MeanFlow: Turning Flow Models into Flow Maps for Accelerated Sampling
Decoupled MeanFlow: Turning Flow Models into Flow Maps for Accelerated Sampling
Kyungmin Lee
Sihyun Yu
Jinwoo Shin
AI4CE
218
3
0
28 Oct 2025
Lookahead Anchoring: Preserving Character Identity in Audio-Driven Human Animation
Lookahead Anchoring: Preserving Character Identity in Audio-Driven Human Animation
Junyoung Seo
Rodrigo Mira
A. Haliassos
Stella Bounareli
Honglie Chen
Linh Tran
Seungryong Kim
Zoe Landgraf
Jie Shen
VGen
121
1
0
27 Oct 2025
DiffusionLane: Diffusion Model for Lane Detection
DiffusionLane: Diffusion Model for Lane Detection
Kunyang Zhou
Yeqin Shao
DiffM
112
0
0
25 Oct 2025
Epipolar Geometry Improves Video Generation Models
Epipolar Geometry Improves Video Generation Models
Orest Kupyn
Fabian Manhardt
F. Tombari
Christian Rupprecht
VGen
194
0
0
24 Oct 2025
BachVid: Training-Free Video Generation with Consistent Background and Character
BachVid: Training-Free Video Generation with Consistent Background and Character
Han Yan
Xibin Song
Yifu Wang
Hongdong Li
Pan Ji
Chao Ma
DiffMVGen
108
0
0
24 Oct 2025
Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos
Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos
Qixiu Li
Yu Deng
Yaobo Liang
L. Luo
Lei Zhou
...
Hao Chen
Lily Sun
Dong Chen
J. Yang
B. Guo
113
4
0
24 Oct 2025
WorldGrow: Generating Infinite 3D World
WorldGrow: Generating Infinite 3D World
Sikuang Li
Chen-Ning Yang
Jiemin Fang
Taoran Yi
Jia Lu
Jiazhong Cen
Lingxi Xie
Wei Shen
Qi Tian
VGen
147
2
0
24 Oct 2025
Improved Training Technique for Shortcut Models
Improved Training Technique for Shortcut Models
Anh-Tien Nguyen
Viet-Anh Nguyen
D. Vu
T. Dao
Chi Tran
Toan M. Tran
Anh Tran
BDL
203
1
0
24 Oct 2025
AccuQuant: Simulating Multiple Denoising Steps for Quantizing Diffusion Models
AccuQuant: Simulating Multiple Denoising Steps for Quantizing Diffusion Models
Seunghoon Lee
Jeongwoo Choi
Byunggwan Son
Jaehyeon Moon
Jeimin Jeon
Bumsub Ham
DiffMMQ
196
0
0
23 Oct 2025
AutoScape: Geometry-Consistent Long-Horizon Scene Generation
AutoScape: Geometry-Consistent Long-Horizon Scene Generation
Jiacheng Chen
Ziyu Jiang
Mingfu Liang
Bingbing Zhuang
Jong-Chyi Su
Sparsh Garg
Ying Wu
Manmohan Chandraker
VGen
116
0
0
23 Oct 2025
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives
Yihao Meng
Hao Ouyang
Yue Yu
Qiuyu Wang
Wen Wang
...
Yixuan Li
Cheng Chen
Yanhong Zeng
Yujun Shen
Huamin Qu
VGen
104
4
0
23 Oct 2025
RAPO++: Cross-Stage Prompt Optimization for Text-to-Video Generation via Data Alignment and Test-Time Scaling
RAPO++: Cross-Stage Prompt Optimization for Text-to-Video Generation via Data Alignment and Test-Time Scaling
Bingjie Gao
Qianli Ma
Xiaoxue Wu
Shuai Yang
Guanzhou Lan
...
Qingyang Liu
Yu Qiao
Xinyuan Chen
Y. Wang
Li Niu
VGen
92
0
0
23 Oct 2025
Advances in 4D Representation: Geometry, Motion, and Interaction
Advances in 4D Representation: Geometry, Motion, and Interaction
M. Zhao
Sauradip Nag
Kai Wang
Aditya Vora
Guangda Ji
Peter Chun
Ali Mahdavi Amiri
Hao Zhang
192
0
0
22 Oct 2025
Video Consistency Distance: Enhancing Temporal Consistency for Image-to-Video Generation via Reward-Based Fine-Tuning
Video Consistency Distance: Enhancing Temporal Consistency for Image-to-Video Generation via Reward-Based Fine-Tuning
Takehiro Aoshima
Yusuke Shinohara
Byeongseon Park
EGVMVGen
319
0
0
22 Oct 2025
MoAlign: Motion-Centric Representation Alignment for Video Diffusion Models
MoAlign: Motion-Centric Representation Alignment for Video Diffusion Models
Aritra Bhowmik
Denis Korzhenkov
Cees G. M. Snoek
A. Habibian
Mohsen Ghafoorian
DiffMVGen
167
4
0
21 Oct 2025
UltraGen: High-Resolution Video Generation with Hierarchical Attention
UltraGen: High-Resolution Video Generation with Hierarchical Attention
Teng Hu
Jiangning Zhang
Zihan Su
Ran Yi
DiffMVGen
182
5
0
21 Oct 2025
GAS: Improving Discretization of Diffusion ODEs via Generalized Adversarial Solver
GAS: Improving Discretization of Diffusion ODEs via Generalized Adversarial Solver
Aleksandr Oganov
Ilya Bykov
Eva Neudachina
Mishan Aliev
Alexander Tolmachev
Alexander Sidorov
Aleksandr Zuev
Andrey Okhotin
Denis Rakitin
Aibek Alanov
141
0
0
20 Oct 2025
Adaptive Discretization for Consistency Models
Adaptive Discretization for Consistency Models
Jiayu Bai
Zhanbo Feng
Zhijie Deng
Tianqi Hou
Robert C. Qiu
Zenan Ling
124
0
0
20 Oct 2025
World-in-World: World Models in a Closed-Loop World
World-in-World: World Models in a Closed-Loop World
Jiahan Zhang
Muqing Jiang
Nanru Dai
Taiming Lu
Arda Uzunoglu
...
Rama Chellappa
Tianmin Shu
Alan Yuille
Yilun Du
Jieneng Chen
VGenVLM
192
4
0
20 Oct 2025
MUG-V 10B: High-efficiency Training Pipeline for Large Video Generation Models
MUG-V 10B: High-efficiency Training Pipeline for Large Video Generation Models
Yongshun Zhang
Zhongyi Fan
Yonghang Zhang
Zhangzikang Li
Weifeng Chen
Zhongwei Feng
Chaoyue Wang
Peng Hou
Anxiang Zeng
VGen
255
0
0
20 Oct 2025
From Mannequin to Human: A Pose-Aware and Identity-Preserving Video Generation Framework for Lifelike Clothing Display
From Mannequin to Human: A Pose-Aware and Identity-Preserving Video Generation Framework for Lifelike Clothing Display
Xiangyu Mu
Dongliang Zhou
Jie Hou
Haijun Zhang
Weili Guan
DiffM
148
0
0
19 Oct 2025
Advancing Off-Road Autonomous Driving: The Large-Scale ORAD-3D Dataset and Comprehensive Benchmarks
Advancing Off-Road Autonomous Driving: The Large-Scale ORAD-3D Dataset and Comprehensive Benchmarks
Chen Min
Jilin Mei
Heng Zhai
Shuai Wang
Tong Sun
...
Yiming Nie
Qi Zhu
Liang Xiao
Dawei Zhao
Yu Hu
80
1
0
18 Oct 2025
Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset
Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset
Qingyan Bai
Qiuyu Wang
Hao Ouyang
Yue Yu
Hanlin Wang
...
Yanhong Zeng
Zichen Liu
Yinghao Xu
Yujun Shen
Qifeng Chen
VGen
251
10
0
17 Oct 2025
TGT: Text-Grounded Trajectories for Locally Controlled Video Generation
TGT: Text-Grounded Trajectories for Locally Controlled Video Generation
Guofeng Zhang
Angtian Wang
Jacob Zhiyuan Fang
Liming Jiang
Haotian Yang
...
Yiding Yang
G. Chen
Longyin Wen
Alan Yuille
Chongyang Ma
DiffMVGen
86
1
0
16 Oct 2025
Adapting Self-Supervised Representations as a Latent Space for Efficient Generation
Adapting Self-Supervised Representations as a Latent Space for Efficient Generation
Ming Gui
Johannes Schusterbauer
Timy Phan
Felix Krause
J. Susskind
Miguel Angel Bautista
Bjorn Ommer
181
1
0
16 Oct 2025
RealDPO: Real or Not Real, that is the Preference
RealDPO: Real or Not Real, that is the Preference
Guo Cheng
Danni Yang
Ziqi Huang
Jianlou Si
Chenyang Si
Ziwei Liu
VGen
278
0
0
16 Oct 2025
ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints
ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints
Meiqi Wu
Jiashu Zhu
Xiaokun Feng
C. L. Philip Chen
Chen Zhu
Bingze Song
Fangyuan Mao
Jiahong Wu
Xiangxiang Chu
Kaiqi Huang
VGenEGVMVLM
334
0
0
16 Oct 2025
STANCE: Motion Coherent Video Generation Via Sparse-to-Dense Anchored Encoding
STANCE: Motion Coherent Video Generation Via Sparse-to-Dense Anchored Encoding
Zhifei Chen
Tianshuo Xu
Leyi Wu
Luozhou Wang
Dongyu Yan
Zihan You
Wenting Luo
Guo Zhang
Yingcong Chen
DiffMVGen
140
0
0
16 Oct 2025
Terra: Explorable Native 3D World Model with Point Latents
Terra: Explorable Native 3D World Model with Point Latents
Yuanhui Huang
Weiliang Chen
Wenzhao Zheng
Xin Tao
Pengfei Wan
Jie Zhou
Jiwen Lu
VGen
98
0
0
16 Oct 2025
Virtually Being: Customizing Camera-Controllable Video Diffusion Models with Multi-View Performance Captures
Virtually Being: Customizing Camera-Controllable Video Diffusion Models with Multi-View Performance Captures
Yuancheng Xu
Wenqi Xian
Li Ma
Julien Philip
Ahmet Levent Taşel
...
Oliver Hermann
Oliver Pilarski
Rahul Garg
P. Debevec
Ning Yu
DiffMVGen
95
0
0
16 Oct 2025
3D Scene Prompting for Scene-Consistent Camera-Controllable Video Generation
3D Scene Prompting for Scene-Consistent Camera-Controllable Video Generation
J. Lee
Jaewoo Jung
Jisang Han
Takuya Narihira
Kazumi Fukuda
Junyoung Seo
Sunghwan Hong
Yuki Mitsufuji
Seungryong Kim
VGen
91
1
0
16 Oct 2025
Previous
12345...181920
Next