ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.15127
  4. Cited By
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large
  Datasets

Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

25 November 2023
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
Dominik Lorenz
Yam Levi
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
    VGen
ArXiv (abs)PDFHTMLHuggingFace (13 upvotes)Github (25943★)

Papers citing "Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets"

50 / 967 papers shown
Title
I Want It That Way! Specifying Nuanced Camera Motions in Video Editing
I Want It That Way! Specifying Nuanced Camera Motions in Video Editing
P. Guhan
D. Kothandaraman
Tsung-Wei Huang
Guan-Ming Su
Dinesh Manocha
Dinesh Manocha
DiffMVGen
201
0
0
24 Dec 2025
YingVideo-MV: Music-Driven Multi-Stage Video Generation
YingVideo-MV: Music-Driven Multi-Stage Video Generation
Jiahui Chen
Weida Wang
Runhua Shi
Huan Yang
Chaofan Ding
Zihao Chen
DiffMVGen
77
0
0
02 Dec 2025
Video Diffusion Models Excel at Tracking Similar-Looking Objects Without Supervision
Video Diffusion Models Excel at Tracking Similar-Looking Objects Without Supervision
Chenshuang Zhang
Kang Zhang
Joon Son Chung
In So Kweon
Junmo Kim
Chengzhi Mao
DiffM
144
0
0
02 Dec 2025
Taming Camera-Controlled Video Generation with Verifiable Geometry Reward
Taming Camera-Controlled Video Generation with Verifiable Geometry Reward
Zhaoqing Wang
Xiaobo Xia
Zhuolin Bie
Jinlin Liu
Dongdong Yu
Jia-Wang Bian
Changhu Wang
EGVMVGen
113
0
0
02 Dec 2025
ChronosObserver: Taming 4D World with Hyperspace Diffusion Sampling
ChronosObserver: Taming 4D World with Hyperspace Diffusion Sampling
Qisen Wang
Yifan Zhao
Peisen Shen
Jialu Li
Jia Li
3DGSVGen
100
0
0
01 Dec 2025
AlignVid: Training-Free Attention Scaling for Semantic Fidelity in Text-Guided Image-to-Video Generation
Yexin Liu
Wen-Jie Shu
Zile Huang
Haoze Zheng
Yueze Wang
Manyuan Zhang
Ser-Nam Lim
Harry Yang
DiffMVGen
28
0
0
01 Dec 2025
Generative Video Motion Editing with 3D Point Tracks
Yao-Chih Lee
Zhoutong Zhang
Jiahui Huang
Jui-Hsien Wang
Joon-Young Lee
Jia-Bin Huang
Eli Shechtman
Zhengqi Li
DiffMVGen3DPC
188
0
0
01 Dec 2025
Audio-Visual World Models: Towards Multisensory Imagination in Sight and Sound
Audio-Visual World Models: Towards Multisensory Imagination in Sight and Sound
Jiahua Wang
Shannan Yan
Leqi Zheng
Jialong Wu
Yaoxin Mao
VGen
32
0
0
30 Nov 2025
TalkingPose: Efficient Face and Gesture Animation with Feedback-guided Diffusion Model
Alireza Javanmardi
Pragati Jaiswal
T. Habtegebrial
Christen Millerdurai
Shaoxiang Wang
A. Pagani
Didier Stricker
DiffMVGen
86
0
0
30 Nov 2025
CC-FMO: Camera-Conditioned Zero-Shot Single Image to 3D Scene Generation with Foundation Model Orchestration
Boshi Tang
Henry Zheng
Rui Huang
Gao Huang
VGen
100
0
0
29 Nov 2025
MVAD : A Comprehensive Multimodal Video-Audio Dataset for AIGC Detection
MVAD : A Comprehensive Multimodal Video-Audio Dataset for AIGC Detection
Mengxue Hu
Yunfeng Diao
Changtao Miao
Jianshu Li
Zhe Li
Joey Tianyi Zhou
VGen
48
0
0
29 Nov 2025
What about gravity in video generation? Post-Training Newton's Laws with Verifiable Rewards
What about gravity in video generation? Post-Training Newton's Laws with Verifiable Rewards
Minh-Quan Le
Yuanzhi Zhu
Vicky Kalogeiton
Dimitris Samaras
EGVMVGen
71
0
0
29 Nov 2025
DisMo: Disentangled Motion Representations for Open-World Motion Transfer
DisMo: Disentangled Motion Representations for Open-World Motion Transfer
Thomas Ressler-Antal
Frank Fundel
Malek Ben Alaya
S. A. Baumann
Felix Krause
Ming Gui
Bjorn Ommer
DiffMVGen
33
0
0
28 Nov 2025
DualCamCtrl: Dual-Branch Diffusion Model for Geometry-Aware Camera-Controlled Video Generation
DualCamCtrl: Dual-Branch Diffusion Model for Geometry-Aware Camera-Controlled Video Generation
Hongfei Zhang
Kanghao Chen
Zixin Zhang
Harold Haodong Chen
Yuanhuiyi Lyu
Yuqi Zhang
Shuai Yang
Kun Zhou
Yingcong Chen
DiffMVGen
72
1
0
28 Nov 2025
Fast Multi-view Consistent 3D Editing with Video Priors
Fast Multi-view Consistent 3D Editing with Video Priors
Liyi Chen
Ruihuang Li
Guowen Zhang
Pengfei Wang
Lei Zhang
DiffMVGen
92
1
0
28 Nov 2025
One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer
One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer
S. Shi
Jing Xu
Zhihang Li
Chunli Peng
Xiaoda Yang
Lijing Lu
Kai Hu
Jiangning Zhang
DiffM
28
0
0
28 Nov 2025
WorldWander: Bridging Egocentric and Exocentric Worlds in Video Generation
WorldWander: Bridging Egocentric and Exocentric Worlds in Video Generation
Quanjian Song
Yiren Song
Kelly Peng
Yuan Gao
Mike Zheng Shou
DiffMVGen
36
0
0
27 Nov 2025
Fast3Dcache: Training-free 3D Geometry Synthesis Acceleration
Fast3Dcache: Training-free 3D Geometry Synthesis Acceleration
M. Yang
Yanming Yang
Chenyi Xu
Chenxi Song
Yufan Zuo
Tong Zhao
Ruibo Li
Chi Zhang
DiffM
80
0
0
27 Nov 2025
MobileI2V: Fast and High-Resolution Image-to-Video on Mobile Devices
MobileI2V: Fast and High-Resolution Image-to-Video on Mobile Devices
Shuai Zhang
Bao Tang
Siyuan Yu
Yueting Zhu
Jingfeng Yao
Ya Zou
Shanglin Yuan
Li Yu
Wenyu Liu
Xinggang Wang
DiffMVGen
177
0
0
26 Nov 2025
TEAR: Temporal-aware Automated Red-teaming for Text-to-Video Models
TEAR: Temporal-aware Automated Red-teaming for Text-to-Video Models
Jiaming He
Guanyu Hou
Hongwei Li
Zhicong Huang
Kangjie Chen
Yi Yu
Wenbo Jiang
Guowen Xu
Tianwei Zhang
EGVMVGen
151
0
0
26 Nov 2025
AVFakeBench: A Comprehensive Audio-Video Forgery Detection Benchmark for AV-LMMs
AVFakeBench: A Comprehensive Audio-Video Forgery Detection Benchmark for AV-LMMs
Shuhan Xia
Peipei Li
Xuannan Liu
Dongsen Zhang
Xinyu Guo
Zekun Li
AAML
84
0
0
26 Nov 2025
Exo2EgoSyn: Unlocking Foundation Video Generation Models for Exocentric-to-Egocentric Video Synthesis
Exo2EgoSyn: Unlocking Foundation Video Generation Models for Exocentric-to-Egocentric Video Synthesis
Mohammad Mahdi
Yuqian Fu
N. Savov
Jiancheng Pan
Danda Pani Paudel
Luc Van Gool
VGen
152
1
0
25 Nov 2025
PhysChoreo: Physics-Controllable Video Generation with Part-Aware Semantic Grounding
PhysChoreo: Physics-Controllable Video Generation with Part-Aware Semantic Grounding
H. Zhang
Tianyu Huang
Zichen Wan
Xiaowei Jin
Hongzhi Zhang
Hui Li
Wangmeng Zuo
VGen
127
0
0
25 Nov 2025
Image Diffusion Models Exhibit Emergent Temporal Propagation in Videos
Image Diffusion Models Exhibit Emergent Temporal Propagation in Videos
Youngseo Kim
Dohyun Kim
Geohee Han
Paul Hongsuck Seo
160
0
0
25 Nov 2025
UltraViCo: Breaking Extrapolation Limits in Video Diffusion Transformers
UltraViCo: Breaking Extrapolation Limits in Video Diffusion Transformers
Min Zhao
Hongzhou Zhu
Y. Wang
Bokai Yan
J. Zhang
Guande He
Ling Yang
Chongxuan Li
Jun-Jie Zhu
96
0
0
25 Nov 2025
Are Image-to-Video Models Good Zero-Shot Image Editors?
Are Image-to-Video Models Good Zero-Shot Image Editors?
Zechuan Zhang
Zhenyuan Chen
Zongxin Yang
Yi Yang
DiffMVGen
489
0
0
24 Nov 2025
3M-TI: High-Quality Mobile Thermal Imaging via Calibration-free Multi-Camera Cross-Modal Diffusion
3M-TI: High-Quality Mobile Thermal Imaging via Calibration-free Multi-Camera Cross-Modal Diffusion
Minchong Chen
Xiaoyun Yuan
Junzhe Wan
Jianing Zhang
Jun Zhang
142
0
0
24 Nov 2025
One4D: Unified 4D Generation and Reconstruction via Decoupled LoRA Control
One4D: Unified 4D Generation and Reconstruction via Decoupled LoRA Control
Zhenxing Mi
Yuxin Wang
Dan Xu
VGen
152
0
0
24 Nov 2025
Learning Plug-and-play Memory for Guiding Video Diffusion Models
Learning Plug-and-play Memory for Guiding Video Diffusion Models
Selena Song
Ziming Xu
Zijun Zhang
Kun Zhou
Jiaxian Guo
Lianhui Qin
Biwei Huang
VGen
176
0
0
24 Nov 2025
SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation
SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation
J. Zhang
Shengming Cao
Rui Li
Xiaotong Zhao
Yutao Cui
...
Gangshan Wu
Haolan Chen
Yu-Syuan Xu
L. xilinx Wang
Kai Ma
VGen
206
0
0
24 Nov 2025
Zero-Shot Video Deraining with Video Diffusion Models
Zero-Shot Video Deraining with Video Diffusion Models
Tuomas Varanka
Juan Luis Gonzalez
Hyeongwoo Kim
Pablo Garrido
Xu Yao
DiffMVGen
132
0
0
23 Nov 2025
Sequence-Adaptive Video Prediction in Continuous Streams using Diffusion Noise Optimization
Sequence-Adaptive Video Prediction in Continuous Streams using Diffusion Noise Optimization
Sina Mokhtarzadeh Azar
Emad Bahrami
Enrico Pallotta
Gianpiero Francesca
Radu Timofte
Juergen Gall
DiffM
92
0
0
23 Nov 2025
FeRA: Frequency-Energy Constrained Routing for Effective Diffusion Adaptation Fine-Tuning
FeRA: Frequency-Energy Constrained Routing for Effective Diffusion Adaptation Fine-Tuning
Bo Yin
Xiaobin Hu
Xingyu Zhou
Peng-Tao Jiang
Yue Liao
Junwei Zhu
Jiangning Zhang
Ying Tai
Chengjie Wang
Shuicheng Yan
DiffM
125
1
0
22 Nov 2025
EgoControl: Controllable Egocentric Video Generation via 3D Full-Body Poses
EgoControl: Controllable Egocentric Video Generation via 3D Full-Body Poses
Enrico Pallotta
Sina Mokhtarzadeh Azar
Lars Doorenbos
Serdar Ozsoy
Umar Iqbal
Juergen Gall
DiffMVGen
108
0
0
22 Nov 2025
Consolidating Diffusion-Generated Video Detection with Unified Multimodal Forgery Learning
Consolidating Diffusion-Generated Video Detection with Unified Multimodal Forgery Learning
Xiaohong Liu
Xiufeng Song
Huayu Zheng
Lei Bai
Xiaoming Liu
Guangtao Zhai
DiffM
124
0
0
22 Nov 2025
PostCam: Camera-Controllable Novel-View Video Generation with Query-Shared Cross-Attention
PostCam: Camera-Controllable Novel-View Video Generation with Query-Shared Cross-Attention
Yipeng Chen
Zhichao Ye
Zhenzhou Fang
Xinyu Chen
Xiaoyu Zhang
Jialing Liu
Nan Wang
Haomin Liu
Guofeng Zhang
DiffMVGen
146
0
0
21 Nov 2025
EvDiff: High Quality Video with an Event Camera
EvDiff: High Quality Video with an Event Camera
Weilun Li
Lei-huan Sun
Ruixi Gao
Qi Jiang
Yuqin Ma
Kaiwei Wang
M. Yang
Luc Van Gool
D. Paudel
DiffMVGen
152
0
0
21 Nov 2025
Loomis Painter: Reconstructing the Painting Process
Loomis Painter: Reconstructing the Painting Process
Markus Pobitzer
Chang Liu
Chenyi Zhuang
Teng Long
Bin Ren
Nicu Sebe
DiffM
143
0
0
21 Nov 2025
Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks
Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks
Cheng Yang
Haiyuan Wan
Yiran Peng
Xin Cheng
Quan Shi
...
Junchi Yu
Xinlei Yu
Xiawu Zheng
D. Zhou
Chenglin Wu
ReLMLRM
274
0
0
19 Nov 2025
First Frame Is the Place to Go for Video Content Customization
First Frame Is the Place to Go for Video Content Customization
Jingxi Chen
Z. Li
Zhichao Liu
Guangyao Shi
Xiyang Wu
Fuxiao Liu
Cornelia Fermüller
Brandon Yushan Feng
Yiannis Aloimonos
DiffMVGen
173
0
0
19 Nov 2025
BD-Net: Has Depth-Wise Convolution Ever Been Applied in Binary Neural Networks?
BD-Net: Has Depth-Wise Convolution Ever Been Applied in Binary Neural Networks?
DoYoung Kim
Jin-Seop Lee
Noo-Ri Kim
SungJoon Lee
Jee-Hyong Lee
MQ
132
3
0
19 Nov 2025
Gaussian See, Gaussian Do: Semantic 3D Motion Transfer from Multiview Video
Gaussian See, Gaussian Do: Semantic 3D Motion Transfer from Multiview Video
Yarin Bekor
Gal Michael Harari
Or Perel
Or Litany
3DGS
97
0
0
18 Nov 2025
Generative Photographic Control for Scene-Consistent Video Cinematic Editing
Generative Photographic Control for Scene-Consistent Video Cinematic Editing
Huiqiang Sun
Liao Shen
Zhan Peng
Kun Wang
Size Wu
...
Z. Huang
Xingyu Zeng
Zhiguo Cao
Wei Li
Chen Change Loy
DiffMVGen
150
0
0
17 Nov 2025
Recurrent Autoregressive Diffusion: Global Memory Meets Local Attention
Recurrent Autoregressive Diffusion: Global Memory Meets Local Attention
Taiye Chen
Zihan Ding
Anjian Li
Christina Zhang
Zeqi Xiao
Yisen Wang
Chi Jin
VGen
149
1
0
17 Nov 2025
Free-Form Scene Editor: Enabling Multi-Round Object Manipulation like in a 3D Engine
Free-Form Scene Editor: Enabling Multi-Round Object Manipulation like in a 3D Engine
Xincheng Shuai
Zhenyuan Qin
Henghui Ding
Dacheng Tao
DiffM
146
0
0
17 Nov 2025
Neo: Real-Time On-Device 3D Gaussian Splatting with Reuse-and-Update Sorting Acceleration
Neo: Real-Time On-Device 3D Gaussian Splatting with Reuse-and-Update Sorting Acceleration
Changhun Oh
Seongryong Oh
Jinwoo Hwang
Yoonsung Kim
Hardik Sharma
Jongse Park
3DGS
150
0
0
17 Nov 2025
Towards High-Consistency Embodied World Model with Multi-View Trajectory Videos
Towards High-Consistency Embodied World Model with Multi-View Trajectory Videos
Taiyi Su
Jian Zhu
Yaxuan Li
Chong Ma
Zitai Huang
Yichen Zhu
Hanli Wang
VGen
230
0
0
17 Nov 2025
DriveLiDAR4D: Sequential and Controllable LiDAR Scene Generation for Autonomous Driving
DriveLiDAR4D: Sequential and Controllable LiDAR Scene Generation for Autonomous Driving
Kaiwen Cai
Xinze Liu
Xia Zhou
Hengtong Hu
Jie Xiang
Luyao Zhang
Xueyang Zhang
Kun Zhan
Yifei Zhan
Xianpeng Lang
3DPC
222
0
0
17 Nov 2025
From Events to Clarity: The Event-Guided Diffusion Framework for Dehazing
From Events to Clarity: The Event-Guided Diffusion Framework for Dehazing
Ling Wang
Yunfan Lu
Wenzong Ma
Huizai Yao
Pengteng Li
Hui Xiong
DiffM
107
0
0
14 Nov 2025
SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control
SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control
Zhengyi Luo
Ye Yuan
Tingwu Wang
Chenran Li
Sirui Chen
...
Jan Kautz
Yan Chang
Umar Iqbal
Linxi Fan
Yuke Zhu
117
4
0
11 Nov 2025
1234...181920
Next