ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.15127
  4. Cited By
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large
  Datasets

Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

25 November 2023
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
Dominik Lorenz
Yam Levi
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
    VGen
ArXiv (abs)PDFHTMLHuggingFace (13 upvotes)Github (25943★)

Papers citing "Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets"

50 / 967 papers shown
Title
Mobius: Text to Seamless Looping Video Generation via Latent Shift
Mobius: Text to Seamless Looping Video Generation via Latent Shift
Xiuli Bi
Jianfei Yuan
Bo Liu
Yanmei Zhang
Xiaodong Cun
Chi-Man Pun
Bin Xiao
DiffMVGen
159
0
0
27 Feb 2025
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less ComputeComputer Vision and Pattern Recognition (CVPR), 2025
Sotiris Anagnostidis
Gregor Bachmann
Yeongmin Kim
Jonas Kohler
Markos Georgopoulos
A. Sanakoyeu
Yuming Du
Albert Pumarola
Ali K. Thabet
Edgar Schönfeld
315
3
0
27 Feb 2025
High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model
High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion ModelComputer Vision and Pattern Recognition (CVPR), 2025
Mingtao Guo
Guanyu Xing
Yanli Liu
DiffMVGen
215
4
0
27 Feb 2025
TransVDM: Motion-Constrained Video Diffusion Model for Transparent Video Synthesis
TransVDM: Motion-Constrained Video Diffusion Model for Transparent Video SynthesisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Menghao Li
Zhenghao Zhang
Junchao Liao
Long Qin
Weizhi Wang
DiffMVGen
199
1
0
26 Feb 2025
X-Dancer: Expressive Music to Human Dance Video Generation
X-Dancer: Expressive Music to Human Dance Video Generation
Zeyuan Chen
Hongyi Xu
Guoxian Song
You Xie
Chenxu Zhang
Xiusi Chen
Chao Wang
Di Chang
Linjie Luo
VGen
289
8
0
24 Feb 2025
PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify
PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and VerifyInternational Conference on Learning Representations (ICLR), 2024
Zhengqing Wang
Jiacheng Chen
Yasutaka Furukawa
332
15
0
24 Feb 2025
RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers
RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers
Min Zhao
Guande He
Yixiao Chen
Hongzhou Zhu
Chong Li
Jun Zhu
VGen
383
35
0
21 Feb 2025
Text-to-Image Rectified Flow as Plug-and-Play Priors
Text-to-Image Rectified Flow as Plug-and-Play PriorsInternational Conference on Learning Representations (ICLR), 2024
Xiaofeng Yang
Cheng Chen
Xulei Yang
Fayao Liu
Guosheng Lin
DiffM
348
21
0
21 Feb 2025
Accelerating Diffusion Transformers with Token-wise Feature Caching
Accelerating Diffusion Transformers with Token-wise Feature CachingInternational Conference on Learning Representations (ICLR), 2024
Chang Zou
Xuyang Liu
Ting Liu
Siteng Huang
Linfeng Zhang
359
56
0
20 Feb 2025
CAST: Component-Aligned 3D Scene Reconstruction from an RGB Image
CAST: Component-Aligned 3D Scene Reconstruction from an RGB ImageACM Transactions on Graphics (TOG), 2025
Kaixin Yao
Longwen Zhang
Xinhao Yan
Yan Zeng
Qixuan Zhang
Wei Yang
Lan Xu
Jiayuan Gu
Jingyi Yu
349
39
0
18 Feb 2025
VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video Generation
VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Xinlong Chen
Yuanxing Zhang
Chongling Rao
Yushuo Guan
Qingbin Liu
Fuzheng Zhang
Chengru Song
Qiang Liu
Di Zhang
Tieniu Tan
286
13
0
18 Feb 2025
SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion
SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion
Junxian Ma
Shiwen Wang
Jian Yang
Junyi Hu
Jian Liang
Guosheng Lin
Jingbo Chen
Kai Li
Yu Meng
DiffMVGen
305
5
0
17 Feb 2025
MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction
MaskGWM: A Generalizable Driving World Model with Video Mask ReconstructionComputer Vision and Pattern Recognition (CVPR), 2025
Jingcheng Ni
Yuxin Guo
Yichen Liu
Rui Chen
Lewei Lu
Z. Wu
DiffMVGen
277
17
0
17 Feb 2025
Phantom: Subject-consistent video generation via cross-modal alignment
Phantom: Subject-consistent video generation via cross-modal alignment
Lijie Liu
Tianxiang Ma
Bingchuan Li
Zhuowei Chen
Jiawei Liu
Qian He
Xinglong Wu
Qian He
Xinglong Wu
DiffMVGen
411
43
0
16 Feb 2025
Learning Human Skill Generators at Key-Step Levels
Learning Human Skill Generators at Key-Step Levels
Yilu Wu
Chenhui Zhu
Shuai Wang
Hanlin Wang
Jing Wang
Zhaoxiang Zhang
Limin Wang
VGen
346
1
0
12 Feb 2025
History-Guided Video Diffusion
History-Guided Video Diffusion
Kiwhan Song
Boyuan Chen
Max Simchowitz
Yilun Du
Russ Tedrake
Vincent Sitzmann
VGen
497
61
0
10 Feb 2025
Pre-Trained Video Generative Models as World Simulators
Pre-Trained Video Generative Models as World Simulators
Haoran He
Yang Zhang
Guanbin Li
Zhihao Xu
Ling Pan
VGen
336
21
0
10 Feb 2025
Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance
Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance
Li Hu
Guangyuan Wang
Zhen Shen
Xin Gao
Dechao Meng
Lian Zhuo
Peng Zhang
Bang Zhang
Liefeng Bo
DiffMVGen
343
34
0
10 Feb 2025
VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer
VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer
Xinyu Liu
Ailing Zeng
Wei Xue
Harry Yang
Wenhan Luo
Qifeng Liu
Wenhan Luo
VGen
1.0K
7
0
09 Feb 2025
A Physical Coherence Benchmark for Evaluating Video Generation Models via Optical Flow-guided Frame Prediction
A Physical Coherence Benchmark for Evaluating Video Generation Models via Optical Flow-guided Frame Prediction
Yongfan Chen
Xiuwen Zhu
Tianyu Li
EGVMVGen
498
3
0
08 Feb 2025
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM
Yueying Zou
Peipei Li
Zekun Li
Huaibo Huang
Xing Cui
Xuannan Liu
Chenghanyu Zhang
Ran He
DeLMO
605
10
0
07 Feb 2025
MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation
MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation
Jinbo Xing
Long Mai
Cusuh Ham
Jiahui Huang
Aniruddha Mahapatra
Chi-Wing Fu
T. Wong
Feng Liu
DiffMVGen
544
21
0
06 Feb 2025
Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach
Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach
Yunuo Chen
Junli Cao
Vidit Goel
Sergei Korolev
Sergei Korolev
Jian Ren
Sergey Tulyakov
Jian Ren
DiffMVGen
334
7
0
05 Feb 2025
MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation
MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation
Haibo Tong
Zhaoyang Wang
Zhe Chen
Haonian Ji
Shi Qiu
...
Peng Xia
Mingyu Ding
Rafael Rafailov
Chelsea Finn
Huaxiu Yao
EGVMVGen
599
8
0
03 Feb 2025
Dissecting Submission Limit in Desk-Rejections: A Mathematical Analysis of Fairness in AI Conference Policies
Dissecting Submission Limit in Desk-Rejections: A Mathematical Analysis of Fairness in AI Conference Policies
Yuefan Cao
Xiaoyu Li
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao Song
Jiahao Zhang
287
12
0
02 Feb 2025
Consistent Video Colorization via Palette Guidance
Consistent Video Colorization via Palette Guidance
Han Wang
Yuang Zhang
Yuhong Zhang
Lingxiao Lu
Li Song
DiffMVGen
280
2
0
31 Jan 2025
Improving Tropical Cyclone Forecasting With Video Diffusion Models
Improving Tropical Cyclone Forecasting With Video Diffusion Models
Zhibo Ren
Pritthijit Nath
Pancham Shukla
323
0
0
27 Jan 2025
VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking
VideoShield: Regulating Diffusion-based Video Generation Models via WatermarkingInternational Conference on Learning Representations (ICLR), 2025
Runyi Hu
Jing Zhang
You Li
Jiwei Li
Qing Guo
Han Qiu
Tianwei Zhang
WIGMVGen
432
16
0
24 Jan 2025
Improving Video Generation with Human Feedback
Improving Video Generation with Human Feedback
Jie Liu
Gongye Liu
Jiajun Liang
Ziyang Yuan
Xiaokun Liu
...
Fei Yang
Pengfei Wan
Di Zhang
Kun Gai
Yujiu Yang
VGenEGVM
406
96
0
23 Jan 2025
PreciseCam: Precise Camera Control for Text-to-Image Generation
PreciseCam: Precise Camera Control for Text-to-Image GenerationComputer Vision and Pattern Recognition (CVPR), 2025
Edurne Bernal-Berdun
Ana Serrano
B. Masiá
Matheus Gadelha
Yannick Hold-Geoffroy
Xin Sun
Diego F. F. Gutierrez
DiffMVGen
187
9
0
22 Jan 2025
Towards Affordance-Aware Articulation Synthesis for Rigged Objects
Towards Affordance-Aware Articulation Synthesis for Rigged Objects
Yu-Chu Yu
C. Lin
Hsin-Ying Lee
Chaoyang Wang
Longji Xu
Ming-Hsuan Yang
DiffMAI4CE
255
0
0
21 Jan 2025
GPS as a Control Signal for Image Generation
GPS as a Control Signal for Image GenerationComputer Vision and Pattern Recognition (CVPR), 2025
Chao Feng
Ziyang Chen
Aleksander Holyñski
Alexei A. Efros
Andrew Owens
DiffM
176
2
0
21 Jan 2025
Video Depth Anything: Consistent Depth Estimation for Super-Long Videos
Video Depth Anything: Consistent Depth Estimation for Super-Long VideosComputer Vision and Pattern Recognition (CVPR), 2025
Sili Chen
Hengkai Guo
Shengnan Zhu
Feihu Zhang
Zilong Huang
Jiashi Feng
Bingyi Kang
MDEVLMAuLLM
532
98
0
21 Jan 2025
Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving
Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous DrivingAAAI Conference on Artificial Intelligence (AAAI), 2024
Yu Yang
Jianbiao Mei
Yukai Ma
Siliang Du
Wenqing Chen
Yijie Qian
Yuxiang Feng
Yong Liu
427
38
0
20 Jan 2025
Joint Learning of Depth and Appearance for Portrait Image Animation
Joint Learning of Depth and Appearance for Portrait Image Animation
Xinya Ji
Gaspard Zoss
Prashanth Chandran
Lingchen Yang
Xun Cao
B. Solenthaler
D. Bradley
3DHMDE
309
1
0
15 Jan 2025
BlobGEN-Vid: Compositional Text-to-Video Generation with Blob Video Representations
BlobGEN-Vid: Compositional Text-to-Video Generation with Blob Video RepresentationsComputer Vision and Pattern Recognition (CVPR), 2025
Weixi Feng
Chao Liu
Sifei Liu
William Yang Wang
Arash Vahdat
Weili Nie
VGenDiffM
178
10
0
13 Jan 2025
Qffusion: Controllable Portrait Video Editing via Quadrant-Grid Attention Learning
Qffusion: Controllable Portrait Video Editing via Quadrant-Grid Attention LearningIEEE Transactions on Visualization and Computer Graphics (TVCG), 2025
Maomao Li
Lijian Lin
Yunfei Liu
Ye Zhu
Yu Li
DiffMVGen
346
1
0
11 Jan 2025
MEt3R: Measuring Multi-View Consistency in Generated Images
MEt3R: Measuring Multi-View Consistency in Generated ImagesComputer Vision and Pattern Recognition (CVPR), 2025
Mohammad Asim
Christopher Wewer
Thomas Wimmer
Bernt Schiele
J. E. Lenssen
EGVM3DGSVGen
223
36
0
10 Jan 2025
Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video GenerationComputer Vision and Pattern Recognition (CVPR), 2025
Guy Yariv
Yuval Kirstain
Amit Zohar
Shelly Sheynin
Yaniv Taigman
Yossi Adi
Sagie Benaim
Adam Polyak
VGenDiffM
150
9
0
06 Jan 2025
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution
Rui Xie
Yinhong Liu
Penghao Zhou
Chen Zhao
Jun Zhou
Lucas Beerens
Zhenru Zhang
Jian Yang
Zhiyong Yang
Ying Tai
VGenDiffM
299
21
0
06 Jan 2025
Visual Large Language Models for Generalized and Specialized Applications
Jiayi Zhang
Zhixin Lai
Wentao Bao
Zhen Tan
Anh Dao
Kewei Sui
Jiayi Shen
Dong Liu
Huan Liu
Yu Kong
VLM
422
32
0
06 Jan 2025
Pointmap-Conditioned Diffusion for Consistent Novel View Synthesis
Pointmap-Conditioned Diffusion for Consistent Novel View Synthesis
Thang-Anh-Quan Nguyen
Nathan Piasco
Luis Roldão
Moussâb Bennehar
D. Tsishkou
Laurent Caraffa
J. Tarel
R. Brémond
DiffM
293
3
0
06 Jan 2025
GS-DiT: Advancing Video Generation with Pseudo 4D Gaussian Fields through Efficient Dense 3D Point Tracking
Weikang Bian
Zhaoyang Huang
Xiaoyu Shi
Yijin Li
Fu-Yun Wang
Jiaming Song
3DGSVGenDiffM
320
24
0
05 Jan 2025
TDM: Temporally-Consistent Diffusion Model for All-in-One Real-World Video RestorationConference on Multimedia Modeling (MMM), 2025
Yizhou Li
Zihua Liu
Yusuke Monno
Masatoshi Okutomi
DiffMVGen
181
2
0
04 Jan 2025
Towards Precise Scaling Laws for Video Diffusion Transformers
Towards Precise Scaling Laws for Video Diffusion TransformersComputer Vision and Pattern Recognition (CVPR), 2024
Yuanyang Yin
Yaqi Zhao
Mingwu Zheng
Ke Lin
Jiarong Ou
...
Pengfei Wan
Di Zhang
Baoqun Yin
Wentao Zhang
Kun Gai
361
9
0
03 Jan 2025
RORem: Training a Robust Object Remover with Human-in-the-Loop
RORem: Training a Robust Object Remover with Human-in-the-LoopComputer Vision and Pattern Recognition (CVPR), 2025
Ruibin Li
Tao Yang
Song Guo
Guang Dai
389
11
0
01 Jan 2025
AKiRa: Augmentation Kit on Rays for optical video generation
AKiRa: Augmentation Kit on Rays for optical video generationComputer Vision and Pattern Recognition (CVPR), 2024
Xi Wang
Robin Courant
Marc Christie
Vicky Kalogeiton
VGen
379
10
0
31 Dec 2024
DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT
DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT
Xiaotao Hu
Wei Yin
Mingkai Jia
Junyuan Deng
Xiaoyang Guo
Qian Zhang
Xiaoxiao Long
Ping Tan
VGen
322
33
0
31 Dec 2024
Edicho: Consistent Image Editing in the Wild
Edicho: Consistent Image Editing in the Wild
Qingyan Bai
Hao Ouyang
Yinghao Xu
Qiuyu Wang
Ceyuan Yang
Ka Leong Cheng
Yujun Shen
Qifeng Chen
DiffM
479
5
0
30 Dec 2024
PERSE: Personalized 3D Generative Avatars from A Single Portrait
PERSE: Personalized 3D Generative Avatars from A Single PortraitComputer Vision and Pattern Recognition (CVPR), 2024
Hyunsoo Cha
Inhee Lee
Hanbyul Joo
3DGS
230
7
0
30 Dec 2024
Previous
123...131415...181920
Next