Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2311.15127
Cited By
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
25 November 2023
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
Dominik Lorenz
Yam Levi
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
VGen
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (13 upvotes)
Github (25943★)
Papers citing
"Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets"
50 / 938 papers shown
Title
TEAR: Temporal-aware Automated Red-teaming for Text-to-Video Models
Jiaming He
Guanyu Hou
Hongwei Li
Zhicong Huang
Kangjie Chen
Yi Yu
Wenbo Jiang
Guowen Xu
Tianwei Zhang
EGVM
VGen
111
0
0
26 Nov 2025
MobileI2V: Fast and High-Resolution Image-to-Video on Mobile Devices
Shuai Zhang
Bao Tang
Siyuan Yu
Yueting Zhu
Jingfeng Yao
Ya Zou
Shanglin Yuan
Li Yu
Wenyu Liu
Xinggang Wang
DiffM
VGen
145
0
0
26 Nov 2025
Image Diffusion Models Exhibit Emergent Temporal Propagation in Videos
Youngseo Kim
Dohyun Kim
Geohee Han
Paul Hongsuck Seo
132
0
0
25 Nov 2025
PhysChoreo: Physics-Controllable Video Generation with Part-Aware Semantic Grounding
H. Zhang
Tianyu Huang
Zichen Wan
Xiaowei Jin
Hongzhi Zhang
Hui Li
Wangmeng Zuo
VGen
115
0
0
25 Nov 2025
UltraViCo: Breaking Extrapolation Limits in Video Diffusion Transformers
Min Zhao
Hongzhou Zhu
Y. Wang
Bokai Yan
J. Zhang
Guande He
Ling Yang
Chongxuan Li
Jun-Jie Zhu
84
0
0
25 Nov 2025
Exo2EgoSyn: Unlocking Foundation Video Generation Models for Exocentric-to-Egocentric Video Synthesis
Mohammad Mahdi
Yuqian Fu
N. Savov
Jiancheng Pan
Danda Pani Paudel
Luc Van Gool
VGen
128
1
0
25 Nov 2025
3M-TI: High-Quality Mobile Thermal Imaging via Calibration-free Multi-Camera Cross-Modal Diffusion
Minchong Chen
Xiaoyun Yuan
Junzhe Wan
Jianing Zhang
Jun Zhang
114
0
0
24 Nov 2025
One4D: Unified 4D Generation and Reconstruction via Decoupled LoRA Control
Zhenxing Mi
Yuxin Wang
Dan Xu
VGen
120
0
0
24 Nov 2025
SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation
J. Zhang
Shengming Cao
Rui Li
Xiaotong Zhao
Yutao Cui
...
Gangshan Wu
Haolan Chen
Yu-Syuan Xu
L. xilinx Wang
Kai Ma
VGen
190
0
0
24 Nov 2025
Are Image-to-Video Models Good Zero-Shot Image Editors?
Zechuan Zhang
Zhenyuan Chen
Zongxin Yang
Yi Yang
DiffM
VGen
445
0
0
24 Nov 2025
Zero-Shot Video Deraining with Video Diffusion Models
Tuomas Varanka
Juan Luis Gonzalez
Hyeongwoo Kim
Pablo Garrido
Xu Yao
DiffM
VGen
112
0
0
23 Nov 2025
Consolidating Diffusion-Generated Video Detection with Unified Multimodal Forgery Learning
Xiaohong Liu
Xiufeng Song
Huayu Zheng
Lei Bai
Xiaoming Liu
Guangtao Zhai
DiffM
92
0
0
22 Nov 2025
FeRA: Frequency-Energy Constrained Routing for Effective Diffusion Adaptation Fine-Tuning
Bo Yin
Xiaobin Hu
Xingyu Zhou
Peng-Tao Jiang
Yue Liao
Junwei Zhu
Jiangning Zhang
Ying Tai
Chengjie Wang
Shuicheng Yan
DiffM
109
1
0
22 Nov 2025
EgoControl: Controllable Egocentric Video Generation via 3D Full-Body Poses
Enrico Pallotta
Sina Mokhtarzadeh Azar
Lars Doorenbos
Serdar Ozsoy
Umar Iqbal
Juergen Gall
DiffM
VGen
72
0
0
22 Nov 2025
Loomis Painter: Reconstructing the Painting Process
Markus Pobitzer
Chang Liu
Chenyi Zhuang
Teng Long
Bin Ren
Nicu Sebe
DiffM
131
0
0
21 Nov 2025
PostCam: Camera-Controllable Novel-View Video Generation with Query-Shared Cross-Attention
Yipeng Chen
Zhichao Ye
Zhenzhou Fang
Xinyu Chen
Xiaoyu Zhang
Jialing Liu
Nan Wang
Haomin Liu
Guofeng Zhang
DiffM
VGen
130
0
0
21 Nov 2025
EvDiff: High Quality Video with an Event Camera
Weilun Li
Lei-huan Sun
Ruixi Gao
Qi Jiang
Yuqin Ma
Kaiwei Wang
M. Yang
Luc Van Gool
D. Paudel
DiffM
VGen
112
0
0
21 Nov 2025
Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks
Cheng Yang
Haiyuan Wan
Yiran Peng
Xin Cheng
Zhaoyang Yu
...
Junchi Yu
Xinlei Yu
Xiawu Zheng
D. Zhou
Chenglin Wu
ReLM
LRM
234
0
0
19 Nov 2025
First Frame Is the Place to Go for Video Content Customization
Jingxi Chen
Z. Li
Zhichao Liu
Guangyao Shi
Xiyang Wu
Fuxiao Liu
Cornelia Fermüller
Brandon Yushan Feng
Yiannis Aloimonos
DiffM
VGen
157
0
0
19 Nov 2025
BD-Net: Has Depth-Wise Convolution Ever Been Applied in Binary Neural Networks?
DoYoung Kim
Jin-Seop Lee
Noo-Ri Kim
SungJoon Lee
Jee-Hyong Lee
MQ
108
3
0
19 Nov 2025
Gaussian See, Gaussian Do: Semantic 3D Motion Transfer from Multiview Video
Yarin Bekor
Gal Michael Harari
Or Perel
Or Litany
3DGS
85
0
0
18 Nov 2025
Recurrent Autoregressive Diffusion: Global Memory Meets Local Attention
Taiye Chen
Zihan Ding
Anjian Li
Christina Zhang
Zeqi Xiao
Yisen Wang
Chi Jin
VGen
129
0
0
17 Nov 2025
DriveLiDAR4D: Sequential and Controllable LiDAR Scene Generation for Autonomous Driving
Kaiwen Cai
Xinze Liu
Xia Zhou
Hengtong Hu
Jie Xiang
Luyao Zhang
Xueyang Zhang
Kun Zhan
Yifei Zhan
Xianpeng Lang
3DPC
190
0
0
17 Nov 2025
Towards High-Consistency Embodied World Model with Multi-View Trajectory Videos
Taiyi Su
Jian Zhu
Yaxuan Li
Chong Ma
Zitai Huang
Yichen Zhu
Hanli Wang
VGen
230
0
0
17 Nov 2025
Generative Photographic Control for Scene-Consistent Video Cinematic Editing
Huiqiang Sun
Liao Shen
Zhan Peng
Kun Wang
Size Wu
...
Z. Huang
Xingyu Zeng
Zhiguo Cao
Wei Li
Chen Change Loy
DiffM
VGen
134
0
0
17 Nov 2025
Free-Form Scene Editor: Enabling Multi-Round Object Manipulation like in a 3D Engine
Xincheng Shuai
Zhenyuan Qin
Henghui Ding
Dacheng Tao
DiffM
138
0
0
17 Nov 2025
Neo: Real-Time On-Device 3D Gaussian Splatting with Reuse-and-Update Sorting Acceleration
Changhun Oh
Seongryong Oh
Jinwoo Hwang
Yoonsung Kim
Hardik Sharma
Jongse Park
3DGS
118
0
0
17 Nov 2025
From Events to Clarity: The Event-Guided Diffusion Framework for Dehazing
Ling Wang
Yunfan Lu
Wenzong Ma
Huizai Yao
Pengteng Li
Hui Xiong
DiffM
79
0
0
14 Nov 2025
SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control
Zhengyi Luo
Ye Yuan
Tingwu Wang
Chenran Li
Sirui Chen
...
Jan Kautz
Yan Chang
Umar Iqbal
Linxi Fan
Yuke Zhu
105
4
0
11 Nov 2025
RoboTAG: End-to-end Robot Configuration Estimation via Topological Alignment Graph
Yifan Liu
Fangneng Zhan
Wanhua Li
Haowen Sun
Katerina Fragkiadaki
Hanspeter Pfister
50
0
0
11 Nov 2025
Simulating the Visual World with Artificial Intelligence: A Roadmap
Jingtong Yue
Z. Huang
Z. Chen
Xintao Wang
Pengfei Wan
Ziwei Liu
VGen
LM&Ro
328
0
0
11 Nov 2025
ViPRA: Video Prediction for Robot Actions
Sandeep Routray
Hengkai Pan
Unnat Jain
Shikhar Bahl
Deepak Pathak
186
0
0
11 Nov 2025
DIMO: Diverse 3D Motion Generation for Arbitrary Objects
Linzhan Mou
Jiahui Lei
Chen Wang
Lingjie Liu
Kostas Daniilidis
VGen
162
0
0
10 Nov 2025
4DSTR: Advancing Generative 4D Gaussians with Spatial-Temporal Rectification for High-Quality and Consistent 4D Generation
Mengmeng Liu
Jiuming Liu
Yunpeng Zhang
Jiangtao Li
M. Yang
Francesco Nex
Hao Cheng
3DGS
120
0
0
10 Nov 2025
Toward the Frontiers of Reliable Diffusion Sampling via Adversarial Sinkhorn Attention Guidance
Kwanyoung Kim
DiffM
126
0
0
10 Nov 2025
RelightMaster: Precise Video Relighting with Multi-plane Light Images
Weikang Bian
Xiaoyu Shi
Z. Huang
J. Bai
Qinghe Wang
Xintao Wang
Pengfei Wan
Kun Gai
Jiaming Song
VGen
160
1
0
09 Nov 2025
FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Jiang Lin
Xinyu Chen
Song Wu
Zhiqiu Zhang
Jizhi Zhang
Ye Wang
Qiang Tang
Qian Wang
Jian Yang
Zili Yi
DiffM
92
0
0
07 Nov 2025
PhysCorr: Dual-Reward DPO for Physics-Constrained Text-to-Video Generation with Automated Preference Selection
Peiyao Wang
Weining Wang
Qi Li
EGVM
VGen
339
1
0
06 Nov 2025
InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation
Jinlai Liu
J. N. Han
B. Yan
Hui Wu
Fengda Zhu
Xing-Hui Wang
Yi Jiang
Bingyue Peng
Zehuan Yuan
VGen
200
0
0
06 Nov 2025
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
Jingqi Tong
Yurong Mou
Hangcheng Li
Mingzhe Li
Y. Yang
...
Y. Zheng
Xinchi Chen
Jun Zhao
Xuanjing Huang
Xipeng Qiu
VGen
LRM
293
6
0
06 Nov 2025
Tortoise and Hare Guidance: Accelerating Diffusion Model Inference with Multirate Integration
Yunghee Lee
Byeonghyun Pak
Junwha Hong
Hoseong Kim
168
0
0
06 Nov 2025
MotionStream: Real-Time Video Generation with Interactive Motion Controls
Joonghyuk Shin
Zhengqi Li
Richard Zhang
Jun-Yan Zhu
Jaesik Park
Eli Schechtman
Xun Huang
DiffM
VGen
252
6
0
03 Nov 2025
UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback
Ropeway Liu
Hangjie Yuan
B. Dong
Jiazheng Xing
Jinwang Wang
Rui Zhao
Yan Xing
Weihua Chen
F. Wang
VGen
110
0
0
03 Nov 2025
Driving scenario generation and evaluation using a structured layer representation and foundational models
Arthur Hubert
Gamal Elghazaly
R. Frank
76
0
0
03 Nov 2025
DANCER: Dance ANimation via Condition Enhancement and Rendering with diffusion model
Yucheng Xing
Jinxing Yin
Xiaodong Liu
VGen
116
0
0
31 Oct 2025
Co-Evolving Latent Action World Models
Yucen Wang
Fengming Zhang
De-Chuan Zhan
Li Zhao
Kaixin Wang
Jiang Bian
VGen
166
0
0
30 Oct 2025
Decoupled MeanFlow: Turning Flow Models into Flow Maps for Accelerated Sampling
Kyungmin Lee
Sihyun Yu
Jinwoo Shin
AI4CE
210
3
0
28 Oct 2025
Rethinking Visual Intelligence: Insights from Video Pretraining
Pablo Acuaviva
A. Davtyan
Mariam Hassan
Sebastian Stapf
Ahmad Rahimi
Alexandre Alahi
Paolo Favaro
VLM
LRM
153
0
0
28 Oct 2025
Lookahead Anchoring: Preserving Character Identity in Audio-Driven Human Animation
Junyoung Seo
Rodrigo Mira
A. Haliassos
Stella Bounareli
Honglie Chen
Linh Tran
Seungryong Kim
Zoe Landgraf
Jie Shen
VGen
109
1
0
27 Oct 2025
DiffusionLane: Diffusion Model for Lane Detection
Kunyang Zhou
Yeqin Shao
DiffM
84
0
0
25 Oct 2025
1
2
3
4
...
17
18
19
Next