ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.08818
  4. Cited By
Align your Latents: High-Resolution Video Synthesis with Latent
  Diffusion Models

Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models

18 April 2023
A. Blattmann
Robin Rombach
Huan Ling
Tim Dockhorn
Seung Wook Kim
Sanja Fidler
Karsten Kreis
    3DGS
    VGen
ArXivPDFHTML

Papers citing "Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models"

50 / 827 papers shown
Title
MSG score: A Comprehensive Evaluation for Multi-Scene Video Generation
MSG score: A Comprehensive Evaluation for Multi-Scene Video Generation
Daewon Yoon
Hyungsuk Lee
Wonsik Shin
VGen
EGVM
DiffM
68
0
0
28 Nov 2024
OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation
Hui Li
Mingwang Xu
Yun Zhan
Shan Mu
Jiaye Li
...
Y. Chen
Tan Chen
Mao Ye
Jingdong Wang
Siyu Zhu
VGen
99
2
0
28 Nov 2024
PCDreamer: Point Cloud Completion Through Multi-view Diffusion Priors
PCDreamer: Point Cloud Completion Through Multi-view Diffusion Priors
Guangshun Wei
Yuan Feng
Long Ma
Chen Wang
Yuanfeng Zhou
Changjian Li
125
0
0
28 Nov 2024
MatchDiffusion: Training-free Generation of Match-cuts
MatchDiffusion: Training-free Generation of Match-cuts
Alejandro Pardo
Fabio Pizzati
Tong Zhang
Alexander Pondaven
Philip H. S. Torr
Juan C. Pérez
Bernard Ghanem
DiffM
VGen
75
1
0
27 Nov 2024
Individual Content and Motion Dynamics Preserved Pruning for Video
  Diffusion Models
Individual Content and Motion Dynamics Preserved Pruning for Video Diffusion Models
Yiming Wu
Huan Wang
Zhenghao Chen
Dong Xu
DiffM
VGen
77
1
0
27 Nov 2024
HiFiVFS: High Fidelity Video Face Swapping
HiFiVFS: High Fidelity Video Face Swapping
Xu Chen
Keke He
Junwei Zhu
Yanhao Ge
Wei Li
Chengjie Wang
VGen
DiffM
78
1
0
27 Nov 2024
MotionCharacter: Identity-Preserving and Motion Controllable Human Video
  Generation
MotionCharacter: Identity-Preserving and Motion Controllable Human Video Generation
Haopeng Fang
Di Qiu
Binjie Mao
Pengfei Yan
He Tang
VGen
DiffM
70
4
0
27 Nov 2024
Scene Co-pilot: Procedural Text to Video Generation with Human in the
  Loop
Scene Co-pilot: Procedural Text to Video Generation with Human in the Loop
Zhaofang Qian
Abolfazl Sharifi
Tucker Carroll
Ser-Nam Lim
VGen
74
0
0
26 Nov 2024
VideoDirector: Precise Video Editing via Text-to-Video Models
VideoDirector: Precise Video Editing via Text-to-Video Models
Yukun Wang
Longguang Wang
Zhiyuan Ma
Qibin Hu
Kai Xu
Yulan Guo
VGen
DiffM
86
0
0
26 Nov 2024
Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal
  Generation and Cache Sharing
Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing
Kaifeng Gao
Jiaxin Shi
Hanwang Zhang
Chunping Wang
Jun Xiao
Long Chen
DiffM
VGen
93
0
0
25 Nov 2024
Sonic: Shifting Focus to Global Audio Perception in Portrait Animation
Sonic: Shifting Focus to Global Audio Perception in Portrait Animation
Xiaozhong Ji
Xiaobin Hu
Zhihong Xu
Junwei Zhu
Chuming Lin
...
Donghao Luo
Yi Chen
Qin Lin
Qinglin Lu
Chengjie Wang
VGen
73
3
0
25 Nov 2024
Neuro-Symbolic Evaluation of Text-to-Video Models using Formal Verification
Neuro-Symbolic Evaluation of Text-to-Video Models using Formal Verification
Sundar Sripada V. S.
Minkyu Choi
Sahil Shah
Harsh Goel
Mohammad Omama
Sandeep P. Chinchali
EGVM
108
2
0
22 Nov 2024
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
Weijia Wu
Mingyu Liu
Zeyu Zhu
Xi Xia
Haoen Feng
Wen Wang
Kevin Qinghong Lin
Chunhua Shen
Mike Zheng Shou
DiffM
VGen
114
1
0
22 Nov 2024
Teaching Video Diffusion Model with Latent Physical Phenomenon Knowledge
Qinglong Cao
Ding Wang
Xirui Li
Yuntian Chen
Chao Ma
Xiaokang Yang
DiffM
VGen
113
2
0
18 Nov 2024
SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input
SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input
Zhen Lv
Yangqi Long
Congzhentao Huang
Cao Li
Chengfei Lv
Hao Ren
Dian Zheng
DiffM
VGen
MDE
112
5
0
18 Nov 2024
OnlyFlow: Optical Flow based Motion Conditioning for Video Diffusion
  Models
OnlyFlow: Optical Flow based Motion Conditioning for Video Diffusion Models
Mathis Koroglu
Hugo Caselles-Dupré
Guillaume Jeanneret Sanmiguel
Matthieu Cord
VGen
DiffM
20
1
0
15 Nov 2024
Diverse capability and scaling of diffusion and auto-regressive models
  when learning abstract rules
Diverse capability and scaling of diffusion and auto-regressive models when learning abstract rules
Binxu Wang
Jiaqi Shang
Haim Sompolinsky
DiffM
33
1
0
12 Nov 2024
Artificial Intelligence for Biomedical Video Generation
Artificial Intelligence for Biomedical Video Generation
Linyuan Li
Jianing Qiu
Anujit Saha
Lin Li
Poyuan Li
Mengxian He
Ziyu Guo
Wu Yuan
VGen
58
1
0
12 Nov 2024
I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength
I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength
Wanquan Feng
Jiawei Liu
Pengqi Tu
Tianhao Qi
Mingzhen Sun
Tianxiang Ma
Songtao Zhao
Siyu Zhou
Qian He
VGen
47
7
0
10 Nov 2024
ReCapture: Generative Video Camera Controls for User-Provided Videos
  using Masked Video Fine-Tuning
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
David Junhao Zhang
Roni Paiss
Shiran Zada
Nikhil Karnad
David E. Jacobs
Yael Pritch
Inbar Mosseri
Mike Zheng Shou
Neal Wadhwa
Nataniel Ruiz
DiffM
VGen
69
15
0
07 Nov 2024
Uncovering Hidden Subspaces in Video Diffusion Models Using
  Re-Identification
Uncovering Hidden Subspaces in Video Diffusion Models Using Re-Identification
Mischa Dombrowski
Hadrien Reynaud
Bernhard Kainz
DiffM
42
1
0
07 Nov 2024
StoryAgent: Customized Storytelling Video Generation via Multi-Agent
  Collaboration
StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration
Panwen Hu
Jin Jiang
Jianqi Chen
Mingfei Han
Shengcai Liao
Xiaojun Chang
Xiaodan Liang
VGen
DiffM
36
5
0
07 Nov 2024
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation
Koichi Namekata
Sherwin Bahmani
Ziyi Wu
Yash Kant
Igor Gilitschenski
David B. Lindell
VGen
57
13
0
07 Nov 2024
Pre-trained Visual Dynamics Representations for Efficient Policy
  Learning
Pre-trained Visual Dynamics Representations for Efficient Policy Learning
Hao Luo
Bohan Zhou
Zongqing Lu
30
0
0
05 Nov 2024
Diffusion-based Generative Multicasting with Intent-aware Semantic
  Decomposition
Diffusion-based Generative Multicasting with Intent-aware Semantic Decomposition
Xinkai Liu
Mahdi Boloursaz Mashhadi
Li Qiao
Yi Ma
Rahim Tafazolli
Mehdi Bennis
DiffM
50
1
0
04 Nov 2024
Optical Flow Representation Alignment Mamba Diffusion Model for Medical
  Video Generation
Optical Flow Representation Alignment Mamba Diffusion Model for Medical Video Generation
Zhenbin Wang
Lei Zhang
Lituan Wang
Minjuan Zhu
Zhenwei Zhang
VGen
MedIm
57
1
0
03 Nov 2024
Infinite-Resolution Integral Noise Warping for Diffusion Models
Infinite-Resolution Integral Noise Warping for Diffusion Models
Yitong Deng
Winnie Lin
Lingxiao Li
Dmitriy Smirnov
Ryan Burgert
Ning Yu
Vincent Dedun
Mohammad H. Taghavi
31
2
0
02 Nov 2024
Fast and Memory-Efficient Video Diffusion Using Streamlined Inference
Fast and Memory-Efficient Video Diffusion Using Streamlined Inference
Zheng Zhan
Yushu Wu
Yifan Gong
Zichong Meng
Zhenglun Kong
Changdi Yang
Geng Yuan
Pu Zhao
Wei Niu
Yanzhi Wang
VGen
39
4
0
02 Nov 2024
X-Drive: Cross-modality consistent multi-sensor data synthesis for
  driving scenarios
X-Drive: Cross-modality consistent multi-sensor data synthesis for driving scenarios
Yichen Xie
Chenfeng Xu
C-T.John Peng
Shuqi Zhao
Nhat Ho
Alexander T. Pham
Mingyu Ding
M. Tomizuka
W. Zhan
DiffM
31
2
0
02 Nov 2024
Enhancing Motion in Text-to-Video Generation with Decomposed Encoding
  and Conditioning
Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning
Penghui Ruan
Pichao Wang
Divya Saxena
Jiannong Cao
Yuhui Shi
DiffM
VGen
31
0
0
31 Oct 2024
Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided
  Mixture-of-Experts
Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided Mixture-of-Experts
Xiang Deng
Youxin Pang
Xiaochen Zhao
Chao Xu
Lizhen Wang
Hongjiang Xiao
Shi Yan
Hongwen Zhang
Yebin Liu
DiffM
VGen
38
1
0
31 Oct 2024
SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video
  Generation
SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation
Yining Hong
Beide Liu
Maxine Wu
Yuanhao Zhai
Kai-Wei Chang
...
Chung-Ching Lin
Jianfeng Wang
Z. Yang
Yingnian Wu
Lijuan Wang
VGen
40
6
0
30 Oct 2024
Latent Diffusion, Implicit Amplification: Efficient Continuous-Scale
  Super-Resolution for Remote Sensing Images
Latent Diffusion, Implicit Amplification: Efficient Continuous-Scale Super-Resolution for Remote Sensing Images
Hanlin Wu
Jiangwei Mo
Xiaohui Sun
Jie Ma
31
1
0
30 Oct 2024
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
Hanyu Wang
Saksham Suri
Yixuan Ren
Hao Chen
Abhinav Shrivastava
VGen
29
9
0
28 Oct 2024
Extrapolating Prospective Glaucoma Fundus Images through Diffusion Model
  in Irregular Longitudinal Sequences
Extrapolating Prospective Glaucoma Fundus Images through Diffusion Model in Irregular Longitudinal Sequences
Zhihao Zhao
Junjie Yang
Shahrooz Faghihroohi
Yinzheng Zhao
Daniel Zapp
Kai-Qi Huang
Nassir Navab
M. A. Nasseri
DiffM
MedIm
59
0
0
28 Oct 2024
Kandinsky 3: Text-to-Image Synthesis for Multifunctional Generative
  Framework
Kandinsky 3: Text-to-Image Synthesis for Multifunctional Generative Framework
V. Arkhipkin
Viacheslav Vasilev
Andrei Filatov
Igor Pavlov
Julia Agafonova
...
Evelina Mironova
Anton Bukashkin
Konstantin Kulikov
Andrey Kuznetsov
Denis Dimitrov
DiffM
31
3
0
28 Oct 2024
SCube: Instant Large-Scale Scene Reconstruction using VoxSplats
SCube: Instant Large-Scale Scene Reconstruction using VoxSplats
Xuanchi Ren
Y. Lu
Hanxue Liang
Zhangjie Wu
Huan Ling
Mike Chen
Sanja Fidler
Francis Williams
Jiahui Huang
3DGS
40
8
0
26 Oct 2024
GiVE: Guiding Visual Encoder to Perceive Overlooked Information
GiVE: Guiding Visual Encoder to Perceive Overlooked Information
Junjie Li
Jianghong Ma
Xiaofeng Zhang
Yuhang Li
Jianyang Shi
23
0
0
26 Oct 2024
Adversarial Environment Design via Regret-Guided Diffusion Models
Adversarial Environment Design via Regret-Guided Diffusion Models
Hojun Chung
Junseo Lee
Minsoo Kim
Dohyeong Kim
Songhwai Oh
24
0
0
25 Oct 2024
NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video
  Reconstruction
NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video Reconstruction
Z. Gong
Guangyin Bao
Qi Zhang
Zhongwei Wan
Duoqian Miao
...
Changwei Wang
Rongtao Xu
Liang Hu
Ke Liu
Yu Zhang
DiffM
VGen
49
8
0
25 Oct 2024
Framer: Interactive Frame Interpolation
Framer: Interactive Frame Interpolation
Wen Wang
Qiuyu Wang
Kecheng Zheng
Hao Ouyang
Zhekai Chen
Biao Gong
Hao Chen
Yujun Shen
Chunhua Shen
VGen
61
5
0
24 Oct 2024
VISAGE: Video Synthesis using Action Graphs for Surgery
VISAGE: Video Synthesis using Action Graphs for Surgery
Yousef Yeganeh
Rachmadio Lazuardi
Amir Shamseddin
Emine Dari
Yash Thirani
Nassir Navab
Azade Farshad
MedIm
23
1
0
23 Oct 2024
DynamicCity: Large-Scale 4D Occupancy Generation from Dynamic Scenes
DynamicCity: Large-Scale 4D Occupancy Generation from Dynamic Scenes
Hengwei Bian
Lingdong Kong
Haozhe Xie
Liang Pan
Yu Qiao
Ziwei Liu
3DPC
32
1
0
23 Oct 2024
Warped Diffusion: Solving Video Inverse Problems with Image Diffusion
  Models
Warped Diffusion: Solving Video Inverse Problems with Image Diffusion Models
Giannis Daras
Weili Nie
Karsten Kreis
A. Dimakis
Morteza Mardani
Nikola B. Kovachki
Arash Vahdat
DiffM
33
5
0
21 Oct 2024
Allegro: Open the Black Box of Commercial-Level Video Generation Model
Allegro: Open the Black Box of Commercial-Level Video Generation Model
Yuan Zhou
Qiuyue Wang
Yuxuan Cai
Huan Yang
VGen
VLM
77
25
0
20 Oct 2024
MedDiff-FM: A Diffusion-based Foundation Model for Versatile Medical
  Image Applications
MedDiff-FM: A Diffusion-based Foundation Model for Versatile Medical Image Applications
Yongrui Yu
Yannian Gu
S. Zhang
Xiaofan Zhang
MedIm
36
2
0
20 Oct 2024
DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise
  Motion Control
DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control
Yujie Wei
Shiwei Zhang
Hangjie Yuan
Xiang Wang
Haonan Qiu
...
F. Liu
Zhizhong Huang
Jiaxin Ye
Yingya Zhang
Hongming Shan
DiffM
VGen
72
14
0
17 Oct 2024
DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving
  Scene Representation
DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation
Guosheng Zhao
Chaojun Ni
Xiaofeng Wang
Zheng Zhu
X. Zhang
...
Xinze Chen
Boyuan Wang
Youyi Zhang
Wenjun Mei
Xingang Wang
VGen
83
24
0
17 Oct 2024
SDI-Paste: Synthetic Dynamic Instance Copy-Paste for Video Instance
  Segmentation
SDI-Paste: Synthetic Dynamic Instance Copy-Paste for Video Instance Segmentation
Sahir Shrestha
Weihao Li
Gao Zhu
Nick Barnes
DiffM
31
0
0
16 Oct 2024
Hessian-Informed Flow Matching
Hessian-Informed Flow Matching
Christopher Iliffe Sprague
Arne Elofsson
Hossein Azizpour
18
0
0
15 Oct 2024
Previous
123456...151617
Next