ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.08818
  4. Cited By
Align your Latents: High-Resolution Video Synthesis with Latent
  Diffusion Models

Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models

18 April 2023
A. Blattmann
Robin Rombach
Huan Ling
Tim Dockhorn
Seung Wook Kim
Sanja Fidler
Karsten Kreis
    3DGS
    VGen
ArXivPDFHTML

Papers citing "Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models"

50 / 827 papers shown
Title
Enhanced Creativity and Ideation through Stable Video Synthesis
Enhanced Creativity and Ideation through Stable Video Synthesis
Elijah Miller
Thomas Dupont
Mingming Wang
VGen
28
0
0
22 May 2024
MagicPose4D: Crafting Articulated Models with Appearance and Motion Control
MagicPose4D: Crafting Articulated Models with Appearance and Motion Control
Hao Zhang
Di Chang
Fang Li
Mohammad Soleymani
Narendra Ahuja
41
6
0
22 May 2024
Diffusion-RSCC: Diffusion Probabilistic Model for Change Captioning in
  Remote Sensing Images
Diffusion-RSCC: Diffusion Probabilistic Model for Change Captioning in Remote Sensing Images
Xiaofei Yu
Yitong Li
Jie Ma
DiffM
47
0
0
21 May 2024
FIFO-Diffusion: Generating Infinite Videos from Text without Training
FIFO-Diffusion: Generating Infinite Videos from Text without Training
Jihwan Kim
Junoh Kang
Jinyoung Choi
Bohyung Han
DiffM
VGen
58
24
0
19 May 2024
On the Trajectory Regularity of ODE-based Diffusion Sampling
On the Trajectory Regularity of ODE-based Diffusion Sampling
Defang Chen
Zhenyu Zhou
Can Wang
Chunhua Shen
Siwei Lyu
35
14
0
18 May 2024
From Sora What We Can See: A Survey of Text-to-Video Generation
From Sora What We Can See: A Survey of Text-to-Video Generation
Rui Sun
Yumin Zhang
Tejal Shah
Jiahao Sun
Shuoying Zhang
Wenqi Li
Haoran Duan
Bo Wei
R. Ranjan
EGVM
79
20
0
17 May 2024
CAT3D: Create Anything in 3D with Multi-View Diffusion Models
CAT3D: Create Anything in 3D with Multi-View Diffusion Models
Ruiqi Gao
Aleksander Holynski
Philipp Henzler
Arthur Brussee
Ricardo Martín Brualla
Pratul P. Srinivasan
Jonathan T. Barron
Ben Poole
29
150
0
16 May 2024
Sakuga-42M Dataset: Scaling Up Cartoon Research
Sakuga-42M Dataset: Scaling Up Cartoon Research
Zhenglin Pan
Yu Zhu
Yuxuan Mu
30
6
0
13 May 2024
Lumina-T2X: Transforming Text into Any Modality, Resolution, and
  Duration via Flow-based Large Diffusion Transformers
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
Peng Gao
Le Zhuo
Ziyi Lin
Ruoyi Du
Xu Luo
...
Weicai Ye
He Tong
Jingwen He
Yu Qiao
Hongsheng Li
VGen
30
82
0
09 May 2024
TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation
TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation
Hritik Bansal
Yonatan Bitton
Michal Yarom
Idan Szpektor
Aditya Grover
Kai-Wei Chang
DiffM
47
11
0
07 May 2024
Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator
  with Diffusion Models
Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator with Diffusion Models
Fan Bao
Chendong Xiang
Gang Yue
Guande He
Hongzhou Zhu
Kaiwen Zheng
Min Zhao
Shilong Liu
Yaole Wang
Jun Zhu
VGen
110
51
0
07 May 2024
MVDiff: Scalable and Flexible Multi-View Diffusion for 3D Object
  Reconstruction from Single-View
MVDiff: Scalable and Flexible Multi-View Diffusion for 3D Object Reconstruction from Single-View
Emmanuelle Bourigault
Pauline Bourigault
29
2
0
06 May 2024
Is Sora a World Simulator? A Comprehensive Survey on General World
  Models and Beyond
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Zheng Zhu
Xiaofeng Wang
Wangbo Zhao
Chen Min
Nianchen Deng
...
Dawei Zhao
Liang Xiao
Jian-jun Zhao
Jiwen Lu
Guan Huang
VGen
LM&Ro
79
36
0
06 May 2024
Video Diffusion Models: A Survey
Video Diffusion Models: A Survey
Andrew Melnik
Michal Ljubljanac
Cong Lu
Qi Yan
Weiming Ren
Helge J. Ritter
VGen
66
12
0
06 May 2024
Matten: Video Generation with Mamba-Attention
Matten: Video Generation with Mamba-Attention
Yu Gao
Jiancheng Huang
Xiaopeng Sun
Zequn Jie
Yujie Zhong
Lin Ma
64
12
0
05 May 2024
CogDPM: Diffusion Probabilistic Models via Cognitive Predictive Coding
CogDPM: Diffusion Probabilistic Models via Cognitive Predictive Coding
Kaiyuan Chen
Xingzhuo Guo
Yu Zhang
Jianmin Wang
Mingsheng Long
DiffM
33
1
0
03 May 2024
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video
  Generation
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation
Yupeng Zhou
Daquan Zhou
Ming-Ming Cheng
Jiashi Feng
Qibin Hou
DiffM
VGen
32
88
0
02 May 2024
TwinDiffusion: Enhancing Coherence and Efficiency in Panoramic Image
  Generation with Diffusion Models
TwinDiffusion: Enhancing Coherence and Efficiency in Panoramic Image Generation with Diffusion Models
Teng Zhou
Yongchuan Tang
DiffM
40
2
0
30 Apr 2024
FlexiFilm: Long Video Generation with Flexible Conditions
FlexiFilm: Long Video Generation with Flexible Conditions
Yichen Ouyang
Jianhao Yuan
Hao Zhao
Gaoang Wang
Bo-Lu Zhao
DiffM
42
6
0
29 Apr 2024
TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion
  Models
TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models
Haomiao Ni
Bernhard Egger
Suhas Lohit
A. Cherian
Ye Wang
T. Koike-Akino
S. X. Huang
Tim K. Marks
DiffM
31
12
0
25 Apr 2024
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
Amirmojtaba Sabour
Sanja Fidler
Karsten Kreis
DiffM
32
24
0
22 Apr 2024
Zero-shot High-fidelity and Pose-controllable Character Animation
Zero-shot High-fidelity and Pose-controllable Character Animation
Bingwen Zhu
Fanyi Wang
Tianyi Lu
Peng Liu
Jingwen Su
Jinxiu Liu
Yanhao Zhang
Zuxuan Wu
Guo-Jun Qi
Yu-Gang Jiang
DiffM
VGen
50
6
0
21 Apr 2024
Exploring AIGC Video Quality: A Focus on Visual Harmony, Video-Text
  Consistency and Domain Distribution Gap
Exploring AIGC Video Quality: A Focus on Visual Harmony, Video-Text Consistency and Domain Distribution Gap
Bowen Qu
Xiaoyu Liang
Shangkun Sun
Wei-Nan Gao
EGVM
30
6
0
21 Apr 2024
PhysDreamer: Physics-Based Interaction with 3D Objects via Video
  Generation
PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation
Tianyuan Zhang
Hong-Xing Yu
Rundi Wu
Brandon Yushan Feng
Changxi Zheng
Noah Snavely
Jiajun Wu
William T. Freeman
AI4CE
VGen
77
61
0
19 Apr 2024
Training-and-prompt-free General Painterly Harmonization Using
  Image-wise Attention Sharing
Training-and-prompt-free General Painterly Harmonization Using Image-wise Attention Sharing
Teng-Fang Hsiao
Bo-Kai Ruan
Hong-Han Shuai
30
2
0
19 Apr 2024
Detecting Out-Of-Distribution Earth Observation Images with Diffusion
  Models
Detecting Out-Of-Distribution Earth Observation Images with Diffusion Models
Georges Le Bellier
Nicolas Audebert
32
4
0
19 Apr 2024
On the Content Bias in Fréchet Video Distance
On the Content Bias in Fréchet Video Distance
Jason S. Hoffman
Aniruddha Mahapatra
Gaurav Parmar
Jun-Yan Zhu
Jia-Bin Huang
EGVM
50
15
0
18 Apr 2024
VideoGigaGAN: Towards Detail-rich Video Super-Resolution
VideoGigaGAN: Towards Detail-rich Video Super-Resolution
Yiran Xu
Taesung Park
Richard Zhang
Yang Zhou
Eli Shechtman
Feng Liu
Jia-Bin Huang
Difan Liu
SupR
93
10
0
18 Apr 2024
InFusion: Inpainting 3D Gaussians via Learning Depth Completion from
  Diffusion Prior
InFusion: Inpainting 3D Gaussians via Learning Depth Completion from Diffusion Prior
Zhiheng Liu
Ouyang Hao
Qiuyu Wang
Ka Leong Cheng
Jie Xiao
Kai Zhu
Nan Xue
Yu Liu
Yujun Shen
Yang Cao
DiffM
3DGS
41
20
0
17 Apr 2024
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Sicheng Xu
Guojun Chen
Yu-Xiao Guo
Jiaolong Yang
Chong Li
Zhenyu Zang
Yizhong Zhang
Xin Tong
Baining Guo
40
86
0
16 Apr 2024
Four-hour thunderstorm nowcasting using deep diffusion models of satellite
Four-hour thunderstorm nowcasting using deep diffusion models of satellite
Kuai Dai
Xutao Li
Junying Fang
Yunming Ye
Demin Yu
Di Xian
Danyu Qin
Danyu Qin
Jingsong Wang
AI4Cl
32
1
0
16 Apr 2024
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
Hongxin Zhang
Zeyuan Wang
Qiushi Lyu
Zheyuan Zhang
Sunli Chen
Tianmin Shu
Yilun Du
Kwonjoon Lee
Yilun Du
Chuang Gan
41
12
0
16 Apr 2024
Navigating the Landscape of Large Language Models: A Comprehensive
  Review and Analysis of Paradigms and Fine-Tuning Strategies
Navigating the Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms and Fine-Tuning Strategies
Benjue Weng
LM&MA
35
7
0
13 Apr 2024
S3Editor: A Sparse Semantic-Disentangled Self-Training Framework for
  Face Video Editing
S3Editor: A Sparse Semantic-Disentangled Self-Training Framework for Face Video Editing
Guangzhi Wang
Tianyi Chen
Kamran Ghasedi
HsiangTao Wu
Tianyu Ding
Chris Nuesmeyer
Ilya Zharkov
Mohan S. Kankanhalli
Luming Liang
32
1
0
11 Apr 2024
Applying Guidance in a Limited Interval Improves Sample and Distribution
  Quality in Diffusion Models
Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models
Tuomas Kynkaanniemi
M. Aittala
Tero Karras
S. Laine
Timo Aila
J. Lehtinen
19
57
0
11 Apr 2024
Quantum State Generation with Structure-Preserving Diffusion Model
Quantum State Generation with Structure-Preserving Diffusion Model
Yuchen Zhu
Tianrong Chen
Evangelos A. Theodorou
Xie Chen
Molei Tao
DiffM
32
6
0
09 Apr 2024
DreamView: Injecting View-specific Text Guidance into Text-to-3D
  Generation
DreamView: Injecting View-specific Text Guidance into Text-to-3D Generation
Junkai Yan
Yipeng Gao
Q. Yang
Xihan Wei
Xuansong Xie
Ancong Wu
Wei-Shi Zheng
35
1
0
09 Apr 2024
Investigating the Effectiveness of Cross-Attention to Unlock Zero-Shot
  Editing of Text-to-Video Diffusion Models
Investigating the Effectiveness of Cross-Attention to Unlock Zero-Shot Editing of Text-to-Video Diffusion Models
Saman Motamed
Wouter Van Gansbeke
Luc Van Gool
VGen
DiffM
35
1
0
08 Apr 2024
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Shenghai Yuan
Jinfa Huang
Yujun Shi
Yongqi Xu
Ruijie Zhu
Bin Lin
Xinhua Cheng
Li-xin Yuan
Jiebo Luo
VGen
73
33
0
07 Apr 2024
Many-to-many Image Generation with Auto-regressive Diffusion Models
Many-to-many Image Generation with Auto-regressive Diffusion Models
Ying Shen
Yizhe Zhang
Shuangfei Zhai
Lifu Huang
J. Susskind
Jiatao Gu
38
6
0
03 Apr 2024
LidarDM: Generative LiDAR Simulation in a Generated World
LidarDM: Generative LiDAR Simulation in a Generated World
Vlas Zyrianov
Henry Che
Zhijian Liu
Shenlong Wang
VGen
25
20
0
03 Apr 2024
Upsample Guidance: Scale Up Diffusion Models without Training
Upsample Guidance: Scale Up Diffusion Models without Training
Juno Hwang
Yong-Hyun Park
Junghyo Jo
29
12
0
02 Apr 2024
Leveraging YOLO-World and GPT-4V LMMs for Zero-Shot Person Detection and
  Action Recognition in Drone Imagery
Leveraging YOLO-World and GPT-4V LMMs for Zero-Shot Person Detection and Action Recognition in Drone Imagery
Christian Limberg
Artur Gonçalves
Bastien Rigault
Helmut Prendinger
27
5
0
02 Apr 2024
Video Interpolation with Diffusion Models
Video Interpolation with Diffusion Models
Siddhant Jain
Daniel Watson
Eric Tabellion
Aleksander Holyñski
Ben Poole
Janne Kontkanen
SupR
VGen
DiffM
30
32
0
01 Apr 2024
GaussianCube: A Structured and Explicit Radiance Representation for 3D
  Generative Modeling
GaussianCube: A Structured and Explicit Radiance Representation for 3D Generative Modeling
Bowen Zhang
Yiji Cheng
Jiaolong Yang
Chunyu Wang
Feng Zhao
Yansong Tang
Dong Chen
Baining Guo
3DGS
37
8
0
28 Mar 2024
Frame by Familiar Frame: Understanding Replication in Video Diffusion
  Models
Frame by Familiar Frame: Understanding Replication in Video Diffusion Models
Aimon Rahman
Malsha V. Perera
Vishal M. Patel
VGen
43
7
0
28 Mar 2024
SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject
  Control
SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control
Binyuan Huang
Yuqing Wen
Yucheng Zhao
Yaosi Hu
Yingfei Liu
...
Tiancai Wang
Chi Zhang
Chang Wen Chen
Zhenzhong Chen
Xiangyu Zhang
38
15
0
28 Mar 2024
MoDiTalker: Motion-Disentangled Diffusion Model for High-Fidelity
  Talking Head Generation
MoDiTalker: Motion-Disentangled Diffusion Model for High-Fidelity Talking Head Generation
Seyeon Kim
Siyoon Jin
Jihye Park
Kihong Kim
Jiyoung Kim
Jisu Nam
Seungryong Kim
DiffM
VGen
58
3
0
28 Mar 2024
TC4D: Trajectory-Conditioned Text-to-4D Generation
TC4D: Trajectory-Conditioned Text-to-4D Generation
Sherwin Bahmani
Xian Liu
Yifan Wang
Ivan Skorokhodov
Victor Rong
...
Jeong Joon Park
Sergey Tulyakov
Gordon Wetzstein
Andrea Tagliasacchi
David B. Lindell
97
35
0
26 Mar 2024
AnimateMe: 4D Facial Expressions via Diffusion Models
AnimateMe: 4D Facial Expressions via Diffusion Models
Dimitrios Gerogiannis
Foivos Paraperas-Papantoniou
Rolandos Alexandros Potamias
Alexandros Lattas
Stylianos Moschoglou
Stylianos Ploumpis
S. Zafeiriou
30
3
0
25 Mar 2024
Previous
123...91011...151617
Next