ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.08818
  4. Cited By
Align your Latents: High-Resolution Video Synthesis with Latent
  Diffusion Models

Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models

18 April 2023
A. Blattmann
Robin Rombach
Huan Ling
Tim Dockhorn
Seung Wook Kim
Sanja Fidler
Karsten Kreis
    3DGS
    VGen
ArXivPDFHTML

Papers citing "Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models"

50 / 827 papers shown
Title
Mobius: A High Efficient Spatial-Temporal Parallel Training Paradigm for
  Text-to-Video Generation Task
Mobius: A High Efficient Spatial-Temporal Parallel Training Paradigm for Text-to-Video Generation Task
Yiran Yang
Jinchao Zhang
Ying Deng
Jie Zhou
DiffM
23
0
0
09 Jul 2024
MiraData: A Large-Scale Video Dataset with Long Durations and Structured
  Captions
MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions
Xuan Ju
Yiming Gao
Zhaoyang Zhang
Ziyang Yuan
Xintao Wang
Ailing Zeng
Yu Xiong
Qiang Xu
Ying Shan
VGen
64
39
0
08 Jul 2024
VIMI: Grounding Video Generation through Multi-modal Instruction
VIMI: Grounding Video Generation through Multi-modal Instruction
Yuwei Fang
Willi Menapace
Aliaksandr Siarohin
Tsai-Shien Chen
Kuan-Chien Wang
Ivan Skorokhodov
Graham Neubig
Sergey Tulyakov
VGen
58
2
0
08 Jul 2024
T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models
T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models
Yibo Miao
Yifan Zhu
Yinpeng Dong
Lijia Yu
Jun Zhu
Xiao-Shan Gao
EGVM
31
12
0
08 Jul 2024
LaSe-E2V: Towards Language-guided Semantic-Aware Event-to-Video
  Reconstruction
LaSe-E2V: Towards Language-guided Semantic-Aware Event-to-Video Reconstruction
Kanghao Chen
Hangyu Li
Jiazhou Zhou
Zeyu Wang
Lin Wang
DiffM
VGen
36
1
0
08 Jul 2024
TimeLDM: Latent Diffusion Model for Unconditional Time Series Generation
TimeLDM: Latent Diffusion Model for Unconditional Time Series Generation
Jian Qian
Miao Sun
Sifan Zhou
Biao Wan
Minhao Li
Patrick Chiang
31
7
0
05 Jul 2024
DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents
DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents
Yilun Xu
Gabriele Corso
Tommi Jaakkola
Arash Vahdat
Karsten Kreis
29
12
0
03 Jul 2024
No Training, No Problem: Rethinking Classifier-Free Guidance for
  Diffusion Models
No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models
Seyedmorteza Sadat
Manuel Kansy
Otmar Hilliges
Romann M. Weber
29
10
0
02 Jul 2024
Boosting Consistency in Story Visualization with Rich-Contextual
  Conditional Diffusion Models
Boosting Consistency in Story Visualization with Rich-Contextual Conditional Diffusion Models
Fei Shen
Hu Ye
Sibo Liu
Jun Zhang
Cong Wang
Xiao Han
Wei Yang
87
34
0
02 Jul 2024
GVDIFF: Grounded Text-to-Video Generation with Diffusion Models
GVDIFF: Grounded Text-to-Video Generation with Diffusion Models
Huanzhang Dou
Ruixiang Li
Wei Su
Xi Li
DiffM
34
1
0
02 Jul 2024
Evaluation of Text-to-Video Generation Models: A Dynamics Perspective
Evaluation of Text-to-Video Generation Models: A Dynamics Perspective
Mingxiang Liao
Hannan Lu
Xinyu Zhang
Fang Wan
Tianyu Wang
Yuzhong Zhao
W. Zuo
Qixiang Ye
Jingdong Wang
VGen
EGVM
61
17
0
01 Jul 2024
Blind Inversion using Latent Diffusion Priors
Blind Inversion using Latent Diffusion Priors
Weimin Bai
Siyi Chen
Wenzheng Chen
He Sun
DiffM
26
4
0
01 Jul 2024
SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix
SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix
Peng Dai
Feitong Tan
Qiangeng Xu
David Futschik
Ruofei Du
S. Fanello
Xiaojuan Qi
Yinda Zhang
VGen
21
4
0
29 Jun 2024
What Matters in Detecting AI-Generated Videos like Sora?
What Matters in Detecting AI-Generated Videos like Sora?
Chirui Chang
Zhengzhe Liu
Xiaoyang Lyu
Xiaojuan Qi
DiffM
VGen
85
7
0
27 Jun 2024
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of
  Text-to-Time-lapse Video Generation
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Shenghai Yuan
Jinfa Huang
Yongqi Xu
Yaoyang Liu
Shaofeng Zhang
Yujun Shi
Ruijie Zhu
Xinhua Cheng
Jiebo Luo
Li Yuan
EGVM
VGen
69
34
0
26 Jun 2024
DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis
  through Structure Guidance
DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance
Younghyun Kim
Geunmin Hwang
Junyu Zhang
Eunbyung Park
40
6
0
26 Jun 2024
Diffusion Model-Based Video Editing: A Survey
Diffusion Model-Based Video Editing: A Survey
Wenhao Sun
Rong-Cheng Tu
Jingyi Liao
Dacheng Tao
VGen
58
22
0
26 Jun 2024
MotionBooth: Motion-Aware Customized Text-to-Video Generation
MotionBooth: Motion-Aware Customized Text-to-Video Generation
Jianzong Wu
Xiangtai Li
Yanhong Zeng
J. J. Zhang
Qianyu Zhou
Yining Li
Yunhai Tong
Kai Chen
DiffM
VGen
70
40
0
25 Jun 2024
FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models
FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models
Haonan Qiu
Zhaoxi Chen
Zhouxia Wang
Yingqing He
Menghan Xia
Ziwei Liu
VGen
DiffM
34
17
0
24 Jun 2024
Dreamitate: Real-World Visuomotor Policy Learning via Video Generation
Dreamitate: Real-World Visuomotor Policy Learning via Video Generation
Junbang Liang
Ruoshi Liu
Ege Ozguroglu
Sruthi Sudhakar
Achal Dave
P. Tokmakov
Shuran Song
Carl Vondrick
VGen
40
22
0
24 Jun 2024
Towards a Science Exocortex
Towards a Science Exocortex
Kevin G. Yager
74
0
0
24 Jun 2024
MVOC: a training-free multiple video object composition method with
  diffusion models
MVOC: a training-free multiple video object composition method with diffusion models
Wei Wang
Yaosen Chen
Yuegen Liu
Qi Yuan
Shubin Yang
Yanru Zhang
DiffM
63
2
0
22 Jun 2024
Identifying and Solving Conditional Image Leakage in Image-to-Video
  Diffusion Model
Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model
Min Zhao
Hongzhou Zhu
Chendong Xiang
Kaiwen Zheng
Chongxuan Li
Jun Zhu
67
8
0
22 Jun 2024
Image Conductor: Precision Control for Interactive Video Synthesis
Image Conductor: Precision Control for Interactive Video Synthesis
Yaowei Li
Xintao Wang
Zhaoyang Zhang
Zhouxia Wang
Ziyang Yuan
Liangbin Xie
Yuexian Zou
Ying Shan
VGen
42
23
0
21 Jun 2024
VideoScore: Building Automatic Metrics to Simulate Fine-grained Human
  Feedback for Video Generation
VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation
Xuan He
Dongfu Jiang
Ge Zhang
Max W.F. Ku
Achint Soni
...
Yaswanth Narsupalli
Rongqi Fan
Zhiheng Lyu
Yuchen Lin
Wenhu Chen
EGVM
VGen
ALM
43
41
0
21 Jun 2024
A3D: Does Diffusion Dream about 3D Alignment?
A3D: Does Diffusion Dream about 3D Alignment?
Savva Ignatyev
Nina Konovalova
Daniil Selikhanovych
Nikolay Patakin
Nikolay Patakin
...
Anton Konushin
Peter Wonka
Alexander Filippov
Peter Wonka
Evgeny Burnaev
DiffM
60
0
0
21 Jun 2024
IRASim: Learning Interactive Real-Robot Action Simulators
IRASim: Learning Interactive Real-Robot Action Simulators
Fangqi Zhu
Hongtao Wu
Song Guo
Yuxiao Liu
Chilam Cheang
Tao Kong
75
13
0
20 Jun 2024
4K4DGen: Panoramic 4D Generation at 4K Resolution
4K4DGen: Panoramic 4D Generation at 4K Resolution
Renjie Li
Panwang Pan
Bangbang Yang
Dejia Xu
Shijie Zhou
Xuanyang Zhang
Zeming Li
A. Kadambi
Zhangyang Wang
Zhiwen Fan
VGen
54
16
0
19 Jun 2024
Sampling 3D Gaussian Scenes in Seconds with Latent Diffusion Models
Sampling 3D Gaussian Scenes in Seconds with Latent Diffusion Models
Paul Henderson
Melonie de Almeida
D. Ivanova
Titas Anciukevicius
3DGS
43
4
0
18 Jun 2024
ARTIST: Improving the Generation of Text-rich Images by Disentanglement
ARTIST: Improving the Generation of Text-rich Images by Disentanglement
Jianyi Zhang
Yufan Zhou
Jiuxiang Gu
Curtis Wigington
Tong Yu
Yiran Chen
Tong Sun
Ruiyi Zhang
75
0
0
17 Jun 2024
Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of
  99%
Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%
Lei Zhu
Fangyun Wei
Yanye Lu
Dong Chen
VLM
41
33
0
17 Jun 2024
Words in Motion: Extracting Interpretable Control Vectors for Motion Transformers
Words in Motion: Extracting Interpretable Control Vectors for Motion Transformers
Omer Sahin Tas
Royden Wagner
47
1
0
17 Jun 2024
ViD-GPT: Introducing GPT-style Autoregressive Generation in Video
  Diffusion Models
ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models
Kaifeng Gao
Jiaxin Shi
Hanwang Zhang
Chunping Wang
Jun Xiao
DiffM
VGen
65
12
0
16 Jun 2024
SatDiffMoE: A Mixture of Estimation Method for Satellite Image
  Super-resolution with Latent Diffusion Models
SatDiffMoE: A Mixture of Estimation Method for Satellite Image Super-resolution with Latent Diffusion Models
Zhaoxu Luo
Bowen Song
Liyue Shen
34
1
0
14 Jun 2024
L4GM: Large 4D Gaussian Reconstruction Model
L4GM: Large 4D Gaussian Reconstruction Model
Jiawei Ren
Kevin Xie
Ashkan Mirzaei
Hanxue Liang
Xiaohui Zeng
...
Ziwei Liu
Antonio Torralba
Sanja Fidler
Seung Wook Kim
Huan Ling
3DGS
27
37
0
14 Jun 2024
Training-free Camera Control for Video Generation
Training-free Camera Control for Video Generation
Chen Hou
Guoqiang Wei
VGen
DiffM
70
30
0
14 Jun 2024
Rethinking Score Distillation as a Bridge Between Image Distributions
Rethinking Score Distillation as a Bridge Between Image Distributions
David McAllister
Songwei Ge
Jia-Bin Huang
David W. Jacobs
Alexei A. Efros
Aleksander Holyñski
Angjoo Kanazawa
DiffM
54
14
0
13 Jun 2024
SimGen: Simulator-conditioned Driving Scene Generation
SimGen: Simulator-conditioned Driving Scene Generation
Yunsong Zhou
Michael Simon
Zhenghao Peng
Sicheng Mo
Hongzi Zhu
Minyi Guo
Bolei Zhou
VGen
44
11
0
13 Jun 2024
Language-driven Grasp Detection
Language-driven Grasp Detection
An Dinh Vuong
Minh Nhat Vu
Baoru Huang
Nghia Nguyen
Hieu Le
T. Vo
Anh Nguyen
VLM
31
18
0
13 Jun 2024
COVE: Unleashing the Diffusion Feature Correspondence for Consistent
  Video Editing
COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing
Jiangshan Wang
Yue Ma
Jiayi Guo
Yicheng Xiao
Gao Huang
Xiu Li
DiffM
23
17
0
13 Jun 2024
Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing
  Reliability,Reproducibility, and Practicality
Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability,Reproducibility, and Practicality
Tianle Zhang
Langtian Ma
Yuchen Yan
Yuchen Zhang
Kai Wang
...
Wenqi Shao
Yang You
Yu Qiao
Ping Luo
Kaipeng Zhang
VGen
61
2
0
13 Jun 2024
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image
  Animation
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
Mingwang Xu
Hui Li
Qingkun Su
Hanlin Shang
Liwei Zhang
Ce Liu
Jingdong Wang
Yao Yao
Siyu Zhu
VGen
29
67
0
13 Jun 2024
Vivid-ZOO: Multi-View Video Generation with Diffusion Model
Vivid-ZOO: Multi-View Video Generation with Diffusion Model
Bing Li
Cheng Zheng
Wenxuan Zhu
Jinjie Mai
Biao Zhang
Peter Wonka
Bernard Ghanem
40
16
0
12 Jun 2024
TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and
  Image-to-Video Generation
TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation
Weixi Feng
Jiachen Li
Michael Stephen Saxon
Tsu-jui Fu
Wenhu Chen
William Yang Wang
EGVM
VGen
34
9
0
12 Jun 2024
Diffusion-Promoted HDR Video Reconstruction
Diffusion-Promoted HDR Video Reconstruction
Yuanshen Guan
Ruikang Xu
Mingde Yao
Ruisheng Gao
Lizhi Wang
Zhiwei Xiong
43
2
0
12 Jun 2024
Hierarchical Patch Diffusion Models for High-Resolution Video Generation
Hierarchical Patch Diffusion Models for High-Resolution Video Generation
Ivan Skorokhodov
Willi Menapace
Aliaksandr Siarohin
Sergey Tulyakov
VGen
40
10
0
12 Jun 2024
HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction
  Awareness
HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness
Zihui Xue
Mi Luo
Changan Chen
Kristen Grauman
DiffM
22
6
0
11 Jun 2024
Flow Map Matching
Flow Map Matching
Nicholas M. Boffi
M. S. Albergo
Eric Vanden-Eijnden
27
4
0
11 Jun 2024
4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion
  Models
4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models
Heng Yu
Chaoyang Wang
Peiye Zhuang
Willi Menapace
Aliaksandr Siarohin
Junli Cao
László A. Jeni
Sergey Tulyakov
Hsin-Ying Lee
VGen
43
23
0
11 Jun 2024
Visual Representation Learning with Stochastic Frame Prediction
Visual Representation Learning with Stochastic Frame Prediction
Huiwon Jang
Dongyoung Kim
Junsu Kim
Jinwoo Shin
Pieter Abbeel
Younggyo Seo
34
2
0
11 Jun 2024
Previous
123...789...151617
Next