ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.15103
  4. Cited By
LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion
  Models

LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models

26 September 2023
Yaohui Wang
Xinyuan Chen
Xin Ma
Shangchen Zhou
Ziqi Huang
Yi Wang
Ceyuan Yang
Yinan He
Jiashuo Yu
Pe-der Yang
Yuwei Guo
Tianxing Wu
Chenyang Si
Yuming Jiang
Cunjian Chen
Chen Change Loy
Bo Dai
Dahua Lin
Yu Qiao
Ziwei Liu
    VGen
    DiffM
ArXivPDFHTML

Papers citing "LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models"

39 / 39 papers shown
Title
VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations for Synthetic Videos
VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations for Synthetic Videos
Zongxia Li
Xiyang Wu
Yubin Qin
Guangyao Shi
Hongyang Du
Dinesh Manocha
Tianyi Zhou
Jordan Boyd-Graber
MLLM
41
0
0
02 May 2025
Quaternion Wavelet-Conditioned Diffusion Models for Image Super-Resolution
Quaternion Wavelet-Conditioned Diffusion Models for Image Super-Resolution
Luigi Sigillo
Christian Bianchi
A. Uncini
Danilo Comminiello
46
0
0
01 May 2025
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction
Qihao Liu
Ju He
Qihang Yu
Liang-Chieh Chen
Alan Yuille
DiffM
VGen
75
0
0
30 Apr 2025
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
Bingjie Gao
Xinyu Gao
Xiaoxue Wu
Yujie Zhou
Yu Qiao
Li Niu
Xinyuan Chen
Yaohui Wang
71
0
0
16 Apr 2025
VACT: A Video Automatic Causal Testing System and a Benchmark
VACT: A Video Automatic Causal Testing System and a Benchmark
Haotong Yang
Qingyuan Zheng
Yunjian Gao
Yongkun Yang
Yangbo He
Zhouchen Lin
Muhan Zhang
VGen
CML
59
0
0
08 Mar 2025
WeGen: A Unified Model for Interactive Multimodal Generation as We Chat
Zhipeng Huang
Shaobin Zhuang
Canmiao Fu
Binxin Yang
Ying Zhang
Chong Sun
Zhizheng Zhang
Yali Wang
Chen Li
Zheng-Jun Zha
DiffM
69
1
0
03 Mar 2025
VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation
VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation
Wenhao Wang
Y. Yang
DiffM
VGen
84
0
0
03 Mar 2025
When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding
When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding
Pingping Zhang
Jinlong Li
Kecheng Chen
Meng Wang
Long Xu
Haoliang Li
N. Sebe
Sam Kwong
Shiqi Wang
VGen
115
3
0
17 Feb 2025
Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile
Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile
Hangliang Ding
Dacheng Li
Runlong Su
Peiyuan Zhang
Zhijie Deng
Ion Stoica
Hao Zhang
VGen
65
4
0
10 Feb 2025
PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation
PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation
Qiyao Xue
Xiangyu Yin
Boyuan Yang
Wei Gao
DiffM
VGen
75
9
0
30 Nov 2024
Investigating Memorization in Video Diffusion Models
Investigating Memorization in Video Diffusion Models
C. L. P. Chen
Enhuai Liu
Daochang Liu
M. Shah
Chang Xu
VGen
DiffM
76
1
0
29 Oct 2024
DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise
  Motion Control
DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control
Yujie Wei
Shiwei Zhang
Hangjie Yuan
Xiang Wang
Haonan Qiu
...
F. Liu
Zhizhong Huang
Jiaxin Ye
Yingya Zhang
Hongming Shan
DiffM
VGen
69
14
0
17 Oct 2024
T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through
  Data, Reward, and Conditional Guidance Design
T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through Data, Reward, and Conditional Guidance Design
Jiachen Li
Qian Long
Jian Zheng
Xiaofeng Gao
Robinson Piramuthu
Wenhu Chen
William Yang Wang
VGen
25
22
0
08 Oct 2024
Denoising Reuse: Exploiting Inter-frame Motion Consistency for Efficient
  Video Latent Generation
Denoising Reuse: Exploiting Inter-frame Motion Consistency for Efficient Video Latent Generation
Chenyu Wang
Shuo Yan
Yixuan Chen
Yujiang Wang
Mingzhi Dong
...
Qin Lv
Fan Yang
Tun Lu
Ning Gu
Li Shang
DiffM
VGen
30
0
0
19 Sep 2024
Phy124: Fast Physics-Driven 4D Content Generation from a Single Image
Phy124: Fast Physics-Driven 4D Content Generation from a Single Image
Jiajing Lin
Zhenzhong Wang
Yongjie Hou
Yuzhou Tang
Min Jiang
VGen
24
6
0
11 Sep 2024
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
Zhuoyi Yang
Jiayan Teng
Wendi Zheng
Ming Ding
Shiyu Huang
...
Weihan Wang
Yean Cheng
Xiaotao Gu
Yuxiao Dong
Jie Tang
DiffM
VGen
72
389
0
12 Aug 2024
Unlearning Concepts from Text-to-Video Diffusion Models
Unlearning Concepts from Text-to-Video Diffusion Models
Shiqi Liu
Yihua Tan
DiffM
29
0
0
19 Jul 2024
OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
Kepan Nan
Rui Xie
Penghao Zhou
Tiehan Fan
Zhenheng Yang
Zhijie Chen
Xiang Li
Jian Yang
Ying Tai
73
68
0
02 Jul 2024
Ctrl-X: Controlling Structure and Appearance for Text-To-Image
  Generation Without Guidance
Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance
Kuan Heng Lin
Sicheng Mo
Ben Klingher
Fangzhou Mu
Bolei Zhou
DiffM
26
15
0
11 Jun 2024
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion
Hao Wen
Zehuan Huang
Yaohui Wang
Xinyuan Chen
Yu Qiao
97
7
0
05 Jun 2024
Vista: A Generalizable Driving World Model with High Fidelity and
  Versatile Controllability
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
Shenyuan Gao
Jiazhi Yang
Li Chen
Kashyap Chitta
Yihang Qiu
Andreas Geiger
Jun Zhang
Hongyang Li
60
75
0
27 May 2024
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video
  Generation
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation
Yupeng Zhou
Daquan Zhou
Ming-Ming Cheng
Jiashi Feng
Qibin Hou
DiffM
VGen
30
86
0
02 May 2024
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Shenghai Yuan
Jinfa Huang
Yujun Shi
Yongqi Xu
Ruijie Zhu
Bin Lin
Xinhua Cheng
Li-xin Yuan
Jiebo Luo
VGen
73
33
0
07 Apr 2024
VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis
VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis
Yumeng Li
William H. Beluch
M. Keuper
Dan Zhang
Anna Khoreva
DiffM
VGen
71
5
0
20 Mar 2024
Diffusion Model-Based Image Editing: A Survey
Diffusion Model-Based Image Editing: A Survey
Yi Huang
Jiancheng Huang
Yifan Liu
Mingfu Yan
Jiaxi Lv
Jianzhuang Liu
Wei Xiong
He Zhang
Liangliang Cao
Liangliang Cao
EGVM
66
84
0
27 Feb 2024
Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation
Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation
Zhiwu Qing
Shiwei Zhang
Jiayu Wang
Xiang Wang
Yujie Wei
Yingya Zhang
Changxin Gao
Nong Sang
VGen
DiffM
24
37
0
07 Dec 2023
DreamVideo: Composing Your Dream Videos with Customized Subject and
  Motion
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion
Yujie Wei
Shiwei Zhang
Zhiwu Qing
Hangjie Yuan
Zhiheng Liu
Yu Liu
Yingya Zhang
Jingren Zhou
Hongming Shan
DiffM
VGen
11
89
0
07 Dec 2023
FusionFrames: Efficient Architectural Aspects for Text-to-Video
  Generation Pipeline
FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline
V.Ya. Arkhipkin
Zein Shaheen
Viacheslav Vasilev
E. Dakhova
Andrey Kuznetsov
Denis Dimitrov
DiffM
VGen
16
5
0
22 Nov 2023
Breathing Life Into Sketches Using Text-to-Video Priors
Breathing Life Into Sketches Using Text-to-Video Priors
Rinon Gal
Yael Vinker
Yuval Alaluf
Amit H. Bermano
Daniel Cohen-Or
Ariel Shamir
Gal Chechik
VGen
DiffM
27
28
0
21 Nov 2023
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via
  Blender-Oriented GPT Planning
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
Jiaxi Lv
Yi Huang
Mingfu Yan
Jiancheng Huang
Jianzhuang Liu
Yifan Liu
Yafei Wen
Xiaoxin Chen
Shifeng Chen
VGen
DiffM
23
23
0
21 Nov 2023
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion
  Models
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models
Shiwei Zhang
Jiayu Wang
Yingya Zhang
Kang Zhao
Hangjie Yuan
Z. Qin
Xiang Wang
Deli Zhao
Jingren Zhou
DiffM
VGen
21
196
0
07 Nov 2023
MotionDirector: Motion Customization of Text-to-Video Diffusion Models
MotionDirector: Motion Customization of Text-to-Video Diffusion Models
Rui Zhao
Yuchao Gu
Jay Zhangjie Wu
David Junhao Zhang
Jia-Wei Liu
Weijia Wu
Jussi Keppo
Mike Zheng Shou
DiffM
VGen
20
103
0
12 Oct 2023
VideoFusion: Decomposed Diffusion Models for High-Quality Video
  Generation
VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation
Zhengxiong Luo
Dayou Chen
Yingya Zhang
Yan Huang
Liangsheng Wang
Yujun Shen
Deli Zhao
Jinren Zhou
Tien-Ping Tan
DiffM
VGen
132
215
0
15 Mar 2023
CogVideo: Large-scale Pretraining for Text-to-Video Generation via
  Transformers
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
243
556
0
29 May 2022
BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation
  and Alignment
BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment
Kelvin C. K. Chan
Shangchen Zhou
Xiangyu Xu
Chen Change Loy
149
388
0
27 Apr 2021
VideoGPT: Video Generation using VQ-VAE and Transformers
VideoGPT: Video Generation using VQ-VAE and Transformers
Wilson Yan
Yunzhi Zhang
Pieter Abbeel
A. Srinivas
ViT
VGen
242
482
0
20 Apr 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
InMoDeGAN: Interpretable Motion Decomposition Generative Adversarial
  Network for Video Generation
InMoDeGAN: Interpretable Motion Decomposition Generative Adversarial Network for Video Generation
Yaohui Wang
F. Brémond
A. Dantcheva
VGen
GAN
138
24
0
08 Jan 2021
A Style-Based Generator Architecture for Generative Adversarial Networks
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras
S. Laine
Timo Aila
262
10,183
0
12 Dec 2018
1