ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.03048
  4. Cited By
Latte: Latent Diffusion Transformer for Video Generation

Latte: Latent Diffusion Transformer for Video Generation

5 January 2024
Xin Ma
Yaohui Wang
Gengyun Jia
Xinyuan Chen
Z. Liu
Yuan-Fang Li
Cunjian Chen
Yu Qiao
    DiffM
    VGen
ArXivPDFHTML

Papers citing "Latte: Latent Diffusion Transformer for Video Generation"

50 / 186 papers shown
Title
FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios
FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios
Shiyi Zhang
Junhao Zhuang
Zhaoyang Zhang
Ying Shan
Yansong Tang
VGen
19
0
0
06 May 2025
ADiff4TPP: Asynchronous Diffusion Models for Temporal Point Processes
ADiff4TPP: Asynchronous Diffusion Models for Temporal Point Processes
Amartya Mukherjee
Ruizhi Deng
He Zhao
Yuzhen Mao
Leonid Sigal
Frederick Tung
DiffM
AI4TS
32
0
0
29 Apr 2025
RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning
RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning
Haoran Geng
Feishi Wang
Songlin Wei
Y. Li
Bangjun Wang
...
Hao Dong
Siyuan Huang
Yue Wang
Jitendra Malik
Pieter Abbeel
43
130
0
26 Apr 2025
PMG: Progressive Motion Generation via Sparse Anchor Postures Curriculum Learning
PMG: Progressive Motion Generation via Sparse Anchor Postures Curriculum Learning
Yingjie Xi
J. J. Zhang
Xiaosong Yang
19
1
0
23 Apr 2025
Latent Video Dataset Distillation
Latent Video Dataset Distillation
Ning Li
Antai Andy Liu
Jingran Zhang
Justin Cui
DD
VGen
55
43
0
23 Apr 2025
DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment
DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment
X. Li
Chenming Wu
Zhao Yang
Zhihao Xu
Dingkang Liang
Y. Zhang
Ji Wan
J. Wang
VGen
49
71
0
22 Apr 2025
DyST-XL: Dynamic Layout Planning and Content Control for Compositional Text-to-Video Generation
DyST-XL: Dynamic Layout Planning and Content Control for Compositional Text-to-Video Generation
Weijie He
Mushui Liu
Yunlong Yu
Zhao Wang
Chao Wu
DiffM
VGen
30
0
0
21 Apr 2025
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
Bingjie Gao
Xinyu Gao
Xiaoxue Wu
Yujie Zhou
Yu Qiao
Li Niu
Xinyuan Chen
Yaohui Wang
44
0
0
16 Apr 2025
EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
J. Xu
Y. Huang
Baoqi Pei
Junlin Hou
Qingqiu Li
Guo Chen
Y. Zhang
Rui Feng
Weidi Xie
DiffM
30
0
0
16 Apr 2025
WORLDMEM: Long-term Consistent World Simulation with Memory
WORLDMEM: Long-term Consistent World Simulation with Memory
Zeqi Xiao
Yushi Lan
Yifan Zhou
Wenqi Ouyang
Shuai Yang
Yanhong Zeng
Xingang Pan
63
0
0
16 Apr 2025
ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration
ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration
Yongsheng Yu
Haitian Zheng
Zhifei Zhang
Jianming Zhang
Yuqian Zhou
Connelly Barnes
Y. Liu
Wei Xiong
Zhe Lin
Jiebo Luo
22
0
0
11 Apr 2025
Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization
Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization
Jialu Li
Shoubin Yu
Han Lin
Jaemin Cho
Jaehong Yoon
Mohit Bansal
DiffM
VGen
27
0
0
11 Apr 2025
Cellular Development Follows the Path of Minimum Action
Cellular Development Follows the Path of Minimum Action
Rohola Zandie
Farhan Khodaee
Yufan Xia
Elazer R. Edelman
28
0
0
10 Apr 2025
RAGME: Retrieval Augmented Video Generation for Enhanced Motion Realism
RAGME: Retrieval Augmented Video Generation for Enhanced Motion Realism
E. Peruzzo
Dejia Xu
Xingqian Xu
Humphrey Shi
N. Sebe
DiffM
VGen
27
0
0
09 Apr 2025
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Kai Wang
Hao Luo
Yibing Song
Gao Huang
Fan Wang
Yang You
24
0
0
09 Apr 2025
HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration
HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration
Boyuan Wang
Runqi Ouyang
Xiaofeng Wang
Zheng Zhu
Guosheng Zhao
Chaojun Ni
Guan Huang
Lihong Liu
Xingang Wang
3DGS
32
0
0
04 Apr 2025
OmniCam: Unified Multimodal Video Generation via Camera Control
OmniCam: Unified Multimodal Video Generation via Camera Control
Xiaoda Yang
Jiayang Xu
Kaixuan Luan
Xinyu Zhan
Hongshun Qiu
...
Shuai Yang
Li Zhang
Checheng Yu
Cewu Lu
Lixin Yang
DiffM
VGen
42
0
0
03 Apr 2025
SkyReels-A2: Compose Anything in Video Diffusion Transformers
SkyReels-A2: Compose Anything in Video Diffusion Transformers
Zhengcong Fei
D. Li
Di Qiu
J. Wang
Yikun Dou
...
J. Xu
Mingyuan Fan
Guibin Chen
Yang Li
Yahui Zhou
DiffM
VGen
50
1
0
03 Apr 2025
HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation
HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation
Boyuan Wang
Xiaofeng Wang
Chaojun Ni
Guosheng Zhao
Zhiqin Yang
...
Yukun Zhou
Xinze Chen
Guan Huang
Lihong Liu
Xingang Wang
VGen
27
2
0
31 Mar 2025
EchoFlow: A Foundation Model for Cardiac Ultrasound Image and Video Generation
EchoFlow: A Foundation Model for Cardiac Ultrasound Image and Video Generation
Hadrien Reynaud
Alberto Gomez
Paul Leeson
Qingjie Meng
B. Kainz
MedIm
36
0
0
28 Mar 2025
Protecting Your Video Content: Disrupting Automated Video-based LLM Annotations
Protecting Your Video Content: Disrupting Automated Video-based LLM Annotations
Haitong Liu
Kuofeng Gao
Yang Bai
Jinmin Li
Jinxiao Shan
Tao Dai
Shu-Tao Xia
AAML
38
0
0
26 Mar 2025
Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Yuchao Gu
Weijia Mao
Mike Zheng Shou
VGen
49
1
0
25 Mar 2025
EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models
EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models
Yufei Cai
Hu Han
Yuxiang Wei
Shiguang Shan
Xilin Chen
DiffM
VGen
44
0
0
25 Mar 2025
AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset
AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset
Haiyu Zhang
Xinyuan Chen
Yaohui Wang
Xihui Liu
Yunhong Wang
Yu Qiao
VGen
38
0
0
25 Mar 2025
TransAnimate: Taming Layer Diffusion to Generate RGBA Video
TransAnimate: Taming Layer Diffusion to Generate RGBA Video
Xuewei Chen
Zhimin Chen
Yiren Song
VGen
44
0
0
23 Mar 2025
Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks
Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks
Bhishma Dedhia
David Bourgin
Krishna Kumar Singh
Yuheng Li
Yan Kang
Zhan Xu
N. Jha
Y. Liu
DiffM
VGen
49
0
0
21 Mar 2025
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering
Kaisi Guan
Zhengfeng Lai
Y. Sun
Peng Zhang
Wei Liu
Kieran Liu
Meng Cao
Ruihua Song
VGen
36
0
0
21 Mar 2025
BlockDance: Reuse Structurally Similar Spatio-Temporal Features to Accelerate Diffusion Transformers
BlockDance: Reuse Structurally Similar Spatio-Temporal Features to Accelerate Diffusion Transformers
Hui Zhang
Tingwei Gao
Jie Shao
Zuxuan Wu
48
0
0
20 Mar 2025
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation
Zihao Zhang
Haoran Chen
Haoyu Zhao
Guansong Lu
Yanwei Fu
Hang Xu
Zuxuan Wu
DiffM
VGen
46
0
0
20 Mar 2025
ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos
ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos
Haolin Yang
Feilong Tang
Ming Hu
Yulong Li
Junjie Guo
Yexin Liu
Zelin Peng
Junjun He
Zongyuan Ge
VGen
DiffM
67
0
0
20 Mar 2025
MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation
MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation
Hongyu Zhang
Yufan Deng
Shenghai Yuan
Peng Jin
Zesen Cheng
Yian Zhao
Chang-Shu Liu
Jie Chen
DiffM
VGen
71
0
0
18 Mar 2025
AUTV: Creating Underwater Video Datasets with Pixel-wise Annotations
AUTV: Creating Underwater Video Datasets with Pixel-wise Annotations
Quang-Trung Truong
Wong Yuk Kwan
Duc Thanh Nguyen
Binh-Son Hua
Sai-Kit Yeung
VGen
38
0
0
17 Mar 2025
Diffusion Dynamics Models with Generative State Estimation for Cloth Manipulation
Diffusion Dynamics Models with Generative State Estimation for Cloth Manipulation
Tongxuan Tian
Haoyang Li
Bo Ai
Xiaodi Yuan
Zhiao Huang
H. Su
DiffM
AI4CE
45
0
0
15 Mar 2025
DreamInsert: Zero-Shot Image-to-Video Object Insertion from A Single Image
Qi Zhao
Zhan Ma
Pan Zhou
VGen
41
0
0
13 Mar 2025
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
Zhengyao Lv
Chenyang Si
Junhao Song
Zhenyu Yang
Yu Qiao
Ziwei Liu
Kwan-Yee K. Wong
VGen
DiffM
48
5
0
13 Mar 2025
CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models
Hao He
Ceyuan Yang
Shanchuan Lin
Yinghao Xu
Meng Wei
Liangke Gui
Qi Zhao
Gordon Wetzstein
Lu Jiang
Hongsheng Li
DiffM
VGen
63
5
0
13 Mar 2025
Semantic Latent Motion for Portrait Video Generation
Qiyuan Zhang
Chenyu Wu
Wenzhang Sun
Huaize Liu
Donglin Di
Wei Chen
Changqing Zou
VGen
36
0
0
13 Mar 2025
TPDiff: Temporal Pyramid Video Diffusion Model
L. Ran
Mike Zheng Shou
52
0
0
12 Mar 2025
Cockatiel: Ensembling Synthetic and Human Preferenced Training for Detailed Video Caption
Luozheng Qin
Zhiyu Tan
Mengping Yang
Xiaomeng Yang
Hao Li
53
0
0
12 Mar 2025
Error Analyses of Auto-Regressive Video Diffusion Models: A Unified Framework
Jing Wang
Fengzhuo Zhang
Xiaoli Li
Vincent Y. F. Tan
Tianyu Pang
Chao Du
Aixin Sun
Zhuoran Yang
VGen
29
1
0
12 Mar 2025
VRMDiff: Text-Guided Video Referring Matting Generation of Diffusion
Lehan Yang
Jincen Song
Tianlong Wang
Daiqing Qi
Weili Shi
Yuheng Liu
Sheng Li
DiffM
VOS
VGen
46
0
0
11 Mar 2025
AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion
Mingzhen Sun
Weining Wang
Gen Li
Jiawei Liu
Jiahui Sun
Wanquan Feng
Shanshan Lao
Siyu Zhou
Qian He
J. Liu
DiffM
VGen
43
1
0
10 Mar 2025
DreamRelation: Relation-Centric Video Customization
Yujie Wei
Shiwei Zhang
Hangjie Yuan
Biao Gong
Longxiang Tang
...
Haonan Qiu
Hengjia Li
Shuai Tan
Y. Zhang
Hongming Shan
VGen
50
1
0
10 Mar 2025
TR-DQ: Time-Rotation Diffusion Quantization
Yihua Shao
Deyang Lin
Fanhu Zeng
Minxi Yan
M. Zhang
...
Haozhe Wang
J. Guo
Yan Wang
Haotong Qin
Hao Tang
MQ
DiffM
54
1
0
09 Mar 2025
An Egocentric Vision-Language Model based Portable Real-time Smart Assistant
Y. Huang
Jilan Xu
Baoqi Pei
Yuping He
Guo Chen
...
Xinyuan Chen
Yaohui Wang
Yali Wang
Yu Qiao
Limin Wang
40
1
0
06 Mar 2025
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles
Rui Zhao
Weijia Mao
Mike Zheng Shou
34
0
0
05 Mar 2025
GRADEO: Towards Human-Like Evaluation for Text-to-Video Generation via Multi-Step Reasoning
Zhun Mou
Bin Xia
Zhengchao Huang
Wenming Yang
Jiaya Jia
VGen
ELM
LRM
35
0
0
04 Mar 2025
VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation
Wenhao Wang
Y. Yang
DiffM
VGen
53
0
0
03 Mar 2025
Raccoon: Multi-stage Diffusion Training with Coarse-to-Fine Curating Videos
Raccoon: Multi-stage Diffusion Training with Coarse-to-Fine Curating Videos
Zhiyu Tan
Junyan Wang
Hao Yang
Luozheng Qin
Hesen Chen
Qiang-feng Zhou
Hao Li
VGen
45
0
0
28 Feb 2025
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
Sotiris Anagnostidis
Gregor Bachmann
Yeongmin Kim
Jonas Kohler
Markos Georgopoulos
A. Sanakoyeu
Yuming Du
Albert Pumarola
Ali K. Thabet
Edgar Schönfeld
61
0
0
27 Feb 2025
1234
Next