ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.12346
  4. Cited By
NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation

Annual Meeting of the Association for Computational Linguistics (ACL), 2023
22 March 2023
Sheng-Siang Yin
Chenfei Wu
Huan Yang
Jianfeng Wang
Xiaodong Wang
Minheng Ni
Zhengyuan Yang
Linjie Li
Shuguang Liu
Fan Yang
Jianlong Fu
Gong Ming
Lijuan Wang
Zicheng Liu
Houqiang Li
Nan Duan
    VGen
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation"

27 / 77 papers shown
FreeLong: Training-Free Long Video Generation with SpectralBlend
  Temporal Attention
FreeLong: Training-Free Long Video Generation with SpectralBlend Temporal AttentionNeural Information Processing Systems (NeurIPS), 2024
Yu Lu
Yuanzhi Liang
Linchao Zhu
Yi Yang
DiffMVGen
311
59
0
29 Jul 2024
Unlearning Concepts from Text-to-Video Diffusion Models
Unlearning Concepts from Text-to-Video Diffusion Models
Shiqi Liu
Yihua Tan
DiffM
225
2
0
19 Jul 2024
MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
Yuang Zhang
Jiaxi Gu
L. Wang
Han Wang
Junqi Cheng
Yuefeng Zhu
Fangyuan Zou
VGen
429
153
0
28 Jun 2024
Text Prompting for Multi-Concept Video Customization by Autoregressive
  Generation
Text Prompting for Multi-Concept Video Customization by Autoregressive Generation
D. Kothandaraman
Kihyuk Sohn
Ruben Villegas
P. Voigtlaender
Dinesh Manocha
Mohammad Babaeizadeh
VGenDiffM
232
3
0
22 May 2024
Video Diffusion Models: A Survey
Video Diffusion Models: A Survey
Andrew Melnik
Michal Ljubljanac
Cong Lu
Qi Yan
Weiming Ren
Helge J. Ritter
VGen
340
36
0
06 May 2024
FlexiFilm: Long Video Generation with Flexible Conditions
FlexiFilm: Long Video Generation with Flexible Conditions
Yichen Ouyang
Jianhao Yuan
Hao Zhao
Gaoang Wang
Bo Zhao
DiffM
223
12
0
29 Apr 2024
Predicting Long-horizon Futures by Conditioning on Geometry and Time
Predicting Long-horizon Futures by Conditioning on Geometry and Time
Tarasha Khurana
Deva Ramanan
AI4TS
213
1
0
17 Apr 2024
Frame by Familiar Frame: Understanding Replication in Video Diffusion
  Models
Frame by Familiar Frame: Understanding Replication in Video Diffusion Models
Aimon Rahman
Malsha V. Perera
Vishal M. Patel
VGen
227
11
0
28 Mar 2024
Sora as a World Model? A Complete Survey on Text-to-Video Generation
Sora as a World Model? A Complete Survey on Text-to-Video Generation
Joseph Cho
Fachrina Dewi Puspitasari
Sheng Zheng
Jingyao Zheng
Noor Ul Eman
...
Caiyan Qin
Tae-Ho Kim
Choong Seon Hong
Yang Yang
Heng Tao Shen
EGVMVGen
284
66
0
08 Mar 2024
DreamFrame: Enhancing Video Understanding via Automatically Generated QA and Style-Consistent Keyframes
DreamFrame: Enhancing Video Understanding via Automatically Generated QA and Style-Consistent Keyframes
Zhende Song
Chenchen Wang
Jiamu Sheng
C. Zhang
Gang Yu
Jiayuan Fan
Tao Chen
VGen
469
21
0
03 Mar 2024
Diffusion Model-Based Image Editing: A Survey
Diffusion Model-Based Image Editing: A Survey
Yi Huang
Jiancheng Huang
Yifan Liu
Mingfu Yan
Jiaxi Lv
Jianzhuang Liu
Wei Xiong
Chentao Song
Liangliang Cao
Liangliang Cao
EGVM
871
196
0
27 Feb 2024
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video
  Synthesis
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
Willi Menapace
Aliaksandr Siarohin
Ivan Skorokhodov
Ekaterina Deyneka
Tsai-Shien Chen
...
Yuwei Fang
A. Stoliar
Elisa Ricci
Jian Ren
Sergey Tulyakov
VGen
340
97
0
22 Feb 2024
Using Left and Right Brains Together: Towards Vision and Language
  Planning
Using Left and Right Brains Together: Towards Vision and Language Planning
Jun Cen
Chenfei Wu
Xiao Liu
Sheng-Siang Yin
Yixuan Pei
Jinglong Yang
Qifeng Chen
Nan Duan
Jianguo Zhang
276
10
0
16 Feb 2024
The Essential Role of Causality in Foundation World Models for Embodied
  AI
The Essential Role of Causality in Foundation World Models for Embodied AI
Tarun Gupta
Wenbo Gong
Chao Ma
Nick Pawlowski
Agrin Hilmkil
...
Jianfeng Gao
Stefan Bauer
Danica Kragic
Bernhard Schölkopf
Cheng Zhang
287
28
0
06 Feb 2024
Vlogger: Make Your Dream A Vlog
Vlogger: Make Your Dream A VlogComputer Vision and Pattern Recognition (CVPR), 2024
Shaobin Zhuang
Kunchang Li
Xinyuan Chen
Yaohui Wang
Ziwei Liu
Yu Qiao
Yali Wang
VGenDiffM
147
63
0
17 Jan 2024
Generating Illustrated Instructions
Generating Illustrated Instructions
Sachit Menon
Ishan Misra
Rohit Girdhar
DiffM
286
7
0
07 Dec 2023
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via
  Blender-Oriented GPT Planning
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
Jiaxi Lv
Yi Huang
Mingfu Yan
Jiancheng Huang
Jianzhuang Liu
Yifan Liu
Yafei Wen
Xiaoxin Chen
Shifeng Chen
VGenDiffM
375
50
0
21 Nov 2023
Make Pixels Dance: High-Dynamic Video Generation
Make Pixels Dance: High-Dynamic Video Generation
Yan Zeng
Guoqiang Wei
Jiani Zheng
Jiaxin Zou
Yang Wei
Yuchen Zhang
Hang Li
DiffMVGen
241
149
0
18 Nov 2023
SEINE: Short-to-Long Video Diffusion Model for Generative Transition and
  Prediction
SEINE: Short-to-Long Video Diffusion Model for Generative Transition and PredictionInternational Conference on Learning Representations (ICLR), 2023
Xinyuan Chen
Yaohui Wang
Lingjun Zhang
Shaobin Zhuang
Xin Ma
Jiashuo Yu
Yali Wang
Dahua Lin
Yu Qiao
Ziwei Liu
VGenDiffM
337
206
0
31 Oct 2023
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video GenerationInternational Journal of Computer Vision (IJCV), 2023
David Junhao Zhang
Jay Zhangjie Wu
Jia-Wei Liu
Rui Zhao
L. Ran
Yuchao Gu
Difei Gao
Mike Zheng Shou
DiffMVGen
609
290
0
27 Sep 2023
VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided
  Planning
VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning
Han Lin
Abhaysinh Zala
Jaemin Cho
Joey Tianyi Zhou
LM&RoVGenDiffM
440
111
0
26 Sep 2023
Hierarchical Masked 3D Diffusion Model for Video Outpainting
Hierarchical Masked 3D Diffusion Model for Video OutpaintingACM Multimedia (ACM MM), 2023
Fanda Fan
Chaoxu Guo
Litong Gong
Biao Wang
Bo Xiao
Yuning Jiang
Chunjie Luo
Jianfeng Zhan
DiffMVGen
256
24
0
05 Sep 2023
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
AI-Generated Content (AIGC) for Various Data Modalities: A SurveyACM Computing Surveys (ACM Comput. Surv.), 2023
Lin Geng Foo
Hossein Rahmani
Jing Liu
746
46
0
27 Aug 2023
DragNUWA: Fine-grained Control in Video Generation by Integrating Text,
  Image, and Trajectory
DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory
Sheng-Siang Yin
Chenfei Wu
Jian Liang
Jie Shi
Houqiang Li
Gong Ming
Nan Duan
VGen
250
213
0
16 Aug 2023
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models
  without Specific Tuning
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific TuningInternational Conference on Learning Representations (ICLR), 2023
Yuwei Guo
Ceyuan Yang
Anyi Rao
Zhengyang Liang
Yaohui Wang
Yu Qiao
Maneesh Agrawala
Dahua Lin
Bo Dai
VGen
944
1,284
0
10 Jul 2023
Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion
  Models
Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2023
Chang-rui Liu
Haoning Wu
Yujie Zhong
Xiaoyu Zhang
Yanfeng Wang
Weidi Xie
DiffMVLM
295
67
0
01 Jun 2023
Text-driven Video Prediction
Text-driven Video PredictionACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) (TOMM), 2022
Haijun Shan
Yue Yu
B. Zhu
Yu-Gang Jiang
VGen
164
4
0
06 Oct 2022
Previous
12