ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.02399
  4. Cited By
Phenaki: Variable Length Video Generation From Open Domain Textual
  Description

Phenaki: Variable Length Video Generation From Open Domain Textual Description

5 October 2022
Ruben Villegas
Mohammad Babaeizadeh
Pieter-Jan Kindermans
Hernan Moraldo
Han Zhang
M. Saffar
Santiago Castro
Julius Kunze
D. Erhan
    DiffM
    VGen
ArXivPDFHTML

Papers citing "Phenaki: Variable Length Video Generation From Open Domain Textual Description"

50 / 287 papers shown
Title
Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text
  Aligned Latent Representation
Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation
Zibo Zhao
Wen Liu
Xin Chen
Xi Zeng
Rui Wang
Pei Cheng
Bin-Bin Fu
Tao Chen
Gang Yu
Shenghua Gao
DiffM
20
87
0
29 Jun 2023
On-Policy Distillation of Language Models: Learning from Self-Generated
  Mistakes
On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes
Rishabh Agarwal
Nino Vieillard
Yongchao Zhou
Piotr Stańczyk
Sabela Ramos
Matthieu Geist
Olivier Bachem
35
84
0
23 Jun 2023
Impacts and Risk of Generative AI Technology on Cyber Defense
Impacts and Risk of Generative AI Technology on Cyber Defense
Subash Neupane
Ivan A. Fernandez
Sudip Mittal
Shahram Rahimi
14
16
0
22 Jun 2023
Meta-Personalizing Vision-Language Models to Find Named Instances in
  Video
Meta-Personalizing Vision-Language Models to Find Named Instances in Video
Chun-Hsiao Yeh
Bryan C. Russell
Josef Sivic
Fabian Caba Heilbron
Simon Jenni
VLM
MLLM
44
9
0
16 Jun 2023
VidEdit: Zero-Shot and Spatially Aware Text-Driven Video Editing
VidEdit: Zero-Shot and Spatially Aware Text-Driven Video Editing
Paul Couairon
Clément Rambour
Jean-Emmanuel Haugeard
Nicolas Thome
DiffM
VGen
4
29
0
14 Jun 2023
Generating Images with 3D Annotations Using Diffusion Models
Generating Images with 3D Annotations Using Diffusion Models
Wufei Ma
Qihao Liu
Jiahao Wang
Angtian Wang
Xiaoding Yuan
...
Ruxiao Duan
Yongrui Qi
Adam Kortylewski
Yaoyao Liu
Alan Yuille
DiffM
21
4
0
13 Jun 2023
MovieFactory: Automatic Movie Creation from Text using Large Generative
  Models for Language and Images
MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images
Junchen Zhu
Huan Yang
Huiguo He
Wenjing Wang
Zixi Tuo
Wen-Huang Cheng
Lianli Gao
Jingkuan Song
Jianlong Fu
VGen
DiffM
27
39
0
12 Jun 2023
The Age of Synthetic Realities: Challenges and Opportunities
The Age of Synthetic Realities: Challenges and Opportunities
J. P. Cardenuto
Jing Yang
Rafael Padilha
Renjie Wan
Daniel Moreira
Haoliang Li
Shiqi Wang
Fernanda A. Andaló
Sébastien Marcel
Anderson de Rezende Rocha
DeLMO
42
29
0
09 Jun 2023
Probabilistic Adaptation of Text-to-Video Models
Probabilistic Adaptation of Text-to-Video Models
Mengjiao Yang
Yilun Du
Bo Dai
Dale Schuurmans
J. Tenenbaum
Pieter Abbeel
VGen
DiffM
32
23
0
02 Jun 2023
Denoising Diffusion Semantic Segmentation with Mask Prior Modeling
Denoising Diffusion Semantic Segmentation with Mask Prior Modeling
Zeqiang Lai
Yuchen Duan
Jifeng Dai
Ziheng Li
Ying Fu
Hongsheng Li
Yu Qiao
Wen Wang
DiffM
16
17
0
02 Jun 2023
StyleDrop: Text-to-Image Generation in Any Style
StyleDrop: Text-to-Image Generation in Any Style
Kihyuk Sohn
Nataniel Ruiz
Kimin Lee
Daniel Castro Chin
Irina Blok
...
Yuanzhen Li
Yuan Hao
Irfan Essa
Michael Rubinstein
Dilip Krishnan
4
141
0
01 Jun 2023
Learning Disentangled Prompts for Compositional Image Synthesis
Learning Disentangled Prompts for Compositional Image Synthesis
Kihyuk Sohn
Albert Eaton Shaw
Yuan Hao
Han Zhang
Luisa F. Polanía
Huiwen Chang
Lu Jiang
Irfan Essa
VLM
13
6
0
01 Jun 2023
SAVE: Spectral-Shift-Aware Adaptation of Image Diffusion Models for
  Text-driven Video Editing
SAVE: Spectral-Shift-Aware Adaptation of Image Diffusion Models for Text-driven Video Editing
Nazmul Karim
Umar Khalid
M. Joneidi
Chen Chen
Nazanin Rahnavard
DiffM
VGen
19
5
0
30 May 2023
Gen-L-Video: Multi-Text to Long Video Generation via Temporal
  Co-Denoising
Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising
Fu Lee Wang
Wenshuo Chen
Guanglu Song
Han-Jia Ye
Yu Liu
Hongsheng Li
VGen
DiffM
33
88
0
29 May 2023
Towards Consistent Video Editing with Text-to-Image Diffusion Models
Towards Consistent Video Editing with Text-to-Image Diffusion Models
Zicheng Zhang
Bonan Li
Xuecheng Nie
Congying Han
Tiande Guo
Luoqi Liu
DiffM
18
24
0
27 May 2023
Data-Driven Optimization for Deposition with Degradable Tools
Data-Driven Optimization for Deposition with Degradable Tools
Tony Zheng
Monimoy Bujarbaruah
Francesco Borrelli
38
0
0
26 May 2023
GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes
GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes
Ibrahim Ethem Hamamci
Sezgin Er
Anjany Sekuboyina
Enis Simsar
A. Tezcan
...
Hadrien Reynaud
Sarthak Pati
Christian Bluethgen
M. K. Özdemir
Bjoern H. Menze
DiffM
MedIm
32
16
0
25 May 2023
Vision + Language Applications: A Survey
Vision + Language Applications: A Survey
Yutong Zhou
N. Shimada
VLM
18
5
0
24 May 2023
Video Prediction Models as Rewards for Reinforcement Learning
Video Prediction Models as Rewards for Reinforcement Learning
Alejandro Escontrela
Ademi Adeniji
Wilson Yan
Ajay Jain
Xue Bin Peng
Ken Goldberg
Youngwoon Lee
Danijar Hafner
Pieter Abbeel
34
52
0
23 May 2023
DirecT2V: Large Language Models are Frame-Level Directors for Zero-Shot
  Text-to-Video Generation
DirecT2V: Large Language Models are Frame-Level Directors for Zero-Shot Text-to-Video Generation
Susung Hong
Junyoung Seo
Heeseong Shin
Sung‐Jin Hong
Seung Wook Kim
DiffM
VGen
23
34
0
23 May 2023
Control-A-Video: Controllable Text-to-Video Generation with Diffusion
  Models
Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models
Weifeng Chen
Yatai Ji
Jie Wu
Hefeng Wu
Pan Xie
Jiashi Li
Xin Xia
Xuefeng Xiao
Liang Lin
VGen
121
6
0
23 May 2023
ControlVideo: Training-free Controllable Text-to-Video Generation
ControlVideo: Training-free Controllable Text-to-Video Generation
Yabo Zhang
Yuxiang Wei
Dongsheng Jiang
Xiaopeng Zhang
W. Zuo
Qi Tian
VGen
DiffM
25
236
0
22 May 2023
GEST: the Graph of Events in Space and Time as a Common Representation
  between Vision and Language
GEST: the Graph of Events in Space and Time as a Common Representation between Vision and Language
Mihai Masala
Nicolae Cudlenco
Traian Rebedea
Marius Leordeanu
9
0
0
22 May 2023
Has It All Been Solved? Open NLP Research Questions Not Solved by Large
  Language Models
Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language Models
Oana Ignat
Zhijing Jin
Artem Abzaliev
Laura Biester
Santiago Castro
...
Verónica Pérez-Rosas
Siqi Shen
Zekun Wang
Winston Wu
Rada Mihalcea
LRM
24
6
0
21 May 2023
InstructVid2Vid: Controllable Video Editing with Natural Language
  Instructions
InstructVid2Vid: Controllable Video Editing with Natural Language Instructions
Bosheng Qin
Juncheng Li
Siliang Tang
Tat-Seng Chua
Yueting Zhuang
VGen
DiffM
21
16
0
21 May 2023
Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation
Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation
Wenjing Wang
Huan Yang
Zixi Tuo
Huiguo He
Junchen Zhu
Jianlong Fu
Jiaying Liu
DiffM
VGen
40
113
0
18 May 2023
Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
Songwei Ge
Seungjun Nah
Guilin Liu
Tyler Poon
Andrew Tao
Bryan Catanzaro
David Jacobs
Jia-Bin Huang
Ming-Yu Liu
Yogesh Balaji
DiffM
VGen
35
252
0
17 May 2023
SoundStorm: Efficient Parallel Audio Generation
SoundStorm: Efficient Parallel Audio Generation
Zalan Borsos
Matthew Sharifi
Damien Vincent
Eugene Kharitonov
Neil Zeghidour
Marco Tagliasacchi
15
97
0
16 May 2023
Integrating Generative Artificial Intelligence in Intelligent Vehicle
  Systems
Integrating Generative Artificial Intelligence in Intelligent Vehicle Systems
Lukas Stappen
J. Dillmann
S. Striegel
Hans-Jörg Vögel
Nicolas Flores-Herr
Björn W. Schuller
22
9
0
15 May 2023
LEO: Generative Latent Image Animator for Human Video Synthesis
LEO: Generative Latent Image Animator for Human Video Synthesis
Yaohui Wang
Xin Ma
Xinyuan Chen
A. Dantcheva
Bo Dai
Yu Qiao
DiffM
59
30
0
06 May 2023
Collaborative Diffusion for Multi-Modal Face Generation and Editing
Collaborative Diffusion for Multi-Modal Face Generation and Editing
Ziqi Huang
Kelvin C. K. Chan
Yuming Jiang
Ziwei Liu
DiffM
26
104
0
20 Apr 2023
Align your Latents: High-Resolution Video Synthesis with Latent
  Diffusion Models
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
A. Blattmann
Robin Rombach
Huan Ling
Tim Dockhorn
Seung Wook Kim
Sanja Fidler
Karsten Kreis
3DGS
VGen
60
1,010
0
18 Apr 2023
Generative Disco: Text-to-Video Generation for Music Visualization
Generative Disco: Text-to-Video Generation for Music Visualization
Vivian Liu
Tao Long
Nathan Raw
Lydia B. Chilton
VGen
11
33
0
17 Apr 2023
Text2Performer: Text-Driven Human Video Generation
Text2Performer: Text-Driven Human Video Generation
Yuming Jiang
Shuai Yang
Tong Liang Koh
Wayne Wu
Chen Change Loy
Ziwei Liu
DiffM
VGen
43
48
0
17 Apr 2023
Latent-Shift: Latent Diffusion with Temporal Shift for Efficient
  Text-to-Video Generation
Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation
Jie An
Songyang Zhang
Harry Yang
Sonal Gupta
Jia-Bin Huang
Jiebo Luo
Xiaoyue Yin
DiffM
VGen
27
106
0
17 Apr 2023
Synthetic Data from Diffusion Models Improves ImageNet Classification
Synthetic Data from Diffusion Models Improves ImageNet Classification
Shekoofeh Azizi
Simon Kornblith
Chitwan Saharia
Mohammad Norouzi
David J. Fleet
VLM
DiffM
20
288
0
17 Apr 2023
Video Generation Beyond a Single Clip
Video Generation Beyond a Single Clip
Hsin-Ping Huang
Yu-Chuan Su
Ming Yang
VLM
DiffM
VGen
16
3
0
15 Apr 2023
M2T: Masking Transformers Twice for Faster Decoding
M2T: Masking Transformers Twice for Faster Decoding
Fabian Mentzer
E. Agustsson
Michael Tschannen
8
17
0
14 Apr 2023
ChatGPT is all you need to decolonize sub-Saharan Vocational Education
ChatGPT is all you need to decolonize sub-Saharan Vocational Education
Isidora Chara Tourni
G. Grigorakis
Isidoros Marougkas
Konstantinos M. Dafnis
Vassiliki‐Panagiota Tassopoulou
16
0
0
11 Apr 2023
Diffusion Models as Masked Autoencoders
Diffusion Models as Masked Autoencoders
Chen Wei
K. Mangalam
Po-Yao (Bernie) Huang
Yanghao Li
Haoqi Fan
Hu Xu
Huiyu Wang
Cihang Xie
Alan Yuille
Christoph Feichtenhofer
DiffM
SyDa
31
47
0
06 Apr 2023
GINA-3D: Learning to Generate Implicit Neural Assets in the Wild
GINA-3D: Learning to Generate Implicit Neural Assets in the Wild
Bokui Shen
Xinchen Yan
C. Qi
Mahyar Najibi
Boyang Deng
Leonidas J. Guibas
Yin Zhou
Drago Anguelov
3DV
22
20
0
04 Apr 2023
The Vector Grounding Problem
The Vector Grounding Problem
Dimitri Coelho Mollo
Raphael Milliere
23
26
0
04 Apr 2023
Scientists' Perspectives on the Potential for Generative AI in their
  Fields
Scientists' Perspectives on the Potential for Generative AI in their Fields
Meredith Ringel Morris
AI4CE
25
36
0
04 Apr 2023
Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models
Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models
Wen Wang
Yan Jiang
K. Xie
Zide Liu
Hao Chen
Yue Cao
Xinlong Wang
Chunhua Shen
DiffM
VGen
29
112
0
30 Mar 2023
Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment
Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment
Kim Sung-Bin
Arda Senocak
H. Ha
Andrew Owens
Tae-Hyun Oh
DiffM
VGen
27
35
0
30 Mar 2023
Your Diffusion Model is Secretly a Zero-Shot Classifier
Your Diffusion Model is Secretly a Zero-Shot Classifier
Alexander C. Li
Mihir Prabhudesai
Shivam Duggal
Ellis L Brown
Deepak Pathak
DiffM
VLM
33
223
0
28 Mar 2023
Fine-grained Audible Video Description
Fine-grained Audible Video Description
Xuyang Shen
Dong Li
Jinxing Zhou
Zhen Qin
Bowen He
...
Yuchao Dai
Lingpeng Kong
Meng Wang
Yu Qiao
Yiran Zhong
VGen
36
11
0
27 Mar 2023
CelebV-Text: A Large-Scale Facial Text-Video Dataset
CelebV-Text: A Large-Scale Facial Text-Video Dataset
Jianhui Yu
Hao Zhu
Liming Jiang
Chen Change Loy
Weidong (Tom) Cai
Wayne Wu
12
55
0
26 Mar 2023
ReVersion: Diffusion-Based Relation Inversion from Images
ReVersion: Diffusion-Based Relation Inversion from Images
Ziqi Huang
Tianxing Wu
Yuming Jiang
Kelvin C. K. Chan
Ziwei Liu
25
65
0
23 Mar 2023
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video
  Generators
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
Levon Khachatryan
A. Movsisyan
Vahram Tadevosyan
Roberto Henschel
Zhangyang Wang
Shant Navasardyan
Humphrey Shi
VGen
27
541
0
23 Mar 2023
Previous
123456
Next