ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.02399
  4. Cited By
Phenaki: Variable Length Video Generation From Open Domain Textual
  Description

Phenaki: Variable Length Video Generation From Open Domain Textual Description

5 October 2022
Ruben Villegas
Mohammad Babaeizadeh
Pieter-Jan Kindermans
Hernan Moraldo
Han Zhang
M. Saffar
Santiago Castro
Julius Kunze
D. Erhan
    DiffM
    VGen
ArXivPDFHTML

Papers citing "Phenaki: Variable Length Video Generation From Open Domain Textual Description"

50 / 287 papers shown
Title
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Yuqing Wang
Tianwei Xiong
Daquan Zhou
Zhijie Lin
Yang Zhao
Bingyi Kang
Jiashi Feng
Xihui Liu
VGen
46
23
0
03 Oct 2024
PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation
PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation
Shaowei Liu
Zhongzheng Ren
Saurabh Gupta
Shenlong Wang
VGen
DiffM
PINN
42
33
0
27 Sep 2024
Denoising Reuse: Exploiting Inter-frame Motion Consistency for Efficient
  Video Latent Generation
Denoising Reuse: Exploiting Inter-frame Motion Consistency for Efficient Video Latent Generation
Chenyu Wang
Shuo Yan
Yixuan Chen
Yujiang Wang
Mingzhi Dong
...
Qin Lv
Fan Yang
Tun Lu
Ning Gu
Li Shang
DiffM
VGen
33
0
0
19 Sep 2024
Learning Generative Interactive Environments By Trained Agent
  Exploration
Learning Generative Interactive Environments By Trained Agent Exploration
Naser Kazemi
N. Savov
Danda Paudel
Luc Van Gool
26
2
0
10 Sep 2024
Dynamic Motion Synthesis: Masked Audio-Text Conditioned Spatio-Temporal
  Transformers
Dynamic Motion Synthesis: Masked Audio-Text Conditioned Spatio-Temporal Transformers
Sohan Anisetty
James Hays
33
0
0
03 Sep 2024
Compositional 3D-aware Video Generation with LLM Director
Compositional 3D-aware Video Generation with LLM Director
Hanxin Zhu
Tianyu He
Anni Tang
Junliang Guo
Zhibo Chen
Jiang Bian
DiffM
VGen
31
7
0
31 Aug 2024
DriveGenVLM: Real-world Video Generation for Vision Language Model based
  Autonomous Driving
DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving
Yongjie Fu
Anmol Jain
Xuan Di
Xu Chen
Zhaobin Mo
VGen
34
4
0
29 Aug 2024
GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video
  Generative Model
GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video Generative Model
Yongjie Fu
Yunlong Li
Xuan Di
VGen
33
2
0
28 Aug 2024
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
Zhuoyi Yang
Jiayan Teng
Wendi Zheng
Ming Ding
Shiyu Huang
...
Weihan Wang
Yean Cheng
Xiaotao Gu
Yuxiao Dong
Jie Tang
DiffM
VGen
72
389
0
12 Aug 2024
Training-Free Condition Video Diffusion Models for single frame
  Spatial-Semantic Echocardiogram Synthesis
Training-Free Condition Video Diffusion Models for single frame Spatial-Semantic Echocardiogram Synthesis
Van Phi Nguyen
Tri Nhan Luong Ha
Huy Hieu Pham
Quoc Long Tran
VGen
DiffM
MedIm
27
2
0
06 Aug 2024
Fine-gained Zero-shot Video Sampling
Fine-gained Zero-shot Video Sampling
Dengsheng Chen
Jie Hu
Javier Segovia-Aguas
Enhua Wu
VGen
DiffM
24
0
0
31 Jul 2024
Benchmarking AIGC Video Quality Assessment: A Dataset and Unified Model
Benchmarking AIGC Video Quality Assessment: A Dataset and Unified Model
Zhichao Zhang
Xinyue Li
Wei Sun
Jun Jia
Xiongkuo Min
...
Puyi Wang
Zhongpeng Ji
Fengyu Sun
Shangling Jui
Guangtao Zhai
EGVM
45
5
0
31 Jul 2024
Fréchet Video Motion Distance: A Metric for Evaluating Motion
  Consistency in Videos
Fréchet Video Motion Distance: A Metric for Evaluating Motion Consistency in Videos
Jiahe Liu
Youran Qu
Qi Yan
Xiaohui Zeng
Lele Wang
Renjie Liao
VGen
EGVM
44
12
0
23 Jul 2024
Streetscapes: Large-scale Consistent Street View Generation Using
  Autoregressive Video Diffusion
Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion
Boyang Deng
Richard Tucker
Zhengqi Li
Leonidas J. Guibas
Noah Snavely
Gordon Wetzstein
VGen
3DGS
DiffM
32
11
0
18 Jul 2024
Video In-context Learning: Autoregressive Transformers are Zero-Shot Video Imitators
Video In-context Learning: Autoregressive Transformers are Zero-Shot Video Imitators
Wentao Zhang
Junliang Guo
Tianyu He
Li Zhao
Linli Xu
Jiang Bian
34
3
0
10 Jul 2024
GVDIFF: Grounded Text-to-Video Generation with Diffusion Models
GVDIFF: Grounded Text-to-Video Generation with Diffusion Models
Huanzhang Dou
Ruixiang Li
Wei Su
Xi Li
DiffM
34
1
0
02 Jul 2024
Efficient World Models with Context-Aware Tokenization
Efficient World Models with Context-Aware Tokenization
Vincent Micheli
Eloi Alonso
François Fleuret
OffRL
VLM
32
4
0
27 Jun 2024
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of
  Text-to-Time-lapse Video Generation
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Shenghai Yuan
Jinfa Huang
Yongqi Xu
Yaoyang Liu
Shaofeng Zhang
Yujun Shi
Ruijie Zhu
Xinhua Cheng
Jiebo Luo
Li Yuan
EGVM
VGen
69
34
0
26 Jun 2024
FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models
FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models
Haonan Qiu
Zhaoxi Chen
Zhouxia Wang
Yingqing He
Menghan Xia
Ziwei Liu
VGen
DiffM
34
17
0
24 Jun 2024
ViD-GPT: Introducing GPT-style Autoregressive Generation in Video
  Diffusion Models
ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models
Kaifeng Gao
Jiaxin Shi
Hanwang Zhang
Chunping Wang
Jun Xiao
DiffM
VGen
65
12
0
16 Jun 2024
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation
Junke Wang
Yi-Xin Jiang
Zehuan Yuan
Binyue Peng
Zuxuan Wu
Yu-Gang Jiang
ViT
VGen
78
36
0
13 Jun 2024
Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing
  Reliability,Reproducibility, and Practicality
Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability,Reproducibility, and Practicality
Tianle Zhang
Langtian Ma
Yuchen Yan
Yuchen Zhang
Kai Wang
...
Wenqi Shao
Yang You
Yu Qiao
Ping Luo
Kaipeng Zhang
VGen
61
2
0
13 Jun 2024
Image and Video Tokenization with Binary Spherical Quantization
Image and Video Tokenization with Binary Spherical Quantization
Yue Zhao
Yuanjun Xiong
Philipp Krahenbuhl
28
17
0
11 Jun 2024
Instant 3D Human Avatar Generation using Image Diffusion Models
Instant 3D Human Avatar Generation using Image Diffusion Models
Nikos Kolotouros
Thiemo Alldieck
Enric Corona
Eduard Gabriel Bazavan
C. Sminchisescu
40
7
0
11 Jun 2024
Visual Representation Learning with Stochastic Frame Prediction
Visual Representation Learning with Stochastic Frame Prediction
Huiwon Jang
Dongyoung Kim
Junsu Kim
Jinwoo Shin
Pieter Abbeel
Younggyo Seo
34
2
0
11 Jun 2024
CoNo: Consistency Noise Injection for Tuning-free Long Video Diffusion
CoNo: Consistency Noise Injection for Tuning-free Long Video Diffusion
Xingrui Wang
Xin Li
Zhibo Chen
DiffM
42
1
0
07 Jun 2024
Zero-Shot Video Editing through Adaptive Sliding Score Distillation
Zero-Shot Video Editing through Adaptive Sliding Score Distillation
Lianghan Zhu
Yanqi Bao
Jing Huo
Jing Wu
Yu-Kun Lai
Wenbin Li
Yang Gao
VGen
23
2
0
07 Jun 2024
SF-V: Single Forward Video Generation Model
SF-V: Single Forward Video Generation Model
Zhixing Zhang
Yanyu Li
Yushu Wu
Yanwu Xu
Anil Kag
...
Aliaksandr Siarohin
Junli Cao
Dimitris N. Metaxas
Sergey Tulyakov
Jian Ren
DiffM
VGen
31
9
0
06 Jun 2024
VideoPhy: Evaluating Physical Commonsense for Video Generation
VideoPhy: Evaluating Physical Commonsense for Video Generation
Hritik Bansal
Zongyu Lin
Tianyi Xie
Zeshun Zong
Michal Yarom
Yonatan Bitton
Chenfanfu Jiang
Yizhou Sun
Kai-Wei Chang
Aditya Grover
EGVM
VGen
32
36
0
05 Jun 2024
ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and
  Zero-shot Language Style Control With Decoupled Codec
ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec
Shengpeng Ji
Jia-li Zuo
Minghui Fang
Siqi Zheng
Qian Chen
...
Ziyue Jiang
Hai Huang
Xize Cheng
Rongjie Huang
Zhou Zhao
45
8
0
03 Jun 2024
CV-VAE: A Compatible Video VAE for Latent Generative Video Models
CV-VAE: A Compatible Video VAE for Latent Generative Video Models
Sijie Zhao
Yong Zhang
Xiaodong Cun
Shaoshu Yang
Muyao Niu
Xiaoyu Li
Wenbo Hu
Ying Shan
DiffM
59
23
0
30 May 2024
Text Prompting for Multi-Concept Video Customization by Autoregressive
  Generation
Text Prompting for Multi-Concept Video Customization by Autoregressive Generation
D. Kothandaraman
Kihyuk Sohn
Ruben Villegas
P. Voigtlaender
Dinesh Manocha
Mohammad Babaeizadeh
VGen
DiffM
35
2
0
22 May 2024
CamViG: Camera Aware Image-to-Video Generation with Multimodal
  Transformers
CamViG: Camera Aware Image-to-Video Generation with Multimodal Transformers
Andrew Marmon
Grant Schindler
José Lezama
Dan Kondratyuk
Bryan Seybold
Irfan Essa
VGen
ViT
DiffM
26
3
0
21 May 2024
DisenStudio: Customized Multi-subject Text-to-Video Generation with
  Disentangled Spatial Control
DisenStudio: Customized Multi-subject Text-to-Video Generation with Disentangled Spatial Control
Hong Chen
Xin Wang
Yipeng Zhang
Yuwei Zhou
Zeyang Zhang
Siao Tang
Wenwu Zhu
VGen
DiffM
39
9
0
21 May 2024
Diffusion for World Modeling: Visual Details Matter in Atari
Diffusion for World Modeling: Visual Details Matter in Atari
Eloi Alonso
Adam Jelley
Vincent Micheli
Anssi Kanervisto
Amos Storkey
Tim Pearce
Franccois Fleuret
39
39
0
20 May 2024
FIFO-Diffusion: Generating Infinite Videos from Text without Training
FIFO-Diffusion: Generating Infinite Videos from Text without Training
Jihwan Kim
Junoh Kang
Jinyoung Choi
Bohyung Han
DiffM
VGen
58
24
0
19 May 2024
From Sora What We Can See: A Survey of Text-to-Video Generation
From Sora What We Can See: A Survey of Text-to-Video Generation
Rui Sun
Yumin Zhang
Tejal Shah
Jiahao Sun
Shuoying Zhang
Wenqi Li
Haoran Duan
Bo Wei
R. Ranjan
EGVM
79
19
0
17 May 2024
LatentColorization: Latent Diffusion-Based Speaker Video Colorization
LatentColorization: Latent Diffusion-Based Speaker Video Colorization
Rory Ward
Dan Bigioi
Shubhajit Basak
John G. Breslin
Peter Corcoran
VGen
DiffM
24
2
0
09 May 2024
TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation
TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation
Hritik Bansal
Yonatan Bitton
Michal Yarom
Idan Szpektor
Aditya Grover
Kai-Wei Chang
DiffM
47
11
0
07 May 2024
Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance
Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance
Kelvin C. K. Chan
Yang Zhao
Xuhui Jia
Ming-Hsuan Yang
Huisheng Wang
22
3
0
02 May 2024
FlexiFilm: Long Video Generation with Flexible Conditions
FlexiFilm: Long Video Generation with Flexible Conditions
Yichen Ouyang
Jianhao Yuan
Hao Zhao
Gaoang Wang
Bo-Lu Zhao
DiffM
42
6
0
29 Apr 2024
TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion
  Models
TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models
Haomiao Ni
Bernhard Egger
Suhas Lohit
A. Cherian
Ye Wang
T. Koike-Akino
S. X. Huang
Tim K. Marks
DiffM
31
12
0
25 Apr 2024
Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings
Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings
Olivia Wiles
Chuhan Zhang
Isabela Albuquerque
Ivana Kajić
Su Wang
...
Jordi Pont-Tuset
Aida Nematzadeh
Anant Nawalgaria
Jordi Pont-Tuset
Aida Nematzadeh
EGVM
120
13
0
25 Apr 2024
PhysDreamer: Physics-Based Interaction with 3D Objects via Video
  Generation
PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation
Tianyuan Zhang
Hong-Xing Yu
Rundi Wu
Brandon Yushan Feng
Changxi Zheng
Noah Snavely
Jiajun Wu
William T. Freeman
AI4CE
VGen
77
61
0
19 Apr 2024
AniClipart: Clipart Animation with Text-to-Video Priors
AniClipart: Clipart Animation with Text-to-Video Priors
Rong Wu
Wanchao Su
Kede Ma
Jing Liao
24
4
0
18 Apr 2024
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
Hongxin Zhang
Zeyuan Wang
Qiushi Lyu
Zheyuan Zhang
Sunli Chen
Tianmin Shu
Yilun Du
Kwonjoon Lee
Yilun Du
Chuang Gan
41
12
0
16 Apr 2024
σ-GPTs: A New Approach to Autoregressive Models
σ-GPTs: A New Approach to Autoregressive Models
Arnaud Pannatier
Evann Courdier
Franccois Fleuret
AI4TS
26
7
0
15 Apr 2024
Contextual Chart Generation for Cyber Deception
Contextual Chart Generation for Cyber Deception
David D. Nguyen
David Liebowitz
Surya Nepal
S. Kanhere
Sharif Abuadbba
41
0
0
07 Apr 2024
AI Royalties -- an IP Framework to Compensate Artists & IP Holders for
  AI-Generated Content
AI Royalties -- an IP Framework to Compensate Artists & IP Holders for AI-Generated Content
Pablo Ducru
Jonathan Raiman
Ronaldo Lemos
Clay Garner
George He
Hanna Balcha
Gabriel Souto
Sergio Branco
Celina Bottino
25
3
0
05 Apr 2024
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale
  Prediction
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
Keyu Tian
Yi-Xin Jiang
Zehuan Yuan
Bingyue Peng
Liwei Wang
VGen
25
248
0
03 Apr 2024
Previous
123456
Next