Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.03638
Cited By
Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer
7 April 2022
Songwei Ge
Thomas Hayes
Harry Yang
Xiaoyue Yin
Guan Pang
David Jacobs
Jia-Bin Huang
Devi Parikh
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer"
29 / 179 papers shown
Title
Shape-aware Text-driven Layered Video Editing
Yao-Chih Lee
Ji-Ze Jang
Yi-Ting Chen
Elizabeth Qiu
Jia-Bin Huang
VGen
DiffM
28
51
0
30 Jan 2023
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
Jay Zhangjie Wu
Yixiao Ge
Xintao Wang
Weixian Lei
Yuchao Gu
Yufei Shi
W. Hsu
Ying Shan
Xiaohu Qie
Mike Zheng Shou
VGen
16
690
0
22 Dec 2022
MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
Ludan Ruan
Y. Ma
Huan Yang
Huiguo He
Bei Liu
Jianlong Fu
Nicholas Jing Yuan
Qin Jin
B. Guo
DiffM
VGen
20
168
0
19 Dec 2022
Towards Smooth Video Composition
Qihang Zhang
Ceyuan Yang
Yujun Shen
Yinghao Xu
Bolei Zhou
VGen
31
14
0
14 Dec 2022
MAGVIT: Masked Generative Video Transformer
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
...
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffM
VGen
20
223
0
10 Dec 2022
Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis
Yuchao Gu
Xintao Wang
Yixiao Ge
Ying Shan
Xiaohu Qie
Mike Zheng Shou
DiffM
11
20
0
06 Dec 2022
VIDM: Video Implicit Diffusion Models
Kangfu Mei
Vishal M. Patel
DiffM
VGen
14
78
0
01 Dec 2022
Latent Video Diffusion Models for High-Fidelity Long Video Generation
Yin-Yin He
Tianyu Yang
Yong Zhang
Ying Shan
Qifeng Chen
DiffM
VGen
8
200
0
23 Nov 2022
Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation
Tsu-jui Fu
Licheng Yu
Ning Zhang
Cheng-Yang Fu
Jong-Chyi Su
William Yang Wang
Sean Bell
VGen
22
37
0
23 Nov 2022
SSGVS: Semantic Scene Graph-to-Video Synthesis
Yuren Cong
Jinhui Yi
Bodo Rosenhahn
M. Yang
60
7
0
11 Nov 2022
Medical Diffusion: Denoising Diffusion Probabilistic Models for 3D Medical Image Generation
Firas Khader
Gustav Mueller-Franzes
Soroosh Tayebi Arasteh
T. Han
Christoph Haarburger
...
Johannes Stegmaier
Christiane Kuhl
S. Nebelung
Jakob Nikolas Kather
Daniel Truhn
DiffM
MedIm
14
63
0
07 Nov 2022
Text-driven Video Prediction
Xue Song
Jingjing Chen
B. Zhu
Yu-Gang Jiang
VGen
10
4
0
06 Oct 2022
Temporally Consistent Transformers for Video Generation
Wilson Yan
Danijar Hafner
Stephen James
Pieter Abbeel
DiffM
11
27
0
05 Oct 2022
Make-A-Video: Text-to-Video Generation without Text-Video Data
Uriel Singer
Adam Polyak
Thomas Hayes
Xiaoyue Yin
Jie An
...
Oron Ashual
Oran Gafni
Devi Parikh
Sonal Gupta
Yaniv Taigman
DiffM
VGen
20
1,339
0
29 Sep 2022
AudioLM: a Language Modeling Approach to Audio Generation
Zalan Borsos
Raphaël Marinier
Damien Vincent
Eugene Kharitonov
Olivier Pietquin
...
Dominik Roblek
O. Teboul
David Grangier
Marco Tagliasacchi
Neil Zeghidour
AuLLM
8
561
0
07 Sep 2022
Generating Long Videos of Dynamic Scenes
Tim Brooks
Janne Hellsten
M. Aittala
Ting-Chun Wang
Timo Aila
J. Lehtinen
Ming-Yu Liu
Alexei A. Efros
Tero Karras
SyDa
4
100
0
07 Jun 2022
MS-RNN: A Flexible Multi-Scale Framework for Spatiotemporal Predictive Learning
Zhifeng Ma
Hao Zhang
Jie Liu
HAI
AI4CE
18
12
0
07 Jun 2022
Unveiling The Mask of Position-Information Pattern Through the Mist of Image Features
C. Lin
Hsin-Ying Lee
Hung-Yu Tseng
M. Singh
Ming-Hsuan Yang
14
3
0
02 Jun 2022
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
235
556
0
29 May 2022
Flexible Diffusion Modeling of Long Videos
William Harvey
Saeid Naderiparizi
Vaden Masrani
Christian Weilbach
Frank D. Wood
DiffM
BDL
VGen
167
284
0
23 May 2022
Transframer: Arbitrary Frame Prediction with Generative Models
C. Nash
João Carreira
Jacob Walker
Iain Barr
Andrew Jaegle
Mateusz Malinowski
Peter W. Battaglia
ViT
6
37
0
17 Mar 2022
VideoGPT: Video Generation using VQ-VAE and Transformers
Wilson Yan
Yunzhi Zhang
Pieter Abbeel
A. Srinivas
ViT
VGen
237
482
0
20 Apr 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
Sound2Sight: Generating Visual Dynamics from Sound and Context
A. Cherian
Moitreya Chatterjee
N. Ahuja
VGen
59
35
0
23 Jul 2020
On Translation Invariance in CNNs: Convolutional Layers can Exploit Absolute Spatial Location
O. Kayhan
J. C. V. Gemert
187
231
0
16 Mar 2020
Transformation-based Adversarial Video Prediction on Large-Scale Data
Pauline Luc
Aidan Clark
Sander Dieleman
Diego de Las Casas
Yotam Doron
Albin Cassirer
Karen Simonyan
VGen
212
86
0
09 Mar 2020
How Much Position Information Do Convolutional Neural Networks Encode?
Md. Amirul Islam
Sen Jia
Neil D. B. Bruce
SSL
189
343
0
22 Jan 2020
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras
S. Laine
Timo Aila
262
10,183
0
12 Dec 2018
Imagine This! Scripts to Compositions to Videos
Tanmay Gupta
Dustin Schwenk
Ali Farhadi
Derek Hoiem
Aniruddha Kembhavi
CoGe
VGen
109
87
0
10 Apr 2018
Previous
1
2
3
4