Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.01950
Cited By
Predicting Video with VQVAE
2 March 2021
Jacob Walker
Ali Razavi
Aaron van den Oord
DRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Predicting Video with VQVAE"
50 / 55 papers shown
Title
Vector Quantized-Elites: Unsupervised and Problem-Agnostic Quality-Diversity Optimization
Constantinos Tsakonas
Konstantinos Chatzilygeroudis
34
0
0
10 Apr 2025
Artificial Intelligence for Biomedical Video Generation
Linyuan Li
Jianing Qiu
Anujit Saha
Lin Li
Poyuan Li
Mengxian He
Ziyu Guo
Wu Yuan
VGen
58
1
0
12 Nov 2024
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
M. Zhang
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
46
9
0
08 Nov 2024
Pre-trained Visual Dynamics Representations for Efficient Policy Learning
Hao Luo
Bohan Zhou
Zongqing Lu
30
0
0
05 Nov 2024
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
Xiang Li
Cristina Mata
J. Park
Kumara Kahatapitiya
Yoo Sung Jang
...
Kanchana Ranasinghe
R. Burgert
Mu Cai
Yong Jae Lee
Michael S. Ryoo
LM&Ro
62
25
0
28 Jun 2024
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation
Junke Wang
Yi-Xin Jiang
Zehuan Yuan
Binyue Peng
Zuxuan Wu
Yu-Gang Jiang
ViT
VGen
78
36
0
13 Jun 2024
Genie: Generative Interactive Environments
Jake Bruce
Michael Dennis
Ashley D. Edwards
Jack Parker-Holder
Yuge Shi
...
Konrad Zolna
Jeff Clune
Nando de Freitas
Satinder Singh
Tim Rocktaschel
VGen
VLM
64
142
0
23 Feb 2024
Sign Language Production with Latent Motion Transformer
Pan Xie
Taiying Peng
Yao Du
Qipeng Zhang
SLR
19
3
0
20 Dec 2023
Earthfarseer: Versatile Spatio-Temporal Dynamical Systems Modeling in One Model
Hao Wu
Yuxuan Liang
Wei Xiong
Zhengyang Zhou
Wei-Ming Huang
Shilong Wang
Kun Wang
AI4TS
50
10
0
13 Dec 2023
ART
⋅
\boldsymbol{\cdot}
⋅
V: Auto-Regressive Text-to-Video Generation with Diffusion Models
Wenming Weng
Ruoyu Feng
Yanhui Wang
Qi Dai
Chunyu Wang
...
Jianmin Bao
Yuhui Yuan
Chong Luo
Yueyi Zhang
Zhiwei Xiong
VGen
25
32
0
30 Nov 2023
Neural Vector Fields: Generalizing Distance Vector Fields by Codebooks and Zero-Curl Regularization
Xianghui Yang
Guosheng Lin
Zhenghao Chen
Luping Zhou
34
2
0
04 Sep 2023
How Safe Am I Given What I See? Calibrated Prediction of Safety Chances for Image-Controlled Autonomy
Zhenjiang Mao
Carson Sobolewski
I. Ruchkin
16
8
0
23 Aug 2023
DDLP: Unsupervised Object-Centric Video Prediction with Deep Dynamic Latent Particles
Tal Daniel
Aviv Tamar
DiffM
17
7
0
09 Jun 2023
Disentanglement via Latent Quantization
Kyle Hsu
W. Dorrell
James C. R. Whittington
Jiajun Wu
Chelsea Finn
DRL
13
24
0
28 May 2023
Towards End-to-End Generative Modeling of Long Videos with Memory-Efficient Bidirectional Transformers
Jaehoon Yoo
Semin Kim
Doyup Lee
Chiheon Kim
Seunghoon Hong
21
3
0
20 Mar 2023
Neural Vector Fields: Implicit Representation by Explicit Learning
Xianghui Yang
Guosheng Lin
Zhenghao Chen
Luping Zhou
AI4CE
44
17
0
08 Mar 2023
Self-Organising Neural Discrete Representation Learning à la Kohonen
Kazuki Irie
Róbert Csordás
Jürgen Schmidhuber
SSL
19
1
0
15 Feb 2023
CHeart: A Conditional Spatio-Temporal Generative Model for Cardiac Anatomy
Mengyun Qiao
Shuo Wang
Huaqi Qiu
A. de Marvao
D. O’Regan
Daniel Rueckert
Wenjia Bai
MedIm
21
14
0
30 Jan 2023
Scalable Adaptive Computation for Iterative Generation
Allan Jabri
David Fleet
Ting-Li Chen
DiffM
19
106
0
22 Dec 2022
Towards Smooth Video Composition
Qihang Zhang
Ceyuan Yang
Yujun Shen
Yinghao Xu
Bolei Zhou
VGen
31
14
0
14 Dec 2022
Unifying conditional and unconditional semantic image synthesis with OCO-GAN
Marlene Careil
Stéphane Lathuilière
Camille Couprie
Jakob Verbeek
VLM
23
0
0
25 Nov 2022
Latent Video Diffusion Models for High-Fidelity Long Video Generation
Yin-Yin He
Tianyu Yang
Yong Zhang
Ying Shan
Qifeng Chen
DiffM
VGen
16
202
0
23 Nov 2022
INR-V: A Continuous Representation Space for Video-based Generative Tasks
Bipasha Sen
Aditya Agarwal
Vinay P. Namboodiri
C. V. Jawahar
VGen
44
6
0
29 Oct 2022
PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting
Thomas Lucas
Fabien Baradel
Philippe Weinzaepfel
Grégory Rogez
22
68
0
19 Oct 2022
JukeDrummer: Conditional Beat-aware Audio-domain Drum Accompaniment Generation via Transformer VQ-VAE
Yueh-Kao Wu
Ching-Yu Chiu
Yi-Hsuan Yang
ViT
19
14
0
12 Oct 2022
Compressed Vision for Efficient Video Understanding
Olivia Wiles
João Carreira
Iain Barr
Andrew Zisserman
Mateusz Malinowski
9
7
0
06 Oct 2022
Phenaki: Variable Length Video Generation From Open Domain Textual Description
Ruben Villegas
Mohammad Babaeizadeh
Pieter-Jan Kindermans
Hernan Moraldo
Han Zhang
M. Saffar
Santiago Castro
Julius Kunze
D. Erhan
DiffM
VGen
43
371
0
05 Oct 2022
Temporally Consistent Transformers for Video Generation
Wilson Yan
Danijar Hafner
Stephen James
Pieter Abbeel
DiffM
22
27
0
05 Oct 2022
HARP: Autoregressive Latent Video Prediction with High-Fidelity Image Generator
Younggyo Seo
Kimin Lee
Fangchen Liu
Stephen James
Pieter Abbeel
VGen
21
28
0
15 Sep 2022
SketchBetween: Video-to-Video Synthesis for Sprite Animation via Sketches
Dagmar Lukka Loftsdóttir
Matthew J. Guzdial
VGen
15
3
0
01 Sep 2022
Symbolic Music Loop Generation with Neural Discrete Representations
Sangjun Han
H. Ihm
Moontae Lee
Woohyung Lim
11
8
0
11 Aug 2022
3D-Aware Video Generation
Sherwin Bahmani
Jeong Joon Park
Despoina Paschalidou
H. Tang
Gordon Wetzstein
Leonidas J. Guibas
Luc Van Gool
Radu Timofte
26
20
0
29 Jun 2022
Forecasting of depth and ego-motion with transformers and self-supervision
Houssem-eddine Boulahbal
A. Voicila
Andrew I. Comport
ViT
MDE
19
3
0
15 Jun 2022
Patch-based Object-centric Transformers for Efficient Video Generation
Wilson Yan
Ryogo Okumura
Stephen James
Pieter Abbeel
DiffM
ViT
23
6
0
08 Jun 2022
Generating Long Videos of Dynamic Scenes
Tim Brooks
Janne Hellsten
M. Aittala
Ting-Chun Wang
Timo Aila
J. Lehtinen
Ming-Yu Liu
Alexei A. Efros
Tero Karras
SyDa
4
101
0
07 Jun 2022
Unsupervised Image Representation Learning with Deep Latent Particles
Tal Daniel
Aviv Tamar
OCL
SSL
11
9
0
31 May 2022
Video Diffusion Models
Jonathan Ho
Tim Salimans
Alexey A. Gritsenko
William Chan
Mohammad Norouzi
David J. Fleet
DiffM
VGen
27
1,503
0
07 Apr 2022
Transframer: Arbitrary Frame Prediction with Generative Models
C. Nash
João Carreira
Jacob Walker
Iain Barr
Andrew Jaegle
Mateusz Malinowski
Peter W. Battaglia
ViT
14
37
0
17 Mar 2022
Look Outside the Room: Synthesizing A Consistent Long-Term 3D Scene Video from A Single Image
Xuanchi Ren
Xiaolong Wang
VGen
19
58
0
17 Mar 2022
Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning
Ligong Han
Jian Ren
Hsin-Ying Lee
Francesco Barbieri
Kyle Olszewski
Shervin Minaee
Dimitris N. Metaxas
Sergey Tulyakov
DiffM
VGen
26
41
0
04 Mar 2022
StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2
Ivan Skorokhodov
Sergey Tulyakov
Mohamed Elhoseiny
VGen
10
278
0
29 Dec 2021
A model of semantic completion in generative episodic memory
Zahra Fayyaz
Aya Altamimi
Sen Cheng
Laurenz Wiskott
16
21
0
26 Nov 2021
Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models
Phil Chen
Masha Itkina
Ransalu Senanayake
Mykel J. Kochenderfer
20
6
0
27 Oct 2021
KaraSinger: Score-Free Singing Voice Synthesis with VQ-VAE using Mel-spectrograms
Chien-Feng Liao
Jen-Yu Liu
Yi-Hsuan Yang
11
5
0
08 Oct 2021
Pragmatic Image Compression for Human-in-the-Loop Decision-Making
S. Reddy
Anca Dragan
Sergey Levine
OffRL
31
13
0
07 Jul 2021
Transflower: probabilistic autoregressive dance generation with multimodal attention
Guillermo Valle Pérez
G. Henter
Jonas Beskow
A. Holzapfel
Pierre-Yves Oudeyer
Simon Alexanderson
16
42
0
25 Jun 2021
FitVid: Overfitting in Pixel-Level Video Prediction
Mohammad Babaeizadeh
M. Saffar
Suraj Nair
Sergey Levine
Chelsea Finn
D. Erhan
VLM
34
81
0
24 Jun 2021
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning
Hao Tan
Jie Lei
Thomas Wolf
Mohit Bansal
31
65
0
21 Jun 2021
NWT: Towards natural audio-to-video generation with representation learning
Rayhane Mama
Marc S. Tyndel
Hashiam Kadhim
Cole Clifford
Ragavan Thurairatnam
VGen
8
12
0
08 Jun 2021
GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions
Chenfei Wu
Lun Huang
Qianxi Zhang
Binyang Li
Lei Ji
Fan Yang
Guillermo Sapiro
Nan Duan
DiffM
VGen
16
233
0
30 Apr 2021
1
2
Next