ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.01950
  4. Cited By
Predicting Video with VQVAE

Predicting Video with VQVAE

2 March 2021
Jacob Walker
Ali Razavi
Aaron van den Oord
    DRL
ArXivPDFHTML

Papers citing "Predicting Video with VQVAE"

50 / 55 papers shown
Title
Vector Quantized-Elites: Unsupervised and Problem-Agnostic Quality-Diversity Optimization
Vector Quantized-Elites: Unsupervised and Problem-Agnostic Quality-Diversity Optimization
Constantinos Tsakonas
Konstantinos Chatzilygeroudis
34
0
0
10 Apr 2025
Artificial Intelligence for Biomedical Video Generation
Artificial Intelligence for Biomedical Video Generation
Linyuan Li
Jianing Qiu
Anujit Saha
Lin Li
Poyuan Li
Mengxian He
Ziyu Guo
Wu Yuan
VGen
58
1
0
12 Nov 2024
Autoregressive Models in Vision: A Survey
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
M. Zhang
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
46
9
0
08 Nov 2024
Pre-trained Visual Dynamics Representations for Efficient Policy
  Learning
Pre-trained Visual Dynamics Representations for Efficient Policy Learning
Hao Luo
Bohan Zhou
Zongqing Lu
30
0
0
05 Nov 2024
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
Xiang Li
Cristina Mata
J. Park
Kumara Kahatapitiya
Yoo Sung Jang
...
Kanchana Ranasinghe
R. Burgert
Mu Cai
Yong Jae Lee
Michael S. Ryoo
LM&Ro
62
25
0
28 Jun 2024
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation
Junke Wang
Yi-Xin Jiang
Zehuan Yuan
Binyue Peng
Zuxuan Wu
Yu-Gang Jiang
ViT
VGen
78
36
0
13 Jun 2024
Genie: Generative Interactive Environments
Genie: Generative Interactive Environments
Jake Bruce
Michael Dennis
Ashley D. Edwards
Jack Parker-Holder
Yuge Shi
...
Konrad Zolna
Jeff Clune
Nando de Freitas
Satinder Singh
Tim Rocktaschel
VGen
VLM
64
142
0
23 Feb 2024
Sign Language Production with Latent Motion Transformer
Sign Language Production with Latent Motion Transformer
Pan Xie
Taiying Peng
Yao Du
Qipeng Zhang
SLR
19
3
0
20 Dec 2023
Earthfarseer: Versatile Spatio-Temporal Dynamical Systems Modeling in
  One Model
Earthfarseer: Versatile Spatio-Temporal Dynamical Systems Modeling in One Model
Hao Wu
Yuxuan Liang
Wei Xiong
Zhengyang Zhou
Wei-Ming Huang
Shilong Wang
Kun Wang
AI4TS
50
10
0
13 Dec 2023
ART$\boldsymbol{\cdot}$V: Auto-Regressive Text-to-Video Generation with
  Diffusion Models
ART⋅\boldsymbol{\cdot}⋅V: Auto-Regressive Text-to-Video Generation with Diffusion Models
Wenming Weng
Ruoyu Feng
Yanhui Wang
Qi Dai
Chunyu Wang
...
Jianmin Bao
Yuhui Yuan
Chong Luo
Yueyi Zhang
Zhiwei Xiong
VGen
25
32
0
30 Nov 2023
Neural Vector Fields: Generalizing Distance Vector Fields by Codebooks
  and Zero-Curl Regularization
Neural Vector Fields: Generalizing Distance Vector Fields by Codebooks and Zero-Curl Regularization
Xianghui Yang
Guosheng Lin
Zhenghao Chen
Luping Zhou
34
2
0
04 Sep 2023
How Safe Am I Given What I See? Calibrated Prediction of Safety Chances
  for Image-Controlled Autonomy
How Safe Am I Given What I See? Calibrated Prediction of Safety Chances for Image-Controlled Autonomy
Zhenjiang Mao
Carson Sobolewski
I. Ruchkin
16
8
0
23 Aug 2023
DDLP: Unsupervised Object-Centric Video Prediction with Deep Dynamic
  Latent Particles
DDLP: Unsupervised Object-Centric Video Prediction with Deep Dynamic Latent Particles
Tal Daniel
Aviv Tamar
DiffM
17
7
0
09 Jun 2023
Disentanglement via Latent Quantization
Disentanglement via Latent Quantization
Kyle Hsu
W. Dorrell
James C. R. Whittington
Jiajun Wu
Chelsea Finn
DRL
13
24
0
28 May 2023
Towards End-to-End Generative Modeling of Long Videos with
  Memory-Efficient Bidirectional Transformers
Towards End-to-End Generative Modeling of Long Videos with Memory-Efficient Bidirectional Transformers
Jaehoon Yoo
Semin Kim
Doyup Lee
Chiheon Kim
Seunghoon Hong
21
3
0
20 Mar 2023
Neural Vector Fields: Implicit Representation by Explicit Learning
Neural Vector Fields: Implicit Representation by Explicit Learning
Xianghui Yang
Guosheng Lin
Zhenghao Chen
Luping Zhou
AI4CE
44
17
0
08 Mar 2023
Self-Organising Neural Discrete Representation Learning à la Kohonen
Self-Organising Neural Discrete Representation Learning à la Kohonen
Kazuki Irie
Róbert Csordás
Jürgen Schmidhuber
SSL
19
1
0
15 Feb 2023
CHeart: A Conditional Spatio-Temporal Generative Model for Cardiac
  Anatomy
CHeart: A Conditional Spatio-Temporal Generative Model for Cardiac Anatomy
Mengyun Qiao
Shuo Wang
Huaqi Qiu
A. de Marvao
D. O’Regan
Daniel Rueckert
Wenjia Bai
MedIm
21
14
0
30 Jan 2023
Scalable Adaptive Computation for Iterative Generation
Scalable Adaptive Computation for Iterative Generation
Allan Jabri
David Fleet
Ting-Li Chen
DiffM
19
106
0
22 Dec 2022
Towards Smooth Video Composition
Towards Smooth Video Composition
Qihang Zhang
Ceyuan Yang
Yujun Shen
Yinghao Xu
Bolei Zhou
VGen
31
14
0
14 Dec 2022
Unifying conditional and unconditional semantic image synthesis with
  OCO-GAN
Unifying conditional and unconditional semantic image synthesis with OCO-GAN
Marlene Careil
Stéphane Lathuilière
Camille Couprie
Jakob Verbeek
VLM
23
0
0
25 Nov 2022
Latent Video Diffusion Models for High-Fidelity Long Video Generation
Latent Video Diffusion Models for High-Fidelity Long Video Generation
Yin-Yin He
Tianyu Yang
Yong Zhang
Ying Shan
Qifeng Chen
DiffM
VGen
16
202
0
23 Nov 2022
INR-V: A Continuous Representation Space for Video-based Generative
  Tasks
INR-V: A Continuous Representation Space for Video-based Generative Tasks
Bipasha Sen
Aditya Agarwal
Vinay P. Namboodiri
C. V. Jawahar
VGen
44
6
0
29 Oct 2022
PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting
PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting
Thomas Lucas
Fabien Baradel
Philippe Weinzaepfel
Grégory Rogez
22
68
0
19 Oct 2022
JukeDrummer: Conditional Beat-aware Audio-domain Drum Accompaniment
  Generation via Transformer VQ-VAE
JukeDrummer: Conditional Beat-aware Audio-domain Drum Accompaniment Generation via Transformer VQ-VAE
Yueh-Kao Wu
Ching-Yu Chiu
Yi-Hsuan Yang
ViT
19
14
0
12 Oct 2022
Compressed Vision for Efficient Video Understanding
Compressed Vision for Efficient Video Understanding
Olivia Wiles
João Carreira
Iain Barr
Andrew Zisserman
Mateusz Malinowski
9
7
0
06 Oct 2022
Phenaki: Variable Length Video Generation From Open Domain Textual
  Description
Phenaki: Variable Length Video Generation From Open Domain Textual Description
Ruben Villegas
Mohammad Babaeizadeh
Pieter-Jan Kindermans
Hernan Moraldo
Han Zhang
M. Saffar
Santiago Castro
Julius Kunze
D. Erhan
DiffM
VGen
43
371
0
05 Oct 2022
Temporally Consistent Transformers for Video Generation
Temporally Consistent Transformers for Video Generation
Wilson Yan
Danijar Hafner
Stephen James
Pieter Abbeel
DiffM
22
27
0
05 Oct 2022
HARP: Autoregressive Latent Video Prediction with High-Fidelity Image
  Generator
HARP: Autoregressive Latent Video Prediction with High-Fidelity Image Generator
Younggyo Seo
Kimin Lee
Fangchen Liu
Stephen James
Pieter Abbeel
VGen
21
28
0
15 Sep 2022
SketchBetween: Video-to-Video Synthesis for Sprite Animation via
  Sketches
SketchBetween: Video-to-Video Synthesis for Sprite Animation via Sketches
Dagmar Lukka Loftsdóttir
Matthew J. Guzdial
VGen
15
3
0
01 Sep 2022
Symbolic Music Loop Generation with Neural Discrete Representations
Symbolic Music Loop Generation with Neural Discrete Representations
Sangjun Han
H. Ihm
Moontae Lee
Woohyung Lim
11
8
0
11 Aug 2022
3D-Aware Video Generation
3D-Aware Video Generation
Sherwin Bahmani
Jeong Joon Park
Despoina Paschalidou
H. Tang
Gordon Wetzstein
Leonidas J. Guibas
Luc Van Gool
Radu Timofte
26
20
0
29 Jun 2022
Forecasting of depth and ego-motion with transformers and
  self-supervision
Forecasting of depth and ego-motion with transformers and self-supervision
Houssem-eddine Boulahbal
A. Voicila
Andrew I. Comport
ViT
MDE
19
3
0
15 Jun 2022
Patch-based Object-centric Transformers for Efficient Video Generation
Patch-based Object-centric Transformers for Efficient Video Generation
Wilson Yan
Ryogo Okumura
Stephen James
Pieter Abbeel
DiffM
ViT
23
6
0
08 Jun 2022
Generating Long Videos of Dynamic Scenes
Generating Long Videos of Dynamic Scenes
Tim Brooks
Janne Hellsten
M. Aittala
Ting-Chun Wang
Timo Aila
J. Lehtinen
Ming-Yu Liu
Alexei A. Efros
Tero Karras
SyDa
4
101
0
07 Jun 2022
Unsupervised Image Representation Learning with Deep Latent Particles
Unsupervised Image Representation Learning with Deep Latent Particles
Tal Daniel
Aviv Tamar
OCL
SSL
11
9
0
31 May 2022
Video Diffusion Models
Video Diffusion Models
Jonathan Ho
Tim Salimans
Alexey A. Gritsenko
William Chan
Mohammad Norouzi
David J. Fleet
DiffM
VGen
27
1,503
0
07 Apr 2022
Transframer: Arbitrary Frame Prediction with Generative Models
Transframer: Arbitrary Frame Prediction with Generative Models
C. Nash
João Carreira
Jacob Walker
Iain Barr
Andrew Jaegle
Mateusz Malinowski
Peter W. Battaglia
ViT
14
37
0
17 Mar 2022
Look Outside the Room: Synthesizing A Consistent Long-Term 3D Scene
  Video from A Single Image
Look Outside the Room: Synthesizing A Consistent Long-Term 3D Scene Video from A Single Image
Xuanchi Ren
Xiaolong Wang
VGen
19
58
0
17 Mar 2022
Show Me What and Tell Me How: Video Synthesis via Multimodal
  Conditioning
Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning
Ligong Han
Jian Ren
Hsin-Ying Lee
Francesco Barbieri
Kyle Olszewski
Shervin Minaee
Dimitris N. Metaxas
Sergey Tulyakov
DiffM
VGen
26
41
0
04 Mar 2022
StyleGAN-V: A Continuous Video Generator with the Price, Image Quality
  and Perks of StyleGAN2
StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2
Ivan Skorokhodov
Sergey Tulyakov
Mohamed Elhoseiny
VGen
10
278
0
29 Dec 2021
A model of semantic completion in generative episodic memory
A model of semantic completion in generative episodic memory
Zahra Fayyaz
Aya Altamimi
Sen Cheng
Laurenz Wiskott
16
21
0
26 Nov 2021
Evidential Softmax for Sparse Multimodal Distributions in Deep
  Generative Models
Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models
Phil Chen
Masha Itkina
Ransalu Senanayake
Mykel J. Kochenderfer
20
6
0
27 Oct 2021
KaraSinger: Score-Free Singing Voice Synthesis with VQ-VAE using
  Mel-spectrograms
KaraSinger: Score-Free Singing Voice Synthesis with VQ-VAE using Mel-spectrograms
Chien-Feng Liao
Jen-Yu Liu
Yi-Hsuan Yang
11
5
0
08 Oct 2021
Pragmatic Image Compression for Human-in-the-Loop Decision-Making
Pragmatic Image Compression for Human-in-the-Loop Decision-Making
S. Reddy
Anca Dragan
Sergey Levine
OffRL
31
13
0
07 Jul 2021
Transflower: probabilistic autoregressive dance generation with
  multimodal attention
Transflower: probabilistic autoregressive dance generation with multimodal attention
Guillermo Valle Pérez
G. Henter
Jonas Beskow
A. Holzapfel
Pierre-Yves Oudeyer
Simon Alexanderson
16
42
0
25 Jun 2021
FitVid: Overfitting in Pixel-Level Video Prediction
FitVid: Overfitting in Pixel-Level Video Prediction
Mohammad Babaeizadeh
M. Saffar
Suraj Nair
Sergey Levine
Chelsea Finn
D. Erhan
VLM
34
81
0
24 Jun 2021
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive
  Learning
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning
Hao Tan
Jie Lei
Thomas Wolf
Mohit Bansal
31
65
0
21 Jun 2021
NWT: Towards natural audio-to-video generation with representation
  learning
NWT: Towards natural audio-to-video generation with representation learning
Rayhane Mama
Marc S. Tyndel
Hashiam Kadhim
Cole Clifford
Ragavan Thurairatnam
VGen
8
12
0
08 Jun 2021
GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions
GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions
Chenfei Wu
Lun Huang
Qianxi Zhang
Binyang Li
Lei Ji
Fan Yang
Guillermo Sapiro
Nan Duan
DiffM
VGen
16
233
0
30 Apr 2021
12
Next