Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.04283
Cited By
NWT: Towards natural audio-to-video generation with representation learning
8 June 2021
Rayhane Mama
Marc S. Tyndel
Hashiam Kadhim
Cole Clifford
Ragavan Thurairatnam
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"NWT: Towards natural audio-to-video generation with representation learning"
13 / 13 papers shown
Title
MAPLE: Encoding Dexterous Robotic Manipulation Priors Learned From Egocentric Videos
Alexey Gavryushin
Xi Wang
Robert J. S. Malate
Chenyu Yang
X. Jia
Shubh Goel
Davide Liconti
René Zurbrugg
Robert K. Katzschmann
Marc Pollefeys
34
0
0
08 Apr 2025
VQEL: Enabling Self-Developed Symbolic Language in Agents through Vector Quantization in Emergent Language Games
Mohammad Mahdi Samiei Paqaleh
Mahdieh Soleymani Baghshah
54
0
0
06 Mar 2025
Representation Collapsing Problems in Vector Quantization
Wenhao Zhao
Qiran Zou
Rushi Shah
Dianbo Liu
72
1
0
25 Nov 2024
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities
Roman Bachmann
Oğuzhan Fatih Kar
David Mizrahi
Ali Garjani
Mingfei Gao
David Griffiths
Jiaming Hu
Afshin Dehghan
Amir Zamir
MoE
VLM
MLLM
36
14
0
13 Jun 2024
SVQ: Sparse Vector Quantization for Spatiotemporal Forecasting
Chao Chen
Tian Zhou
Yanjun Zhao
Hui Liu
Liang Sun
Rong Jin
30
0
0
06 Dec 2023
Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation
Guy Yariv
Itai Gat
Sagie Benaim
Lior Wolf
Idan Schwartz
Yossi Adi
DiffM
VGen
29
36
0
28 Sep 2023
Towards Neural Variational Monte Carlo That Scales Linearly with System Size
Or Sharir
G. Chan
Anima Anandkumar
6
4
0
21 Dec 2022
StyleTalker: One-shot Style-based Audio-driven Talking Head Video Generation
Dong Min
Min-Hwan Song
Eunji Ko
S. Hwang
VGen
22
12
0
23 Aug 2022
Discrete Key-Value Bottleneck
Frederik Trauble
Anirudh Goyal
Nasim Rahaman
Michael C. Mozer
Kenji Kawaguchi
Yoshua Bengio
Bernhard Schölkopf
CLL
13
22
0
22 Jul 2022
Efficient-VDVAE: Less is more
Louay Hazami
Rayhane Mama
Ragavan Thurairatnam
BDL
19
28
0
25 Mar 2022
VideoGPT: Video Generation using VQ-VAE and Transformers
Wilson Yan
Yunzhi Zhang
Pieter Abbeel
A. Srinivas
ViT
VGen
245
484
0
20 Apr 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,774
0
24 Feb 2021
AudioViewer: Learning to Visualize Sounds
Chunjin Song
Yuchi Zhang
Willis Peng
Parmis Mohaghegh
Bastian Wandt
Helge Rhodin
22
1
0
22 Dec 2020
1