Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1711.00937
Cited By
v1
v2 (latest)
Neural Discrete Representation Learning
2 November 2017
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
BDL
SSL
OCL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Neural Discrete Representation Learning"
50 / 3,807 papers shown
SSDD: Single-Step Diffusion Decoder for Efficient Image Tokenization
Théophane Vallaeys
Jakob Verbeek
Matthieu Cord
DiffM
239
3
0
06 Oct 2025
CodeFormer++: Blind Face Restoration Using Deformable Registration and Deep Metric Learning
Venkata Bharath Reddy Reddem
Akshay P Sarashetti
Ranjith Merugu
Amit Satish Unde
123
0
0
06 Oct 2025
ReactDiff: Fundamental Multiple Appropriate Facial Reaction Diffusion Model
Luo Cheng
Song Siyang
Yan Siyuan
Yu Zhen
Ge Zongyuan
93
1
0
06 Oct 2025
Bridging Text and Video Generation: A Survey
Nilay Kumar
Priyansh Bhandari
G. Maragatham
VGen
264
0
0
06 Oct 2025
VChain: Chain-of-Visual-Thought for Reasoning in Video Generation
Longxiang Zhang
Ning Yu
Gordon Chen
Haonan Qiu
P. Debevec
Ziwei Liu
VGen
LRM
87
7
0
06 Oct 2025
Randomness from causally independent processes
Martin Sandfuchs
Carla Ferradini
R. Renner
CML
156
0
0
06 Oct 2025
Language Model Based Text-to-Audio Generation: Anti-Causally Aligned Collaborative Residual Transformers
Juncheng Wang
Chao Xu
Cheng Yu
Zhe Hu
Haoyu Xie
Guoqi Yu
Lei Shang
Shujun Wang
DiffM
170
2
0
06 Oct 2025
REAR: Rethinking Visual Autoregressive Models via Generator-Tokenizer Consistency Regularization
Qiyuan He
Y. Li
Haotian Ye
Jinghao Wang
Xinyao Liao
Pheng-Ann Heng
Stefano Ermon
James Zou
Angela Yao
DiffM
VGen
237
2
0
06 Oct 2025
GenAR: Next-Scale Autoregressive Generation for Spatial Gene Expression Prediction
Jiarui Ouyang
Yihui Wang
Yihang Gao
Yingxue Xu
Shu Yang
Hao Chen
125
0
0
05 Oct 2025
MASC: Boosting Autoregressive Image Generation with a Manifold-Aligned Semantic Clustering
Lixuan He
Shikang Zheng
Linfeng Zhang
162
0
0
05 Oct 2025
MulVuln: Enhancing Pre-trained LMs with Shared and Language-Specific Knowledge for Multilingual Vulnerability Detection
Van Nguyen
Surya Nepal
Xingliang Yuan
Tingmin Wu
Fengchao Chen
Carsten Rudolph
AAML
139
0
0
05 Oct 2025
Domain-Adapted Granger Causality for Real-Time Cross-Slice Attack Attribution in 6G Networks
Minh K. Quan
P. Pathirana
87
1
0
04 Oct 2025
Soft Disentanglement in Frequency Bands for Neural Audio Codecs
Benoit Ginies
Xiaoyu Bie
Olivier Fercoq
Gaël Richard
100
1
0
04 Oct 2025
Désentrelacement Fréquentiel Doux pour les Codecs Audio Neuronaux
Benoît Giniès
Xiaoyu Bie
Olivier Fercoq
Gaël Richard
132
0
0
04 Oct 2025
Product-Quantised Image Representation for High-Quality Image Synthesis
Denis Zavadski
Nikita Philip Tatsch
Carsten Rother
107
0
0
03 Oct 2025
Multi-scale Autoregressive Models are Laplacian, Discrete, and Latent Diffusion Models in Disguise
Steve Hong
Samuel Belkadi
DiffM
113
0
0
03 Oct 2025
TIT-Score: Evaluating Long-Prompt Based Text-to-Image Alignment via Text-to-Image-to-Text Consistency
Juntong Wang
Huiyu Duan
Jiarui Wang
Ziheng Jia
Guangtao Zhai
Xiongkuo Min
EGVM
ALM
LM&MA
VLM
156
2
0
03 Oct 2025
Flip Distribution Alignment VAE for Multi-Phase MRI Synthesis
International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Xiaoyan Kui
Qianmu Xiao
Qqinsong Li
Zexin Ji
JIelin Zhang
Beiji Zou
OOD
116
0
0
03 Oct 2025
Discrete Facial Encoding: : A Framework for Data-driven Facial Display Discovery
Minh Tran
Maksim Siniukov
Zhangyu Jin
Mohammad Soleymani
76
0
0
02 Oct 2025
Test-Time Anchoring for Discrete Diffusion Posterior Sampling
Litu Rout
Andreas Lugmayr
Yasamin Jafarian
Srivatsan Varadharajan
Constantine Caramanis
Sanjay Shakkottai
Ira Kemelmacher-Shlizerman
218
0
0
02 Oct 2025
Growing Visual Generative Capacity for Pre-Trained MLLMs
Hanyu Wang
Jiaming Han
Ziyan Yang
Qi Zhao
Shanchuan Lin
Xiangyu Yue
Abhinav Shrivastava
Zhenheng Yang
Hao Chen
VLM
217
1
0
02 Oct 2025
Input-Aware Sparse Attention for Real-Time Co-Speech Video Generation
Beijia Lu
Ziyi Chen
Jing Xiao
Jun-Yan Zhu
DiffM
VGen
329
0
0
02 Oct 2025
MelTok: 2D Tokenization for Single-Codebook Audio Compression
Jingyi Li
Zhiyuan Zhao
Yunfei Liu
Lijian Lin
Ye Zhu
Jiahao Wu
Qiuqiang Kong
Yu Li
Y. Li
311
0
0
02 Oct 2025
BioBlobs: Differentiable Graph Partitioning for Protein Representation Learning
Xin Wang
Carlos Oliver
132
0
0
02 Oct 2025
SoundReactor: Frame-level Online Video-to-Audio Generation
Koichi Saito
Julian Tanke
Christian Simon
Masato Ishii
Kazuki Shimada
Zachary Novack
Zhi-Wei Zhong
Akio Hayakawa
Takashi Shibuya
Yuki Mitsufuji
DiffM
VGen
248
0
0
02 Oct 2025
Eliciting Chain-of-Thought Reasoning for Time Series Analysis using Reinforcement Learning
Felix Parker
Nimeesha Chan
Chi Zhang
Kimia Ghobadi
AI4TS
OffRL
LRM
143
1
0
01 Oct 2025
Ultra-Efficient Decoding for End-to-End Neural Compression and Reconstruction
Ethan G Rogers
Cheng Wang
131
0
0
01 Oct 2025
Purrception: Variational Flow Matching for Vector-Quantized Image Generation
Răzvan-Andrei Matişan
Vincent Tao Hu
Grigory Bartosh
Bjorn Ommer
Cees G. M. Snoek
Max Welling
Jan-Willem van de Meent
Mohammad Mahdi Derakhshani
Floor Eijkelboom
149
1
0
01 Oct 2025
Baseline Systems For The 2025 Low-Resource Audio Codec Challenge
Yusuf Ziya Isik
Rafał Łaganowski
266
0
0
30 Sep 2025
Flow Autoencoders are Effective Protein Tokenizers
Rohit Dilip
Evan Zhang
Ayush Varshney
David Van Valen
DiffM
126
0
0
30 Sep 2025
PUREVQ-GAN: Defending Data Poisoning Attacks through Vector-Quantized Bottlenecks
Alexander Branch
Omead Brandon Pooladzandi
Radin Khosraviani
Sunay Bhat
Jeffrey Q. Jiang
Gregory Pottie
84
0
0
30 Sep 2025
LieHMR: Autoregressive Human Mesh Recovery with
S
O
(
3
)
SO(3)
SO
(
3
)
Diffusion
Donghwan Kim
Tae-Kyun Kim
DiffM
212
0
0
30 Sep 2025
Go with Your Gut: Scaling Confidence for Autoregressive Image Generation
Harold Haodong Chen
Xianfeng Wu
Wen-Jie Shu
Rongjin Guo
Disen Lan
Harry Yang
Ying-Cong Chen
137
2
0
30 Sep 2025
LaTo: Landmark-tokenized Diffusion Transformer for Fine-grained Human Face Editing
Zhenghao Zhang
Ziying Zhang
Junchao Liao
Xiangyu Meng
Qiang Hu
Siyu Zhu
Xiaoyun Zhang
Long Qin
Weizhi Wang
144
0
0
30 Sep 2025
DiVeQ: Differentiable Vector Quantization Using the Reparameterization Trick
Mohammad Hassan Vali
Tom Bäckström
Arno Solin
MQ
146
1
0
30 Sep 2025
Learning Energy-based Variational Latent Prior for VAEs
Debottam Dutta
Chaitanya Amballa
Zhongweiyang Xu
Yu-Lin Wei
Romit Roy Choudhury
DiffM
160
0
0
30 Sep 2025
Hyperspherical Latents Improve Continuous-Token Autoregressive Generation
Guolin Ke
Hui Xue
150
3
0
29 Sep 2025
Cycle Diffusion Model for Counterfactual Image Generation
Fangrui Huang
Alan Wang
Binxu Li
Bailey Trang
Ridvan Yesiloglu
Tianyu Hua
Wei Peng
Ehsan Adeli
DiffM
MedIm
210
1
0
29 Sep 2025
MoReFlow: Motion Retargeting Learning through Unsupervised Flow Matching
Wontaek Kim
Tianyu Li
Sehoon Ha
208
0
0
29 Sep 2025
DyMoDreamer: World Modeling with Dynamic Modulation
Boxuan Zhang
Runqing Wang
Wei Xiao
Weipu Zhang
Jian Sun
Gao Huang
Jie Chen
Gang Wang
164
0
0
29 Sep 2025
SynthCloner: Synthesizer-style Audio Transfer via Factorized Codec with ADSR Envelope Control
Jeng-Yue Liu
Ting-Chao Hsu
Yen-Tung Yeh
Li Su
Yi-Hsuan Yang
123
0
0
29 Sep 2025
Score-based Membership Inference on Diffusion Models
Mingxing Rao
Bowen Qu
Daniel Moyer
DiffM
130
1
0
29 Sep 2025
Aligning Visual Foundation Encoders to Tokenizers for Diffusion Models
Bowei Chen
Sai Bi
Hao Tan
Chentao Song
Tianyuan Zhang
Zhengqi Li
Yuanjun Xiong
Jianming Zhang
Kai Zhang
219
5
0
29 Sep 2025
Beyond Softmax: A Natural Parameterization for Categorical Random Variables
A. Manenti
Cesare Alippi
BDL
133
0
0
29 Sep 2025
Discrete Variational Autoencoding via Policy Search
Michael Drolet
Firas Al-Hafez
Aditya Bhatt
Jan Peters
Oleg Arenz
83
0
0
29 Sep 2025
Uni-X: Mitigating Modality Conflict with a Two-End-Separated Architecture for Unified Multimodal Models
Jitai Hao
Hao Liu
Xinyan Xiao
Qiang Huang
Jun Yu
229
0
0
29 Sep 2025
MotionVerse: A Unified Multimodal Framework for Motion Comprehension, Generation and Editing
Ruibing Hou
Mingshuang Luo
Hongyu Pan
Hong Chang
Shiguang Shan
144
2
0
28 Sep 2025
HieraTok: Multi-Scale Visual Tokenizer Improves Image Reconstruction and Generation
Cong Chen
Ziyuan Huang
Cheng Zou
Huanyi Zheng
Kaixiang Ji
Jiajia Liu
Jingdong Chen
Hao Chen
Chunhua Shen
161
3
0
28 Sep 2025
AudioMoG: Guiding Audio Generation with Mixture-of-Guidance
Junyou Wang
Zehua Chen
Binjie Yuan
Kaiwen Zheng
Chang Li
Yuxuan Jiang
Jun Zhu
161
0
0
28 Sep 2025
Reinforcement Learning with Inverse Rewards for World Model Post-training
Yang Ye
Tianyu He
Shuo Yang
Jiang Bian
VGen
198
1
0
28 Sep 2025
Previous
1
2
3
4
5
6
...
75
76
77
Next
Page 5 of 77
Page
of 77
Go