ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.00937
  4. Cited By
Neural Discrete Representation Learning
v1v2 (latest)

Neural Discrete Representation Learning

2 November 2017
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
    BDLSSLOCL
ArXiv (abs)PDFHTML

Papers citing "Neural Discrete Representation Learning"

50 / 3,807 papers shown
Autoregressive Image Generation Needs Only a Few Lines of Cached Tokens
Autoregressive Image Generation Needs Only a Few Lines of Cached Tokens
Ziran Qin
Youru Lv
Mingbao Lin
Zeren Zhang
Chanfan Gan
Tieyuan Chen
W. Lin
DiffMVLM
106
1
0
04 Dec 2025
DeRA: Decoupled Representation Alignment for Video Tokenization
DeRA: Decoupled Representation Alignment for Video Tokenization
Pengbo Guo
Junke Wang
Zhen Xing
Chengxu Liu
Daoguo Dong
Xueming Qian
Zuxuan Wu
AI4TS
103
0
0
04 Dec 2025
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
Yanran Zhang
Ziyi Wang
Wenzhao Zheng
Zheng Zhu
Jie Zhou
Jiwen Lu
VGen3DV
240
1
0
04 Dec 2025
Controllable Long-term Motion Generation with Extended Joint Targets
Controllable Long-term Motion Generation with Extended Joint Targets
Eunjong Lee
Eunhee Kim
Sanghoon Hong
Eunho Jung
Jihoon Kim
74
0
0
04 Dec 2025
Efficient Generative Transformer Operators For Million-Point PDEs
Efficient Generative Transformer Operators For Million-Point PDEs
Armand K. Koupai
Lise Le Boudec
Patrick Gallinari
78
0
0
04 Dec 2025
Denoise to Track: Harnessing Video Diffusion Priors for Robust Correspondence
Denoise to Track: Harnessing Video Diffusion Priors for Robust Correspondence
Tianyu Yuan
Yuanbo Yang
Lin Chen
Yao Yao
Zhuzhong Qian
DiffMVGen
257
0
0
04 Dec 2025
LSRS: Latent Scale Rejection Sampling for Visual Autoregressive Modeling
LSRS: Latent Scale Rejection Sampling for Visual Autoregressive Modeling
Hong-Kai Zheng
Piji Li
69
0
0
03 Dec 2025
FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation
FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation
Yiyi Cai
Y. Wu
Kunhang Li
You Zhou
Bo Zheng
Haiyang Liu
VGen
116
0
0
03 Dec 2025
Enhancing next token prediction based pre-training for jet foundation models
Enhancing next token prediction based pre-training for jet foundation models
Joschka Birk
Anna Hallin
Gregor Kasieczka
Nikol Madzharova
Ian Pang
David Shih
104
1
0
03 Dec 2025
Graph VQ-Transformer (GVT): Fast and Accurate Molecular Generation via High-Fidelity Discrete Latents
Graph VQ-Transformer (GVT): Fast and Accurate Molecular Generation via High-Fidelity Discrete Latents
Haozhuo Zheng
Cheng Wang
Yang Liu
71
0
0
02 Dec 2025
Contrastive Deep Learning for Variant Detection in Wastewater Genomic Sequencing
Contrastive Deep Learning for Variant Detection in Wastewater Genomic Sequencing
Adele Chinda
Richmond Azumah
Hemanth Demakethepalli Venkateswara
80
0
0
02 Dec 2025
U4D: Uncertainty-Aware 4D World Modeling from LiDAR Sequences
U4D: Uncertainty-Aware 4D World Modeling from LiDAR Sequences
Xiang Xu
Ao Liang
Youquan Liu
Linfeng Li
Lingdong Kong
Ziwei Liu
Qingshan Liu
139
1
0
02 Dec 2025
AutoBrep: Autoregressive B-Rep Generation with Unified Topology and Geometry
AutoBrep: Autoregressive B-Rep Generation with Unified Topology and Geometry
Xiang Xu
P. Jayaraman
Joseph George Lambourne
Yilin Liu
Durvesh Malpure
Pete Meltzer
AI4CE
180
1
0
02 Dec 2025
Mitigating Intra- and Inter-modal Forgetting in Continual Learning of Unified Multimodal Models
Mitigating Intra- and Inter-modal Forgetting in Continual Learning of Unified Multimodal Models
Xiwen Wei
Mustafa Munir
R. Marculescu
CLL
282
0
0
02 Dec 2025
TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models
TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models
Zhiheng Liu
Weiming Ren
Haozhe Liu
Zijian Zhou
S. Chen
...
Ping Luo
Wei Liu
Tao Xiang
Jonas Schult
Yuren Cong
166
2
0
01 Dec 2025
Q2D2: A Geometry-Aware Audio Codec Leveraging Two-Dimensional Quantization
Q2D2: A Geometry-Aware Audio Codec Leveraging Two-Dimensional Quantization
Tal Shuster
Eliya Nachmani
120
0
0
01 Dec 2025
Deconstructing Generative Diversity: An Information Bottleneck Analysis of Discrete Latent Generative Models
Yudi Wu
Wenhao Zhao
Dianbo Liu
113
0
0
01 Dec 2025
Real-World Robot Control by Deep Active Inference With a Temporally Hierarchical World ModelIEEE Robotics and Automation Letters (IEEE RA-L), 2025
Kentaro Fujii
Shingo Murata
68
1
0
01 Dec 2025
Mofasa: A Step Change in Metal-Organic Framework Generation
Mofasa: A Step Change in Metal-Organic Framework Generation
Vaidotas Šimkus
Anders Christensen
Steven Bennett
Ian Johnson
Mark Neumann
James Gin
Jonathan Godwin
Benjamin Rhodes
AI4CE
142
0
0
01 Dec 2025
Smol-GS: Compact Representations for Abstract 3D Gaussian Splatting
Smol-GS: Compact Representations for Abstract 3D Gaussian Splatting
Haishan Wang
Mohammad Hassan Vali
Arno Solin
3DGS
168
0
0
30 Nov 2025
Neural Discrete Representation Learning for Sparse-View CBCT Reconstruction: From Algorithm Design to Prospective Multicenter Clinical Evaluation
Neural Discrete Representation Learning for Sparse-View CBCT Reconstruction: From Algorithm Design to Prospective Multicenter Clinical Evaluation
H. Wang
Lei Chen
Wei-Hua Zhang
Linxia Wu
Yong Luo
...
Xuhua Duan
Lefei Zhang
Gao-Jun Teng
Bo Du
Huangxuan Zhao
55
0
0
30 Nov 2025
Quantized-Tinyllava: a new multimodal foundation model enables efficient split learning
Quantized-Tinyllava: a new multimodal foundation model enables efficient split learning
J. Guo
Xin Luo
Jie Liu
Yiqun Wang
Kai-Wei Chang
Wei Wang
Jie Liu
100
0
0
28 Nov 2025
VQRAE: Representation Quantization Autoencoders for Multimodal Understanding, Generation and Reconstruction
VQRAE: Representation Quantization Autoencoders for Multimodal Understanding, Generation and Reconstruction
Sinan Du
Jiahao Guo
Bo Li
Shuhao Cui
Zhengzhuo Xu
...
Yongxian Wei
Kun Gai
X. Wang
Kai Wu
C. Yuan
228
1
0
28 Nov 2025
ReactionMamba: Generating Short & Long Human Reaction Sequences
ReactionMamba: Generating Short & Long Human Reaction Sequences
Hajra Anwar Beg
Baptiste Chopin
Hao Tang
Mohamed Daoudi
Mamba
187
0
0
28 Nov 2025
REVEAL: Reasoning-enhanced Forensic Evidence Analysis for Explainable AI-generated Image Detection
REVEAL: Reasoning-enhanced Forensic Evidence Analysis for Explainable AI-generated Image Detection
Huangsen Cao
Qin Mei
Zhiheng Li
Yuxi Li
Ying Zhang
...
Zhimeng Zhang
Xin Ding
Yongwei Wang
Jing Lyu
Fei Wu
138
0
0
28 Nov 2025
Visual Generation Tuning
Visual Generation Tuning
Jiahao Guo
Sinan Du
J. Yao
Wenyu Liu
Bo Li
Haoxiang Cao
Kun Gai
C. Yuan
Kai Wu
Xinggang Wang
VLM
306
0
0
28 Nov 2025
BrepGPT: Autoregressive B-rep Generation with Voronoi Half-Patch
BrepGPT: Autoregressive B-rep Generation with Voronoi Half-PatchACM Transactions on Graphics (TOG), 2025
Pu Li
Wenhao Zhang
Weize Quan
Biao Zhang
Peter Wonka
Dong-ming Yan
99
0
0
27 Nov 2025
PURE Codec: Progressive Unfolding of Residual Entropy for Speech Codec Learning
PURE Codec: Progressive Unfolding of Residual Entropy for Speech Codec Learning
J. Shi
H. Wang
William Chen
Chenda Li
Wangyou Zhang
Jinchuan Tian
Shinji Watanabe
154
0
0
27 Nov 2025
RAVQ-HoloNet: Rate-Adaptive Vector-Quantized Hologram Compression
RAVQ-HoloNet: Rate-Adaptive Vector-Quantized Hologram Compression
Shima Rafiei
Zahra Nabizadeh Shahr Babak
S. Samavi
S. Shirani
136
0
0
26 Nov 2025
DiverseVAR: Balancing Diversity and Quality of Next-Scale Visual Autoregressive Models
DiverseVAR: Balancing Diversity and Quality of Next-Scale Visual Autoregressive Models
Mingue Park
Prin Phunyaphibarn
Phillip Y. Lee
Minhyuk Sung
124
0
0
26 Nov 2025
Progress by Pieces: Test-Time Scaling for Autoregressive Image Generation
Progress by Pieces: Test-Time Scaling for Autoregressive Image Generation
Joonhyung Park
Hyeongwon Jang
Joowon Kim
Eunho Yang
VLM
159
0
0
26 Nov 2025
Harmonic-Percussive Disentangled Neural Audio Codec for Bandwidth Extension
Harmonic-Percussive Disentangled Neural Audio Codec for Bandwidth Extension
Benoît Giniès
Xiaoyu Bie
Olivier Fercoq
Gaël Richard
173
0
0
26 Nov 2025
SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space
SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space
Zhenyi Shen
Junru Lu
Lin Gui
Jiazheng Li
Yulan He
D. Yin
Xing Sun
339
0
0
25 Nov 2025
SKEL-CF: Coarse-to-Fine Biomechanical Skeleton and Surface Mesh Recovery
SKEL-CF: Coarse-to-Fine Biomechanical Skeleton and Surface Mesh Recovery
Da Li
Jiping Jin
Xuanlong Yu
Wei Liu
Xiaodong Cun
Kai Chen
Rui Fan
Jiangang Kong
Xi Shen
3DH
494
0
0
25 Nov 2025
Text-guided Controllable Diffusion for Realistic Camouflage Images Generation
Text-guided Controllable Diffusion for Realistic Camouflage Images Generation
Yuhang Qian
Haiyan Chen
Wentong Li
Ningzhong Liu
Jie Qin
DiffM
203
1
0
25 Nov 2025
Operationalizing Quantized Disentanglement
Operationalizing Quantized Disentanglement
Vitória Barin Pacela
Kartik Ahuja
Simon Lacoste-Julien
Pascal Vincent
88
0
0
25 Nov 2025
PRADA: Probability-Ratio-Based Attribution and Detection of Autoregressive-Generated Images
PRADA: Probability-Ratio-Based Attribution and Detection of Autoregressive-Generated Images
Simon Damm
Jonas Ricker
Henning Petzka
Asja Fischer
197
0
0
25 Nov 2025
Temporal-Visual Semantic Alignment: A Unified Architecture for Transferring Spatial Priors from Vision Models to Zero-Shot Temporal Tasks
Temporal-Visual Semantic Alignment: A Unified Architecture for Transferring Spatial Priors from Vision Models to Zero-Shot Temporal Tasks
Xiangkai Ma
Han Zhang
Wenzhong Li
Sanglu Lu
AI4TSVGen
291
0
0
25 Nov 2025
DINO-Tok: Adapting DINO for Visual Tokenizers
DINO-Tok: Adapting DINO for Visual Tokenizers
Mingkai Jia
Mingxiao Li
Liaoyuan Fan
Tianxing Shi
Jiaxin Guo
...
Xiaoyang Guo
Xiao-Xiao Long
Qian Zhang
P. Tan
Wei Yin
ViT
201
0
0
25 Nov 2025
AI/ML based Joint Source and Channel Coding for HARQ-ACK Payload
AI/ML based Joint Source and Channel Coding for HARQ-ACK Payload
Akash S. Doshi
Pinar Sen
Kirill Ivanov
Wei Yang
June Namgoong
Runxin Wang
Rachel Wang
Taesang Yoo
Jing Jiang
Tingfang Ji
54
0
0
25 Nov 2025
Robust Long-term Test-Time Adaptation for 3D Human Pose Estimation through Motion Discretization
Robust Long-term Test-Time Adaptation for 3D Human Pose Estimation through Motion Discretization
Yilin Wen
Kechuan Dong
Yusuke Sugano
3DHTTA
358
0
0
24 Nov 2025
fMRI-LM: Towards a Universal Foundation Model for Language-Aligned fMRI Understanding
fMRI-LM: Towards a Universal Foundation Model for Language-Aligned fMRI Understanding
Yuxiang Wei
Y. Zhang
Xi Xiao
Chengxuan Qian
Tianyang Wang
Vince D. Calhoun
195
2
0
24 Nov 2025
Multiscale Vector-Quantized Variational Autoencoder for Endoscopic Image Synthesis
Multiscale Vector-Quantized Variational Autoencoder for Endoscopic Image SynthesisInternational Symposium on Telecommunications (IST), 2025
Dimitrios E. Diamantis
D. Iakovidis
MedIm
363
0
0
24 Nov 2025
Learning Massively Multitask World Models for Continuous Control
Learning Massively Multitask World Models for Continuous Control
Nicklas Hansen
Hao Su
Xiaolong Wang
OffRLCLLLM&Ro
536
0
0
24 Nov 2025
FVAR: Visual Autoregressive Modeling via Next Focus Prediction
FVAR: Visual Autoregressive Modeling via Next Focus Prediction
Xiaofan Li
Chenming Wu
Yanpeng Sun
Jiaming Zhou
Delin Qu
Yansong Qu
Weihao Bo
Haibao Yu
Dingkang Liang
VGen
171
0
0
24 Nov 2025
TRIDENT: A Trimodal Cascade Generative Framework for Drug and RNA-Conditioned Cellular Morphology Synthesis
TRIDENT: A Trimodal Cascade Generative Framework for Drug and RNA-Conditioned Cellular Morphology Synthesis
Rui Peng
Ziru Liu
Lingyuan Ye
Yuxing Lu
Boxin Shi
Jinzhuo Wang
90
0
0
23 Nov 2025
DELTA: Language Diffusion-based EEG-to-Text Architecture
DELTA: Language Diffusion-based EEG-to-Text Architecture
Mingyu Jeon
Hyobin Kim
DiffM
80
0
0
22 Nov 2025
METIS: Multi-Source Egocentric Training for Integrated Dexterous Vision-Language-Action Model
METIS: Multi-Source Egocentric Training for Integrated Dexterous Vision-Language-Action Model
Y. Fu
Ning Chen
Junkai Zhao
Shaozhe Shan
Guocai Yao
Pengwei Wang
Zhongyuan Wang
Shanghang Zhang
226
2
0
21 Nov 2025
Spanning Tree Autoregressive Visual Generation
Spanning Tree Autoregressive Visual Generation
Sangkyu Lee
Changho Lee
Janghoon Han
Hosung Song
Tackgeun You
Hwasup Lim
Stanley Jungkyu Choi
Honglak Lee
Youngjae Yu
205
0
0
21 Nov 2025
Flow and Depth Assisted Video Prediction with Latent Transformer
Eliyas Suleyman
Paul Henderson
Eksan Firkat
Nicolas Pugeault
159
0
0
20 Nov 2025
1234...757677
Next
Page 1 of 77
Pageof 77