ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.00937
  4. Cited By
Neural Discrete Representation Learning
v1v2 (latest)

Neural Discrete Representation Learning

2 November 2017
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
    BDLSSLOCL
ArXiv (abs)PDFHTML

Papers citing "Neural Discrete Representation Learning"

50 / 3,788 papers shown
Title
U4D: Uncertainty-Aware 4D World Modeling from LiDAR Sequences
U4D: Uncertainty-Aware 4D World Modeling from LiDAR Sequences
Xiang Xu
Ao Liang
Youquan Liu
Linfeng Li
Lingdong Kong
Ziwei Liu
Qingshan Liu
40
0
0
02 Dec 2025
Graph VQ-Transformer (GVT): Fast and Accurate Molecular Generation via High-Fidelity Discrete Latents
Graph VQ-Transformer (GVT): Fast and Accurate Molecular Generation via High-Fidelity Discrete Latents
Haozhuo Zheng
Cheng Wang
Yang Liu
8
0
0
02 Dec 2025
AutoBrep: Autoregressive B-Rep Generation with Unified Topology and Geometry
AutoBrep: Autoregressive B-Rep Generation with Unified Topology and Geometry
Xiang Xu
P. Jayaraman
Joseph George Lambourne
Yilin Liu
Durvesh Malpure
Pete Meltzer
AI4CE
32
0
0
02 Dec 2025
Real-World Robot Control by Deep Active Inference With a Temporally Hierarchical World ModelIEEE Robotics and Automation Letters (IEEE RA-L), 2025
Kentaro Fujii
Shingo Murata
8
0
0
01 Dec 2025
Mofasa: A Step Change in Metal-Organic Framework Generation
Mofasa: A Step Change in Metal-Organic Framework Generation
Vaidotas Šimkus
Anders Christensen
Steven Bennett
Ian Johnson
Mark Neumann
James Gin
Jonathan Godwin
Benjamin Rhodes
AI4CE
60
0
0
01 Dec 2025
TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models
TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models
Zhiheng Liu
Weiming Ren
Haozhe Liu
Zijian Zhou
S. Chen
...
Ping Luo
Wei Liu
Tao Xiang
Jonas Schult
Yuren Cong
72
0
0
01 Dec 2025
Q2D2: A Geometry-Aware Audio Codec Leveraging Two-Dimensional Quantization
Q2D2: A Geometry-Aware Audio Codec Leveraging Two-Dimensional Quantization
Tal Shuster
Eliya Nachmani
84
0
0
01 Dec 2025
Deconstructing Generative Diversity: An Information Bottleneck Analysis of Discrete Latent Generative Models
Yudi Wu
Wenhao Zhao
Dianbo Liu
28
0
0
01 Dec 2025
Smol-GS: Compact Representations for Abstract 3D Gaussian Splatting
Smol-GS: Compact Representations for Abstract 3D Gaussian Splatting
Haishan Wang
Mohammad Hassan Vali
Arno Solin
3DGS
108
0
0
30 Nov 2025
Neural Discrete Representation Learning for Sparse-View CBCT Reconstruction: From Algorithm Design to Prospective Multicenter Clinical Evaluation
Neural Discrete Representation Learning for Sparse-View CBCT Reconstruction: From Algorithm Design to Prospective Multicenter Clinical Evaluation
H. Wang
Lei Chen
Wei-Hua Zhang
Linxia Wu
Yong Luo
...
Xuhua Duan
Lefei Zhang
Gao-Jun Teng
Bo Du
Huangxuan Zhao
16
0
0
30 Nov 2025
Quantized-Tinyllava: a new multimodal foundation model enables efficient split learning
Quantized-Tinyllava: a new multimodal foundation model enables efficient split learning
J. Guo
Xin Luo
Jie Liu
48
0
0
28 Nov 2025
Visual Generation Tuning
Visual Generation Tuning
Jiahao Guo
Sinan Du
J. Yao
Wenyu Liu
Bo Li
Haoxiang Cao
Kun Gai
C. Yuan
Kai Wu
Xinggang Wang
VLM
181
0
0
28 Nov 2025
VQRAE: Representation Quantization Autoencoders for Multimodal Understanding, Generation and Reconstruction
VQRAE: Representation Quantization Autoencoders for Multimodal Understanding, Generation and Reconstruction
Sinan Du
Jiahao Guo
Bo Li
Shuhao Cui
Zhengzhuo Xu
...
Yongxian Wei
Kun Gai
X. Wang
Kai Wu
C. Yuan
94
0
0
28 Nov 2025
REVEAL: Reasoning-enhanced Forensic Evidence Analysis for Explainable AI-generated Image Detection
REVEAL: Reasoning-enhanced Forensic Evidence Analysis for Explainable AI-generated Image Detection
Huangsen Cao
Qin Mei
Zhiheng Li
Yuxi Li
Ying Zhang
...
Zhimeng Zhang
Xin Ding
Yongwei Wang
Jing Lyu
Fei Wu
40
0
0
28 Nov 2025
PURE Codec: Progressive Unfolding of Residual Entropy for Speech Codec Learning
PURE Codec: Progressive Unfolding of Residual Entropy for Speech Codec Learning
J. Shi
H. Wang
William Chen
Chenda Li
Wangyou Zhang
Jinchuan Tian
Shinji Watanabe
104
0
0
27 Nov 2025
BrepGPT: Autoregressive B-rep Generation with Voronoi Half-Patch
BrepGPT: Autoregressive B-rep Generation with Voronoi Half-PatchACM Transactions on Graphics (TOG), 2025
Pu Li
Wenhao Zhang
Weize Quan
Biao Zhang
Peter Wonka
Dong-ming Yan
16
0
0
27 Nov 2025
DiverseVAR: Balancing Diversity and Quality of Next-Scale Visual Autoregressive Models
DiverseVAR: Balancing Diversity and Quality of Next-Scale Visual Autoregressive Models
Mingue Park
Prin Phunyaphibarn
Phillip Y. Lee
Minhyuk Sung
100
0
0
26 Nov 2025
Progress by Pieces: Test-Time Scaling for Autoregressive Image Generation
Progress by Pieces: Test-Time Scaling for Autoregressive Image Generation
Joonhyung Park
Hyeongwon Jang
Joowon Kim
Eunho Yang
VLM
100
0
0
26 Nov 2025
Harmonic-Percussive Disentangled Neural Audio Codec for Bandwidth Extension
Harmonic-Percussive Disentangled Neural Audio Codec for Bandwidth Extension
Benoît Giniès
Xiaoyu Bie
Olivier Fercoq
Gaël Richard
140
0
0
26 Nov 2025
RAVQ-HoloNet: Rate-Adaptive Vector-Quantized Hologram Compression
RAVQ-HoloNet: Rate-Adaptive Vector-Quantized Hologram Compression
Shima Rafiei
Zahra Nabizadeh Shahr Babak
S. Samavi
S. Shirani
116
0
0
26 Nov 2025
SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space
SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space
Zhenyi Shen
Junru Lu
Lin Gui
Jiazheng Li
Yulan He
D. Yin
Xing Sun
156
0
0
25 Nov 2025
Text-guided Controllable Diffusion for Realistic Camouflage Images Generation
Text-guided Controllable Diffusion for Realistic Camouflage Images Generation
Yuhang Qian
Haiyan Chen
Wentong Li
Ningzhong Liu
Jie Qin
DiffM
173
0
0
25 Nov 2025
Temporal-Visual Semantic Alignment: A Unified Architecture for Transferring Spatial Priors from Vision Models to Zero-Shot Temporal Tasks
Temporal-Visual Semantic Alignment: A Unified Architecture for Transferring Spatial Priors from Vision Models to Zero-Shot Temporal Tasks
Xiangkai Ma
Han Zhang
Wenzhong Li
Sanglu Lu
AI4TSVGen
227
0
0
25 Nov 2025
Operationalizing Quantized Disentanglement
Operationalizing Quantized Disentanglement
Vitória Barin Pacela
Kartik Ahuja
Simon Lacoste-Julien
Pascal Vincent
69
0
0
25 Nov 2025
AI/ML based Joint Source and Channel Coding for HARQ-ACK Payload
AI/ML based Joint Source and Channel Coding for HARQ-ACK Payload
Akash S. Doshi
Pinar Sen
Kirill Ivanov
Wei Yang
June Namgoong
Runxin Wang
Rachel Wang
Taesang Yoo
Jing Jiang
Tingfang Ji
12
0
0
25 Nov 2025
SKEL-CF: Coarse-to-Fine Biomechanical Skeleton and Surface Mesh Recovery
SKEL-CF: Coarse-to-Fine Biomechanical Skeleton and Surface Mesh Recovery
Da Li
Jiping Jin
Xuanlong Yu
Wei Liu
Xiaodong Cun
Kai Chen
Rui Fan
Jiangang Kong
Xi Shen
3DH
358
0
0
25 Nov 2025
PRADA: Probability-Ratio-Based Attribution and Detection of Autoregressive-Generated Images
PRADA: Probability-Ratio-Based Attribution and Detection of Autoregressive-Generated Images
Simon Damm
Jonas Ricker
Henning Petzka
Asja Fischer
169
0
0
25 Nov 2025
DINO-Tok: Adapting DINO for Visual Tokenizers
DINO-Tok: Adapting DINO for Visual Tokenizers
Mingkai Jia
Mingxiao Li
Liaoyuan Fan
Tianxing Shi
Jiaxin Guo
...
Xiaoyang Guo
Xiao-Xiao Long
Qian Zhang
P. Tan
Wei Yin
ViT
152
0
0
25 Nov 2025
FVAR: Visual Autoregressive Modeling via Next Focus Prediction
FVAR: Visual Autoregressive Modeling via Next Focus Prediction
Xiaofan Li
Chenming Wu
Yanpeng Sun
Jiaming Zhou
Delin Qu
Yansong Qu
Weihao Bo
Haibao Yu
Dingkang Liang
VGen
108
0
0
24 Nov 2025
Multiscale Vector-Quantized Variational Autoencoder for Endoscopic Image Synthesis
Multiscale Vector-Quantized Variational Autoencoder for Endoscopic Image SynthesisInternational Symposium on Telecommunications (IST), 2025
Dimitrios E. Diamantis
D. Iakovidis
MedIm
245
0
0
24 Nov 2025
Learning Massively Multitask World Models for Continuous Control
Learning Massively Multitask World Models for Continuous Control
Nicklas Hansen
Hao Su
Xiaolong Wang
OffRLCLLLM&Ro
419
0
0
24 Nov 2025
fMRI-LM: Towards a Universal Foundation Model for Language-Aligned fMRI Understanding
fMRI-LM: Towards a Universal Foundation Model for Language-Aligned fMRI Understanding
Yuxiang Wei
Y. Zhang
Xi Xiao
Chengxuan Qian
Tianyang Wang
Vince D. Calhoun
64
0
0
24 Nov 2025
Robust Long-term Test-Time Adaptation for 3D Human Pose Estimation through Motion Discretization
Robust Long-term Test-Time Adaptation for 3D Human Pose Estimation through Motion Discretization
Yilin Wen
Kechuan Dong
Yusuke Sugano
3DHTTA
288
0
0
24 Nov 2025
TRIDENT: A Trimodal Cascade Generative Framework for Drug and RNA-Conditioned Cellular Morphology Synthesis
TRIDENT: A Trimodal Cascade Generative Framework for Drug and RNA-Conditioned Cellular Morphology Synthesis
Rui Peng
Ziru Liu
Lingyuan Ye
Yuxing Lu
Boxin Shi
Jinzhuo Wang
70
0
0
23 Nov 2025
DELTA: Language Diffusion-based EEG-to-Text Architecture
DELTA: Language Diffusion-based EEG-to-Text Architecture
Mingyu Jeon
Hyobin Kim
DiffM
16
0
0
22 Nov 2025
METIS: Multi-Source Egocentric Training for Integrated Dexterous Vision-Language-Action Model
METIS: Multi-Source Egocentric Training for Integrated Dexterous Vision-Language-Action Model
Y. Fu
Ning Chen
Junkai Zhao
Shaozhe Shan
Guocai Yao
Pengwei Wang
Zhongyuan Wang
Shanghang Zhang
148
0
0
21 Nov 2025
Spanning Tree Autoregressive Visual Generation
Spanning Tree Autoregressive Visual Generation
Sangkyu Lee
Changho Lee
Janghoon Han
Hosung Song
Tackgeun You
Hwasup Lim
Stanley Jungkyu Choi
Honglak Lee
Youngjae Yu
180
0
0
21 Nov 2025
AMS-KV: Adaptive KV Caching in Multi-Scale Visual Autoregressive Transformers
Boxun Xu
Yu Wang
Zihu Wang
Peng Li
VLM
237
0
0
20 Nov 2025
LAOF: Robust Latent Action Learning with Optical Flow Constraints
Xizhou Bu
Jiexi Lyu
Fulei Sun
R. G. Yang
Zhiqiang Ma
Wei Li
60
0
0
20 Nov 2025
MamTiff-CAD: Multi-Scale Latent Diffusion with Mamba+ for Complex Parametric Sequence
MamTiff-CAD: Multi-Scale Latent Diffusion with Mamba+ for Complex Parametric Sequence
Liyuan Deng
Yunpeng Bai
Yongkang Dai
Xiaoshui Huang
Hongping Gan
Dongshuo Huang
Hao jiacheng
Yilei Shi
72
0
0
20 Nov 2025
Flow and Depth Assisted Video Prediction with Latent Transformer
Eliyas Suleyman
Paul Henderson
Eksan Firkat
Nicolas Pugeault
90
0
0
20 Nov 2025
Mem-MLP: Real-Time 3D Human Motion Generation from Sparse Inputs
Sinan Mutlu
Georgios Fotios Angelis
Savas Ozkan
Paul Wisbey
Anastasios Drosou
Mete Ozay
3DH
268
0
0
20 Nov 2025
Progressive Supernet Training for Efficient Visual Autoregressive Modeling
Xiaoyue Chen
Yuling Shi
Kaiyuan Li
Huandong Wang
Yong Li
Xiaodong Gu
Xinlei Chen
Mingbao Lin
52
0
0
20 Nov 2025
LiSTAR: Ray-Centric World Models for 4D LiDAR Sequences in Autonomous Driving
Pei Liu
Songtao Wang
Lang Zhang
Xingyue Peng
Yuandong Lyu
...
Weiliang Ma
Xueyang Zhang
Yifei Zhan
Xianpeng Lang
Jun Ma
SyDa
280
0
0
20 Nov 2025
Learning Human-Like RL Agents Through Trajectory Optimization With Action Quantization
Learning Human-Like RL Agents Through Trajectory Optimization With Action Quantization
Jian-Ting Guo
Yu-Cheng Chen
Ping-Chun Hsieh
Kuo-Hao Ho
Po-Wei Huang
Ti-Rong Wu
I-Chen Wu
76
0
0
19 Nov 2025
Taming Generative Synthetic Data for X-ray Prohibited Item Detection
Taming Generative Synthetic Data for X-ray Prohibited Item Detection
Jialong Sun
Hongguang Zhu
Weizhe Liu
Yunda Sun
Renshuai Tao
Y. X. Wei
132
0
0
19 Nov 2025
B-Rep Distance Functions (BR-DF): How to Represent a B-Rep Model by Volumetric Distance Functions?
B-Rep Distance Functions (BR-DF): How to Represent a B-Rep Model by Volumetric Distance Functions?
Fuyang Zhang
P. Jayaraman
Xiang Xu
Yasutaka Furukawa
108
0
0
18 Nov 2025
Towards Deploying VLA without Fine-Tuning: Plug-and-Play Inference-Time VLA Policy Steering via Embodied Evolutionary Diffusion
Towards Deploying VLA without Fine-Tuning: Plug-and-Play Inference-Time VLA Policy Steering via Embodied Evolutionary Diffusion
Zhuo Li
Junjia Liu
Zhipeng Dong
Tao Teng
Quentin Rouxel
D. Caldwell
Fei Chen
76
0
0
18 Nov 2025
StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model
StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model
Y. Yang
Zhi Cen
Sida Peng
Xiangwei Chen
Yifu Deng
Xinyu Zhu
Fan Jia
Xiaowei Zhou
Hujun Bao
DiffMVGen
212
0
0
18 Nov 2025
Voiced-Aware Style Extraction and Style Direction Adjustment for Expressive Text-to-Speech
Voiced-Aware Style Extraction and Style Direction Adjustment for Expressive Text-to-Speech
Nam-Gyu Kim
45
0
0
18 Nov 2025
1234...747576
Next