ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.00937
  4. Cited By
Neural Discrete Representation Learning
v1v2 (latest)

Neural Discrete Representation Learning

2 November 2017
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
    BDLSSLOCL
ArXiv (abs)PDFHTML

Papers citing "Neural Discrete Representation Learning"

50 / 3,803 papers shown
AMS-KV: Adaptive KV Caching in Multi-Scale Visual Autoregressive Transformers
Boxun Xu
Yu Wang
Zihu Wang
Peng Li
VLM
280
0
0
20 Nov 2025
Progressive Supernet Training for Efficient Visual Autoregressive Modeling
Xiaoyue Chen
Yuling Shi
Kaiyuan Li
Huandong Wang
Yong Li
Xiaodong Gu
Xinlei Chen
Mingbao Lin
102
0
0
20 Nov 2025
MamTiff-CAD: Multi-Scale Latent Diffusion with Mamba+ for Complex Parametric Sequence
MamTiff-CAD: Multi-Scale Latent Diffusion with Mamba+ for Complex Parametric Sequence
Liyuan Deng
Yunpeng Bai
Yongkang Dai
Xiaoshui Huang
Hongping Gan
Dongshuo Huang
Hao jiacheng
Yilei Shi
92
0
0
20 Nov 2025
LiSTAR: Ray-Centric World Models for 4D LiDAR Sequences in Autonomous Driving
Pei Liu
Songtao Wang
Lang Zhang
Xingyue Peng
Yuandong Lyu
...
Weiliang Ma
Xueyang Zhang
Yifei Zhan
Xianpeng Lang
Jun Ma
SyDa
409
0
0
20 Nov 2025
Mem-MLP: Real-Time 3D Human Motion Generation from Sparse Inputs
Sinan Mutlu
Georgios Fotios Angelis
Savas Ozkan
Paul Wisbey
Anastasios Drosou
Mete Ozay
3DH
355
0
0
20 Nov 2025
Learning Human-Like RL Agents Through Trajectory Optimization With Action Quantization
Learning Human-Like RL Agents Through Trajectory Optimization With Action Quantization
Jian-Ting Guo
Yu-Cheng Chen
Ping-Chun Hsieh
Kuo-Hao Ho
Po-Wei Huang
Ti-Rong Wu
I-Chen Wu
92
0
0
19 Nov 2025
Taming Generative Synthetic Data for X-ray Prohibited Item Detection
Taming Generative Synthetic Data for X-ray Prohibited Item Detection
Jialong Sun
Hongguang Zhu
Weizhe Liu
Yunda Sun
Renshuai Tao
Y. X. Wei
157
0
0
19 Nov 2025
Voiced-Aware Style Extraction and Style Direction Adjustment for Expressive Text-to-Speech
Voiced-Aware Style Extraction and Style Direction Adjustment for Expressive Text-to-Speech
Nam-Gyu Kim
81
0
0
18 Nov 2025
StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model
StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model
Y. Yang
Zhi Cen
Sida Peng
Xiangwei Chen
Yifu Deng
Xinyu Zhu
Fan Jia
Xiaowei Zhou
Hujun Bao
DiffMVGen
324
0
0
18 Nov 2025
GeoSceneGraph: Geometric Scene Graph Diffusion Model for Text-guided 3D Indoor Scene Synthesis
GeoSceneGraph: Geometric Scene Graph Diffusion Model for Text-guided 3D Indoor Scene Synthesis
Antonio Ruiz
Tao Wu
Andrew Melnik
Qing Cheng
X. Wang
Lu Liu
Yongliang Wang
Y. Zhang
Helge J. Ritter
DiffMVGen
138
0
0
18 Nov 2025
Towards Deploying VLA without Fine-Tuning: Plug-and-Play Inference-Time VLA Policy Steering via Embodied Evolutionary Diffusion
Towards Deploying VLA without Fine-Tuning: Plug-and-Play Inference-Time VLA Policy Steering via Embodied Evolutionary Diffusion
Zhuo Li
Junjia Liu
Zhipeng Dong
Tao Teng
Quentin Rouxel
D. Caldwell
Fei Chen
88
0
0
18 Nov 2025
B-Rep Distance Functions (BR-DF): How to Represent a B-Rep Model by Volumetric Distance Functions?
B-Rep Distance Functions (BR-DF): How to Represent a B-Rep Model by Volumetric Distance Functions?
Fuyang Zhang
P. Jayaraman
Xiang Xu
Yasutaka Furukawa
132
0
0
18 Nov 2025
Infinite-Story: A Training-Free Consistent Text-to-Image Generation
Infinite-Story: A Training-Free Consistent Text-to-Image Generation
Jihun Park
Kyoungmin Lee
Jongmin Gim
Hyeonseo Jo
Minseok Oh
Wonhyeok Choi
K. Hwang
Jaeyeul Kim
Minwoo Choi
S. Im
111
0
1
17 Nov 2025
MergeDNA: Context-aware Genome Modeling with Dynamic Tokenization through Token Merging
MergeDNA: Context-aware Genome Modeling with Dynamic Tokenization through Token Merging
Siyuan Li
Kai Yu
Anna Wang
Zicheng Liu
Chang Yu
Jingbo Zhou
Qirong Yang
Yucheng Guo
Xiaoming Zhang
Stan Z. Li
99
0
0
17 Nov 2025
CoordAR: One-Reference 6D Pose Estimation of Novel Objects via Autoregressive Coordinate Map Generation
CoordAR: One-Reference 6D Pose Estimation of Novel Objects via Autoregressive Coordinate Map Generation
Dexin Zuo
Ang Li
Wei Wang
Wenxian Yu
Danping Zou
3DPC
201
0
0
17 Nov 2025
ActVAR: Activating Mixtures of Weights and Tokens for Efficient Visual Autoregressive Generation
ActVAR: Activating Mixtures of Weights and Tokens for Efficient Visual Autoregressive Generation
Kaixin Zhang
Ruiqing Yang
Yuan Zhang
Shan You
Tao Huang
VLM
138
0
0
17 Nov 2025
DAP: A Discrete-token Autoregressive Planner for Autonomous Driving
DAP: A Discrete-token Autoregressive Planner for Autonomous Driving
Bowen Ye
Bin Zhang
Hang Zhao
178
0
0
17 Nov 2025
InterMoE: Individual-Specific 3D Human Interaction Generation via Dynamic Temporal-Selective MoE
InterMoE: Individual-Specific 3D Human Interaction Generation via Dynamic Temporal-Selective MoE
Lipeng Wang
Hongxing Fan
Haohua Chen
Zehuan Huang
Lu Sheng
94
0
0
17 Nov 2025
Seg-VAR: Image Segmentation with Visual Autoregressive Modeling
Seg-VAR: Image Segmentation with Visual Autoregressive Modeling
Rongkun Zheng
Lu Qi
Xi Chen
Yi Wang
K. Wang
Hengshuang Zhao
134
0
0
16 Nov 2025
DEMIST: Decoupled Multi-stream latent diffusion for Quantitative Myelin Map Synthesis
DEMIST: Decoupled Multi-stream latent diffusion for Quantitative Myelin Map Synthesis
Jiacheng Wang
Hao Li
Xing Yao
Ahmad Toubasi
Taegan Vinarsky
...
Chaoyang Jin
Richard Dortch
Junzhong Xu
F. Bagnato
I. Oguz
MedIm
175
0
0
16 Nov 2025
Through-Foliage Surface-Temperature Reconstruction for early Wildfire Detection
Through-Foliage Surface-Temperature Reconstruction for early Wildfire Detection
Mohamed Youssef
Lukas Brunner
Klaus Rundhammer
Gerald Czech
Oliver Bimber
80
1
0
16 Nov 2025
VLA-R: Vision-Language Action Retrieval toward Open-World End-to-End Autonomous Driving
VLA-R: Vision-Language Action Retrieval toward Open-World End-to-End Autonomous Driving
Hyunki Seong
Seongwoo Moon
Hojin Ahn
Jehun Kang
David Hyunchul Shim
VLM
200
1
0
16 Nov 2025
DINO-Detect: A Simple yet Effective Framework for Blur-Robust AI-Generated Image Detection
DINO-Detect: A Simple yet Effective Framework for Blur-Robust AI-Generated Image Detection
Jialiang Shen
Jiyang Zheng
Yunqi Xue
Huajie Chen
Yu Yao
...
Ruiqi Liu
Helin Gong
Yang Yang
Dadong Wang
Tongliang Liu
239
0
0
16 Nov 2025
ReCast: Reliability-aware Codebook Assisted Lightweight Time Series Forecasting
ReCast: Reliability-aware Codebook Assisted Lightweight Time Series Forecasting
Xiang Ma
Taihua Chen
Pengcheng Wang
Xuemei Li
Caiming Zhang
AI4TS
106
0
0
15 Nov 2025
LiDAR-GS++:Improving LiDAR Gaussian Reconstruction via Diffusion Priors
LiDAR-GS++:Improving LiDAR Gaussian Reconstruction via Diffusion Priors
Qifeng Chen
Jiarun Liu
Rengan Xie
Tao Tang
Sicong Du
Yiru Zhao
Yuchi Huo
Sheng Yang
AI4CE
156
1
0
15 Nov 2025
Improved Masked Image Generation with Knowledge-Augmented Token Representations
Improved Masked Image Generation with Knowledge-Augmented Token Representations
Guotao Liang
Baoquan Zhang
Zhiyuan Wen
Zihao Han
Yunming Ye
120
0
0
15 Nov 2025
Point Cloud Quantization through Multimodal Prompting for 3D Understanding
Point Cloud Quantization through Multimodal Prompting for 3D Understanding
Hongxuan Li
Wencheng Zhu
Huiying Xu
Xinzhong Zhu
Q. Hu
MQ3DPC
434
0
0
15 Nov 2025
MixAR: Mixture Autoregressive Image Generation
MixAR: Mixture Autoregressive Image Generation
Jinyuan Hu
Jiayou Zhang
Shaobo Cui
Kun Zhang
Guangyi Chen
DiffM
157
0
0
15 Nov 2025
Towards Leveraging Sequential Structure in Animal Vocalizations
Towards Leveraging Sequential Structure in Animal Vocalizations
Eklavya Sarkar
Mathew Magimai.-Doss
146
0
0
13 Nov 2025
Optimizing Input of Denoising Score Matching is Biased Towards Higher Score Norm
Optimizing Input of Denoising Score Matching is Biased Towards Higher Score Norm
Tongda Xu
DiffM
156
1
0
13 Nov 2025
Learning Binary Autoencoder-Based Codes with Progressive Training
Learning Binary Autoencoder-Based Codes with Progressive Training
Vukan Ninkovic
D. Vukobratović
73
0
0
12 Nov 2025
Large Sign Language Models: Toward 3D American Sign Language Translation
Large Sign Language Models: Toward 3D American Sign Language Translation
S. Zhang
Xiaoxiao He
Di Liu
Zhaoyang Xia
Mingyu Zhao
Chaowei Tan
Vivian Li
Bo Liu
Dimitris N. Metaxas
Mubbasir Kapadia
SLR
309
1
0
11 Nov 2025
From IDs to Semantics: A Generative Framework for Cross-Domain Recommendation with Adaptive Semantic Tokenization
From IDs to Semantics: A Generative Framework for Cross-Domain Recommendation with Adaptive Semantic Tokenization
Peiyu Hu
Wayne Lu
Jia Wang
137
2
0
11 Nov 2025
From Classical to Hybrid: A Practical Framework for Quantum-Enhanced Learning
From Classical to Hybrid: A Practical Framework for Quantum-Enhanced Learning
Silvie Illésová
Tomáš Bezděk
Vojtěch Novák
Ivan Zelinka
Stefano Cacciatore
Martin Beseda
202
0
0
11 Nov 2025
Twist and Compute: The Cost of Pose in 3D Generative Diffusion
Twist and Compute: The Cost of Pose in 3D Generative Diffusion
Kyle Fogarty
Jack Foster
Boqiao Zhang
Jing Yang
Cengiz Öztireli
DiffM
144
0
0
11 Nov 2025
Retrospective motion correction in MRI using disentangled embeddings
Retrospective motion correction in MRI using disentangled embeddings
Qi Wang
Veronika Ecker
Marcel Früh
S. Gatidis
Thomas Kustner
92
0
0
11 Nov 2025
ViPRA: Video Prediction for Robot Actions
ViPRA: Video Prediction for Robot Actions
Sandeep Routray
Hengkai Pan
Unnat Jain
Shikhar Bahl
Deepak Pathak
236
2
0
11 Nov 2025
CAST-LUT: Tokenizer-Guided HSV Look-Up Tables for Purple Flare Removal
CAST-LUT: Tokenizer-Guided HSV Look-Up Tables for Purple Flare Removal
Pu Wang
ShuNing Sun
Jialang Lu
Chen Wu
Zhihua Zhang
Youshan Zhang
Chenggang Shan
Dianjie Lu
Guijuan Zhang
Zhuoran Zheng
121
0
0
10 Nov 2025
VAEVQ: Enhancing Discrete Visual Tokenization through Variational Modeling
VAEVQ: Enhancing Discrete Visual Tokenization through Variational Modeling
Sicheng Yang
Xing Hu
Qiang Wu
Dawei Yang
196
0
0
10 Nov 2025
Enhancing Multimodal Misinformation Detection by Replaying the Whole Story from Image Modality Perspective
Enhancing Multimodal Misinformation Detection by Replaying the Whole Story from Image Modality Perspective
Bing Wang
Ximing Li
Y. Wang
Chaofan Li
Lin Yuanbo Wu
B. Wang
Shengsheng Wang
140
1
0
09 Nov 2025
Seq2Seq Models Reconstruct Visual Jigsaw Puzzles without Seeing Them
Seq2Seq Models Reconstruct Visual Jigsaw Puzzles without Seeing Them
Gur Elkn
Ofir Itzhak Shahar
Ohad Ben-Shahar
VLM
92
0
0
09 Nov 2025
PhysCorr: Dual-Reward DPO for Physics-Constrained Text-to-Video Generation with Automated Preference Selection
PhysCorr: Dual-Reward DPO for Physics-Constrained Text-to-Video Generation with Automated Preference Selection
Peiyao Wang
Weining Wang
Qi Li
EGVMVGen
403
1
0
06 Nov 2025
InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation
InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation
Jinlai Liu
J. N. Han
B. Yan
Hui Wu
Fengda Zhu
Xing-Hui Wang
Yi Jiang
Bingyue Peng
Zehuan Yuan
VGen
264
3
0
06 Nov 2025
Unified Multimodal Diffusion Forcing for Forceful Manipulation
Unified Multimodal Diffusion Forcing for Forceful Manipulation
Zixuan Huang
Huaidian Hou
Dmitry Berenson
DiffM
97
0
0
06 Nov 2025
XR-1: Towards Versatile Vision-Language-Action Models via Learning Unified Vision-Motion Representations
XR-1: Towards Versatile Vision-Language-Action Models via Learning Unified Vision-Motion Representations
Shichao Fan
K. Wu
Zhengping Che
X. Wang
Di Wu
...
M. M. Li
Qingjie Liu
Shanghang Zhang
Min Wan
Yong Dai
250
1
0
04 Nov 2025
Unified Diffusion VLA: Vision-Language-Action Model via Joint Discrete Denoising Diffusion Process
Unified Diffusion VLA: Vision-Language-Action Model via Joint Discrete Denoising Diffusion Process
Jiayi Chen
Wenxuan Song
Pengxiang Ding
Ziyang Zhou
Han Zhao
Feilong Tang
Donglin Wang
Haoang Li
140
3
0
03 Nov 2025
ExplicitLM: Decoupling Knowledge from Parameters via Explicit Memory Banks
ExplicitLM: Decoupling Knowledge from Parameters via Explicit Memory Banks
Chengzhang Yu
Zening Lu
Chenyang Zheng
C. Wang
Yiming Zhang
Zhanpeng Jin
KELM
143
0
0
03 Nov 2025
Wave-Particle (Continuous-Discrete) Dualistic Visual Tokenization for Unified Understanding and Generation
Wave-Particle (Continuous-Discrete) Dualistic Visual Tokenization for Unified Understanding and Generation
Yizhu Chen
Chen Ju
Z. Wang
Shuai Xiao
X. Chen
Jinsong Lan
Xiaoyong Zhu
Ying Chen
138
0
0
03 Nov 2025
MoSa: Motion Generation with Scalable Autoregressive Modeling
MoSa: Motion Generation with Scalable Autoregressive Modeling
Mengyuan Liu
Sheng Yan
Y. Wang
Yingjie Li
Gui-Bin Bian
Hong Liu
188
2
0
03 Nov 2025
Embodied Cognition Augmented End2End Autonomous Driving
Embodied Cognition Augmented End2End Autonomous Driving
Ling Niu
Xiaoji Zheng
Han Wang
Chen Zheng
Ziyuan Yang
Bokui Chen
Jiangtao Gong
108
0
0
03 Nov 2025
Previous
12345...757677
Next