ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.00937
  4. Cited By
Neural Discrete Representation Learning
v1v2 (latest)

Neural Discrete Representation Learning

2 November 2017
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
    BDLSSLOCL
ArXiv (abs)PDFHTML

Papers citing "Neural Discrete Representation Learning"

50 / 3,767 papers shown
Title
Progress by Pieces: Test-Time Scaling for Autoregressive Image Generation
Progress by Pieces: Test-Time Scaling for Autoregressive Image Generation
Joonhyung Park
Hyeongwon Jang
Joowon Kim
Eunho Yang
VLM
92
0
0
26 Nov 2025
Harmonic-Percussive Disentangled Neural Audio Codec for Bandwidth Extension
Harmonic-Percussive Disentangled Neural Audio Codec for Bandwidth Extension
Benoît Giniès
Xiaoyu Bie
Olivier Fercoq
Gaël Richard
128
0
0
26 Nov 2025
DiverseVAR: Balancing Diversity and Quality of Next-Scale Visual Autoregressive Models
DiverseVAR: Balancing Diversity and Quality of Next-Scale Visual Autoregressive Models
Mingue Park
Prin Phunyaphibarn
Phillip Y. Lee
Minhyuk Sung
88
0
0
26 Nov 2025
RAVQ-HoloNet: Rate-Adaptive Vector-Quantized Hologram Compression
RAVQ-HoloNet: Rate-Adaptive Vector-Quantized Hologram Compression
Shima Rafiei
Zahra Nabizadeh Shahr Babak
S. Samavi
S. Shirani
112
0
0
26 Nov 2025
Operationalizing Quantized Disentanglement
Operationalizing Quantized Disentanglement
Vitória Barin Pacela
Kartik Ahuja
Simon Lacoste-Julien
Pascal Vincent
69
0
0
25 Nov 2025
Temporal-Visual Semantic Alignment: A Unified Architecture for Transferring Spatial Priors from Vision Models to Zero-Shot Temporal Tasks
Temporal-Visual Semantic Alignment: A Unified Architecture for Transferring Spatial Priors from Vision Models to Zero-Shot Temporal Tasks
Xiangkai Ma
Han Zhang
Wenzhong Li
Sanglu Lu
AI4TSVGen
211
0
0
25 Nov 2025
AI/ML based Joint Source and Channel Coding for HARQ-ACK Payload
AI/ML based Joint Source and Channel Coding for HARQ-ACK Payload
Akash S. Doshi
Pinar Sen
Kirill Ivanov
Wei Yang
June Namgoong
Runxin Wang
Rachel Wang
Taesang Yoo
Jing Jiang
Tingfang Ji
8
0
0
25 Nov 2025
SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space
SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space
Zhenyi Shen
Junru Lu
Lin Gui
Jiazheng Li
Yulan He
D. Yin
Xing Sun
112
0
0
25 Nov 2025
Text-guided Controllable Diffusion for Realistic Camouflage Images Generation
Text-guided Controllable Diffusion for Realistic Camouflage Images Generation
Yuhang Qian
Haiyan Chen
Wentong Li
Ningzhong Liu
Jie Qin
DiffM
161
0
0
25 Nov 2025
PRADA: Probability-Ratio-Based Attribution and Detection of Autoregressive-Generated Images
PRADA: Probability-Ratio-Based Attribution and Detection of Autoregressive-Generated Images
Simon Damm
Jonas Ricker
Henning Petzka
Asja Fischer
165
0
0
25 Nov 2025
DINO-Tok: Adapting DINO for Visual Tokenizers
DINO-Tok: Adapting DINO for Visual Tokenizers
Mingkai Jia
Mingxiao Li
Liaoyuan Fan
Tianxing Shi
Jiaxin Guo
...
Xiaoyang Guo
Xiao-Xiao Long
Qian Zhang
P. Tan
Wei Yin
ViT
144
0
0
25 Nov 2025
FVAR: Visual Autoregressive Modeling via Next Focus Prediction
FVAR: Visual Autoregressive Modeling via Next Focus Prediction
Xiaofan Li
Chenming Wu
Yanpeng Sun
Jiaming Zhou
Delin Qu
Yansong Qu
Weihao Bo
Haibao Yu
Dingkang Liang
VGen
104
0
0
24 Nov 2025
Multiscale Vector-Quantized Variational Autoencoder for Endoscopic Image Synthesis
Multiscale Vector-Quantized Variational Autoencoder for Endoscopic Image SynthesisInternational Symposium on Telecommunications (IST), 2025
Dimitrios E. Diamantis
D. Iakovidis
MedIm
229
0
0
24 Nov 2025
Robust Long-term Test-Time Adaptation for 3D Human Pose Estimation through Motion Discretization
Robust Long-term Test-Time Adaptation for 3D Human Pose Estimation through Motion Discretization
Yilin Wen
Kechuan Dong
Yusuke Sugano
3DHTTA
284
0
0
24 Nov 2025
TRIDENT: A Trimodal Cascade Generative Framework for Drug and RNA-Conditioned Cellular Morphology Synthesis
TRIDENT: A Trimodal Cascade Generative Framework for Drug and RNA-Conditioned Cellular Morphology Synthesis
Rui Peng
Ziru Liu
Lingyuan Ye
Yuxing Lu
Boxin Shi
Jinzhuo Wang
58
0
0
23 Nov 2025
Spanning Tree Autoregressive Visual Generation
Spanning Tree Autoregressive Visual Generation
Sangkyu Lee
Changho Lee
Janghoon Han
Hosung Song
Tackgeun You
Hwasup Lim
Stanley Jungkyu Choi
Honglak Lee
Youngjae Yu
176
0
0
21 Nov 2025
METIS: Multi-Source Egocentric Training for Integrated Dexterous Vision-Language-Action Model
METIS: Multi-Source Egocentric Training for Integrated Dexterous Vision-Language-Action Model
Y. Fu
Ning Chen
Junkai Zhao
Shaozhe Shan
Guocai Yao
Pengwei Wang
Zhongyuan Wang
Shanghang Zhang
144
0
0
21 Nov 2025
Mem-MLP: Real-Time 3D Human Motion Generation from Sparse Inputs
Sinan Mutlu
Georgios Fotios Angelis
Savas Ozkan
Paul Wisbey
Anastasios Drosou
Mete Ozay
3DH
248
0
0
20 Nov 2025
Flow and Depth Assisted Video Prediction with Latent Transformer
Eliyas Suleyman
Paul Henderson
Eksan Firkat
Nicolas Pugeault
86
0
0
20 Nov 2025
AMS-KV: Adaptive KV Caching in Multi-Scale Visual Autoregressive Transformers
Boxun Xu
Yu Wang
Zihu Wang
Peng Li
VLM
221
0
0
20 Nov 2025
MamTiff-CAD: Multi-Scale Latent Diffusion with Mamba+ for Complex Parametric Sequence
MamTiff-CAD: Multi-Scale Latent Diffusion with Mamba+ for Complex Parametric Sequence
Liyuan Deng
Yunpeng Bai
Yongkang Dai
Xiaoshui Huang
Hongping Gan
Dongshuo Huang
Hao jiacheng
Yilei Shi
68
0
0
20 Nov 2025
LAOF: Robust Latent Action Learning with Optical Flow Constraints
Xizhou Bu
Jiexi Lyu
Fulei Sun
R. G. Yang
Zhiqiang Ma
Wei Li
56
0
0
20 Nov 2025
LiSTAR: Ray-Centric World Models for 4D LiDAR Sequences in Autonomous Driving
Pei Liu
Songtao Wang
Lang Zhang
Xingyue Peng
Yuandong Lyu
...
Weiliang Ma
Xueyang Zhang
Yifei Zhan
Xianpeng Lang
Jun Ma
SyDa
256
0
0
20 Nov 2025
Progressive Supernet Training for Efficient Visual Autoregressive Modeling
Xiaoyue Chen
Yuling Shi
Kaiyuan Li
Huandong Wang
Yong Li
Xiaodong Gu
Xinlei Chen
Mingbao Lin
52
0
0
20 Nov 2025
Taming Generative Synthetic Data for X-ray Prohibited Item Detection
Taming Generative Synthetic Data for X-ray Prohibited Item Detection
Jialong Sun
Hongguang Zhu
Weizhe Liu
Yunda Sun
Renshuai Tao
Y. X. Wei
124
0
0
19 Nov 2025
Learning Human-Like RL Agents Through Trajectory Optimization With Action Quantization
Learning Human-Like RL Agents Through Trajectory Optimization With Action Quantization
Jian-Ting Guo
Yu-Cheng Chen
Ping-Chun Hsieh
Kuo-Hao Ho
Po-Wei Huang
Ti-Rong Wu
I-Chen Wu
76
0
0
19 Nov 2025
Voiced-Aware Style Extraction and Style Direction Adjustment for Expressive Text-to-Speech
Voiced-Aware Style Extraction and Style Direction Adjustment for Expressive Text-to-Speech
Nam-Gyu Kim
45
0
0
18 Nov 2025
B-Rep Distance Functions (BR-DF): How to Represent a B-Rep Model by Volumetric Distance Functions?
B-Rep Distance Functions (BR-DF): How to Represent a B-Rep Model by Volumetric Distance Functions?
Fuyang Zhang
P. Jayaraman
Xiang Xu
Yasutaka Furukawa
108
0
0
18 Nov 2025
Towards Deploying VLA without Fine-Tuning: Plug-and-Play Inference-Time VLA Policy Steering via Embodied Evolutionary Diffusion
Towards Deploying VLA without Fine-Tuning: Plug-and-Play Inference-Time VLA Policy Steering via Embodied Evolutionary Diffusion
Zhuo Li
Junjia Liu
Zhipeng Dong
Tao Teng
Quentin Rouxel
D. Caldwell
Fei Chen
72
0
0
18 Nov 2025
GeoSceneGraph: Geometric Scene Graph Diffusion Model for Text-guided 3D Indoor Scene Synthesis
GeoSceneGraph: Geometric Scene Graph Diffusion Model for Text-guided 3D Indoor Scene Synthesis
Antonio Ruiz
Tao Wu
Andrew Melnik
Qing Cheng
X. Wang
Lu Liu
Yongliang Wang
Y. Zhang
Helge J. Ritter
DiffMVGen
104
0
0
18 Nov 2025
StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model
StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model
Y. Yang
Zhi Cen
Sida Peng
Xiangwei Chen
Yifu Deng
Xinyu Zhu
Fan Jia
Xiaowei Zhou
Hujun Bao
DiffMVGen
196
0
0
18 Nov 2025
CoordAR: One-Reference 6D Pose Estimation of Novel Objects via Autoregressive Coordinate Map Generation
CoordAR: One-Reference 6D Pose Estimation of Novel Objects via Autoregressive Coordinate Map Generation
Dexin Zuo
Ang Li
Wei Wang
Wenxian Yu
Danping Zou
3DPC
148
0
0
17 Nov 2025
Infinite-Story: A Training-Free Consistent Text-to-Image Generation
Infinite-Story: A Training-Free Consistent Text-to-Image Generation
Jihun Park
Kyoungmin Lee
Jongmin Gim
Hyeonseo Jo
Minseok Oh
Wonhyeok Choi
K. Hwang
Jaeyeul Kim
Minwoo Choi
S. Im
99
0
1
17 Nov 2025
MergeDNA: Context-aware Genome Modeling with Dynamic Tokenization through Token Merging
MergeDNA: Context-aware Genome Modeling with Dynamic Tokenization through Token Merging
Siyuan Li
Kai Yu
Anna Wang
Zicheng Liu
Chang Yu
Jingbo Zhou
Qirong Yang
Yucheng Guo
Xiaoming Zhang
Stan Z. Li
64
0
0
17 Nov 2025
DAP: A Discrete-token Autoregressive Planner for Autonomous Driving
DAP: A Discrete-token Autoregressive Planner for Autonomous Driving
Bowen Ye
Bin Zhang
Hang Zhao
154
0
0
17 Nov 2025
InterMoE: Individual-Specific 3D Human Interaction Generation via Dynamic Temporal-Selective MoE
InterMoE: Individual-Specific 3D Human Interaction Generation via Dynamic Temporal-Selective MoE
Lipeng Wang
Hongxing Fan
Haohua Chen
Zehuan Huang
Lu Sheng
46
0
0
17 Nov 2025
ActVAR: Activating Mixtures of Weights and Tokens for Efficient Visual Autoregressive Generation
ActVAR: Activating Mixtures of Weights and Tokens for Efficient Visual Autoregressive Generation
Kaixin Zhang
Ruiqing Yang
Yuan Zhang
Shan You
Tao Huang
VLM
99
0
0
17 Nov 2025
DINO-Detect: A Simple yet Effective Framework for Blur-Robust AI-Generated Image Detection
DINO-Detect: A Simple yet Effective Framework for Blur-Robust AI-Generated Image Detection
Jialiang Shen
Jiyang Zheng
Yunqi Xue
Huajie Chen
Yu Yao
...
Ruiqi Liu
Helin Gong
Yang Yang
Dadong Wang
Tongliang Liu
175
0
0
16 Nov 2025
VLA-R: Vision-Language Action Retrieval toward Open-World End-to-End Autonomous Driving
VLA-R: Vision-Language Action Retrieval toward Open-World End-to-End Autonomous Driving
Hyunki Seong
Seongwoo Moon
Hojin Ahn
Jehun Kang
David Hyunchul Shim
VLM
156
0
0
16 Nov 2025
Through-Foliage Surface-Temperature Reconstruction for early Wildfire Detection
Through-Foliage Surface-Temperature Reconstruction for early Wildfire Detection
Mohamed Youssef
Lukas Brunner
Klaus Rundhammer
Gerald Czech
Oliver Bimber
68
0
0
16 Nov 2025
Seg-VAR: Image Segmentation with Visual Autoregressive Modeling
Seg-VAR: Image Segmentation with Visual Autoregressive Modeling
Rongkun Zheng
Lu Qi
Xi Chen
Yi Wang
K. Wang
Hengshuang Zhao
96
0
0
16 Nov 2025
DEMIST: Decoupled Multi-stream latent diffusion for Quantitative Myelin Map Synthesis
DEMIST: Decoupled Multi-stream latent diffusion for Quantitative Myelin Map Synthesis
Jiacheng Wang
Hao Li
Xing Yao
Ahmad Toubasi
Taegan Vinarsky
...
Chaoyang Jin
Richard Dortch
Junzhong Xu
F. Bagnato
I. Oguz
MedIm
139
0
0
16 Nov 2025
LiDAR-GS++:Improving LiDAR Gaussian Reconstruction via Diffusion Priors
LiDAR-GS++:Improving LiDAR Gaussian Reconstruction via Diffusion Priors
Qifeng Chen
Jiarun Liu
Rengan Xie
Tao Tang
Sicong Du
Yiru Zhao
Yuchi Huo
Sheng Yang
AI4CE
122
0
0
15 Nov 2025
MixAR: Mixture Autoregressive Image Generation
MixAR: Mixture Autoregressive Image Generation
Jinyuan Hu
Jiayou Zhang
Shaobo Cui
Kun Zhang
Guangyi Chen
DiffM
116
0
0
15 Nov 2025
ReCast: Reliability-aware Codebook Assisted Lightweight Time Series Forecasting
ReCast: Reliability-aware Codebook Assisted Lightweight Time Series Forecasting
Xiang Ma
Taihua Chen
Pengcheng Wang
Xuemei Li
Caiming Zhang
AI4TS
50
0
0
15 Nov 2025
Improved Masked Image Generation with Knowledge-Augmented Token Representations
Improved Masked Image Generation with Knowledge-Augmented Token Representations
Guotao Liang
Baoquan Zhang
Zhiyuan Wen
Zihao Han
Yunming Ye
92
0
0
15 Nov 2025
Point Cloud Quantization through Multimodal Prompting for 3D Understanding
Point Cloud Quantization through Multimodal Prompting for 3D Understanding
Hongxuan Li
Wencheng Zhu
Huiying Xu
Xinzhong Zhu
Pengfei Zhu
MQ3DPC
349
0
0
15 Nov 2025
Optimizing Input of Denoising Score Matching is Biased Towards Higher Score Norm
Optimizing Input of Denoising Score Matching is Biased Towards Higher Score Norm
Tongda Xu
DiffM
125
1
0
13 Nov 2025
Towards Leveraging Sequential Structure in Animal Vocalizations
Towards Leveraging Sequential Structure in Animal Vocalizations
Eklavya Sarkar
Mathew Magimai.-Doss
82
0
0
13 Nov 2025
ViPRA: Video Prediction for Robot Actions
ViPRA: Video Prediction for Robot Actions
Sandeep Routray
Hengkai Pan
Unnat Jain
Shikhar Bahl
Deepak Pathak
190
0
0
11 Nov 2025
1234...747576
Next