Low Bit-Rate Speech Coding with VQ-VAE and a WaveNet Decoder

14 October 2019

Papers citing "Low Bit-Rate Speech Coding with VQ-VAE and a WaveNet Decoder"

25 / 25 papers shown

Title
Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations Xue Jiang Xiulian Peng Yuan Zhang Yan-Heng Lu SSL 83 0 0 15 Mar 2025
Learning Source Disentanglement in Neural Audio Codec Xiaoyu Bie Xubo Liu Gaël Richard 29 1 0 17 Sep 2024
OpenACE: An Open Benchmark for Evaluating Audio Coding Performance Jozef Coldenhoff Niclas Granqvist Milos Cernak 30 0 0 12 Sep 2024
AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes Zhaohui Li Haitao Wang Xinghua Jiang 40 1 0 14 Aug 2023
JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models Peike Li Bo-Yu Chen Yao Yao Yikai Wang Allen Wang Alex Jinpeng Wang MGen VLM DiffM 70 37 0 09 Aug 2023
Native Multi-Band Audio Coding within Hyper-Autoencoded Reconstruction Propagation Networks Darius Petermann Inseon Jang Minje Kim 16 1 0 14 Mar 2023
NESC: Robust Neural End-2-End Speech Coding with GANs N. Pia Kishan Gupta Srikanth Korse M. Multrus Guillaume Fuchs 33 15 0 07 Jul 2022
Cross-Scale Vector Quantization for Scalable Neural Speech Coding Xue Jiang Xiulian Peng Huaying Xue Yuan Zhang Yan Lu MQ 39 9 0 07 Jul 2022
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis Karren D. Yang Dejan Marković Steven Krenn Vasu Agrawal Alexander Richard VGen 16 32 0 31 Mar 2022
Practical cognitive speech compression Reza Lotfidereshgi P. Gournay 32 2 0 08 Mar 2022
End-to-End Neural Speech Coding for Real-Time Communications Xue Jiang Xiulian Peng Chengyu Zheng Huaying Xue Yuan Zhang Yan Lu 26 27 0 24 Jan 2022
Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models Phil Chen Masha Itkina Ransalu Senanayake Mykel J. Kochenderfer 33 6 0 27 Oct 2021
Cognitive Coding of Speech Reza Lotfidereshgi P. Gournay 32 5 0 08 Oct 2021
HARP-Net: Hyper-Autoencoded Reconstruction Propagation for Scalable Neural Audio Coding Darius Petermann Seungkwon Beack Minje Kim 22 14 0 22 Jul 2021
SoundStream: An End-to-End Neural Audio Codec Neil Zeghidour Alejandro Luebs Ahmed Omran Jan Skoglund Marco Tagliasacchi AI4TS 43 722 0 07 Jul 2021
Generative Speech Coding with Predictive Variance Regularization W. Kleijn Andrew Storus Michael Chinen Tom Denton Felicia S. C. Lim Alejandro Luebs Jan Skoglund Hengchin Yeh 21 66 0 18 Feb 2021
Enhancing into the codec: Noise Robust Speech Coding with Vector-Quantized Autoencoders Jonah Casebeer Vinjai Vale Umut Isik J. Valin Ritwik Giri A. Krishnaswamy 54 18 0 12 Feb 2021
Psychoacoustic Calibration of Loss Functions for Efficient End-to-End Neural Audio Coding Kai Zhen Mi Suk Lee Jongmo Sung Seung-Wha Beack Minje Kim 32 21 0 31 Dec 2020
Deep Residual Mixture Models Perttu Hämäläinen Martin Trapp Tuure Saloheimo Arno Solin 28 8 0 22 Jun 2020
Efficient And Scalable Neural Residual Waveform Coding With Collaborative Quantization Kai Zhen Mi Suk Lee Jongmo Sung Seungkwon Beack Minje Kim 30 20 0 13 Feb 2020
Speech bandwidth extension with WaveNet Archit Gupta Brendan Shillingford Yannis Assael Thomas C. Walters 19 28 0 05 Jul 2019
Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding Kai Zhen Jongmo Sung Mi Suk Lee Seungkwon Beack Minje Kim 27 39 0 18 Jun 2019
Improving Opus Low Bit Rate Quality with Neural Speech Synthesis Jan Skoglund J. Valin 37 38 0 12 May 2019
A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet J. Valin Jan Skoglund 18 78 0 28 Mar 2019
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis Ye Jia Yu Zhang Ron J. Weiss Quan Wang Jonathan Shen ... Z. Chen Patrick Nguyen Ruoming Pang Ignacio López Moreno Yonghui Wu 207 820 0 12 Jun 2018