v1v2v3 (latest)

ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech

19 July 2018

Ming-Yu Liu

Kainan Peng

Jitong Chen

ArXiv (abs)PDF HTML

Papers citing "ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech"

35 / 135 papers shown

Title
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis Kundan Kumar Rithesh Kumar T. Boissière L. Gestin Wei Zhen Teoh Jose M. R. Sotelo A. D. Brébisson Yoshua Bengio Aaron Courville GAN 178 961 0 08 Oct 2019
High Fidelity Speech Synthesis with Adversarial Networks Mikolaj Binkowski Jeff Donahue Sander Dieleman Aidan Clark Erich Elsen Norman Casagrande Luis C. Cobo Karen Simonyan 309 240 0 25 Sep 2019
DurIAN: Duration Informed Attention Network For Multimodal Synthesis Chengzhu Yu Heng Lu Na Hu Meng Yu Chao Weng ... Deyi Tuo Shiyin Kang Guangzhi Lei Jane Polak Scowcroft Dong Yu CVBM 83 118 0 04 Sep 2019
Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech Synthesis Xin Wang Junichi Yamagishi 66 32 0 27 Aug 2019
Quasi-Periodic WaveNet Vocoder: A Pitch Dependent Dilated Convolution Model for Parametric Speech Generation Yi-Chiao Wu Tomoki Hayashi Patrick Lumban Tobing Kazuhiro Kobayashi Tomoki Toda 49 16 0 01 Jul 2019
End-to-End Emotional Speech Synthesis Using Style Tokens and Semi-Supervised Training Peng Wu Zhenhua Ling Li-Juan Liu Yuan Jiang Hong-Chuan Wu Lirong Dai 92 72 0 26 Jun 2019
A Neural Vocoder with Hierarchical Generation of Amplitude and Phase Spectra for Statistical Parametric Speech Synthesis Yang Ai Zhenhua Ling 123 29 0 23 Jun 2019
Towards Transfer Learning for End-to-End Speech Synthesis from Deep Pre-Trained Language Models Wei Fang Yu-An Chung James R. Glass 61 27 0 17 Jun 2019
MelNet: A Generative Model for Audio in the Frequency Domain Sean Vasquez M. Lewis DiffM 85 132 0 04 Jun 2019
Discrete Flows: Invertible Generative Models of Discrete Data Dustin Tran Keyon Vafa Kumar Krishna Agrawal Laurent Dinh Ben Poole DRL 164 117 0 24 May 2019
Effective parameter estimation methods for an ExcitNet model in generative text-to-speech systems Ohsung Kwon Eunwoo Song Jae-Min Kim Hong-Goo Kang 48 4 0 21 May 2019
Non-Autoregressive Neural Text-to-Speech Kainan Peng Ming-Yu Liu Z. Song Kexin Zhao 101 40 0 21 May 2019
Almost Unsupervised Text to Speech and Automatic Speech Recognition Yi Ren Xu Tan Tao Qin Sheng Zhao Zhou Zhao Tie-Yan Liu 95 102 0 13 May 2019
Neural source-filter waveform models for statistical parametric speech synthesis Xin Wang Shinji Takaki Junichi Yamagishi 97 118 0 27 Apr 2019
A New GAN-based End-to-End TTS Training Algorithm Haohan Guo Frank Soong Lei He Lei Xie 95 47 0 09 Apr 2019
Probability density distillation with generative adversarial networks for high-quality parallel waveform generation Ryuichi Yamamoto Eunwoo Song Jae-Min Kim 70 55 0 09 Apr 2019
GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-spectrogram Lauri Juvela Bajibabu Bollepalli Junichi Yamagishi P. Alku 76 18 0 08 Apr 2019
Towards Generalized Speech Enhancement with Generative Adversarial Networks Santiago Pascual Joan Serrà Antonio Bonafonte GAN 67 33 0 06 Apr 2019
WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation Kou Tanaka Hirokazu Kameoka Takuhiro Kaneko Nobukatsu Hojo 80 19 0 05 Apr 2019
Unsupervised Polyglot Text To Speech Eliya Nachmani Lior Wolf 65 42 0 06 Feb 2019
Feature reinforcement with word embedding and parsing information in neural TTS Huaiping Ming Lei He Haohan Guo Frank Soong 157 15 0 03 Jan 2019
LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis Min-Jae Hwang Frank Soong Fenglong Xie Xi Wang Hyeonjoo Kang Hong-Goo Kang 67 21 0 29 Nov 2018
Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective Zhong-Qiu Wang Ke Tan DeLiang Wang 119 95 0 22 Nov 2018
Representation Mixing for TTS Synthesis Kyle Kastner J. F. Santos Yoshua Bengio Aaron Courville 55 43 0 17 Nov 2018
Towards achieving robust universal neural vocoding Jaime Lorenzo-Trueba Thomas Drugman Javier Latorre Thomas Merritt Bartosz Putrycz Roberto Barra-Chicote Alexis Moinet Vatsal Aggarwal DRL 149 19 0 15 Nov 2018
FloWaveNet : A Generative Flow for Raw Audio Sungwon Kim Sang-gil Lee Jongyoon Song Jaehyeon Kim Sungroh Yoon 133 169 0 06 Nov 2018
WaveGlow: A Flow-based Generative Network for Speech Synthesis R. Prenger Rafael Valle Bryan Catanzaro 174 1,036 0 31 Oct 2018
Waveform generation for text-to-speech synthesis using pitch-synchronous multi-scale generative adversarial networks Lauri Juvela Bajibabu Bollepalli Junichi Yamagishi P. Alku 74 23 0 30 Oct 2018
Speaking style adaptation in Text-To-Speech synthesis using Sequence-to-sequence models with attention Bajibabu Bollepalli Lauri Juvela P. Alku 51 4 0 29 Oct 2018
Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language Yusuke Yasuda Xin Wang Shinji Takaki Junichi Yamagishi 63 87 0 29 Oct 2018
Neural source-filter-based waveform model for statistical parametric speech synthesis Xin Wang Shinji Takaki Junichi Yamagishi 121 125 0 29 Oct 2018
STFT spectral loss for training a neural speech waveform model Shinji Takaki Toru Nakashika Xin Wang Junichi Yamagishi 75 21 0 29 Oct 2018
A Fully Time-domain Neural Model for Subband-based Speech Synthesizer Azam Rabiee Geonmin Kim Tae-Ho Kim Soo-Young Lee 18 1 0 12 Oct 2018
Neural Speech Synthesis with Transformer Network Naihan Li Shujie Liu Yanqing Liu Sheng Zhao Ming-Yuan Liu M. Zhou 85 102 0 19 Sep 2018
Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis Yu-An Chung Yuxuan Wang Wei-Ning Hsu Yu Zhang RJ Skerry-Ryan 87 117 0 30 Aug 2018