Title
Introduction to Voice Presentation Attack Detection and Recent Advances Md. Sahidullah Héctor Delgado Massimiliano Todisco Tomi Kinnunen Nicholas W. D. Evans Junichi Yamagishi Kong-Aik Lee AAML 83 75 0 04 Jan 2019
Feature reinforcement with word embedding and parsing information in neural TTS Huaiping Ming Lei He Haohan Guo Frank Soong 153 15 0 03 Jan 2019
Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice Yan Deng Lei He Frank Soong 114 29 0 13 Dec 2018
FPETS : Fully Parallel End-to-End Text-to-Speech System Dabiao Ma Zhiba Su Wenxuan Wang Yuhao Lu 58 6 0 12 Dec 2018
Learning latent representations for style control and transfer in end-to-end speech synthesis Ya-Jie Zhang Shifeng Pan Lei He Zhenhua Ling BDL SSL DRL 83 229 0 11 Dec 2018
Generative Adversarial Network based Speaker Adaptation for High Fidelity WaveNet Vocoder Qiao Tian Bing Yang Shan Liu GAN 55 9 0 06 Dec 2018
LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis Min-Jae Hwang Frank Soong Fenglong Xie Xi Wang Hyeonjoo Kang Hong-Goo Kang 51 21 0 29 Nov 2018
Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion Wen-Chin Huang Yi-Chiao Wu Hsin-Te Hwang Patrick Lumban Tobing Tomoki Hayashi Kazuhiro Kobayashi Tomoki Toda Yu Tsao H. Wang 61 20 0 27 Nov 2018
Learning pronunciation from a foreign language in speech synthesis networks Younggun Lee Suwon Shon Taesu Kim 58 28 0 23 Nov 2018
TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer Sicong Huang Qiyang Li Cem Anil Xuchan Bao Sageev Oore Roger C. Grosse 92 98 0 22 Nov 2018
Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes Yue Liu Yu Zhang Tara N. Sainath Yonghui Wu William Chan AuLLM 79 131 0 22 Nov 2018
The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation Kai Chen Weilin Zhang Shlomo Dubnov Gus Xia Wei Li MGen 39 5 0 20 Nov 2018
Improving Sequence-to-Sequence Acoustic Modeling by Adding Text-Supervision Jing-Xuan Zhang Zhenhua Ling Yuan Jiang Li-Juan Liu Chen Liang Lirong Dai 80 30 0 20 Nov 2018
Representation Mixing for TTS Synthesis Kyle Kastner J. F. Santos Yoshua Bengio Aaron Courville 55 43 0 17 Nov 2018
Generating Albums with SampleRNN to Imitate Metal, Rock, and Punk Bands CJ Carr Zack Zukowski MGen 35 20 0 16 Nov 2018
Effect of data reduction on sequence-to-sequence neural TTS Javier Latorre Jakub Lachowicz Jaime Lorenzo-Trueba Thomas Merritt Thomas Drugman S. Ronanki Klimkov Viacheslav 90 59 0 15 Nov 2018
Comprehensive evaluation of statistical speech waveform synthesis Thomas Merritt Bartosz Putrycz Adam Nadolski Tianjun Ye Daniel Korzekwa ... Alexis Moinet A. Breen Rafal Kuklinski N. Strom Roberto Barra-Chicote 51 18 0 15 Nov 2018
Towards achieving robust universal neural vocoding Jaime Lorenzo-Trueba Thomas Drugman Javier Latorre Thomas Merritt Bartosz Putrycz Roberto Barra-Chicote Alexis Moinet Vatsal Aggarwal DRL 135 19 0 15 Nov 2018
PerformanceNet: Score-to-Audio Music Generation with Multi-Band Convolutional Residual Network Bryan Wang Yi-Hsuan Yang 71 38 0 11 Nov 2018
ExcitNet vocoder: A neural excitation model for parametric speech synthesis systems Eunwoo Song Kyungguen Byun Hong-Goo Kang 75 29 0 09 Nov 2018
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms Kou Tanaka Hirokazu Kameoka Takuhiro Kaneko Nobukatsu Hojo 72 112 0 09 Nov 2018
Speaker-adaptive neural vocoders for parametric speech synthesis systems Eunwoo Song Xiang Yu Erik Cambria Jagath Rajapakse 49 3 0 08 Nov 2018
Reconstructing Speech Stimuli From Human Auditory Cortex Activity Using a WaveNet Approach Ran Wang Yao Wang A. Flinker 29 7 0 06 Nov 2018
FloWaveNet : A Generative Flow for Raw Audio Sungwon Kim Sang-gil Lee Jongyoon Song Jaehyeon Kim Sungroh Yoon 116 169 0 06 Nov 2018
Robust and fine-grained prosody control of end-to-end speech synthesis Younggun Lee Jonathan Le Roux 88 147 0 06 Nov 2018
Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation Ye Jia Melvin Johnson Wolfgang Macherey Ron J. Weiss Yuan Cao Chung-Cheng Chiu Naveen Ari Stella Laurenzo Yonghui Wu 98 163 0 05 Nov 2018
ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion Hirokazu Kameoka Kou Tanaka Damian Kwaśny Takuhiro Kaneko Nobukatsu Hojo 92 64 0 05 Nov 2018
Investigating context features hidden in End-to-End TTS Kohki Mametani T. Kato Seiichi Yamamoto 40 9 0 04 Nov 2018
Cycle-consistency training for end-to-end speech recognition Takaaki Hori Ramón Fernández Astudillo Tomoki Hayashi Yu Zhang Shinji Watanabe Jonathan Le Roux 97 87 0 02 Nov 2018
Training Neural Speech Recognition Systems with Synthetic Speech Augmentation Jason Chun Lok Li R. Gadde Boris Ginsburg Vitaly Lavrukhin 63 55 0 02 Nov 2018
Neural Music Synthesis for Flexible Timbre Control Jong Wook Kim Rachel M. Bittner Aparna Kumar J. P. Bello 65 39 0 01 Nov 2018
WaveGlow: A Flow-based Generative Network for Speech Synthesis R. Prenger Rafael Valle Bryan Catanzaro 174 1,036 0 31 Oct 2018
Waveform generation for text-to-speech synthesis using pitch-synchronous multi-scale generative adversarial networks Lauri Juvela Bajibabu Bollepalli Junichi Yamagishi P. Alku 64 23 0 30 Oct 2018
End-to-end music source separation: is it possible in the waveform domain? Francesc Lluís Jordi Pons Xavier Serra 76 73 0 29 Oct 2018
Audio inpainting of music by means of neural networks Andrés Marafioti Nicki Holighaus P. Majdak Nathanael Perraudin 75 18 0 29 Oct 2018
Speaking style adaptation in Text-To-Speech synthesis using Sequence-to-sequence models with attention Bajibabu Bollepalli Lauri Juvela P. Alku 51 4 0 29 Oct 2018
Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language Yusuke Yasuda Xin Wang Shinji Takaki Junichi Yamagishi 63 87 0 29 Oct 2018
Neural source-filter-based waveform model for statistical parametric speech synthesis Xin Wang Shinji Takaki Junichi Yamagishi 99 125 0 29 Oct 2018
STFT spectral loss for training a neural speech waveform model Shinji Takaki Toru Nakashika Xin Wang Junichi Yamagishi 75 21 0 29 Oct 2018
LPCNet: Improving Neural Speech Synthesis Through Linear Prediction J. Valin Jan Skoglund 74 451 0 28 Oct 2018
Reducing over-smoothness in speech synthesis using Generative Adversarial Networks Leyuan Sheng Evgeny Nikolaevich Pavlovskiy GAN 55 9 0 25 Oct 2018
Hierarchical Generative Modeling for Controllable Speech Synthesis Wei-Ning Hsu Yu Zhang Ron J. Weiss Heiga Zen Yonghui Wu ... Ye Jia Zhiwen Chen Jonathan Shen Patrick Nguyen Ruoming Pang BDL 79 276 0 16 Oct 2018
Sequence-to-Sequence Acoustic Modeling for Voice Conversion Jing-Xuan Zhang Zhenhua Ling Li-Juan Liu Yuan Jiang Lirong Dai 82 130 0 16 Oct 2018
A Fully Time-domain Neural Model for Subband-based Speech Synthesizer Azam Rabiee Geonmin Kim Tae-Ho Kim Soo-Young Lee 18 1 0 12 Oct 2018
Conditional WaveGAN Chae Young Lee Anoop Toffy G. Jung W. Han DiffM 46 21 0 27 Sep 2018
Neural Speech Synthesis with Transformer Network Naihan Li Shujie Liu Yanqing Liu Sheng Zhao Ming-Yuan Liu M. Zhou 72 102 0 19 Sep 2018
Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis Yu-An Chung Yuxuan Wang Wei-Ning Hsu Yu Zhang RJ Skerry-Ryan 87 117 0 30 Aug 2018
Fast Spectrogram Inversion using Multi-head Convolutional Neural Networks Sercan O. Arik Heewoo Jun G. Diamos 76 108 0 20 Aug 2018
Multimodal speech synthesis architecture for unsupervised speaker adaptation Hieu-Thi Luong Junichi Yamagishi 65 10 0 20 Aug 2018
Investigating accuracy of pitch-accent annotations in neural network-based speech synthesis and denoising effects Hieu-Thi Luong Xin Wang Junichi Yamagishi Nobuyuki Nishizawa 51 16 0 02 Aug 2018