v1v2v3 (latest)

Neural Speech Synthesis with Transformer Network

19 September 2018

Papers citing "Neural Speech Synthesis with Transformer Network"

37 / 37 papers shown

Title
Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT Dongyang Dai Zhiyong Wu Shiyin Kang Xixin Wu Jia Jia Dan Su Dong Yu Helen Meng 95 26 0 03 Jan 2025
Text-to-speech synthesis based on latent variable conversion using diffusion probabilistic model and variational autoencoder Yusuke Yasuda Tomoki Toda DiffM 79 8 0 16 Dec 2022
Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language Yusuke Yasuda Tomoki Toda 121 10 0 16 Dec 2022
Speech Synthesis with Mixed Emotions Kun Zhou Berrak Sisman R. Rana B.W.Schuller Haizhou Li 87 47 0 11 Aug 2022
ESPnet2-TTS: Extending the Edge of TTS Research Tomoki Hayashi Ryuichi Yamamoto Takenori Yoshimura Peter Wu Jiatong Shi Takaaki Saeki Yooncheol Ju Yusuke Yasuda Shinnosuke Takamichi Shinji Watanabe VLM 85 63 0 15 Oct 2021
Controllable Context-aware Conversational Speech Synthesis Jian Cong Shan Yang Na Hu Guangzhi Li Lei Xie Jane Polak Scowcroft 73 30 0 21 Jun 2021
Phone-Level Prosody Modelling with GMM-Based MDN for Diverse and Controllable Speech Synthesis Chenpeng Du K. Yu 154 20 0 27 May 2021
Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram Shengkui Zhao Hao Wang Trung Hieu Nguyen B. Ma 51 20 0 03 Feb 2021
The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans Shinji Watanabe Florian Boyer Xuankai Chang Pengcheng Guo Tomoki Hayashi ... Shigeki Karita Chenda Li Jing Shi Aswin Shanmugam Subramanian Wangyou Zhang VLM 108 38 0 23 Dec 2020
Few Shot Adaptive Normalization Driven Multi-Speaker Speech Synthesis Neeraj Kumar Srishti Goel Ankur Narang Brejesh Lall 68 5 0 14 Dec 2020
EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture Chenfeng Miao Shuang Liang Zhencheng Liu Minchuan Chen Jun Ma Shaojun Wang Jing Xiao 67 38 0 07 Dec 2020
Towards Natural Bilingual and Code-Switched Speech Synthesis Based on Mix of Monolingual Recordings and Cross-Lingual Voice Conversion Shengkui Zhao Trung Hieu Nguyen Hao Wang B. Ma 60 25 0 16 Oct 2020
Exploration of End-to-end Synthesisers forZero Resource Speech Challenge 2020 Karthik Pandia D.S. Anusha Prakash M. M. H. Murthy 42 4 0 10 Sep 2020
A comparison of Vietnamese Statistical Parametric Speech Synthesis Systems Phan Huy Kinh V. Phung Anh-Tuan Dinh Quoc Bao Nguyen 27 1 0 26 May 2020
DiscreTalk: Text-to-Speech as a Machine Translation Problem Tomoki Hayashi Shinji Watanabe 70 32 0 12 May 2020
Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise Shan Yang Yuxuan Wang Lei Xie 66 10 0 28 Apr 2020
Data Processing for Optimizing Naturalness of Vietnamese Text-to-speech System V. Phung Phan Huy Kinh Anh-Tuan Dinh Quoc Bao Nguyen 35 5 0 20 Apr 2020
Probing the phonetic and phonological knowledge of tones in Mandarin TTS models Jian Zhu 62 8 0 23 Dec 2019
Emotional Voice Conversion using Multitask Learning with Text-to-speech Tae-Ho Kim Sungjae Cho Shinkook Choi Sejik Park Soo-Young Lee 92 40 0 11 Nov 2019
Effect of choice of probability distribution, randomness, and search methods for alignment modeling in sequence-to-sequence text-to-speech synthesis using hard alignment Yusuke Yasuda Xin Wang Junichi Yamagishi 25 2 0 28 Oct 2019
Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks Kazuhiro Nakamura Shinji Takaki Kei Hashimoto Keiichiro Oura Yoshihiko Nankaku K. Tokuda 84 19 0 24 Oct 2019
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit Tomoki Hayashi Ryuichi Yamamoto Katsuki Inoue Takenori Yoshimura Shinji Watanabe Tomoki Toda K. Takeda Yu Zhang Xu Tan VLM 93 205 0 24 Oct 2019
Attention Forcing for Sequence-to-sequence Model Training Qingyun Dou Yiting Lu Joshua Efiong Mark Gales 62 6 0 26 Sep 2019
DurIAN: Duration Informed Attention Network For Multimodal Synthesis Chengzhu Yu Heng Lu Na Hu Meng Yu Chao Weng ... Deyi Tuo Shiyin Kang Guangzhi Lei Jane Polak Scowcroft Dong Yu CVBM 85 118 0 04 Sep 2019
Initial investigation of an encoder-decoder end-to-end TTS framework using marginalization of monotonic hard latent alignments Yusuke Yasuda Xin Wang Junichi Yamagishi 58 8 0 30 Aug 2019
Maximizing Mutual Information for Tacotron Peng Liu Xixin Wu Shiyin Kang Guangzhi Li Jane Polak Scowcroft Dong Yu 86 16 0 30 Aug 2019
Forward-Backward Decoding for Regularizing End-to-End TTS Yibin Zheng Xi Wang Lei He Shifeng Pan Frank Soong Zhengqi Wen J. Tao 48 13 0 18 Jul 2019
Almost Unsupervised Text to Speech and Automatic Speech Recognition Yi Ren Xu Tan Tao Qin Sheng Zhao Zhou Zhao Tie-Yan Liu 95 102 0 13 May 2019
The Zero Resource Speech Challenge 2019: TTS without T Ewan Dunbar Robin Algayres Julien Karadayi Mathieu Bernard Juan Benjumea ... Lucas Ondel A. Black Laurent Besacier S. Sakti Emmanuel Dupoux 94 117 0 25 Apr 2019
A New GAN-based End-to-End TTS Training Algorithm Haohan Guo Frank Soong Lei He Lei Xie 95 47 0 09 Apr 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech Heiga Zen Viet Dang R. Clark Yu Zhang Ron J. Weiss Ye Jia Zhiwen Chen Yonghui Wu 164 959 0 05 Apr 2019
In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data N. Prateek Mateusz Lajszczak Roberto Barra-Chicote Thomas Drugman Jaime Lorenzo-Trueba Thomas Merritt S. Ronanki Trevor Wood 84 30 0 04 Apr 2019
Multi-reference Tacotron by Intercross Training for Style Disentangling,Transfer and Control in Speech Synthesis Yanyao Bian Changbin Chen Yongguo Kang Zhenglin Pan 77 46 0 04 Apr 2019
Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet Mingyang Zhang Xin Wang Fuming Fang Haizhou Li Junichi Yamagishi 70 50 0 29 Mar 2019
FPETS : Fully Parallel End-to-End Text-to-Speech System Dabiao Ma Zhiba Su Wenxuan Wang Yuhao Lu 58 6 0 12 Dec 2018
Learning latent representations for style control and transfer in end-to-end speech synthesis Ya-Jie Zhang Shifeng Pan Lei He Zhenhua Ling BDL SSL DRL 94 229 0 11 Dec 2018
Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language Yusuke Yasuda Xin Wang Shinji Takaki Junichi Yamagishi 63 87 0 29 Oct 2018