Training Multi-Speaker Neural Text-to-Speech Systems using
Speaker-Imbalanced Speech Corpora

v1v2 (latest)

Training Multi-Speaker Neural Text-to-Speech Systems using Speaker-Imbalanced Speech Corpora

1 April 2019

Xin Wang

Junichi Yamagishi

Nobuyuki Nishizawa

ArXiv (abs)PDF HTML

Papers citing "Training Multi-Speaker Neural Text-to-Speech Systems using Speaker-Imbalanced Speech Corpora"

14 / 14 papers shown

Title
A multilingual training strategy for low resource Text to Speech Asma Amalas Mounir Ghogho Mohamed Chetouani Rachid Oulad Haj Thami 76 2 0 02 Sep 2024
Building a Luganda Text-to-Speech Model From Crowdsourced Data Sulaiman Kagumire Andrew Katumba J. Nakatumba‐Nabende John Quinn 33 1 0 16 May 2024
Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis Kenichi Fujita Atsushi Ando Yusuke Ijima 28 2 0 11 Feb 2024
Custom Data Augmentation for low resource ASR using Bark and Retrieval-Based Voice Conversion Anand Kamble Aniket Tathe Suyash Kumbharkar Atharva Bhandare Anirban C. Mitra 152 1 0 24 Nov 2023
Combining speakers of multiple languages to improve quality of neural voices Javier Latorre Charlotte Bailleul Tuuli H. Morrill Alistair Conkie Y. Stylianou 64 8 0 17 Aug 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 133 359 0 29 Jun 2021
AI based Presentation Creator With Customized Audio Content Delivery Muvazima Mansoor Srikanth Chandar Ramamoorthy Srinath 113 0 0 27 Jun 2021
Preliminary study on using vector quantization latent spaces for TTS/VC systems with consistent performance Hieu-Thi Luong Junichi Yamagishi 85 0 0 25 Jun 2021
The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS Wen-Chin Huang Tomoki Hayashi Shinji Watanabe Tomoki Toda DRL 81 40 0 06 Oct 2020
Efficient neural speech synthesis for low-resource languages through multilingual modeling M. D. Korte Jaebok Kim E. Klabbers 59 19 0 20 Aug 2020
Modeling Prosodic Phrasing with Multi-Task Learning in Tacotron-based TTS Rui Liu Berrak Sisman F. Bao Guanglai Gao Haizhou Li 41 18 0 11 Aug 2020
Multi-speaker Text-to-speech Synthesis Using Deep Gaussian Processes Kentaro Mitsui Tomoki Koriyama Hiroshi Saruwatari 48 5 0 07 Aug 2020
Expressive TTS Training with Frame and Style Reconstruction Loss Rui Liu Berrak Sisman Guanglai Gao Haizhou Li 112 73 0 04 Aug 2020
Teacher-Student Training for Robust Tacotron-based TTS Rui Liu Berrak Sisman Jingdong Li F. Bao Guanglai Gao Haizhou Li 109 38 0 07 Nov 2019