In Other News: A Bi-style Text-to-speech Model for Synthesizing
Newscaster Voice with Limited Data

In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data

4 April 2019

Mateusz Lajszczak

Roberto Barra-Chicote

Jaime Lorenzo-Trueba

Trevor Wood

ArXiv (abs)PDF HTML

Papers citing "In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data"

15 / 15 papers shown

Title
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data Mateusz Lajszczak Guillermo Cámbara Yang Li Fatih Beyhan Arent van Korlaar ... Bartosz Putrycz Soledad López Gambino Kayeon Yoo Elena Sokolova Thomas Drugman LM&MA 113 88 0 12 Feb 2024
On granularity of prosodic representations in expressive text-to-speech Mikolaj Babianski Kamil Pokora Raahil Shah Rafał Sienkiewicz Daniel Korzekwa V. Klimkov 66 6 0 26 Jan 2023
Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody Peter Makarov Ammar Abbas Mateusz Lajszczak Arnaud Joly S. Karlapati Alexis Moinet Thomas Drugman Penny Karanasou 89 16 0 29 Jun 2022
Prosodic Alignment for off-screen automatic dubbing Yogesh Virkar Marcello Federico Robert Enyedi Roberto Barra-Chicote 81 9 0 06 Apr 2022
Discrete Acoustic Space for an Efficient Sampling in Neural Text-To-Speech Mu Li Jonas Rohnke Antonio Bonafonte Mateusz Lajszczak Trevor Wood DRL 96 2 0 24 Oct 2021
Machine Translation Verbosity Control for Automatic Dubbing Surafel Melaku Lakew Marcello Federico Yue Wang Cuong Hoang Yogesh Virkar Roberto Barra-Chicote Robert Enyedi 63 24 0 08 Oct 2021
Enhancing audio quality for expressive Neural Text-to-Speech Abdelhamid Ezzerg Adam Gabry's Bartosz Putrycz Daniel Korzekwa Daniel Sáez-Trigueros David McHardy Kamil Pokora Jakub Lachowicz Jaime Lorenzo-Trueba V. Klimkov 130 6 0 13 Aug 2021
SynthASR: Unlocking Synthetic Data for Speech Recognition A. Fazel Wei Yang Yulan Liu Roberto Barra-Chicote Yi Meng Roland Maas J. Droppo SyDa 110 51 0 14 Jun 2021
Parallel WaveNet conditioned on VAE latent vectors Jonas Rohnke Thomas Merritt Jaime Lorenzo-Trueba Adam Gabry's Vatsal Aggarwal Alexis Moinet Roberto Barra-Chicote 74 3 0 17 Dec 2020
Bootstrap an end-to-end ASR system by multilingual training, transfer learning, text-to-text mapping and synthetic audio Manuel Giollo Deniz Gunceler Yulan Liu D. Willett 54 12 0 25 Nov 2020
BOFFIN TTS: Few-Shot Speaker Adaptation by Bayesian Optimization Henry B. Moss Vatsal Aggarwal N. Prateek Javier I. González Roberto Barra-Chicote BDL 51 57 0 04 Feb 2020
From Speech-to-Speech Translation to Automatic Dubbing Marcello Federico Robert Enyedi Roberto Barra-Chicote Ritwik Giri Umut Isik A. Krishnaswamy Hassan Sawaf 106 43 0 19 Jan 2020
Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection Shubhi Tyagi M. Nicolis Jonas Rohnke Thomas Drugman Jaime Lorenzo-Trueba 77 32 0 02 Dec 2019
Using VAEs and Normalizing Flows for One-shot Text-To-Speech Synthesis of Expressive Speech Vatsal Aggarwal Marius Cotescu N. Prateek Jaime Lorenzo-Trueba Roberto Barra-Chicote 86 31 0 28 Nov 2019
Fine-grained robust prosody transfer for single-speaker neural text-to-speech V. Klimkov S. Ronanki Jonas Rohnke Thomas Drugman AI4TS 89 82 0 04 Jul 2019