v1v2v3 (latest)

Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis

12 May 2020

ArXiv (abs)PDF HTML Github (898★)

Papers citing "Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis"

34 / 84 papers shown

Title
Adversarial Audio Synthesis with Complex-valued Polynomial Networks Yongtao Wu Grigorios G. Chrysos Volkan Cevher DiffM 171 4 0 14 Jun 2022
StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis Yinghao Aaron Li Cong Han N. Mesgarani 148 46 0 30 May 2022
Bridging the prosody GAP: Genetic Algorithm with People to efficiently sample emotional prosody Pol van Rijn Harin Lee Nori Jacoby 71 3 0 10 May 2022
Regotron: Regularizing the Tacotron2 architecture via monotonic alignment loss Efthymios Georgiou Kosmas Kritsis Georgios Paraskevopoulos Athanasios Katsamanis Vassilis Katsouros Alexandros Potamianos 156 3 0 28 Apr 2022
Universal Adaptor: Converting Mel-Spectrograms Between Different Configurations for Speech Synthesis Fan Wang Po-Chun Hsu Da-Rong Liu Hung-yi Lee 73 0 0 01 Apr 2022
ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversion Edresson Casanova C. Shulby Alexander Korolev Arnaldo Cândido Júnior A. S. Soares S. Aluísio M. Ponti 170 15 0 29 Mar 2022
Text-free non-parallel many-to-many voice conversion using normalising flows Thomas Merritt Abdelhamid Ezzerg Piotr Bilinski Magdalena Proszewska Kamil Pokora Roberto Barra-Chicote Daniel Korzekwa 133 15 0 15 Mar 2022
Distribution augmentation for low-resource expressive text-to-speech Mateusz Lajszczak Animesh Prasad Arent van Korlaar Bajibabu Bollepalli Antonio Bonafonte ... M. Nicolis Alexis Moinet Thomas Drugman Trevor Wood Elena Sokolova 89 8 0 13 Feb 2022
ItôWave: Itô Stochastic Differential Equation Is All You Need For Wave Generation Shoule Wu Ziqiang Shi DiffM 557 9 0 29 Jan 2022
GANtron: Emotional Speech Synthesis with Generative Adversarial Networks E. Hortal Rodrigo Brechard Alarcia GAN 53 2 0 06 Oct 2021
Integrated Speech and Gesture Synthesis Siyang Wang Simon Alexanderson Joakim Gustafson Jonas Beskow G. Henter Éva Székely 105 19 0 25 Aug 2021
One TTS Alignment To Rule Them All Rohan Badlani A. Lancucki Kevin J. Shih Rafael Valle Ming-Yu Liu Bryan Catanzaro 100 91 0 23 Aug 2021
Daft-Exprt: Cross-Speaker Prosody Transfer on Any Text for Expressive Speech Synthesis Julian Zaïdi Hugo Seuté Benjamin van Niekerk M. Carbonneau 65 26 0 04 Aug 2021
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing Zhaofeng Shi 77 10 0 01 Aug 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 187 382 0 29 Jun 2021
Sprachsynthese -- State-of-the-Art in englischer und deutscher Sprache René Peinl 52 0 0 11 Jun 2021
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech Jaehyeon Kim Jungil Kong Juhee Son DRL 192 987 0 11 Jun 2021
Giving Commands to a Self-Driving Car: How to Deal with Uncertain Situations? Thierry Deruyttere Victor Milewski Marie-Francine Moens 90 15 0 08 Jun 2021
Speaker verification-derived loss and data augmentation for DNN-based multispeaker speech synthesis Beáta Lőrincz Adriana Stan M. Giurgiu 58 6 0 03 Jun 2021
ItôTTS and ItôWave: Linear Stochastic Differential Equation Is All You Need For Audio Generation Shoule Wu Ziqiang Shi DiffM 159 11 0 17 May 2021
Exploring emotional prototypes in a high dimensional TTS latent space Pol van Rijn Silvan Mertes Dominik Schiller Peter M. C. Harrison P. Larrouy-Maestri Elisabeth André Nori Jacoby 62 12 0 05 May 2021
Review of end-to-end speech synthesis technology based on deep learning Zhaoxi Mu Xinyu Yang Yizhuo Dong AuLLM ALM 118 25 0 20 Apr 2021
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech Myeonghun Jeong Hyeongju Kim Sung Jun Cheon Byoung Jin Choi N. Kim DiffM 117 205 0 03 Apr 2021
SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model Edresson Casanova C. Shulby Eren Golge Nicolas Müller F. S. Oliveira Arnaldo Cândido Júnior A. S. Soares S. Aluísio M. Ponti 132 112 0 02 Apr 2021
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention Peng Liu Yuewen Cao Songxiang Liu Na Hu Guangzhi Li Chao Weng Jane Polak Scowcroft 121 23 0 12 Feb 2021
EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture Chenfeng Miao Shuang Liang Zhencheng Liu Minchuan Chen Jun Ma Shaojun Wang Jing Xiao 91 40 0 07 Dec 2020
Text-to-speech for the hearing impaired Josef Schlittenlacher T. Baer 41 0 0 03 Dec 2020
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis Ron J. Weiss RJ Skerry-Ryan Eric Battenberg Soroosh Mariooryad Diederik P. Kingma 129 104 0 06 Nov 2020
Speech Synthesis and Control Using Differentiable DSP Giorgio Fabbro Vladimir Golkov Thomas Kemp Zorah Lähner 90 13 0 28 Oct 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis Zhifeng Kong Ming-Yu Liu Jiaji Huang Kexin Zhao Bryan Catanzaro DiffM BDL 454 1,574 0 21 Sep 2020
FastPitch: Parallel Text-to-speech with Pitch Prediction Adrian Lañcucki 143 359 0 11 Jun 2020
End-to-End Adversarial Text-to-Speech Jeff Donahue Sander Dieleman Mikolaj Binkowski Erich Elsen Karen Simonyan 216 190 0 05 Jun 2020
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search Jaehyeon Kim Sungwon Kim Jungil Kong Sungroh Yoon 180 523 0 22 May 2020
TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese Edresson Casanova A. Júnior C. Shulby F. S. Oliveira João Paulo Teixeira M. Ponti S. Aluísio 96 24 0 11 May 2020