v1v2 (latest)

FastPitch: Parallel Text-to-speech with Pitch Prediction

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

11 June 2020

Papers citing "FastPitch: Parallel Text-to-speech with Pitch Prediction"

33 / 183 papers shown

Title
ESPnet2-TTS: Extending the Edge of TTS Research Tomoki Hayashi Ryuichi Yamamoto Takenori Yoshimura Peter Wu Jiatong Shi Takaaki Saeki Yooncheol Ju Yusuke Yasuda Shinnosuke Takamichi Shinji Watanabe VLM 109 68 0 15 Oct 2021
Exploring Timbre Disentanglement in Non-Autoregressive Cross-Lingual Text-to-Speech Haoyue Zhan Xinyuan Yu Haitong Zhang Yang Zhang Yue Lin 86 5 0 14 Oct 2021
Adapting TTS models For New Speakers using Transfer Learning Paarth Neekhara Jason Chun Lok Li Boris Ginsburg 172 17 0 12 Oct 2021
EdiTTS: Score-based Editing for Controllable Text-to-Speech Jaesung Tae Hyeongju Kim Taesu Kim DiffM 317 42 0 06 Oct 2021
PortaSpeech: Portable and High-Quality Generative Text-to-Speech Yi Ren Jinglin Liu Zhou Zhao 204 86 0 30 Sep 2021
Text-Free Prosody-Aware Generative Spoken Language Modeling Eugene Kharitonov Ann Lee Adam Polyak Yossi Adi Jade Copet ... Tu Nguyen M. Rivière Abdel-rahman Mohamed Emmanuel Dupoux Wei-Ning Hsu 158 133 0 07 Sep 2021
One TTS Alignment To Rule Them All Rohan Badlani A. Lancucki Kevin J. Shih Rafael Valle Ming-Yu Liu Bryan Catanzaro 115 93 0 23 Aug 2021
Daft-Exprt: Cross-Speaker Prosody Transfer on Any Text for Expressive Speech Synthesis Julian Zaïdi Hugo Seuté Benjamin van Niekerk M. Carbonneau 92 26 0 04 Aug 2021
Creation and Detection of German Voice Deepfakes Vanessa Barnekow Dominik Binder Niclas Kromrey Pascal Munaretto A. Schaad Felix Schmieder 36 3 0 02 Aug 2021
Digital Einstein Experience: Fast Text-to-Speech for Conversational AI Joanna Rownicka Nils Messerschmidt A. Tripiana Volodymyr Gromoglasov Timo P. Kunz 30 0 0 21 Jul 2021
Location, Location: Enhancing the Evaluation of Text-to-Speech Synthesis Using the Rapid Prosody Transcription Paradigm Elijah Gutierrez Pilar Oplustil Gallegos Catherine Lai 85 4 0 06 Jul 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 195 389 0 29 Jun 2021
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis Jinhyeok Yang Jaesung Bae Taejun Bak Young-Ik Kim Hoon-Young Cho 150 40 0 29 Jun 2021
FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis Taejun Bak Jaesung Bae Hanbin Bae Young-Ik Kim Hoon-Young Cho 152 18 0 29 Jun 2021
UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control M. Kang Sungjae Kim Injung Kim 116 4 0 21 Jun 2021
Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis D. Mohan Qinmin Hu Tian Huey Teh Alexandra Torresquintero C. Wallis Marlene Staib Lorenzo Foglianti Jiameng Gao Simon King 73 19 0 15 Jun 2021
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation Won Jang D. Lim Jaesam Yoon Bongwan Kim Juntae Kim 167 154 0 15 Jun 2021
Sprachsynthese -- State-of-the-Art in englischer und deutscher Sprache René Peinl 68 0 0 11 Jun 2021
Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation Dong Min Dong Bok Lee Eunho Yang Sung Ju Hwang 234 192 0 06 Jun 2021
Review of end-to-end speech synthesis technology based on deep learning Zhaoxi Mu Xinyu Yang Yizhuo Dong AuLLM ALM 126 27 0 20 Apr 2021
TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction Stanislav Beliaev Boris Ginsburg 77 9 0 16 Apr 2021
Non-autoregressive sequence-to-sequence voice conversion Tomoki Hayashi Wen-Chin Huang Kazuhiro Kobayashi Tomoki Toda 72 25 0 14 Apr 2021
Estimating articulatory movements in speech production with transformer networks Sathvik Udupa Anwesha Roy Abhayjeet Singh Aravind Illa P. Ghosh 100 16 0 11 Apr 2021
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech Myeonghun Jeong Hyeongju Kim Sung Jun Cheon Byoung Jin Choi N. Kim DiffM 125 208 0 03 Apr 2021
Context-Aware Prosody Correction for Text-Based Speech Editing Max Morrison Lucas Rencker Zeyu Jin Nicholas J. Bryan Juan-Pablo Caceres Bryan Pardo 150 33 0 16 Feb 2021
A Survey on Deep Reinforcement Learning for Audio-Based Applications S. Latif Heriberto Cuayáhuitl Farrukh Pervez Fahad Shamshad Hafiz Shehbaz Ali Johan Sulaeman OffRL 183 79 0 01 Jan 2021
Universal MelGAN: A Robust Neural Vocoder for High-Fidelity Waveform Generation in Multiple Domains Won Jang D. Lim Jaesam Yoon 131 37 0 19 Nov 2020
Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis C. Chien Hung-yi Lee 119 40 0 12 Nov 2020
Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminatorsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025 Ryuichi Yamamoto Eunwoo Song Min-Jae Hwang Jae-Min Kim 99 19 0 27 Oct 2020
Recent Developments on ESPnet Toolkit Boosted by ConformerIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025 Pengcheng Guo Florian Boyer Xuankai Chang Tomoki Hayashi Yosuke Higuchi ... Jing Shi Shinji Watanabe Kun Wei Wangyou Zhang Yuekai Zhang 167 270 0 26 Oct 2020
TTS-by-TTS: TTS-driven Data Augmentation for Fast and High-Quality Speech SynthesisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025 Min-Jae Hwang Ryuichi Yamamoto Eunwoo Song Jae-Min Kim 66 32 0 26 Oct 2020
Parallel Tacotron: Non-Autoregressive and Controllable TTSIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025 Isaac Elias Heiga Zen Jonathan Shen Yu Zhang Ye Jia Ron J. Weiss Yonghui Wu DRL 128 106 0 22 Oct 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to SpeechInternational Conference on Learning Representations (ICLR), 2025 Yi Ren Chenxu Hu Xu Tan Tao Qin Sheng Zhao Zhou Zhao Tie-Yan Liu 410 1,517 0 08 Jun 2020