v1v2v3 (latest)

Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis

Spoken Language Technology Workshop (SLT), 2020

12 November 2020

C. Chien

Hung-yi Lee

ArXiv (abs)PDF HTML

Papers citing "Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis"

22 / 22 papers shown

ProMode: A Speech Prosody Model Conditioned on Acoustic and Textual Inputs

193

12 Aug 2025

Voice Cloning: Comprehensive Survey

Hussam Azzuni

Abdulmotaleb El Saddik

VLM

450

01 May 2025

CorrTalk: Correlation Between Hierarchical Speech and Facial Activity Variances for 3D Animation

300

17 Oct 2023

CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis

Yi Meng

Xiang Li

Zhiyong Wu

192

30 Aug 2023

KEST: Kernel Distance Based Efficient Self-Training for Improving Controllable Text GenerationInternational Joint Conference on Artificial Intelligence (IJCAI), 2023

Yuxi Feng

Xiaoyuan Yi

L. Lakshmanan

Xing Xie

222

17 Jun 2023

ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS AdaptationInterspeech (Interspeech), 2023

Ambuj Mehrish

Abhinav Ramesh Kashyap

Yingting Li

Navonil Majumder

Soujanya Poria

244

29 May 2023

FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model

368

06 Mar 2023

Controllable speech synthesis by learning discrete phoneme-level prosodic representationsSpeech Communication (Speech Commun.), 2022

Aimilios Chalamandaris

Pirros Tsiakoulis

P. Mastorocostas

186

29 Nov 2022

Predicting phoneme-level prosody latents using AR and flow-based Prior Networks for expressive speech synthesis

Aimilios Chalamandaris

Pirros Tsiakoulis

294

02 Nov 2022

A Survey on Non-Autoregressive Generation for Neural Machine Translation and BeyondIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

Junliang Guo

317

121

20 Apr 2022

Hierarchical and Multi-Scale Variational Autoencoder for Diverse and Natural Non-Autoregressive Text-to-SpeechInterspeech (Interspeech), 2022

344

08 Apr 2022

ProsoSpeech: Enhancing Prosody With Quantized Vector Pre-training in Text-to-SpeechIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Qian Chen

Zhou Zhao

221

16 Feb 2022

Unsupervised word-level prosody tagging for controllable speech synthesisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Yiwei Guo

Chenpeng Du

Kai Yu

255

15 Feb 2022

MsEmoTTS: Multi-scale emotion transfer, prediction, and control for emotional speech synthesisIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Yinjiao Lei

Shan Yang

Xinsheng Wang

Lei Xie

239

17 Jan 2022

Improved Prosodic Clustering for Multispeaker and Speaker-independent Phoneme-level Prosody ControlInternational Conference on Speech and Computer (SPECOM), 2021

Aimilios Chalamandaris

Pirros Tsiakoulis

202

19 Nov 2021

Improving Prosody for Unseen Texts in Speech Synthesis by Utilizing Linguistic Information and Noisy Data

160

15 Nov 2021

Hierarchical prosody modeling and control in non-autoregressive parallel neural TTS

T. Raitio

Jiangchuan Li

Shreyas Seshadri

249

06 Oct 2021

Multi-Scale Spectrogram Modelling for Neural Text-to-SpeechSpeech Synthesis Workshop (SS), 2021

200

29 Jun 2021

A Survey on Neural Speech Synthesis

Xu Tan

453

446

29 Jun 2021

UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control

M. Kang

Sungjae Kim

Injung Kim

364

21 Jun 2021

Phone-Level Prosody Modelling with GMM-Based MDN for Diverse and Controllable Speech SynthesisIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021

Chenpeng Du

K. Yu

378

27 May 2021

Rich Prosody Diversity Modelling with Phone-level Mixture Density NetworkInterspeech (Interspeech), 2021

Chenpeng Du

K. Yu

371

01 Feb 2021