ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.06465
  4. Cited By
Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis
v1v2v3 (latest)

Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis

Spoken Language Technology Workshop (SLT), 2020
12 November 2020
C. Chien
Hung-yi Lee
ArXiv (abs)PDFHTML

Papers citing "Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis"

22 / 22 papers shown
ProMode: A Speech Prosody Model Conditioned on Acoustic and Textual Inputs
ProMode: A Speech Prosody Model Conditioned on Acoustic and Textual Inputs
Eray Eren
Qingju Liu
Hyeongwoo Kim
Pablo Garrido
Abeer Alwan
193
0
0
12 Aug 2025
Voice Cloning: Comprehensive Survey
Voice Cloning: Comprehensive Survey
Hussam Azzuni
Abdulmotaleb El Saddik
VLM
450
6
0
01 May 2025
CorrTalk: Correlation Between Hierarchical Speech and Facial Activity
  Variances for 3D Animation
CorrTalk: Correlation Between Hierarchical Speech and Facial Activity Variances for 3D Animation
Zhaojie Chu
K. Guo
Xiaofen Xing
Yilin Lan
Bolun Cai
Xiangmin Xu
300
13
0
17 Oct 2023
CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive
  Text-to-Speech Synthesis
CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis
Yi Meng
Xiang Li
Zhiyong Wu
Tingtian Li
Zixun Sun
Xinyu Xiao
Chi Sun
Hui Zhan
Helen Meng
192
1
0
30 Aug 2023
KEST: Kernel Distance Based Efficient Self-Training for Improving
  Controllable Text Generation
KEST: Kernel Distance Based Efficient Self-Training for Improving Controllable Text GenerationInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
Yuxi Feng
Xiaoyuan Yi
L. Lakshmanan
Xing Xie
222
1
0
17 Jun 2023
ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for
  Low-Resource TTS Adaptation
ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS AdaptationInterspeech (Interspeech), 2023
Ambuj Mehrish
Abhinav Ramesh Kashyap
Yingting Li
Navonil Majumder
Soujanya Poria
244
14
0
29 May 2023
FoundationTTS: Text-to-Speech for ASR Customization with Generative
  Language Model
FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model
Rui Xue
Yanqing Liu
Lei He
Xuejiao Tan
Linquan Liu
Ed Lin
Sheng Zhao
368
9
0
06 Mar 2023
Controllable speech synthesis by learning discrete phoneme-level
  prosodic representations
Controllable speech synthesis by learning discrete phoneme-level prosodic representationsSpeech Communication (Speech Commun.), 2022
Nikolaos Ellinas
Myrsini Christidou
Alexandra Vioni
June Sig Sung
Aimilios Chalamandaris
Pirros Tsiakoulis
P. Mastorocostas
186
10
0
29 Nov 2022
Predicting phoneme-level prosody latents using AR and flow-based Prior
  Networks for expressive speech synthesis
Predicting phoneme-level prosody latents using AR and flow-based Prior Networks for expressive speech synthesis
Konstantinos Klapsas
Karolos Nikitaras
Nikolaos Ellinas
June Sig Sung
Inchul Hwang
S. Raptis
Aimilios Chalamandaris
Pirros Tsiakoulis
294
1
0
02 Nov 2022
A Survey on Non-Autoregressive Generation for Neural Machine Translation
  and Beyond
A Survey on Non-Autoregressive Generation for Neural Machine Translation and BeyondIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Yisheng Xiao
Lijun Wu
Junliang Guo
Juntao Li
Hao Fei
Tao Qin
Tie-Yan Liu
3DVMedImAI4CE
317
121
0
20 Apr 2022
Hierarchical and Multi-Scale Variational Autoencoder for Diverse and
  Natural Non-Autoregressive Text-to-Speech
Hierarchical and Multi-Scale Variational Autoencoder for Diverse and Natural Non-Autoregressive Text-to-SpeechInterspeech (Interspeech), 2022
Jaesung Bae
Jinhyeok Yang
Taejun Bak
Young-Sun Joo
DiffM
344
6
0
08 Apr 2022
ProsoSpeech: Enhancing Prosody With Quantized Vector Pre-training in
  Text-to-Speech
ProsoSpeech: Enhancing Prosody With Quantized Vector Pre-training in Text-to-SpeechIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yi Ren
Ming Lei
Zhiying Huang
Shi-Rui Zhang
Qian Chen
Zhijie Yan
Zhou Zhao
221
50
0
16 Feb 2022
Unsupervised word-level prosody tagging for controllable speech
  synthesis
Unsupervised word-level prosody tagging for controllable speech synthesisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yiwei Guo
Chenpeng Du
Kai Yu
255
16
0
15 Feb 2022
MsEmoTTS: Multi-scale emotion transfer, prediction, and control for
  emotional speech synthesis
MsEmoTTS: Multi-scale emotion transfer, prediction, and control for emotional speech synthesisIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Yinjiao Lei
Shan Yang
Xinsheng Wang
Lei Xie
239
97
0
17 Jan 2022
Improved Prosodic Clustering for Multispeaker and Speaker-independent
  Phoneme-level Prosody Control
Improved Prosodic Clustering for Multispeaker and Speaker-independent Phoneme-level Prosody ControlInternational Conference on Speech and Computer (SPECOM), 2021
Myrsini Christidou
Alexandra Vioni
Nikolaos Ellinas
G. Vamvoukakis
K. Markopoulos
Panos Kakoulidis
June Sig Sung
Hyoungmin Park
Aimilios Chalamandaris
Pirros Tsiakoulis
202
4
0
19 Nov 2021
Improving Prosody for Unseen Texts in Speech Synthesis by Utilizing
  Linguistic Information and Noisy Data
Improving Prosody for Unseen Texts in Speech Synthesis by Utilizing Linguistic Information and Noisy Data
Zhu Li
Yuqing Zhang
Mengxi Nie
Ming Yan
Mengnan He
Ruixiong Zhang
Caixia Gong
160
3
0
15 Nov 2021
Hierarchical prosody modeling and control in non-autoregressive parallel
  neural TTS
Hierarchical prosody modeling and control in non-autoregressive parallel neural TTS
T. Raitio
Jiangchuan Li
Shreyas Seshadri
249
28
0
06 Oct 2021
Multi-Scale Spectrogram Modelling for Neural Text-to-Speech
Multi-Scale Spectrogram Modelling for Neural Text-to-SpeechSpeech Synthesis Workshop (SS), 2021
Ammar Abbas
Bajibabu Bollepalli
Alexis Moinet
Arnaud Joly
Penny Karanasou
Peter Makarov
Simon Slangens
S. Karlapati
Thomas Drugman
200
0
0
29 Jun 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
453
446
0
29 Jun 2021
UniTTS: Residual Learning of Unified Embedding Space for Speech Style
  Control
UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control
M. Kang
Sungjae Kim
Injung Kim
364
4
0
21 Jun 2021
Phone-Level Prosody Modelling with GMM-Based MDN for Diverse and
  Controllable Speech Synthesis
Phone-Level Prosody Modelling with GMM-Based MDN for Diverse and Controllable Speech SynthesisIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021
Chenpeng Du
K. Yu
378
24
0
27 May 2021
Rich Prosody Diversity Modelling with Phone-level Mixture Density
  Network
Rich Prosody Diversity Modelling with Phone-level Mixture Density NetworkInterspeech (Interspeech), 2021
Chenpeng Du
K. Yu
371
18
0
01 Feb 2021
1
Page 1 of 1