ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.06775
  4. Cited By
Controllable neural text-to-speech synthesis using intuitive prosodic
  features

Controllable neural text-to-speech synthesis using intuitive prosodic features

14 September 2020
T. Raitio
Ramya Rasipuram
D. Castellani
ArXivPDFHTML

Papers citing "Controllable neural text-to-speech synthesis using intuitive prosodic features"

32 / 32 papers shown
Title
A Human-in-the-Loop Approach to Improving Cross-Text Prosody Transfer
A Human-in-the-Loop Approach to Improving Cross-Text Prosody Transfer
Himanshu Maurya
A. Sigurgeirsson
27
0
0
06 Jun 2024
Building speech corpus with diverse voice characteristics for its
  prompt-based representation
Building speech corpus with diverse voice characteristics for its prompt-based representation
Aya Watanabe
Shinnosuke Takamichi
Yuki Saito
Wataru Nakata
Detai Xin
Hiroshi Saruwatari
27
0
0
20 Mar 2024
Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics
  Description for Prompt-based Control
Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control
Aya Watanabe
Shinnosuke Takamichi
Yuki Saito
Wataru Nakata
Detai Xin
Hiroshi Saruwatari
16
9
0
24 Sep 2023
FluentEditor: Text-based Speech Editing by Considering Acoustic and
  Prosody Consistency
FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency
Rui Liu
Jiatian Xi
Ziyue Jiang
Haizhou Li
17
2
0
21 Sep 2023
MSM-VC: High-fidelity Source Style Transfer for Non-Parallel Voice
  Conversion by Multi-scale Style Modeling
MSM-VC: High-fidelity Source Style Transfer for Non-Parallel Voice Conversion by Multi-scale Style Modeling
Zhichao Wang
Xinsheng Wang
Qicong Xie
Tao Li
Linfu Xie
Qiao Tian
Yuping Wang
16
4
0
03 Sep 2023
Automatic Evaluation of Turn-taking Cues in Conversational Speech
  Synthesis
Automatic Evaluation of Turn-taking Cues in Conversational Speech Synthesis
Erik Ekstedt
Siyang Wang
Éva Székely
Joakim Gustafson
Gabriel Skantze
11
6
0
29 May 2023
Controllable speech synthesis by learning discrete phoneme-level
  prosodic representations
Controllable speech synthesis by learning discrete phoneme-level prosodic representations
Nikolaos Ellinas
Myrsini Christidou
Alexandra Vioni
June Sig Sung
Aimilios Chalamandaris
Pirros Tsiakoulis
P. Mastorocostas
17
7
0
29 Nov 2022
Prosody-controllable spontaneous TTS with neural HMMs
Prosody-controllable spontaneous TTS with neural HMMs
Harm Lameris
Shivam Mehta
G. Henter
Joakim Gustafson
Éva Székely
33
15
0
24 Nov 2022
Delivering Speaking Style in Low-resource Voice Conversion with
  Multi-factor Constraints
Delivering Speaking Style in Low-resource Voice Conversion with Multi-factor Constraints
Zhichao Wang
Xinsheng Wang
Linfu Xie
Yuan-Jui Chen
Qiao Tian
Yuping Wang
25
5
0
16 Nov 2022
Controllable Data Generation by Deep Learning: A Review
Controllable Data Generation by Deep Learning: A Review
Shiyu Wang
Yuanqi Du
Xiaojie Guo
Bo Pan
Zhaohui Qin
Liang Zhao
29
28
0
19 Jul 2022
BERT, can HE predict contrastive focus? Predicting and controlling
  prominence in neural TTS using a language model
BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model
Brooke Stephenson
Laurent Besacier
Laurent Girin
Thomas Hueber
12
8
0
04 Jul 2022
iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for
  Speech Synthesis based on Disentanglement between Prosody and Timbre
iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for Speech Synthesis based on Disentanglement between Prosody and Timbre
Guangyan Zhang
Ying Qin
W. Zhang
Jialun Wu
Mei Li
Yu Gai
Feijun Jiang
Tan Lee
48
26
0
29 Jun 2022
Exact Prosody Cloning in Zero-Shot Multispeaker Text-to-Speech
Exact Prosody Cloning in Zero-Shot Multispeaker Text-to-Speech
Florian Lux
Julia Koch
Ngoc Thang Vu
32
19
0
24 Jun 2022
The Sillwood Technologies System for the VoiceMOS Challenge 2022
The Sillwood Technologies System for the VoiceMOS Challenge 2022
Jiameng Gao
18
0
0
08 Apr 2022
SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural
  Text-to-Speech Synthesis
SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis
Georgia Maniati
Alexandra Vioni
Nikolaos Ellinas
Karolos Nikitaras
Konstantinos Klapsas
June Sig Sung
Gunu Jho
Aimilios Chalamandaris
Pirros Tsiakoulis
14
26
0
06 Apr 2022
Vocal effort modeling in neural TTS for improving the intelligibility of
  synthetic speech in noise
Vocal effort modeling in neural TTS for improving the intelligibility of synthetic speech in noise
T. Raitio
Petko N. Petkov
Jiangchuan Li
M. Shifas
Andrea Davis
Y. Stylianou
9
2
0
20 Mar 2022
Speaker Adaption with Intuitive Prosodic Features for Statistical
  Parametric Speech Synthesis
Speaker Adaption with Intuitive Prosodic Features for Statistical Parametric Speech Synthesis
Pengyu Cheng
Zhenhua Ling
14
3
0
02 Mar 2022
Inkorrect: Online Handwriting Spelling Correction
Inkorrect: Online Handwriting Spelling Correction
Andrii Maksai
H. Rowley
Jesse Berent
C. Musat
19
3
0
28 Feb 2022
J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis
J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis
Shinnosuke Takamichi
Wataru Nakata
Naoko Tanji
Hiroshi Saruwatari
AuLLM
25
6
0
26 Jan 2022
One-shot Voice Conversion For Style Transfer Based On Speaker Adaptation
One-shot Voice Conversion For Style Transfer Based On Speaker Adaptation
Zhichao Wang
Qicong Xie
Tao Li
Hongqiang Du
Lei Xie
Pengcheng Zhu
Mengxiao Bi
19
11
0
24 Nov 2021
Prosodic Clustering for Phoneme-level Prosody Control in End-to-End
  Speech Synthesis
Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis
Alexandra Vioni
Myrsini Christidou
Nikolaos Ellinas
G. Vamvoukakis
Panos Kakoulidis
Taehoon Kim
June Sig Sung
Hyoungmin Park
Aimilios Chalamandaris
Pirros Tsiakoulis
11
11
0
19 Nov 2021
Improved Prosodic Clustering for Multispeaker and Speaker-independent
  Phoneme-level Prosody Control
Improved Prosodic Clustering for Multispeaker and Speaker-independent Phoneme-level Prosody Control
Myrsini Christidou
Alexandra Vioni
Nikolaos Ellinas
G. Vamvoukakis
K. Markopoulos
Panos Kakoulidis
June Sig Sung
Hyoungmin Park
Aimilios Chalamandaris
Pirros Tsiakoulis
16
4
0
19 Nov 2021
Zero-shot Voice Conversion via Self-supervised Prosody Representation
  Learning
Zero-shot Voice Conversion via Self-supervised Prosody Representation Learning
Shijun Wang
Dimche Kostadinov
Damian Borth
19
10
0
27 Oct 2021
Exploring Timbre Disentanglement in Non-Autoregressive Cross-Lingual
  Text-to-Speech
Exploring Timbre Disentanglement in Non-Autoregressive Cross-Lingual Text-to-Speech
Haoyue Zhan
Xinyuan Yu
Haitong Zhang
Yang Zhang
Yue Lin
16
5
0
14 Oct 2021
Emphasis control for parallel neural TTS
Emphasis control for parallel neural TTS
Shreyas Seshadri
T. Raitio
D. Castellani
Jiangchuan Li
50
11
0
06 Oct 2021
Hierarchical prosody modeling and control in non-autoregressive parallel
  neural TTS
Hierarchical prosody modeling and control in non-autoregressive parallel neural TTS
T. Raitio
Jiangchuan Li
Shreyas Seshadri
32
22
0
06 Oct 2021
Multi-Scale Spectrogram Modelling for Neural Text-to-Speech
Multi-Scale Spectrogram Modelling for Neural Text-to-Speech
Ammar Abbas
Bajibabu Bollepalli
Alexis Moinet
Arnaud Joly
Penny Karanasou
Peter Makarov
Simon Slangens
S. Karlapati
Thomas Drugman
16
0
0
29 Jun 2021
Enriching Source Style Transfer in Recognition-Synthesis based
  Non-Parallel Voice Conversion
Enriching Source Style Transfer in Recognition-Synthesis based Non-Parallel Voice Conversion
Zhichao Wang
Xinyong Zhou
Fengyu Yang
Tao Li
Hongqiang Du
Lei Xie
Wendong Gan
Haitao Chen
Hai Li
11
22
0
16 Jun 2021
Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis
Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis
D. Mohan
Qinmin Hu
Tian Huey Teh
Alexandra Torresquintero
C. Wallis
Marlene Staib
Lorenzo Foglianti
Jiameng Gao
Simon King
20
16
0
15 Jun 2021
Improving multi-speaker TTS prosody variance with a residual encoder and
  normalizing flows
Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows
Iván Vallés-Pérez
Julian Roth
Grzegorz Beringer
Roberto Barra-Chicote
J. Droppo
21
8
0
10 Jun 2021
Analysis and Assessment of Controllability of an Expressive Deep
  Learning-based TTS system
Analysis and Assessment of Controllability of an Expressive Deep Learning-based TTS system
Noé Tits
Kevin El Haddad
Thierry Dutoit
11
5
0
06 Mar 2021
Speech Synthesis and Control Using Differentiable DSP
Speech Synthesis and Control Using Differentiable DSP
Giorgio Fabbro
Vladimir Golkov
Thomas Kemp
Daniel Cremers
13
12
0
28 Oct 2020
1