ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1912.00955
  4. Cited By
Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven
  Acoustic Embedding Selection

Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection

2 December 2019
Shubhi Tyagi
M. Nicolis
Jonas Rohnke
Thomas Drugman
Jaime Lorenzo-Trueba
ArXivPDFHTML

Papers citing "Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection"

20 / 20 papers shown
Title
On the Cost and Benefits of Training Context with Utterance or Full Conversation Training: A Comparative Stud
On the Cost and Benefits of Training Context with Utterance or Full Conversation Training: A Comparative Stud
Hyouin Liu
Zhikuan Zhang
24
0
0
12 May 2025
Audio-Visual Neural Syntax Acquisition
Audio-Visual Neural Syntax Acquisition
Cheng-I Jeff Lai
Freda Shi
Puyuan Peng
Yoon Kim
Kevin Gimpel
...
David D. Cox
David F. Harwath
Yang Zhang
Karen Livescu
James R. Glass
CLIP
NAI
45
1
0
11 Oct 2023
Expressive paragraph text-to-speech synthesis with multi-step
  variational autoencoder
Expressive paragraph text-to-speech synthesis with multi-step variational autoencoder
Xuyuan Li
Zengqiang Shang
Peiyang Shi
Hua Hua
Jian Liu
Pengyuan Zhang
21
0
0
25 Aug 2023
Cascading and Direct Approaches to Unsupervised Constituency Parsing on
  Spoken Sentences
Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken Sentences
Yuan Tseng
Cheng-I Jeff Lai
Hung-yi Lee
SSL
35
4
0
15 Mar 2023
Investigation of Japanese PnG BERT language model in text-to-speech
  synthesis for pitch accent language
Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language
Yusuke Yasuda
T. Toda
25
8
0
16 Dec 2022
ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in
  Paragraph-based TTS
ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in Paragraph-based TTS
Liumeng Xue
Frank Soong
Shaofei Zhang
Linfu Xie
19
23
0
14 Sep 2022
A Study of Modeling Rising Intonation in Cantonese Neural Speech
  Synthesis
A Study of Modeling Rising Intonation in Cantonese Neural Speech Synthesis
Qibing Bai
Tom Ko
Yu Zhang
8
4
0
03 Aug 2022
Low-data? No problem: low-resource, language-agnostic conversational
  text-to-speech via F0-conditioned data augmentation
Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation
Giulia Comini
Goeric Huybrechts
M. Ribeiro
Adam Gabry's
Jaime Lorenzo-Trueba
19
5
0
29 Jul 2022
CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many
  Fine-Grained Prosody Transfer
CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer
S. Karlapati
Penny Karanasou
Mateusz Lajszczak
Ammar Abbas
Alexis Moinet
Peter Makarov
Raymond Li
Arent van Korlaar
Simon Slangen
Thomas Drugman
14
15
0
27 Jun 2022
Discrete Acoustic Space for an Efficient Sampling in Neural
  Text-To-Speech
Discrete Acoustic Space for an Efficient Sampling in Neural Text-To-Speech
Mu-Wei Li
Jonas Rohnke
A. Bonafonte
Mateusz Lajszczak
Trevor Wood
DRL
17
2
0
24 Oct 2021
Using multiple reference audios and style embedding constraints for
  speech synthesis
Using multiple reference audios and style embedding constraints for speech synthesis
Cheng Gong
Longbiao Wang
Zhenhua Ling
Ju Zhang
J. Dang
9
5
0
09 Oct 2021
Location, Location: Enhancing the Evaluation of Text-to-Speech Synthesis
  Using the Rapid Prosody Transcription Paradigm
Location, Location: Enhancing the Evaluation of Text-to-Speech Synthesis Using the Rapid Prosody Transcription Paradigm
Elijah Gutierrez
Pilar Oplustil Gallegos
Catherine Lai
13
3
0
06 Jul 2021
Multi-Scale Spectrogram Modelling for Neural Text-to-Speech
Multi-Scale Spectrogram Modelling for Neural Text-to-Speech
Ammar Abbas
Bajibabu Bollepalli
Alexis Moinet
Arnaud Joly
Penny Karanasou
Peter Makarov
Simon Slangens
S. Karlapati
Thomas Drugman
16
0
0
29 Jun 2021
A learned conditional prior for the VAE acoustic space of a TTS system
A learned conditional prior for the VAE acoustic space of a TTS system
Panagiota Karanasou
S. Karlapati
Alexis Moinet
Arnaud Joly
Ammar Abbas
Simon Slangen
Jaime Lorenzo-Trueba
Thomas Drugman
12
7
0
14 Jun 2021
EmoCat: Language-agnostic Emotional Voice Conversion
EmoCat: Language-agnostic Emotional Voice Conversion
Bastian Schnell
Goeric Huybrechts
Bartek Perz
Thomas Drugman
Jaime Lorenzo-Trueba
11
10
0
14 Jan 2021
Using previous acoustic context to improve Text-to-Speech synthesis
Using previous acoustic context to improve Text-to-Speech synthesis
Pilar Oplustil Gallegos
Simon King
13
11
0
07 Dec 2020
Low-resource expressive text-to-speech using data augmentation
Low-resource expressive text-to-speech using data augmentation
Goeric Huybrechts
Thomas Merritt
Giulia Comini
Bartek Perz
Raahil Shah
Jaime Lorenzo-Trueba
13
50
0
11 Nov 2020
Prosodic Representation Learning and Contextual Sampling for Neural
  Text-to-Speech
Prosodic Representation Learning and Contextual Sampling for Neural Text-to-Speech
S. Karlapati
Ammar Abbas
Zack Hodari
Alexis Moinet
Arnaud Joly
Panagiota Karanasou
Thomas Drugman
15
19
0
04 Nov 2020
Perception of prosodic variation for speech synthesis using an
  unsupervised discrete representation of F0
Perception of prosodic variation for speech synthesis using an unsupervised discrete representation of F0
Zack Hodari
Catherine Lai
Simon King
6
13
0
14 Mar 2020
Probing the phonetic and phonological knowledge of tones in Mandarin TTS
  models
Probing the phonetic and phonological knowledge of tones in Mandarin TTS models
Jian Zhu
11
8
0
23 Dec 2019
1