Visualization and Interpretation of Latent Spaces for Controlling
Expressive Speech Synthesis through Audio Analysis

Visualization and Interpretation of Latent Spaces for Controlling Expressive Speech Synthesis through Audio Analysis

27 March 2019

Kevin El Haddad

ArXiv (abs)PDF HTML

Papers citing "Visualization and Interpretation of Latent Spaces for Controlling Expressive Speech Synthesis through Audio Analysis"

18 / 18 papers shown

Title
DurIAN-E 2: Duration Informed Attention Network with Adaptive Variational Autoencoder and Adversarial Learning for Expressive Text-to-Speech Synthesis Yu Gu Qiushi Zhu Guangzhi Lei Chao Weng Jane Polak Scowcroft DiffM 67 0 0 17 Oct 2024
TIPAA-SSL: Text Independent Phone-to-Audio Alignment based on Self-Supervised Learning and Knowledge Transfer Noé Tits Prernna Bhatnagar Thierry Dutoit 107 0 0 03 May 2024
Expressive TTS Driven by Natural Language Prompts Using Few Human Annotations Hanglei Zhang Yiwei Guo Sen Liu Xie Chen Kai Yu 55 1 0 02 Nov 2023
MUST&P-SRL: Multi-lingual and Unified Syllabification in Text and Phonetic Domains for Speech Representation Learning Noé Tits 114 0 0 17 Oct 2023
DurIAN-E: Duration Informed Attention Network For Expressive Text-to-Speech Synthesis Yu Gu Yianrao Bian Guangzhi Lei Chao Weng Jane Polak Scowcroft DiffM 55 2 0 22 Sep 2023
Flowchase: a Mobile Application for Pronunciation Training Noé Tits Zoé Broisson 34 2 0 05 Jul 2023
PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions Guanghou Liu Yongmao Zhang Yinjiao Lei Yunlin Chen Rui Wang Zhifei Li Linfu Xie 70 42 0 31 May 2023
InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt Dongchao Yang Songxiang Liu Rongjie Huang Chao Weng Helen Meng DiffM VLM 89 102 0 31 Jan 2023
A Model You Can Hear: Audio Identification with Playable Prototypes Romain Loiseau Baptiste Bouvier Yann Teytaut Elliot Vincent Mathieu Aubry Loic Landrieu 57 6 0 05 Aug 2022
Controllable Data Generation by Deep Learning: A Review Shiyu Wang Yuanqi Du Xiaojie Guo Bo Pan Zhaohui Qin Liang Zhao 99 28 0 19 Jul 2022
Emotional Voice Conversion: Theory, Databases and ESD Kun Zhou Berrak Sisman Rui Liu Haizhou Li 127 180 0 31 May 2021
Analysis and Assessment of Controllability of an Expressive Deep Learning-based TTS system Noé Tits Kevin El Haddad Thierry Dutoit 69 5 0 06 Mar 2021
GraphPB: Graphical Representations of Prosody Boundary in Speech Synthesis Aolan Sun Jianzong Wang Ning Cheng Huayi Peng Zhen Zeng Lingwei Kong Jing Xiao 62 9 0 03 Dec 2020
VAW-GAN for Disentanglement and Recomposition of Emotional Elements in Speech Kun Zhou Berrak Sisman Haizhou Li DRL 109 42 0 03 Nov 2020
ICE-Talk: an Interface for a Controllable Expressive Talking Machine Noé Tits Kevin El Haddad Thierry Dutoit LLMAG 59 3 0 25 Aug 2020
Interactive Text-to-Speech System via Joint Style Analysis Yang Gao Weiyi Zheng Zhaojun Yang Thilo Köhler Christian Fuegen Qing He 75 11 0 17 Feb 2020
The Theory behind Controllable Expressive Speech Synthesis: a Cross-disciplinary Approach Noé Tits Kevin El Haddad Thierry Dutoit 57 8 0 14 Oct 2019
A Methodology for Controlling the Emotional Expressiveness in Synthetic Speech -- a Deep Learning approach Noé Tits 40 10 0 05 Jul 2019