v1v2v3 (latest)

Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning

20 October 2017

Sharan Narang

Papers citing "Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning"

50 / 170 papers shown

Title
Disentangling Style and Speaker Attributes for TTS Style TransferIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022 Xiaochun An Frank Soong Lei Xie 267 21 0 24 Jan 2022
MHTTS: Fast multi-head text-to-speech for spontaneous speech with imperfect transcriptionIEEE International Conference on Tools with Artificial Intelligence (ICTAI), 2022 Dabiao Ma Yitong Zhang Meng Li Feng Ye 75 1 0 19 Jan 2022
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyoneInternational Conference on Machine Learning (ICML), 2021 Edresson Casanova Julian Weber C. Shulby Arnaldo Cândido Júnior Eren Golge M. Ponti 569 536 0 04 Dec 2021
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech Sung-Feng Huang Chyi-Jiunn Lin Da-Rong Liu Yi-Chen Chen Hung-yi Lee 386 70 0 07 Nov 2021
Emotional Prosody Control for Speech Generation S. Sivaprasad Saiteja Kosgi Vineet Gandhi 146 20 0 07 Nov 2021
Intelligent Video Editing: Incorporating Modern Talking Face Generation Algorithms in a Video Editor Anchit Gupta Faizan Farooq Khan Rudrabha Mukhopadhyay Vinay P. Namboodiri C. V. Jawahar CVBM 173 6 0 16 Oct 2021
Neural Dubber: Dubbing for Videos According to Scripts Chenxu Hu Qiao Tian Tingle Li Yuping Wang Yuxuan Wang Hang Zhao DiffM VGen 226 50 0 15 Oct 2021
Adapting TTS models For New Speakers using Transfer Learning Paarth Neekhara Jason Chun Lok Li Boris Ginsburg 192 19 0 12 Oct 2021
Hierarchical prosody modeling and control in non-autoregressive parallel neural TTS T. Raitio Jiangchuan Li Shreyas Seshadri 183 26 0 06 Oct 2021
GANtron: Emotional Speech Synthesis with Generative Adversarial Networks E. Hortal Rodrigo Brechard Alarcia GAN 77 2 0 06 Oct 2021
PortaSpeech: Portable and High-Quality Generative Text-to-Speech Yi Ren Jinglin Liu Zhou Zhao 317 90 0 30 Sep 2021
On-device neural speech synthesis Sivanand Achanta Albert Antony L. Golipour Jiangchuan Li T. Raitio ... Francesco Rossi Jennifer Shi Jaimin Upadhyay David Winarsky Hepeng Zhang 222 19 0 17 Sep 2021
Cross-speaker emotion disentangling and transfer for end-to-end speech synthesis Tao Li Xinsheng Wang Qicong Xie Zhichao Wang Linfu Xie 153 62 0 14 Sep 2021
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing Zhaofeng Shi 129 11 0 01 Aug 2021
Facetron: A Multi-speaker Face-to-Speech Model based on Cross-modal Latent RepresentationsEuropean Signal Processing Conference (EUSIPCO), 2021 Seyun Um Jihyun Kim Jihyun Lee Hong-Goo Kang CVBM 282 4 0 26 Jul 2021
Interactive Storytelling for Children: A Case-study of Design and Development Considerations for Ethical Conversational AI J. Chubb S. Missaoui S. Concannon Liam Maloney James Alfred Walker 138 43 0 20 Jul 2021
VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis Hui Lu Zhiyong Wu Xixin Wu Xu Li Shiyin Kang Xunying Liu Helen Meng 93 15 0 07 Jul 2021
AdaSpeech 3: Adaptive Text to Speech for Spontaneous Style Yuzi Yan Xu Tan Bohan Li Guangyan Zhang Tao Qin Sheng Zhao Yuan-Chung Shen Weiqiang Zhang Tie-Yan Liu 116 23 0 06 Jul 2021
EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion Daxin Tan Liqun Deng Y. Yeung Xin Jiang Xiao Chen Tan Lee 143 50 0 04 Jul 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 287 427 0 29 Jun 2021
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech SynthesisInterspeech (Interspeech), 2021 Jinhyeok Yang Jaesung Bae Taejun Bak Young-Ik Kim Hoon-Young Cho 170 42 0 29 Jun 2021
Distilling the Knowledge from Conditional Normalizing Flows Dmitry Baranchuk Vladimir Aliev Artem Babenko BDL 180 4 0 24 Jun 2021
Improving Performance of Seen and Unseen Speech Style Transfer in End-to-end Neural TTSInterspeech (Interspeech), 2021 Xiaochun An Frank Soong Lei Xie 278 9 0 18 Jun 2021
Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis D. Mohan Qinmin Hu Tian Huey Teh Alexandra Torresquintero C. Wallis Marlene Staib Lorenzo Foglianti Jiameng Gao Simon King 125 20 0 15 Jun 2021
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-SpeechInternational Conference on Machine Learning (ICML), 2021 Jaehyeon Kim Jungil Kong Juhee Son DRL 240 1,124 0 11 Jun 2021
Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech GenerationInternational Conference on Machine Learning (ICML), 2021 Dong Min Dong Bok Lee Eunho Yang Sung Ju Hwang 282 206 0 06 Jun 2021
An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesisInternational Conference on Knowledge-Based Intelligent Information & Engineering Systems (KES), 2021 Beáta Lőrincz Adriana Stan M. Giurgiu 66 2 0 03 Jun 2021
Speaker verification-derived loss and data augmentation for DNN-based multispeaker speech synthesisEuropean Signal Processing Conference (EUSIPCO), 2021 Beáta Lőrincz Adriana Stan M. Giurgiu 78 6 0 03 Jun 2021
ItôTTS and ItôWave: Linear Stochastic Differential Equation Is All You Need For Audio Generation Shoule Wu Ziqiang Shi DiffM 206 11 0 17 May 2021
Interpreting intermediate convolutional layers of generative CNNs trained on waveformsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021 Gašper Beguš Alan Zhou 240 8 0 19 Apr 2021
TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction Stanislav Beliaev Boris Ginsburg 169 10 0 16 Apr 2021
SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech ModelInterspeech (Interspeech), 2021 Edresson Casanova C. Shulby Eren Golge Nicolas Müller F. S. Oliveira Arnaldo Cândido Júnior A. S. Soares S. Aluísio M. Ponti 188 113 0 02 Apr 2021
Continual Speaker Adaptation for Text-to-Speech Synthesis Hamed Hemati Damian Borth CLL 154 9 0 26 Mar 2021
AdaSpeech: Adaptive Text to Speech for Custom VoiceInternational Conference on Learning Representations (ICLR), 2021 Mingjian Chen Xu Tan Bohan Li Yanqing Liu Tao Qin Sheng Zhao Tie-Yan Liu VLM DiffM 198 211 0 01 Mar 2021
Deepfakes Generation and Detection: State-of-the-art, open challenges, countermeasures, and way forward Momina Masood M. Nawaz K. Malik A. Javed Aun Irtaza AAML 465 397 0 25 Feb 2021
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention Peng Liu Yuewen Cao Songxiang Liu Na Hu Guangzhi Li Chao Weng Jane Polak Scowcroft 149 23 0 12 Feb 2021
Voice Cloning: a Multi-Speaker Text-to-Speech Synthesis Approach based on Transfer Learning Giuseppe Ruggiero Enrico Zovato Luigi Di Caro V. Pollet DiffM 104 14 0 10 Feb 2021
Expressive Neural Voice CloningAsian Conference on Machine Learning (ACML), 2021 Paarth Neekhara Shehzeen Samarah Hussain Shlomo Dubnov F. Koushanfar Julian McAuley DiffM 121 36 0 30 Jan 2021
Whispered and Lombard Neural Speech SynthesisSpoken Language Technology Workshop (SLT), 2021 Qiong Hu T. Bleisch Petko N. Petkov T. Raitio Erik Marchi V. Lakshminarasimhan 128 15 0 13 Jan 2021
Few Shot Adaptive Normalization Driven Multi-Speaker Speech Synthesis Neeraj Kumar Srishti Goel Ankur Narang Brejesh Lall 113 5 0 14 Dec 2020
EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture Chenfeng Miao Shuang Liang Zhencheng Liu Minchuan Chen Jun Ma Shaojun Wang Jing Xiao 143 43 0 07 Dec 2020
MelGlow: Efficient Waveform Generative Network Based on Location-Variable ConvolutionSpoken Language Technology Workshop (SLT), 2020 Zhen Zeng Jianzong Wang Ning Cheng Jing Xiao 129 8 0 03 Dec 2020
Synth2Aug: Cross-domain speaker recognition with TTS synthesized speechSpoken Language Technology Workshop (SLT), 2020 Yiling Huang Yutian Chen Jason W. Pelecanos Quan Wang 144 13 0 24 Nov 2020
Parallel Tacotron: Non-Autoregressive and Controllable TTS Isaac Elias Heiga Zen Jonathan Shen Yu Zhang Ye Jia Ron J. Weiss Yonghui Wu DRL 157 109 0 22 Oct 2020
Learning Speaker Embedding from Text-to-Speech Jaejin Cho Piotr Żelasko Jesus Villalba Shinji Watanabe Najim Dehak 108 12 0 21 Oct 2020
Neural Speech Synthesis for Estonian Liisa Rätsep Liisi Piits Hille Pajupuu Indrek Hein Mark Fišel 51 2 0 06 Oct 2020
HiFiSinger: Towards High-Fidelity Neural Singing Voice Synthesis Jiawei Chen Xu Tan Jian Luan Tao Qin Tie-Yan Liu VLM 188 104 0 03 Sep 2020
Prosody Learning Mechanism for Speech Synthesis System Without Text Length LimitInterspeech (Interspeech), 2020 Zhen Zeng Jianzong Wang Ning Cheng Jing Xiao 133 9 0 13 Aug 2020
LRSpeech: Extremely Low-Resource Speech Synthesis and RecognitionKnowledge Discovery and Data Mining (KDD), 2020 Jin Xu Xu Tan Yi Ren Tao Qin Jian Li Sheng Zhao Tie-Yan Liu VLM 129 98 0 09 Aug 2020
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep LearningIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020 Berrak Sisman Junichi Yamagishi Simon King Haizhou Li BDL 391 388 0 09 Aug 2020