LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech

5 April 2019

Papers citing "LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech"

50 / 617 papers shown

Title
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation Kun Song Heyang Xue Xinsheng Wang Jian Cong Yongmao Zhang Linfu Xie Bing Yang Xiong Zhang Dan Su 93 5 0 01 Jun 2022
StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis Yinghao Aaron Li Cong Han N. Mesgarani 110 40 0 30 May 2022
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data Sungwon Kim Heeseung Kim Sung-Hoon Yoon DiffM 246 53 0 30 May 2022
End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions Wonjune Kang M. Hasegawa-Johnson D. Roy 70 8 0 19 May 2022
Dictionary-Based Fusion of Contact and Acoustic Microphones for Wind Noise Reduction Marvin Tammen Xilin Li Simon Doclo L. Theverapperuma 19 4 0 18 May 2022
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech Rongjie Huang Yi Ren Jinglin Liu Chenye Cui Zhou Zhao OODD VLM 176 34 0 15 May 2022
The VoicePrivacy 2020 Challenge Evaluation Plan N. Tomashenko B. M. L. Srivastava Xin Wang Emmanuel Vincent A. Nautsch ... Nicholas W. D. Evans J. Patino J. Bonastre Paul-Gauthier Noé Massimiliano Todisco 79 44 0 14 May 2022
Talking Face Generation with Multilingual TTS Hyoung-Kyu Song Sanghyun Woo Junhyeok Lee S. Yang Hyunjae Cho Youseong Lee Dongho Choi Kang-Wook Kim CVBM 77 22 0 13 May 2022
Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech Yongqian Li Cheng Yu Guangzhi Sun Hua Jiang Fanglei Sun Weiqin Zu Ying Wen Yang Yang Jun Wang 47 7 0 09 May 2022
SVTS: Scalable Video-to-Speech Synthesis Rodrigo Mira A. Haliassos Stavros Petridis Björn W. Schuller Maja Pantic 67 35 0 04 May 2022
SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech Zhenhui Ye Zhou Zhao Yi Ren Leilei Gan 85 28 0 25 Apr 2022
Brainish: Formalizing A Multimodal Language for Intelligence and Consciousness Paul Pu Liang 78 4 0 14 Apr 2022
Fine-grained Noise Control for Multispeaker Speech Synthesis Karolos Nikitaras G. Vamvoukakis Nikolaos Ellinas Konstantinos Klapsas K. Markopoulos S. Raptis June Sig Sung Gunu Jho Aimilios Chalamandaris Pirros Tsiakoulis 57 5 0 11 Apr 2022
Karaoker: Alignment-free singing voice synthesis with speech training data Panos Kakoulidis Nikolaos Ellinas G. Vamvoukakis K. Markopoulos June Sig Sung Gunu Jho Pirros Tsiakoulis Aimilios Chalamandaris 96 3 0 08 Apr 2022
Heterogeneous Target Speech Separation Hyunjae Cho Wonbin Jung Junhyeok Lee Paris Smaragdis Sanghyun Woo 90 26 0 07 Apr 2022
A Wav2vec2-Based Experimental Study on Self-Supervised Learning Methods to Improve Child Speech Recognition Rishabh Jain Andrei Barcovschi Mariam Yiwere Dan Bigioi Peter Corcoran H. Cucu 54 34 0 06 Apr 2022
Residual-guided Personalized Speech Synthesis based on Face Image Jianrong Wang Zixuan Wang Xiaosheng Hu Xuewei Li Qiang Fang Li Liu CVBM 46 17 0 01 Apr 2022
AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios Yihan Wu Xu Tan Bohan Li Lei He Sheng Zhao Ruihua Song Tao Qin Tie-Yan Liu VLM DiffM 78 69 0 01 Apr 2022
Universal Adaptor: Converting Mel-Spectrograms Between Different Configurations for Speech Synthesis Fan Wang Po-Chun Hsu Da-Rong Liu Hung-yi Lee 37 0 0 01 Apr 2022
Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text Normalization Evelina Bakhturina Yang Zhang Boris Ginsburg 38 10 0 29 Mar 2022
DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning Takaaki Saeki Kentaro Tachibana Ryuichi Yamamoto 53 11 0 29 Mar 2022
ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversion Edresson Casanova C. Shulby Alexander Korolev Arnaldo Cândido Júnior A. S. Soares S. Aluísio M. Ponti 117 14 0 29 Mar 2022
Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus Minchan Kim Myeonghun Jeong Byoung Jin Choi Sunghwan Ahn Joun Yeop Lee N. Kim 106 26 0 29 Mar 2022
Analyzing Language-Independent Speaker Anonymization Framework under Unseen Conditions Xiaoxiao Miao Xin Wang Erica Cooper Junichi Yamagishi N. Tomashenko 62 11 0 28 Mar 2022
The VoicePrivacy 2022 Challenge Evaluation Plan N. Tomashenko Xin Wang Xiaoxiao Miao Hubert Nourtel Pierre Champion Massimiliano Todisco Emmanuel Vincent Nicholas W. D. Evans Junichi Yamagishi J. Bonastre 109 63 0 23 Mar 2022
A Text-to-Speech Pipeline, Evaluation Methodology, and Initial Fine-Tuning Results for Child Speech Synthesis Rishabh Jain Mariam Yiwere Dan Bigioi Peter Corcoran H. Cucu 67 14 0 22 Mar 2022
ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis Jinlong Xue Yayue Deng Yichen Han Ya Li Jianqing Sun Jiaen Liang 51 8 0 20 Mar 2022
RoSS: Utilizing Robotic Rotation for Audio Source Separation Hyungjoo Seo Sahil Bhandary Karnoor Romit Roy Choudhury 89 0 0 18 Mar 2022
Real time spectrogram inversion on mobile phone Oleg Rybakov Marco Tagliasacchi Yunpeng Li Liyang Jiang Xia Zhang Fadi Biadsy 121 4 0 01 Mar 2022
Learning the Beauty in Songs: Neural Singing Voice Beautifier Jinglin Liu Chengxi Li Yi Ren Zhiying Zhu Zhou Zhao DiffM 90 17 0 27 Feb 2022
Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models Xiaoxiao Miao Xin Wang Erica Cooper Junichi Yamagishi N. Tomashenko 158 25 0 26 Feb 2022
Revisiting Over-Smoothness in Text to Speech Yi Ren Xu Tan Tao Qin Zhou Zhao Tie-Yan Liu 143 64 0 26 Feb 2022
Differentially Private Speaker Anonymization Ali Shahin Shamsabadi B. M. L. Srivastava A. Bellet Nathalie Vauquier Emmanuel Vincent Mohamed Maouche Marc Tommasi Nicolas Papernot MIACV 138 35 0 23 Feb 2022
nnSpeech: Speaker-Guided Conditional Variational Autoencoder for Zero-shot Multi-speaker Text-to-Speech Bo Zhao Xulong Zhang Jianzong Wang Ning Cheng Jing Xiao DiffM 86 22 0 22 Feb 2022
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing Tao Wang Jiangyan Yi Ruibo Fu J. Tao Zhengqi Wen KELM 69 20 0 21 Feb 2022
FedEmbed: Personalized Private Federated Learning Andrew Silva Katherine Metcalf N. Apostoloff B. Theobald FedML 63 6 0 18 Feb 2022
VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion Disong Wang Shan Yang Dan Su Xunying Liu Dong Yu Helen Meng 55 11 0 18 Feb 2022
SpeechPainter: Text-conditioned Speech Inpainting Zalan Borsos Matthew Sharifi Marco Tagliasacchi 93 28 0 15 Feb 2022
I'm Hearing (Different) Voices: Anonymous Voices to Protect User Privacy H.C.M. Turner Giulio Lovisotto Simon Eberz Ivan Martinovic 25 1 0 13 Feb 2022
J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis Shinnosuke Takamichi Wataru Nakata Naoko Tanji Hiroshi Saruwatari AuLLM 70 7 0 26 Jan 2022
Zero-Shot Long-Form Voice Cloning with Dynamic Convolution Attention Artem Gorodetskii Ivan Ozhiganov 108 2 0 25 Jan 2022
KazakhTTS2: Extending the Open-Source Kazakh TTS Corpus With More Data, Speakers, and Topics Saida Mussakhojayeva Yerbolat Khassanov H. A. Varol 67 13 0 15 Jan 2022
CVSS Corpus and Massively Multilingual Speech-to-Speech Translation Yeting Jia Michelle Tadmor Ramanovich Quan Wang Heiga Zen SLR 94 70 0 11 Jan 2022
End-to-end speaker diarization with transformer Yongquan Lai Xin Tang Yuanyuan Fu Rui Fang 51 1 0 14 Dec 2021
Training Robust Zero-Shot Voice Conversion Models with Self-supervised Features Trung D. Q. Dang Dung T. Tran Peter Chin K. Koishida SSL 69 15 0 08 Dec 2021
VocBench: A Neural Vocoder Benchmark for Speech Synthesis Ehab A. AlBadawy Andrew Gibiansky Qing He Jilong Wu Ming-Ching Chang Siwei Lyu 53 12 0 06 Dec 2021
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone Edresson Casanova Julian Weber C. Shulby Arnaldo Cândido Júnior Eren Golge M. Ponti 244 415 0 04 Dec 2021
V2C: Visual Voice Cloning Qi Chen Yuanqing Li Yuankai Qi Jiaqiu Zhou Mingkui Tan Qi Wu VGen 72 27 0 25 Nov 2021
Implicit Acoustic Echo Cancellation for Keyword Spotting and Device-Directed Speech Detection Samuele Cornell T. Balestri Thibaud Sénéchal 39 5 0 20 Nov 2021
More than Words: In-the-Wild Visually-Driven Prosody for Text-to-Speech Michael Hassid Michelle Tadmor Ramanovich Brendan Shillingford Miaosen Wang Ye Jia Tal Remez DiffM 65 18 0 19 Nov 2021