ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.02882
  4. Cited By
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech

LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech

5 April 2019
Heiga Zen
Viet Dang
R. Clark
Yu Zhang
Ron J. Weiss
Ye Jia
Zhiwen Chen
Yonghui Wu
ArXiv (abs)PDFHTML

Papers citing "LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech"

50 / 617 papers shown
Title
Handling Background Noise in Neural Speech Generation
Handling Background Noise in Neural Speech Generation
Tom Denton
Alejandro Luebs
Felicia S. C. Lim
Andrew Storus
Hengchin Yeh
W. Kleijn
Jan Skoglund
45
2
0
23 Feb 2021
Alternate Endings: Improving Prosody for Incremental Neural TTS with
  Predicted Future Text Input
Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input
Brooke Stephenson
Thomas Hueber
Laurent Girin
Laurent Besacier
89
10
0
19 Feb 2021
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep
  VAE with Residual Attention
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Peng Liu
Yuewen Cao
Songxiang Liu
Na Hu
Guangzhi Li
Chao Weng
Dan Su
95
22
0
12 Feb 2021
Voice Cloning: a Multi-Speaker Text-to-Speech Synthesis Approach based
  on Transfer Learning
Voice Cloning: a Multi-Speaker Text-to-Speech Synthesis Approach based on Transfer Learning
Giuseppe Ruggiero
Enrico Zovato
Luigi Di Caro
V. Pollet
DiffM
63
10
0
10 Feb 2021
Universal Neural Vocoding with Parallel WaveNet
Universal Neural Vocoding with Parallel WaveNet
Yunlong Jiao
Adam Gabry's
Georgi Tinchev
Bartosz Putrycz
Daniel Korzekwa
V. Klimkov
73
42
0
01 Feb 2021
Expressive Neural Voice Cloning
Expressive Neural Voice Cloning
Paarth Neekhara
Shehzeen Samarah Hussain
Shlomo Dubnov
F. Koushanfar
Julian McAuley
DiffM
56
30
0
30 Jan 2021
A Study of F0 Modification for X-Vector Based Speech Pseudonymization
  Across Gender
A Study of F0 Modification for X-Vector Based Speech Pseudonymization Across Gender
Pierre Champion
D. Jouvet
Anthony Larcher
62
24
0
21 Jan 2021
Mispronunciation Detection in Non-native (L2) English with Uncertainty
  Modeling
Mispronunciation Detection in Non-native (L2) English with Uncertainty Modeling
Daniel Korzekwa
Jaime Lorenzo-Trueba
Szymon Zaporowski
Shira Calamaro
Thomas Drugman
B. Kostek
25
16
0
16 Jan 2021
MLS: A Large-Scale Multilingual Dataset for Speech Research
MLS: A Large-Scale Multilingual Dataset for Speech Research
Vineel Pratap
Qiantong Xu
Anuroop Sriram
Gabriel Synnaeve
R. Collobert
AuLLM
154
512
0
07 Dec 2020
How Far Are We from Robust Voice Conversion: A Survey
How Far Are We from Robust Voice Conversion: A Survey
Tzu-hsien Huang
Jheng-hao Lin
Chien-yu Huang
Hung-yi Lee
84
25
0
24 Nov 2020
Synth2Aug: Cross-domain speaker recognition with TTS synthesized speech
Synth2Aug: Cross-domain speaker recognition with TTS synthesized speech
Yiling Huang
Yutian Chen
Jason W. Pelecanos
Quan Wang
98
12
0
24 Nov 2020
Universal MelGAN: A Robust Neural Vocoder for High-Fidelity Waveform
  Generation in Multiple Domains
Universal MelGAN: A Robust Neural Vocoder for High-Fidelity Waveform Generation in Multiple Domains
Won Jang
D. Lim
Jaesam Yoon
57
34
0
19 Nov 2020
Enhancing Low-Quality Voice Recordings Using Disentangled Channel Factor
  and Neural Waveform Model
Enhancing Low-Quality Voice Recordings Using Disentangled Channel Factor and Neural Waveform Model
Haoyu Li
Yang Ai
Junichi Yamagishi
76
2
0
10 Nov 2020
Pretraining Strategies, Waveform Model Choice, and Acoustic
  Configurations for Multi-Speaker End-to-End Speech Synthesis
Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis
Erica Cooper
Xin Wang
Yi Zhao
Yusuke Yasuda
Junichi Yamagishi
SyDa
37
3
0
10 Nov 2020
FragmentVC: Any-to-Any Voice Conversion by End-to-End Extracting and
  Fusing Fine-Grained Voice Fragments With Attention
FragmentVC: Any-to-Any Voice Conversion by End-to-End Extracting and Fusing Fine-Grained Voice Fragments With Attention
Yist Y. Lin
C. Chien
Jheng-hao Lin
Hung-yi Lee
Lin-Shan Lee
56
79
0
27 Oct 2020
Speaker Anonymization with Distribution-Preserving X-Vector Generation
  for the VoicePrivacy Challenge 2020
Speaker Anonymization with Distribution-Preserving X-Vector Generation for the VoicePrivacy Challenge 2020
H.C.M. Turner
Giulio Lovisotto
Ivan Martinovic
68
21
0
26 Oct 2020
Learning Speaker Embedding from Text-to-Speech
Learning Speaker Embedding from Text-to-Speech
Jaejin Cho
Piotr Żelasko
Jesus Villalba
Shinji Watanabe
Najim Dehak
59
11
0
21 Oct 2020
Replacing Human Audio with Synthetic Audio for On-device Unspoken
  Punctuation Prediction
Replacing Human Audio with Synthetic Audio for On-device Unspoken Punctuation Prediction
Daria Soboleva
Ondrej Skopek
Márius vSajgalík
Victor Cuarbune
Felix Weissenberger
...
B. Prisacari
Daniel Valcarce
Justin Lu
Rohit Prabhavalkar
Balint Miklos
98
9
0
20 Oct 2020
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis
  Including Unsupervised Duration Modeling
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
Jonathan Shen
Ye Jia
Mike Chrzanowski
Yu Zhang
Isaac Elias
Heiga Zen
Yonghui Wu
87
112
0
08 Oct 2020
Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent
  Systems
Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems
Yinghui Huang
H. Kuo
Samuel Thomas
Zvi Kons
Kartik Audhkhasi
Brian Kingsbury
R. Hoory
M. Picheny
VLM
43
63
0
08 Oct 2020
Latent linguistic embedding for cross-lingual text-to-speech and voice
  conversion
Latent linguistic embedding for cross-lingual text-to-speech and voice conversion
Hieu-Thi Luong
Junichi Yamagishi
53
5
0
08 Oct 2020
The Academia Sinica Systems of Voice Conversion for VCC2020
The Academia Sinica Systems of Voice Conversion for VCC2020
Yu-Huai Peng
Cheng-Hung Hu
A. Kang
Hung-Shin Lee
Pin-Yuan Chen
Yu Tsao
Hsin-Min Wang
42
2
0
06 Oct 2020
The Sequence-to-Sequence Baseline for the Voice Conversion Challenge
  2020: Cascading ASR and TTS
The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS
Wen-Chin Huang
Tomoki Hayashi
Shinji Watanabe
Tomoki Toda
DRL
76
40
0
06 Oct 2020
Transfer Learning from Monolingual ASR to Transcription-free
  Cross-lingual Voice Conversion
Transfer Learning from Monolingual ASR to Transcription-free Cross-lingual Voice Conversion
Che-Jui Chang
60
5
0
30 Sep 2020
Transfer Learning from Speech Synthesis to Voice Conversion with
  Non-Parallel Training Data
Transfer Learning from Speech Synthesis to Voice Conversion with Non-Parallel Training Data
Mingyang Zhang
Yi Zhou
Li Zhao
Haizhou Li
78
53
0
30 Sep 2020
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence
  Modeling
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling
Songxiang Liu
Yuewen Cao
Disong Wang
Xixin Wu
Xunying Liu
Helen Meng
BDL
83
92
0
06 Sep 2020
What the Future Brings: Investigating the Impact of Lookahead for
  Incremental Neural TTS
What the Future Brings: Investigating the Impact of Lookahead for Incremental Neural TTS
Brooke Stephenson
Laurent Besacier
Laurent Girin
Thomas Hueber
76
14
0
04 Sep 2020
Textual Echo Cancellation
Textual Echo Cancellation
Shaojin Ding
Ye Jia
Ke Hu
Quan Wang
56
8
0
13 Aug 2020
An Overview of Voice Conversion and its Challenges: From Statistical
  Modeling to Deep Learning
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning
Berrak Sisman
Junichi Yamagishi
Simon King
Haizhou Li
BDL
133
327
0
09 Aug 2020
Controllable Neural Prosody Synthesis
Controllable Neural Prosody Synthesis
Max Morrison
Zeyu Jin
Justin Salamon
Nicholas J. Bryan
G. J. Mysore
55
20
0
07 Aug 2020
Xiaomingbot: A Multilingual Robot News Reporter
Xiaomingbot: A Multilingual Robot News Reporter
Runxin Xu
Jun Cao
Mingxuan Wang
Jiaze Chen
Hao Zhou
...
Xiang Yin
Xijin Zhang
Songcheng Jiang
Yuxuan Wang
Lei Li
74
11
0
12 Jul 2020
DeepSinger: Singing Voice Synthesis with Data Mined From the Web
DeepSinger: Singing Voice Synthesis with Data Mined From the Web
Yi Ren
Xu Tan
Tao Qin
Jian Luan
Zhou Zhao
Tie-Yan Liu
102
73
0
09 Jul 2020
Embodied Self-supervised Learning by Coordinated Sampling and Training
Embodied Self-supervised Learning by Coordinated Sampling and Training
Yifan Sun
Xihong Wu
SSL
54
7
0
20 Jun 2020
MultiSpeech: Multi-Speaker Text to Speech with Transformer
MultiSpeech: Multi-Speaker Text to Speech with Transformer
Mingjian Chen
Xu Tan
Yi Ren
Jin Xu
Hao Sun
Sheng Zhao
Tao Qin
Tie-Yan Liu
65
110
0
08 Jun 2020
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment
  Search
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
Jaehyeon Kim
Sungwon Kim
Jungil Kong
Sungroh Yoon
110
498
0
22 May 2020
NAUTILUS: a Versatile Voice Cloning System
NAUTILUS: a Versatile Voice Cloning System
Hieu-Thi Luong
Junichi Yamagishi
87
53
0
22 May 2020
Investigation of learning abilities on linguistic features in
  sequence-to-sequence text-to-speech synthesis
Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis
Yusuke Yasuda
Xin Wang
Junichi Yamagishi
AI4TS
63
31
0
20 May 2020
Design Choices for X-vector Based Speaker Anonymization
Design Choices for X-vector Based Speaker Anonymization
B. M. L. Srivastava
N. Tomashenko
Xin Wang
Emmanuel Vincent
Junichi Yamagishi
Mohamed Maouche
A. Bellet
Marc Tommasi
60
63
0
18 May 2020
Single Channel Far Field Feature Enhancement For Speaker Verification In
  The Wild
Single Channel Far Field Feature Enhancement For Speaker Verification In The Wild
P. S. Nidadavolu
Saurabh Kataria
Leibny Paola García-Perera
Jesús Villalba
Najim Dehak
20
3
0
17 May 2020
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis
Prajwal K R
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
67
113
0
17 May 2020
ConVoice: Real-Time Zero-Shot Voice Style Transfer with Convolutional
  Network
ConVoice: Real-Time Zero-Shot Voice Style Transfer with Convolutional Network
Yurii Rebryk
Stanislav Beliaev
60
8
0
15 May 2020
Flowtron: an Autoregressive Flow-based Generative Network for
  Text-to-Speech Synthesis
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
Rafael Valle
Kevin J. Shih
R. Prenger
Bryan Catanzaro
94
121
0
12 May 2020
Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice
  Conversion without Parallel Data
Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice Conversion without Parallel Data
Seung-won Park
Doo-young Kim
Myun-chul Joe
84
42
0
07 May 2020
Introducing the VoicePrivacy Initiative
Introducing the VoicePrivacy Initiative
N. Tomashenko
B. M. L. Srivastava
Xin Wang
Emmanuel Vincent
A. Nautsch
...
Nicholas W. D. Evans
J. Patino
J. Bonastre
Paul-Gauthier Noé
Massimiliano Todisco
111
132
0
04 May 2020
A Study of Non-autoregressive Model for Sequence Generation
A Study of Non-autoregressive Model for Sequence Generation
Yi Ren
Jinglin Liu
Xu Tan
Zhou Zhao
Sheng Zhao
Tie-Yan Liu
105
62
0
22 Apr 2020
ViSQOL v3: An Open Source Production Ready Objective Speech and Audio
  Metric
ViSQOL v3: An Open Source Production Ready Objective Speech and Audio Metric
Michael Chinen
Felicia S. C. Lim
Jan Skoglund
Nikita Gureev
F. O'Gorman
Andrew Hines
78
143
0
20 Apr 2020
Gender Representation in Open Source Speech Resources
Gender Representation in Open Source Speech Resources
Mahault Garnerin
Solange Rossato
Laurent Besacier
47
6
0
18 Mar 2020
Unsupervised Style and Content Separation by Minimizing Mutual
  Information for Speech Synthesis
Unsupervised Style and Content Separation by Minimizing Mutual Information for Speech Synthesis
Ting-Yao Hu
A. Shrivastava
Oncel Tuzel
C. Dhir
54
32
0
09 Mar 2020
Comparison of Speech Representations for Automatic Quality Estimation in
  Multi-Speaker Text-to-Speech Synthesis
Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis
Jennifer Williams
Joanna Rownicka
P. Oplustil
Simon King
93
25
0
28 Feb 2020
An empirical study of Conv-TasNet
An empirical study of Conv-TasNet
Berkan Kadıoğlu
Michael Horgan
Xiaoyu Liu
Jordi Pons
Dan Darcy
Vivek Kumar
40
44
0
20 Feb 2020
Previous
123...111213
Next