Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.05884
Cited By
v1
v2 (latest)
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
16 December 2017
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
Zongheng Yang
Zhiwen Chen
Yu Zhang
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions"
26 / 1,276 papers shown
Title
Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
Yi Zhao
Shinji Takaki
Hieu-Thi Luong
Junichi Yamagishi
Daisuke Saito
Nobuaki Minematsu
76
64
0
31 Jul 2018
Scaling and bias codes for modeling speaker-adaptive DNN-based speech synthesis systems
Hieu-Thi Luong
Junichi Yamagishi
98
7
0
31 Jul 2018
Deep Encoder-Decoder Models for Unsupervised Learning of Controllable Speech Synthesis
G. Henter
Jaime Lorenzo-Trueba
Xin Wang
Junichi Yamagishi
DRL
SSL
88
61
0
30 Jul 2018
Back-Translation-Style Data Augmentation for End-to-End ASR
Tomoki Hayashi
Shinji Watanabe
Yu Zhang
Tomoki Toda
Takaaki Hori
Ramón Fernández Astudillo
K. Takeda
94
103
0
28 Jul 2018
ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech
Ming-Yu Liu
Kainan Peng
Jitong Chen
102
347
0
19 Jul 2018
The Emotional Voices Database: Towards Controlling the Emotion Dimension in Voice Generation Systems
Adaeze Adigwe
Noé Tits
Kevin El Haddad
Sarah Ostadabbas
Thierry Dutoit
69
80
0
25 Jun 2018
Sounderfeit: Cloning a Physical Model using a Conditional Adversarial Autoencoder
Stephen Sinclair
GAN
37
1
0
25 Jun 2018
EMPHASIS: An Emotional Phoneme-based Acoustic Model for Speech Synthesis System
Hao Li
Yongguo Kang
Zhenyu Wang
51
21
0
25 Jun 2018
A Variational Prosody Model for Mapping the Context-Sensitive Variation of Functional Prosodic Prototypes
B. Gerazov
Gérard Bailly
Omar Mohammed
Yi Xu
Philip N. Garner
44
7
0
22 Jun 2018
Frequency domain variants of velvet noise and their application to speech processing and synthesis: with appendices
Hideki Kawahara
Ken-Ichi Sakakibara
Masanori Morise
Hideki Banno
Tomoki Toda
Toshio Irino
28
10
0
18 Jun 2018
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Zhiwen Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
270
838
0
12 Jun 2018
Voice Imitating Text-to-Speech Neural Networks
Younggun Lee
Taesu Kim
Soo-Young Lee
65
11
0
04 Jun 2018
Mixed-Precision Training for NLP and Speech Recognition with OpenSeq2Seq
Oleksii Kuchaiev
Boris Ginsburg
Igor Gitman
Vitaly Lavrukhin
Jason Chun Lok Li
Huyen Nguyen
Carl Case
Paulius Micikevicius
VLM
72
49
0
25 May 2018
A Universal Music Translation Network
Noam Mor
Lior Wolf
Adam Polyak
Yaniv Taigman
89
110
0
21 May 2018
Collapsed speech segment detection and suppression for WaveNet vocoder
Yi-Chiao Wu
Kazuhiro Kobayashi
Tomoki Hayashi
Patrick Lumban Tobing
Tomoki Toda
72
25
0
30 Apr 2018
Automatic Documentation of ICD Codes with Far-Field Speech Recognition
Albert Haque
Corinna Fukushima
21
0
0
30 Apr 2018
Speaker-independent raw waveform model for glottal excitation
Lauri Juvela
Vassilis Tsiaras
Bajibabu Bollepalli
Manu Airaksinen
Junichi Yamagishi
P. Alku
54
39
0
25 Apr 2018
Conditional End-to-End Audio Transforms
Albert Haque
Michelle Guo
Prateek Verma
114
41
0
30 Mar 2018
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron
RJ Skerry-Ryan
Eric Battenberg
Y. Xiao
Yuxuan Wang
Daisy Stanton
Joel Shor
Ron J. Weiss
R. Clark
Rif A. Saurous
56
555
0
24 Mar 2018
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
Yuxuan Wang
Daisy Stanton
Yu Zhang
RJ Skerry-Ryan
Eric Battenberg
Joel Shor
Y. Xiao
Fei Ren
Ye Jia
Rif A. Saurous
68
827
0
23 Mar 2018
Neural Network Quine
Oscar Chang
Hod Lipson
68
23
0
15 Mar 2018
Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's voice using GAN, WaveNet and low-quality found data
Jaime Lorenzo-Trueba
Fuming Fang
Xin Wang
Isao Echizen
Junichi Yamagishi
Tomi Kinnunen
66
73
0
02 Mar 2018
Do WaveNets Dream of Acoustic Waves?
Kanru Hua
26
1
0
23 Feb 2018
Fitting New Speakers Based on a Short Untranscribed Sample
Eliya Nachmani
Adam Polyak
Yaniv Taigman
Lior Wolf
53
84
0
20 Feb 2018
Neural Voice Cloning with a Few Samples
Sercan O. Arik
Jitong Chen
Kainan Peng
Ming-Yu Liu
Yanqi Zhou
82
388
0
14 Feb 2018
Adversarial Audio Synthesis
Chris Donahue
Julian McAuley
M. Puckette
GAN
147
616
0
12 Feb 2018
Previous
1
2
3
...
24
25
26