Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.05884
Cited By
v1
v2 (latest)
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
16 December 2017
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
Zongheng Yang
Zhiwen Chen
Yu Zhang
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions"
50 / 1,276 papers shown
Title
Non-Parallel Sequence-to-Sequence Voice Conversion with Disentangled Linguistic and Speaker Representations
Jing-Xuan Zhang
Zhenhua Ling
Lirong Dai
100
99
0
25 Jun 2019
Adversarial Learning for Improved Onsets and Frames Music Transcription
Jong Wook Kim
J. P. Bello
176
49
0
20 Jun 2019
A Unified Speaker Adaptation Method for Speech Synthesis using Transcribed and Untranscribed Speech with Backpropagation
Hieu-Thi Luong
Junichi Yamagishi
67
10
0
18 Jun 2019
Towards Transfer Learning for End-to-End Speech Synthesis from Deep Pre-Trained Language Models
Wei Fang
Yu-An Chung
James R. Glass
61
27
0
17 Jun 2019
Parametric Resynthesis with neural vocoders
Soumi Maiti
Michael I. Mandel
68
19
0
16 Jun 2019
Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis
Eric Battenberg
Soroosh Mariooryad
Daisy Stanton
RJ Skerry-Ryan
Matt Shannon
David Kao
Tom Bagby
BDL
86
45
0
08 Jun 2019
Text-based Editing of Talking-head Video
Ohad Fried
A. Tewari
Michael Zollhöfer
Adam Finkelstein
Eli Shechtman
Dan B. Goldman
Kyle Genova
Zeyu Jin
Christian Theobalt
Maneesh Agrawala
VGen
91
262
0
04 Jun 2019
MelNet: A Generative Model for Audio in the Frequency Domain
Sean Vasquez
M. Lewis
DiffM
85
132
0
04 Jun 2019
Robust Sequence-to-Sequence Acoustic Modeling with Stepwise Monotonic Attention for Neural TTS
Mutian He
Yan Deng
Lei He
83
81
0
03 Jun 2019
Unsupervised End-to-End Learning of Discrete Linguistic Units for Voice Conversion
Andy T. Liu
Po-Chun Hsu
Hung-yi Lee
SSL
73
30
0
28 May 2019
Effective parameter estimation methods for an ExcitNet model in generative text-to-speech systems
Ohsung Kwon
Eunwoo Song
Jae-Min Kim
Hong-Goo Kang
44
4
0
21 May 2019
Non-Autoregressive Neural Text-to-Speech
Kainan Peng
Ming-Yu Liu
Z. Song
Kexin Zhao
101
40
0
21 May 2019
MoGlow: Probabilistic and controllable motion synthesis using normalising flows
G. Henter
Simon Alexanderson
Jonas Beskow
94
98
0
16 May 2019
AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
Kaizhi Qian
Yang Zhang
Shiyu Chang
Xuesong Yang
M. Hasegawa-Johnson
135
471
0
14 May 2019
Learning to Groove with Inverse Sequence Transformations
Jon Gillick
Adam Roberts
Jesse Engel
Douglas Eck
David Bamman
SLR
BDL
77
81
0
14 May 2019
Almost Unsupervised Text to Speech and Automatic Speech Recognition
Yi Ren
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
95
102
0
13 May 2019
Machine learning in acoustics: theory and applications
Michael J. Bianco
Peter Gerstoft
James Traer
Emma Ozanich
M. Roch
Sharon Gannot
Charles-Alban Deledalle
AI4CE
89
391
0
11 May 2019
High quality, lightweight and adaptable TTS using LPCNet
Zvi Kons
Slava Shechtman
A. Sorin
Carmel Rabinovitz
R. Hoory
67
54
0
02 May 2019
Deep Learning for Audio Signal Processing
Hendrik Purwins
Yue Liu
Tuomas Virtanen
Jan Schlüter
Shuo-yiin Chang
Tara N. Sainath
VLM
119
598
0
30 Apr 2019
Semi-supervised Sequence-to-sequence ASR using Unpaired Speech and Text
M. Baskar
Shinji Watanabe
Ramón Fernández Astudillo
Takaaki Hori
L. Burget
J. Černocký
75
40
0
30 Apr 2019
The Zero Resource Speech Challenge 2019: TTS without T
Ewan Dunbar
Robin Algayres
Julien Karadayi
Mathieu Bernard
Juan Benjumea
...
Lucas Ondel
A. Black
Laurent Besacier
S. Sakti
Emmanuel Dupoux
85
117
0
25 Apr 2019
TTS Skins: Speaker Conversion via ASR
Adam Polyak
Lior Wolf
Yaniv Taigman
76
28
0
18 Apr 2019
Expediting TTS Synthesis with Adversarial Vocoding
Paarth Neekhara
Chris Donahue
M. Puckette
Shlomo Dubnov
Julian McAuley
66
20
0
16 Apr 2019
Unsupervised Singing Voice Conversion
Eliya Nachmani
Lior Wolf
82
56
0
13 Apr 2019
End-to-end Text-to-speech for Low-resource Languages by Cross-Lingual Transfer Learning
Tao Tu
Yuan-Jui Chen
Cheng-chieh Yeh
Hung-yi Lee
82
88
0
13 Apr 2019
Building a mixed-lingual neural TTS system with only monolingual data
Liumeng Xue
Wei Song
Guanghui Xu
Lei Xie
Zhizheng Wu
57
30
0
12 Apr 2019
Direct speech-to-speech translation with a sequence-to-sequence model
Ye Jia
Ron J. Weiss
Fadi Biadsy
Wolfgang Macherey
Melvin Johnson
Zhiwen Chen
Yonghui Wu
101
230
0
12 Apr 2019
A New GAN-based End-to-End TTS Training Algorithm
Haohan Guo
Frank Soong
Lei He
Lei Xie
91
47
0
09 Apr 2019
Exploiting Syntactic Features in a Parsed Tree to Improve End-to-End TTS
Haohan Guo
Frank Soong
Lei He
Lei Xie
61
30
0
09 Apr 2019
Probability density distillation with generative adversarial networks for high-quality parallel waveform generation
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
70
55
0
09 Apr 2019
Parrotron: An End-to-End Speech-to-Speech Conversion Model and its Applications to Hearing-Impaired Speech and Speech Separation
Fadi Biadsy
Ron J. Weiss
Pedro J. Moreno
D. Kanvesky
Ye Jia
88
115
0
08 Apr 2019
GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-spectrogram
Lauri Juvela
Bajibabu Bollepalli
Junichi Yamagishi
P. Alku
76
18
0
08 Apr 2019
Taco-VC: A Single Speaker Tacotron based Voice Conversion with Limited Data
Roee Levy Leshem
Raja Giryes
61
8
0
06 Apr 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
Heiga Zen
Viet Dang
R. Clark
Yu Zhang
Ron J. Weiss
Ye Jia
Zhiwen Chen
Yonghui Wu
164
959
0
05 Apr 2019
In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data
N. Prateek
Mateusz Lajszczak
Roberto Barra-Chicote
Thomas Drugman
Jaime Lorenzo-Trueba
Thomas Merritt
S. Ronanki
Trevor Wood
74
30
0
04 Apr 2019
Multi-reference Tacotron by Intercross Training for Style Disentangling,Transfer and Control in Speech Synthesis
Yanyao Bian
Changbin Chen
Yongguo Kang
Zhenglin Pan
77
46
0
04 Apr 2019
Training Multi-Speaker Neural Text-to-Speech Systems using Speaker-Imbalanced Speech Corpora
Hieu-Thi Luong
Xin Wang
Junichi Yamagishi
Nobuyuki Nishizawa
77
23
0
01 Apr 2019
Grammatical Error Correction and Style Transfer via Zero-shot Monolingual Translation
Elizaveta Korotkova
Agnes Luhtaru
Maksym Del
Krista Liin
Daiga Deksne
Mark Fishel
62
11
0
27 Mar 2019
CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages
Kyubyong Park
Thomas Mulc
70
101
0
27 Mar 2019
WGANSing: A Multi-Voice Singing Voice Synthesizer Based on the Wasserstein-GAN
Pritish Chandna
Merlijn Blaauw
J. Bonada
E. Gómez
91
62
0
26 Mar 2019
Deep Text-to-Speech System with Seq2Seq Model
Gary Wang
AI4TS
28
9
0
11 Mar 2019
Analysing Deep Learning-Spectral Envelope Prediction Methods for Singing Synthesis
F. Bous
A. Röbel
21
3
0
04 Mar 2019
A Unified Neural Architecture for Instrumental Audio Tasks
Steven Spratley
Daniel Beck
Trevor Cohn
56
5
0
01 Mar 2019
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Jonathan Shen
Patrick Nguyen
Yonghui Wu
Zhiwen Chen
Mengzhao Chen
...
William Chan
Shubham Toshniwal
Baohua Liao
M. Nirschl
Pat Rondon
VLM
113
211
0
21 Feb 2019
Audio-Linguistic Embeddings for Spoken Sentences
Albert Haque
Michelle Guo
Prateek Verma
Li Fei-Fei
80
51
0
20 Feb 2019
Data Efficient Voice Cloning for Neural Singing Synthesis
Merlijn Blaauw
J. Bonada
R. Daido
137
33
0
19 Feb 2019
Adversarial Generation of Time-Frequency Features with application in audio synthesis
Andrés Marafioti
Nicki Holighaus
Nathanael Perraudin
P. Majdak
60
68
0
11 Feb 2019
Unsupervised Polyglot Text To Speech
Eliya Nachmani
Lior Wolf
65
42
0
06 Feb 2019
Unsupervised speech representation learning using WaveNet autoencoders
J. Chorowski
Ron J. Weiss
Samy Bengio
Aaron van den Oord
SSL
76
319
0
25 Jan 2019
Exploring Transfer Learning for Low Resource Emotional TTS
Noé Tits
Kevin El Haddad
Thierry Dutoit
64
61
0
14 Jan 2019
Previous
1
2
3
...
23
24
25
26
Next