ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.05884
  4. Cited By
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram
  Predictions
v1v2 (latest)

Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions

16 December 2017
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
Zongheng Yang
Zhiwen Chen
Yu Zhang
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
ArXiv (abs)PDFHTML

Papers citing "Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions"

50 / 1,276 papers shown
Title
Improving Low Resource Code-switched ASR using Augmented Code-switched
  TTS
Improving Low Resource Code-switched ASR using Augmented Code-switched TTS
Yash Sharma
Basil Abraham
Karan Taneja
Preethi Jyothi
61
21
0
12 Oct 2020
Baseline System of Voice Conversion Challenge 2020 with Cyclic
  Variational Autoencoder and Parallel WaveGAN
Baseline System of Voice Conversion Challenge 2020 with Cyclic Variational Autoencoder and Parallel WaveGAN
Patrick Lumban Tobing
Yi-Chiao Wu
Tomoki Toda
DRL
60
14
0
09 Oct 2020
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis
  Including Unsupervised Duration Modeling
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
Jonathan Shen
Ye Jia
Mike Chrzanowski
Yu Zhang
Isaac Elias
Heiga Zen
Yonghui Wu
106
112
0
08 Oct 2020
The Academia Sinica Systems of Voice Conversion for VCC2020
The Academia Sinica Systems of Voice Conversion for VCC2020
Yu-Huai Peng
Cheng-Hung Hu
A. Kang
Hung-Shin Lee
Pin-Yuan Chen
Yu Tsao
Hsin-Min Wang
66
2
0
06 Oct 2020
Neural Speech Synthesis for Estonian
Neural Speech Synthesis for Estonian
Liisa Rätsep
Liisi Piits
Hille Pajupuu
Indrek Hein
Mark Fišel
15
2
0
06 Oct 2020
Transfer Learning from Monolingual ASR to Transcription-free
  Cross-lingual Voice Conversion
Transfer Learning from Monolingual ASR to Transcription-free Cross-lingual Voice Conversion
Che-Jui Chang
65
5
0
30 Sep 2020
Transfer Learning from Speech Synthesis to Voice Conversion with
  Non-Parallel Training Data
Transfer Learning from Speech Synthesis to Voice Conversion with Non-Parallel Training Data
Mingyang Zhang
Yi Zhou
Li Zhao
Haizhou Li
92
53
0
30 Sep 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis
DiffWave: A Versatile Diffusion Model for Audio Synthesis
Zhifeng Kong
Ming-Yu Liu
Jiaji Huang
Kexin Zhao
Bryan Catanzaro
DiffMBDL
222
1,471
0
21 Sep 2020
Hierarchical Multi-Grained Generative Model for Expressive Speech
  Synthesis
Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis
Yukiya Hono
Kazuna Tsuboi
Kei Sawada
Kei Hashimoto
Keiichiro Oura
Yoshihiko Nankaku
K. Tokuda
BDL
57
24
0
17 Sep 2020
Controllable neural text-to-speech synthesis using intuitive prosodic
  features
Controllable neural text-to-speech synthesis using intuitive prosodic features
T. Raitio
Ramya Rasipuram
D. Castellani
78
66
0
14 Sep 2020
Visual-speech Synthesis of Exaggerated Corrective Feedback
Visual-speech Synthesis of Exaggerated Corrective Feedback
Yaohua Bu
Weijun Li
Tianyi Ma
S. Chen
Jia Jia
Kun Li
Xiaobo Lu
35
1
0
12 Sep 2020
RECOApy: Data recording, pre-processing and phonetic transcription for
  end-to-end speech-based applications
RECOApy: Data recording, pre-processing and phonetic transcription for end-to-end speech-based applications
Adriana Stan
75
5
0
11 Sep 2020
Exploration of End-to-end Synthesisers forZero Resource Speech Challenge
  2020
Exploration of End-to-end Synthesisers forZero Resource Speech Challenge 2020
Karthik Pandia D.S.
Anusha Prakash
M. M.
H. Murthy
42
4
0
10 Sep 2020
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence
  Modeling
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling
Songxiang Liu
Yuewen Cao
Disong Wang
Xixin Wu
Xunying Liu
Helen Meng
BDL
116
92
0
06 Sep 2020
What the Future Brings: Investigating the Impact of Lookahead for
  Incremental Neural TTS
What the Future Brings: Investigating the Impact of Lookahead for Incremental Neural TTS
Brooke Stephenson
Laurent Besacier
Laurent Girin
Thomas Hueber
76
14
0
04 Sep 2020
HiFiSinger: Towards High-Fidelity Neural Singing Voice Synthesis
HiFiSinger: Towards High-Fidelity Neural Singing Voice Synthesis
Jiawei Chen
Xu Tan
Jian Luan
Tao Qin
Tie-Yan Liu
VLM
104
93
0
03 Sep 2020
Voice Conversion by Cascading Automatic Speech Recognition and
  Text-to-Speech Synthesis with Prosody Transfer
Voice Conversion by Cascading Automatic Speech Recognition and Text-to-Speech Synthesis with Prosody Transfer
Jing-Xuan Zhang
Li-Juan Liu
Yan-Nian Chen
Ya-Jun Hu
Yuan Jiang
Zhenhua Ling
Lirong Dai
49
17
0
03 Sep 2020
WaveGrad: Estimating Gradients for Waveform Generation
WaveGrad: Estimating Gradients for Waveform Generation
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
William Chan
DiffMBDL
158
795
0
02 Sep 2020
DrumGAN: Synthesis of Drum Sounds With Timbral Feature Conditioning
  Using Generative Adversarial Networks
DrumGAN: Synthesis of Drum Sounds With Timbral Feature Conditioning Using Generative Adversarial Networks
J. Nistal
Stefan Lattner
G. Richard
GAN
78
56
0
27 Aug 2020
Generating Handwriting via Decoupled Style Descriptors
Generating Handwriting via Decoupled Style Descriptors
Atsunobu Kotani
Stefanie Tellex
James Tompkin
72
25
0
26 Aug 2020
Efficient neural speech synthesis for low-resource languages through
  multilingual modeling
Efficient neural speech synthesis for low-resource languages through multilingual modeling
M. D. Korte
Jaebok Kim
E. Klabbers
59
19
0
20 Aug 2020
Unsupervised Acoustic Unit Representation Learning for Voice Conversion
  using WaveNet Auto-encoders
Unsupervised Acoustic Unit Representation Learning for Voice Conversion using WaveNet Auto-encoders
Mingjie Chen
Thomas Hain
SSLDRL
54
15
0
16 Aug 2020
Audio Dequantization for High Fidelity Audio Generation in Flow-based
  Neural Vocoder
Audio Dequantization for High Fidelity Audio Generation in Flow-based Neural Vocoder
Hyun-Wook Yoon
Sang-Hoon Lee
Hyeong-Rae Noh
Seong-Whan Lee
111
11
0
16 Aug 2020
Textual Echo Cancellation
Textual Echo Cancellation
Shaojin Ding
Ye Jia
Ke Hu
Quan Wang
81
8
0
13 Aug 2020
Prosody Learning Mechanism for Speech Synthesis System Without Text
  Length Limit
Prosody Learning Mechanism for Speech Synthesis System Without Text Length Limit
Zhen Zeng
Jianzong Wang
Ning Cheng
Jing Xiao
61
8
0
13 Aug 2020
Modeling Prosodic Phrasing with Multi-Task Learning in Tacotron-based
  TTS
Modeling Prosodic Phrasing with Multi-Task Learning in Tacotron-based TTS
Rui Liu
Berrak Sisman
F. Bao
Guanglai Gao
Haizhou Li
41
18
0
11 Aug 2020
Unsupervised Learning For Sequence-to-sequence Text-to-speech For
  Low-resource Languages
Unsupervised Learning For Sequence-to-sequence Text-to-speech For Low-resource Languages
Haitong Zhang
Yue Lin
53
30
0
11 Aug 2020
Data Efficient Voice Cloning from Noisy Samples with Domain Adversarial
  Training
Data Efficient Voice Cloning from Noisy Samples with Domain Adversarial Training
Jian Cong
Shan Yang
Lei Xie
Guoqiao Yu
Guanglu Wan
72
31
0
10 Aug 2020
SpeedySpeech: Efficient Neural Speech Synthesis
SpeedySpeech: Efficient Neural Speech Synthesis
Jan Vainer
Ondrej Dusek
66
43
0
09 Aug 2020
Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling
Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling
Yeunju Choi
Youngmoon Jung
Hoirin Kim
139
26
0
09 Aug 2020
LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition
LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition
Jin Xu
Xu Tan
Yi Ren
Tao Qin
Jian Li
Sheng Zhao
Tie-Yan Liu
VLM
70
91
0
09 Aug 2020
An Overview of Voice Conversion and its Challenges: From Statistical
  Modeling to Deep Learning
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning
Berrak Sisman
Junichi Yamagishi
Simon King
Haizhou Li
BDL
139
329
0
09 Aug 2020
Controllable Neural Prosody Synthesis
Controllable Neural Prosody Synthesis
Max Morrison
Zeyu Jin
Justin Salamon
Nicholas J. Bryan
G. J. Mysore
57
20
0
07 Aug 2020
A Machine of Few Words -- Interactive Speaker Recognition with
  Reinforcement Learning
A Machine of Few Words -- Interactive Speaker Recognition with Reinforcement Learning
Mathieu Seurin
Florian Strub
Philippe Preux
Olivier Pietquin
49
5
0
07 Aug 2020
Incremental Text to Speech for Neural Sequence-to-Sequence Models using
  Reinforcement Learning
Incremental Text to Speech for Neural Sequence-to-Sequence Models using Reinforcement Learning
D. Mohan
R. Lenain
Lorenzo Foglianti
Tian Huey Teh
Marlene Staib
Alexandra Torresquintero
Jiameng Gao
AI4TS
53
11
0
07 Aug 2020
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Wen-Chin Huang
Tomoki Hayashi
Yi-Chiao Wu
Hirokazu Kameoka
Tomoki Toda
118
40
0
07 Aug 2020
Phonological Features for 0-shot Multilingual Speech Synthesis
Phonological Features for 0-shot Multilingual Speech Synthesis
Marlene Staib
Tian Huey Teh
Alexandra Torresquintero
D. Mohan
Lorenzo Foglianti
R. Lenain
Jiameng Gao
60
33
0
06 Aug 2020
HooliGAN: Robust, High Quality Neural Vocoding
HooliGAN: Robust, High Quality Neural Vocoding
Ollie McCarthy
Zo Ahmed
95
14
0
06 Aug 2020
PPSpeech: Phrase based Parallel End-to-End TTS System
PPSpeech: Phrase based Parallel End-to-End TTS System
Yahuan Cong
Ran Zhang
Jian Luan
45
3
0
06 Aug 2020
Recognition-Synthesis Based Non-Parallel Voice Conversion with
  Adversarial Learning
Recognition-Synthesis Based Non-Parallel Voice Conversion with Adversarial Learning
Jing-Xuan Zhang
Zhenhua Ling
Lirong Dai
81
6
0
05 Aug 2020
Expressive TTS Training with Frame and Style Reconstruction Loss
Expressive TTS Training with Frame and Style Reconstruction Loss
Rui Liu
Berrak Sisman
Guanglai Gao
Haizhou Li
112
73
0
04 Aug 2020
One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech
One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech
Tomás Nekvinda
Ondrej Dusek
72
57
0
03 Aug 2020
Audiovisual Speech Synthesis using Tacotron2
Audiovisual Speech Synthesis using Tacotron2
Ahmed Hussen Abdelaziz
Anushree Prasanna Kumar
Chloe Seivwright
Gabriele Fanelli
Justin Binder
Y. Stylianou
S. Kajarekar
54
15
0
03 Aug 2020
Exploiting Deep Sentential Context for Expressive End-to-End Speech
  Synthesis
Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis
Fengyu Yang
Shan Yang
Qinghua Wu
Yujun Wang
Lei Xie
73
5
0
03 Aug 2020
Hearing What You Cannot See: Acoustic Vehicle Detection Around Corners
Hearing What You Cannot See: Acoustic Vehicle Detection Around Corners
Yannick Schulz
Avinash Kini Mattar
Thomas M. Hehn
Julian F. P. Kooij
20
0
0
30 Jul 2020
Speaking Speed Control of End-to-End Speech Synthesis using
  Sentence-Level Conditioning
Speaking Speed Control of End-to-End Speech Synthesis using Sentence-Level Conditioning
Jaesung Bae
Hanbin Bae
Young-Sun Joo
Junmo Lee
Gyeong-Hoon Lee
Hoon-Young Cho
73
17
0
30 Jul 2020
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested
  Adversarial Network
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
Jinhyeok Yang
Junmo Lee
Young-Ik Kim
Hoonyoung Cho
Injung Kim
82
73
0
30 Jul 2020
Detecting and analysing spontaneous oral cancer speech in the wild
Detecting and analysing spontaneous oral cancer speech in the wild
B. Halpern
Rob van Son
M. V. D. Brekel
O. Scharenborg
38
9
0
28 Jul 2020
Multi-speaker Emotion Conversion via Latent Variable Regularization and
  a Chained Encoder-Decoder-Predictor Network
Multi-speaker Emotion Conversion via Latent Variable Regularization and a Chained Encoder-Decoder-Predictor Network
Ravi Shankar
Hsi-Wei Hsieh
N. Charon
A. Venkataraman
113
11
0
25 Jul 2020
Non-parallel Emotion Conversion using a Deep-Generative Hybrid Network
  and an Adversarial Pair Discriminator
Non-parallel Emotion Conversion using a Deep-Generative Hybrid Network and an Adversarial Pair Discriminator
Ravi Shankar
Jacob Sager
A. Venkataraman
GAN
111
20
0
25 Jul 2020
Previous
123...192021...242526
Next