ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1807.07281
  4. Cited By
ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech
v1v2v3 (latest)

ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech

19 July 2018
Ming-Yu Liu
Kainan Peng
Jitong Chen
ArXiv (abs)PDFHTML

Papers citing "ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech"

35 / 135 papers shown
Title
MelGAN: Generative Adversarial Networks for Conditional Waveform
  Synthesis
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
Kundan Kumar
Rithesh Kumar
T. Boissière
L. Gestin
Wei Zhen Teoh
Jose M. R. Sotelo
A. D. Brébisson
Yoshua Bengio
Aaron Courville
GAN
178
961
0
08 Oct 2019
High Fidelity Speech Synthesis with Adversarial Networks
High Fidelity Speech Synthesis with Adversarial Networks
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
309
240
0
25 Sep 2019
DurIAN: Duration Informed Attention Network For Multimodal Synthesis
DurIAN: Duration Informed Attention Network For Multimodal Synthesis
Chengzhu Yu
Heng Lu
Na Hu
Meng Yu
Chao Weng
...
Deyi Tuo
Shiyin Kang
Guangzhi Lei
Jane Polak Scowcroft
Dong Yu
CVBM
83
118
0
04 Sep 2019
Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice
  Frequency for Text-to-Speech Synthesis
Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech Synthesis
Xin Wang
Junichi Yamagishi
66
32
0
27 Aug 2019
Quasi-Periodic WaveNet Vocoder: A Pitch Dependent Dilated Convolution
  Model for Parametric Speech Generation
Quasi-Periodic WaveNet Vocoder: A Pitch Dependent Dilated Convolution Model for Parametric Speech Generation
Yi-Chiao Wu
Tomoki Hayashi
Patrick Lumban Tobing
Kazuhiro Kobayashi
Tomoki Toda
49
16
0
01 Jul 2019
End-to-End Emotional Speech Synthesis Using Style Tokens and
  Semi-Supervised Training
End-to-End Emotional Speech Synthesis Using Style Tokens and Semi-Supervised Training
Peng Wu
Zhenhua Ling
Li-Juan Liu
Yuan Jiang
Hong-Chuan Wu
Lirong Dai
92
72
0
26 Jun 2019
A Neural Vocoder with Hierarchical Generation of Amplitude and Phase
  Spectra for Statistical Parametric Speech Synthesis
A Neural Vocoder with Hierarchical Generation of Amplitude and Phase Spectra for Statistical Parametric Speech Synthesis
Yang Ai
Zhenhua Ling
123
29
0
23 Jun 2019
Towards Transfer Learning for End-to-End Speech Synthesis from Deep
  Pre-Trained Language Models
Towards Transfer Learning for End-to-End Speech Synthesis from Deep Pre-Trained Language Models
Wei Fang
Yu-An Chung
James R. Glass
61
27
0
17 Jun 2019
MelNet: A Generative Model for Audio in the Frequency Domain
MelNet: A Generative Model for Audio in the Frequency Domain
Sean Vasquez
M. Lewis
DiffM
85
132
0
04 Jun 2019
Discrete Flows: Invertible Generative Models of Discrete Data
Discrete Flows: Invertible Generative Models of Discrete Data
Dustin Tran
Keyon Vafa
Kumar Krishna Agrawal
Laurent Dinh
Ben Poole
DRL
164
117
0
24 May 2019
Effective parameter estimation methods for an ExcitNet model in
  generative text-to-speech systems
Effective parameter estimation methods for an ExcitNet model in generative text-to-speech systems
Ohsung Kwon
Eunwoo Song
Jae-Min Kim
Hong-Goo Kang
48
4
0
21 May 2019
Non-Autoregressive Neural Text-to-Speech
Non-Autoregressive Neural Text-to-Speech
Kainan Peng
Ming-Yu Liu
Z. Song
Kexin Zhao
101
40
0
21 May 2019
Almost Unsupervised Text to Speech and Automatic Speech Recognition
Almost Unsupervised Text to Speech and Automatic Speech Recognition
Yi Ren
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
95
102
0
13 May 2019
Neural source-filter waveform models for statistical parametric speech
  synthesis
Neural source-filter waveform models for statistical parametric speech synthesis
Xin Wang
Shinji Takaki
Junichi Yamagishi
97
118
0
27 Apr 2019
A New GAN-based End-to-End TTS Training Algorithm
A New GAN-based End-to-End TTS Training Algorithm
Haohan Guo
Frank Soong
Lei He
Lei Xie
95
47
0
09 Apr 2019
Probability density distillation with generative adversarial networks
  for high-quality parallel waveform generation
Probability density distillation with generative adversarial networks for high-quality parallel waveform generation
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
70
55
0
09 Apr 2019
GELP: GAN-Excited Linear Prediction for Speech Synthesis from
  Mel-spectrogram
GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-spectrogram
Lauri Juvela
Bajibabu Bollepalli
Junichi Yamagishi
P. Alku
76
18
0
08 Apr 2019
Towards Generalized Speech Enhancement with Generative Adversarial
  Networks
Towards Generalized Speech Enhancement with Generative Adversarial Networks
Santiago Pascual
Joan Serrà
Antonio Bonafonte
GAN
67
33
0
06 Apr 2019
WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform
  Generation
WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation
Kou Tanaka
Hirokazu Kameoka
Takuhiro Kaneko
Nobukatsu Hojo
80
19
0
05 Apr 2019
Unsupervised Polyglot Text To Speech
Unsupervised Polyglot Text To Speech
Eliya Nachmani
Lior Wolf
65
42
0
06 Feb 2019
Feature reinforcement with word embedding and parsing information in
  neural TTS
Feature reinforcement with word embedding and parsing information in neural TTS
Huaiping Ming
Lei He
Haohan Guo
Frank Soong
157
15
0
03 Jan 2019
LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis
LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis
Min-Jae Hwang
Frank Soong
Fenglong Xie
Xi Wang
Hyeonjoo Kang
Hong-Goo Kang
67
21
0
29 Nov 2018
Deep Learning Based Phase Reconstruction for Speaker Separation: A
  Trigonometric Perspective
Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective
Zhong-Qiu Wang
Ke Tan
DeLiang Wang
119
95
0
22 Nov 2018
Representation Mixing for TTS Synthesis
Representation Mixing for TTS Synthesis
Kyle Kastner
J. F. Santos
Yoshua Bengio
Aaron Courville
55
43
0
17 Nov 2018
Towards achieving robust universal neural vocoding
Towards achieving robust universal neural vocoding
Jaime Lorenzo-Trueba
Thomas Drugman
Javier Latorre
Thomas Merritt
Bartosz Putrycz
Roberto Barra-Chicote
Alexis Moinet
Vatsal Aggarwal
DRL
149
19
0
15 Nov 2018
FloWaveNet : A Generative Flow for Raw Audio
FloWaveNet : A Generative Flow for Raw Audio
Sungwon Kim
Sang-gil Lee
Jongyoon Song
Jaehyeon Kim
Sungroh Yoon
133
169
0
06 Nov 2018
WaveGlow: A Flow-based Generative Network for Speech Synthesis
WaveGlow: A Flow-based Generative Network for Speech Synthesis
R. Prenger
Rafael Valle
Bryan Catanzaro
174
1,036
0
31 Oct 2018
Waveform generation for text-to-speech synthesis using pitch-synchronous
  multi-scale generative adversarial networks
Waveform generation for text-to-speech synthesis using pitch-synchronous multi-scale generative adversarial networks
Lauri Juvela
Bajibabu Bollepalli
Junichi Yamagishi
P. Alku
74
23
0
30 Oct 2018
Speaking style adaptation in Text-To-Speech synthesis using
  Sequence-to-sequence models with attention
Speaking style adaptation in Text-To-Speech synthesis using Sequence-to-sequence models with attention
Bajibabu Bollepalli
Lauri Juvela
P. Alku
51
4
0
29 Oct 2018
Investigation of enhanced Tacotron text-to-speech synthesis systems with
  self-attention for pitch accent language
Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language
Yusuke Yasuda
Xin Wang
Shinji Takaki
Junichi Yamagishi
63
87
0
29 Oct 2018
Neural source-filter-based waveform model for statistical parametric
  speech synthesis
Neural source-filter-based waveform model for statistical parametric speech synthesis
Xin Wang
Shinji Takaki
Junichi Yamagishi
121
125
0
29 Oct 2018
STFT spectral loss for training a neural speech waveform model
STFT spectral loss for training a neural speech waveform model
Shinji Takaki
Toru Nakashika
Xin Wang
Junichi Yamagishi
75
21
0
29 Oct 2018
A Fully Time-domain Neural Model for Subband-based Speech Synthesizer
A Fully Time-domain Neural Model for Subband-based Speech Synthesizer
Azam Rabiee
Geonmin Kim
Tae-Ho Kim
Soo-Young Lee
18
1
0
12 Oct 2018
Neural Speech Synthesis with Transformer Network
Neural Speech Synthesis with Transformer Network
Naihan Li
Shujie Liu
Yanqing Liu
Sheng Zhao
Ming-Yuan Liu
M. Zhou
85
102
0
19 Sep 2018
Semi-Supervised Training for Improving Data Efficiency in End-to-End
  Speech Synthesis
Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis
Yu-An Chung
Yuxuan Wang
Wei-Ning Hsu
Yu Zhang
RJ Skerry-Ryan
87
117
0
30 Aug 2018
Previous
123