Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1807.07281
Cited By
v1
v2
v3 (latest)
ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech
19 July 2018
Ming-Yu Liu
Kainan Peng
Jitong Chen
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech"
50 / 135 papers shown
Title
Improved parallel WaveGAN vocoder with perceptually weighted spectrogram loss
Eunwoo Song
Ryuichi Yamamoto
Min-Jae Hwang
Jin-Seob Kim
Ohsung Kwon
Jae-Min Kim
68
14
0
19 Jan 2021
Building Multi lingual TTS using Cross Lingual Voice Conversion
Qinghua Sun
Kenji Nagamatsu
13
3
0
28 Dec 2020
I'm Sorry for Your Loss: Spectrally-Based Audio Distances Are Bad at Pitch
Joseph P. Turian
Max Henry
49
31
0
08 Dec 2020
EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture
Chenfeng Miao
Shuang Liang
Zhencheng Liu
Minchuan Chen
Jun Ma
Shaojun Wang
Jing Xiao
67
38
0
07 Dec 2020
Multi-Instrumentalist Net: Unsupervised Generation of Music from Body Movements
Kun Su
Xiulong Liu
Eli Shlizerman
91
29
0
07 Dec 2020
Text-to-speech for the hearing impaired
Josef Schlittenlacher
T. Baer
32
0
0
03 Dec 2020
MelGlow: Efficient Waveform Generative Network Based on Location-Variable Convolution
Zhen Zeng
Jianzong Wang
Ning Cheng
Jing Xiao
44
8
0
03 Dec 2020
Empirical Evaluation of Deep Learning Model Compression Techniques on the WaveNet Vocoder
Sam Davis
Giuseppe Coccia
Sam Gooch
Julian Mack
36
0
0
20 Nov 2020
Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis
Erica Cooper
Xin Wang
Yi Zhao
Yusuke Yasuda
Junichi Yamagishi
SyDa
50
3
0
10 Nov 2020
Denoising-and-Dereverberation Hierarchical Neural Vocoder for Robust Waveform Generation
Yang Ai
Haoyu Li
Xin Wang
Junichi Yamagishi
Zhenhua Ling
47
4
0
08 Nov 2020
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
Ron J. Weiss
RJ Skerry-Ryan
Eric Battenberg
Soroosh Mariooryad
Diederik P. Kingma
99
101
0
06 Nov 2020
Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminators
Ryuichi Yamamoto
Eunwoo Song
Min-Jae Hwang
Jae-Min Kim
74
18
0
27 Oct 2020
TTS-by-TTS: TTS-driven Data Augmentation for Fast and High-Quality Speech Synthesis
Min-Jae Hwang
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
44
32
0
26 Oct 2020
NU-GAN: High resolution neural upsampling with GAN
Rithesh Kumar
Kundan Kumar
Vicki Anand
Yoshua Bengio
Aaron Courville
65
26
0
22 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
183
1,954
0
12 Oct 2020
The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders
Wen-Chin Huang
Patrick Lumban Tobing
Yi-Chiao Wu
Kazuhiro Kobayashi
Tomoki Toda
86
8
0
09 Oct 2020
Improving Sequential Latent Variable Models with Autoregressive Flows
Joseph Marino
Lei Chen
Jiawei He
Stephan Mandt
BDL
AI4TS
127
12
0
07 Oct 2020
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed Langevin Dynamics
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
Shogo Seki
DiffM
124
21
0
06 Oct 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis
Zhifeng Kong
Ming-Yu Liu
Jiaji Huang
Kexin Zhao
Bryan Catanzaro
DiffM
BDL
219
1,471
0
21 Sep 2020
WaveGrad: Estimating Gradients for Waveform Generation
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
William Chan
DiffM
BDL
155
795
0
02 Sep 2020
Nonparallel Voice Conversion with Augmented Classifier Star Generative Adversarial Networks
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
99
20
0
27 Aug 2020
Audio Dequantization for High Fidelity Audio Generation in Flow-based Neural Vocoder
Hyun-Wook Yoon
Sang-Hoon Lee
Hyeong-Rae Noh
Seong-Whan Lee
111
11
0
16 Aug 2020
Bunched LPCNet : Vocoder for Low-cost Neural Text-To-Speech Systems
Ravichander Vipperla
Sangjun Park
Kihyun Choo
Samin S. Ishtiaq
Kyoungbo Min
S. Bhattacharya
Abhinav Mehrotra
Alberto Gil C. P. Ramos
Nicholas D. Lane
72
26
0
11 Aug 2020
Unsupervised Learning For Sequence-to-sequence Text-to-speech For Low-resource Languages
Haitong Zhang
Yue Lin
53
30
0
11 Aug 2020
SpeedySpeech: Efficient Neural Speech Synthesis
Jan Vainer
Ondrej Dusek
66
43
0
09 Aug 2020
Unsupervised Cross-Domain Singing Voice Conversion
Adam Polyak
Lior Wolf
Yossi Adi
Yaniv Taigman
58
44
0
06 Aug 2020
HooliGAN: Robust, High Quality Neural Vocoding
Ollie McCarthy
Zo Ahmed
95
14
0
06 Aug 2020
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
Jinhyeok Yang
Junmo Lee
Young-Ik Kim
Hoonyoung Cho
Injung Kim
82
73
0
30 Jul 2020
Robust Front-End for Multi-Channel ASR using Flow-Based Density Estimation
Xiaoyuan Yi
Hyeonseung Lee
Wenhao Li
Hyung Yong Kim
Nam Soo Kim
84
22
0
25 Jul 2020
Quasi-Periodic WaveNet: An Autoregressive Raw Waveform Generative Model with Pitch-dependent Dilated Convolution Neural Network
Yi-Chiao Wu
Tomoki Hayashi
Patrick Lumban Tobing
Kazuhiro Kobayashi
Tomoki Toda
50
18
0
11 Jul 2020
DeepSinger: Singing Voice Synthesis with Data Mined From the Web
Yi Ren
Xu Tan
Tao Qin
Jian Luan
Zhou Zhao
Tie-Yan Liu
112
73
0
09 Jul 2020
Deep generative models for musical audio synthesis
M. Huzaifah
L. Wyse
210
20
0
10 Jun 2020
WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
Hyeongju Kim
Hyeongseung Lee
Woohyun Kang
Sung Jun Cheon
Byoung Jin Choi
N. Kim
67
12
0
08 Jun 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
155
1,415
0
08 Jun 2020
End-to-End Adversarial Text-to-Speech
Jeff Donahue
Sander Dieleman
Mikolaj Binkowski
Erich Elsen
Karen Simonyan
85
187
0
05 Jun 2020
NAUTILUS: a Versatile Voice Cloning System
Hieu-Thi Luong
Junichi Yamagishi
100
53
0
22 May 2020
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive Pitch-dependent Dilated Convolution Model for Parametric Speech Generation
Yi-Chiao Wu
Tomoki Hayashi
T. Okamoto
Hisashi Kawai
Tomoki Toda
73
4
0
18 May 2020
Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding
Seungwoo Choi
Seungju Han
Dongyoung Kim
S. Ha
91
67
0
18 May 2020
Many-to-Many Voice Transformer Network
Hirokazu Kameoka
Wen-Chin Huang
Kou Tanaka
Takuhiro Kaneko
Nobukatsu Hojo
Tomoki Toda
ViT
83
30
0
18 May 2020
WG-WaveNet: Real-Time High-Fidelity Speech Synthesis without GPU
Po-Chun Hsu
Hung-yi Lee
44
16
0
15 May 2020
Reverberation Modeling for Source-Filter-based Neural Vocoder
Yang Ai
Xin Wang
Junichi Yamagishi
Zhenhua Ling
59
3
0
15 May 2020
FeatherWave: An efficient high-fidelity neural vocoder with multi-band linear prediction
Qiao Tian
Zewang Zhang
Heng Lu
Linghui Chen
Shan Liu
69
22
0
12 May 2020
DiscreTalk: Text-to-Speech as a Machine Translation Problem
Tomoki Hayashi
Shinji Watanabe
70
32
0
12 May 2020
Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech
Geng Yang
Shan Yang
Kai-Chun Liu
Peng Fang
Wei Chen
Lei Xie
153
200
0
11 May 2020
From Speaker Verification to Multispeaker Speech Synthesis, Deep Transfer with Feedback Constraint
Zexin Cai
Chuxiong Zhang
Ming Li
73
42
0
10 May 2020
Jukebox: A Generative Model for Music
Prafulla Dhariwal
Heewoo Jun
Christine Payne
Jong Wook Kim
Alec Radford
Ilya Sutskever
VLM
171
758
0
30 Apr 2020
Knowledge-and-Data-Driven Amplitude Spectrum Prediction for Hierarchical Neural Vocoders
Yang Ai
Zhenhua Ling
65
8
0
16 Apr 2020
AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit Alignment
Zhen Zeng
Jianzong Wang
Ning Cheng
Tian Xia
Jing Xiao
VLM
75
56
0
04 Mar 2020
WaveFlow: A Compact Flow-based Model for Raw Audio
Ming-Yu Liu
Kainan Peng
Kexin Zhao
Z. Song
102
117
0
03 Dec 2019
Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
195
821
0
25 Oct 2019
Previous
1
2
3
Next