Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1909.11646
Cited By
v1
v2 (latest)
High Fidelity Speech Synthesis with Adversarial Networks
International Conference on Learning Representations (ICLR), 2019
25 September 2019
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"High Fidelity Speech Synthesis with Adversarial Networks"
50 / 153 papers shown
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
349
435
0
29 Jun 2021
AI based Presentation Creator With Customized Audio Content Delivery
Muvazima Mansoor
Srikanth Chandar
Ramamoorthy Srinath
176
1
0
27 Jun 2021
Glow-WaveGAN: Learning Speech Representations from GAN-based Variational Auto-Encoder For High Fidelity Flow-based Speech Synthesis
Interspeech (Interspeech), 2021
Jian Cong
Shan Yang
Lei Xie
Jane Polak Scowcroft
DRL
166
29
0
21 Jun 2021
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Interspeech (Interspeech), 2021
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
Najim Dehak
William Chan
DiffM
213
97
0
17 Jun 2021
Non Gaussian Denoising Diffusion Models
Eliya Nachmani
Robin San Roman
Lior Wolf
VLM
DiffM
163
59
0
14 Jun 2021
Catch-A-Waveform: Learning to Generate Audio from a Single Short Example
Neural Information Processing Systems (NeurIPS), 2021
Gal Greshler
Tamar Rott Shaham
T. Michaeli
197
27
0
11 Jun 2021
Sprachsynthese -- State-of-the-Art in englischer und deutscher Sprache
René Peinl
153
0
0
11 Jun 2021
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
International Conference on Machine Learning (ICML), 2021
Jaehyeon Kim
Jungil Kong
Juhee Son
DRL
298
1,151
0
11 Jun 2021
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis
Interspeech (Interspeech), 2021
Ji-Hoon Kim
Sang-Hoon Lee
Ji-Hyun Lee
Seong-Whan Lee
198
60
0
04 Jun 2021
NVC-Net: End-to-End Adversarial Voice Conversion
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Bac Nguyen Cong
Fabien Cardinaux
AAML
195
48
0
02 Jun 2021
ItôTTS and ItôWave: Linear Stochastic Differential Equation Is All You Need For Audio Generation
Shoule Wu
Ziqiang Shi
DiffM
238
11
0
17 May 2021
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
International Conference on Machine Learning (ICML), 2021
Vadim Popov
Ivan Vovk
Vladimir Gogoryan
Tasnima Sadekova
Mikhail Kudinov
DiffM
395
660
0
13 May 2021
VQCPC-GAN: Variable-Length Adversarial Audio Synthesis Using Vector-Quantized Contrastive Predictive Coding
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2021
J. Nistal
Cyran Aouameur
Stefan Lattner
G. Richard
235
7
0
04 May 2021
VideoGPT: Video Generation using VQ-VAE and Transformers
Wilson Yan
Yunzhi Zhang
Pieter Abbeel
A. Srinivas
ViT
VGen
632
643
0
20 Apr 2021
Noise Estimation for Generative Diffusion Models
Robin San-Roman
Eliya Nachmani
Lior Wolf
DiffM
288
117
0
06 Apr 2021
Deepfakes Generation and Detection: State-of-the-art, open challenges, countermeasures, and way forward
Momina Masood
M. Nawaz
K. Malik
A. Javed
Aun Irtaza
AAML
504
410
0
25 Feb 2021
MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Nobukatsu Hojo
127
71
0
25 Feb 2021
AudioVisual Speech Synthesis: A brief literature review
Efthymios Georgiou
Athanasios Katsamanis
77
0
0
18 Feb 2021
High Fidelity Speech Regeneration with Application to Speech Enhancement
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Adam Polyak
Lior Wolf
Yossi Adi
Ori Kabeli
Yaniv Taigman
154
19
0
31 Jan 2021
Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade
Findings (Findings), 2020
Jiatao Gu
X. Kong
246
144
0
31 Dec 2020
MelGlow: Efficient Waveform Generative Network Based on Location-Variable Convolution
Spoken Language Technology Workshop (SLT), 2020
Zhen Zeng
Jianzong Wang
Ning Cheng
Jing Xiao
153
8
0
03 Dec 2020
A Comprehensive Survey on Deep Music Generation: Multi-level Representations, Algorithms, Evaluations, and Future Directions
Shulei Ji
Jing Luo
Xinyu Yang
MGen
268
143
0
13 Nov 2020
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
Ron J. Weiss
RJ Skerry-Ryan
Eric Battenberg
Soroosh Mariooryad
Diederik P. Kingma
196
106
0
06 Nov 2020
StyleMelGAN: An Efficient High-Fidelity Adversarial Vocoder with Temporal Adaptive Normalization
Ahmed Mustafa
N. Pia
Guillaume Fuchs
180
90
0
03 Nov 2020
Speech Synthesis and Control Using Differentiable DSP
Giorgio Fabbro
Vladimir Golkov
Thomas Kemp
Zorah Lähner
181
13
0
28 Oct 2020
Upsampling artifacts in neural audio synthesis
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Jordi Pons
Santiago Pascual
Giulio Cengarle
Joan Serrà
213
69
0
27 Oct 2020
Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminators
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Ryuichi Yamamoto
Eunwoo Song
Min-Jae Hwang
Jae-Min Kim
175
19
0
27 Oct 2020
CLAR: Contrastive Learning of Auditory Representations
International Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Haider Al-Tahan
Y. Mohsenzadeh
SSL
380
67
0
19 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
493
2,433
0
12 Oct 2020
The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders
Wen-Chin Huang
Patrick Lumban Tobing
Yi-Chiao Wu
Kazuhiro Kobayashi
Tomoki Toda
172
9
0
09 Oct 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis
International Conference on Learning Representations (ICLR), 2020
Zhifeng Kong
Ming-Yu Liu
Jiaji Huang
Kexin Zhao
Bryan Catanzaro
DiffM
BDL
672
1,759
0
21 Sep 2020
HiFiSinger: Towards High-Fidelity Neural Singing Voice Synthesis
Jiawei Chen
Xu Tan
Jian Luan
Tao Qin
Tie-Yan Liu
VLM
218
106
0
03 Sep 2020
WaveGrad: Estimating Gradients for Waveform Generation
International Conference on Learning Representations (ICLR), 2020
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
William Chan
DiffM
BDL
410
886
0
02 Sep 2020
Prosody Learning Mechanism for Speech Synthesis System Without Text Length Limit
Interspeech (Interspeech), 2020
Zhen Zeng
Jianzong Wang
Ning Cheng
Jing Xiao
138
9
0
13 Aug 2020
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020
Berrak Sisman
Junichi Yamagishi
Simon King
Haizhou Li
BDL
440
389
0
09 Aug 2020
A Spectral Energy Distance for Parallel Speech Synthesis
A. Gritsenko
Tim Salimans
Rianne van den Berg
Jasper Snoek
Nal Kalchbrenner
236
78
0
03 Aug 2020
Adversarially Trained Multi-Singer Sequence-To-Sequence Singing Synthesizer
Jie Wu
Jian Luan
158
28
0
18 Jun 2020
FastPitch: Parallel Text-to-speech with Pitch Prediction
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Adrian Lañcucki
265
388
0
11 Jun 2020
HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
Interspeech (Interspeech), 2020
Jiaqi Su
Zeyu Jin
Adam Finkelstein
171
155
0
10 Jun 2020
End-to-End Adversarial Text-to-Speech
Jeff Donahue
Sander Dieleman
Mikolaj Binkowski
Erich Elsen
Karen Simonyan
317
192
0
05 Jun 2020
Speech-to-Singing Conversion based on Boundary Equilibrium GAN
Interspeech (Interspeech), 2020
Da-Yi Wu
Yi-Hsuan Yang
GAN
205
8
0
28 May 2020
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive Pitch-dependent Dilated Convolution Model for Parametric Speech Generation
Yi-Chiao Wu
Tomoki Hayashi
T. Okamoto
Hisashi Kawai
Tomoki Toda
152
4
0
18 May 2020
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
Rafael Valle
Kevin J. Shih
R. Prenger
Bryan Catanzaro
271
131
0
12 May 2020
Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech
Geng Yang
Shan Yang
Kai-Chun Liu
Peng Fang
Wei Chen
Lei Xie
225
225
0
11 May 2020
GACELA -- A generative adversarial context encoder for long audio inpainting
Andrés Marafioti
P. Majdak
Nicki Holighaus
Nathanael Perraudin
266
51
0
11 May 2020
Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice Conversion without Parallel Data
Seung-won Park
Doo-young Kim
Myun-chul Joe
155
45
0
07 May 2020
Conditional Spoken Digit Generation with StyleGAN
Interspeech (Interspeech), 2020
Kasperi Palkama
Lauri Juvela
Alexander Ilin
GAN
224
11
0
28 Apr 2020
Transformation-based Adversarial Video Prediction on Large-Scale Data
Pauline Luc
Aidan Clark
Sander Dieleman
Diego de Las Casas
Yotam Doron
Albin Cassirer
Karen Simonyan
VGen
1.0K
91
0
09 Mar 2020
A Limited-Capacity Minimax Theorem for Non-Convex Games or: How I Learned to Stop Worrying about Mixed-Nash and Love Neural Nets
Gauthier Gidel
David Balduzzi
Wojciech M. Czarnecki
M. Garnelo
Yoram Bachrach
255
7
0
14 Feb 2020
Score and Lyrics-Free Singing Voice Generation
International Conference on Innovative Computing and Cloud Computing (ICCC), 2019
Jen-Yu Liu
Yu-Hua Chen
Yin-Cheng Yeh
Yi-Hsuan Yang
170
23
0
26 Dec 2019
Previous
1
2
3
4
Next