MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis

8 October 2019

Aaron Courville

Papers citing "MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis"

50 / 226 papers shown

Title
GANtron: Emotional Speech Synthesis with Generative Adversarial Networks E. Hortal Rodrigo Brechard Alarcia GAN 26 2 0 06 Oct 2021
On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis Cheng-I Jeff Lai Erica Cooper Yang Zhang Shiyu Chang Kaizhi Qian ... Yung-Sung Chuang Alexander H. Liu Junichi Yamagishi David D. Cox James R. Glass 28 6 0 04 Oct 2021
Bilateral Denoising Diffusion Models Max W. Y. Lam Jun Wang Rongjie Huang Dan Su Dong Yu DiffM 33 42 0 26 Aug 2021
StarGAN-VC+ASR: StarGAN-based Non-Parallel Voice Conversion Regularized by Automatic Speech Recognition Shoki Sakamoto Akira Taniguchi T. Taniguchi Hirokazu Kameoka BDL 31 5 0 10 Aug 2021
DarkGAN: Exploiting Knowledge Distillation for Comprehensible Audio Synthesis with GANs J. Nistal Stefan Lattner G. Richard 28 8 0 03 Aug 2021
SVSNet: An End-to-end Speaker Voice Similarity Assessment Model Cheng-Hung Hu Yu-Huai Peng Junichi Yamagishi Yu Tsao Hsin-Min Wang 29 5 0 20 Jul 2021
FoleyGAN: Visually Guided Generative Adversarial Network-Based Synchronous Sound Generation in Silent Videos Sanchita Ghose John J. Prevost GAN 27 26 0 20 Jul 2021
Neural Waveshaping Synthesis B. Hayes C. Saitis Gyorgy Fazekas 36 28 0 11 Jul 2021
SoundStream: An End-to-End Neural Audio Codec Neil Zeghidour Alejandro Luebs Ahmed Omran Jan Skoglund Marco Tagliasacchi AI4TS 43 744 0 07 Jul 2021
Adversarial Auto-Encoding for Packet Loss Concealment Santiago Pascual Joan Serrà Jordi Pons 31 27 0 07 Jul 2021
EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion Daxin Tan Liqun Deng Y. Yeung Xin Jiang Xiao Chen Tan Lee 29 38 0 04 Jul 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 23 353 0 29 Jun 2021
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis Jinhyeok Yang Jaesung Bae Taejun Bak Young-Ik Kim Hoon-Young Cho 34 36 0 29 Jun 2021
AI based Presentation Creator With Customized Audio Content Delivery Muvazima Mansoor Srikanth Chandar Ramamoorthy Srinath 26 0 0 27 Jun 2021
Basis-MelGAN: Efficient Neural Vocoder Based on Audio Decomposition Zhengxi Liu Y. Qian DRL 24 10 0 25 Jun 2021
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis Nanxin Chen Yu Zhang Heiga Zen Ron J. Weiss Mohammad Norouzi Najim Dehak William Chan DiffM 23 88 0 17 Jun 2021
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation Won Jang D. Lim Jaesam Yoon Bongwan Kim Juntae Kim 38 125 0 15 Jun 2021
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech Jaehyeon Kim Jungil Kong Juhee Son DRL 89 847 0 11 Jun 2021
NWT: Towards natural audio-to-video generation with representation learning Rayhane Mama Marc S. Tyndel Hashiam Kadhim Cole Clifford Ragavan Thurairatnam VGen 34 12 0 08 Jun 2021
Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation Dong Min Dong Bok Lee Eunho Yang Sung Ju Hwang 25 160 0 06 Jun 2021
Style-Restricted GAN: Multi-Modal Translation with Style Restriction Using Generative Adversarial Networks Sho Inoue T. Gonsalves GAN 20 0 0 17 May 2021
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech Vadim Popov Ivan Vovk Vladimir Gogoryan Tasnima Sadekova Mikhail Kudinov DiffM 61 515 0 13 May 2021
End-to-End Video-To-Speech Synthesis using Generative Adversarial Networks Rodrigo Mira Konstantinos Vougioukas Pingchuan Ma Stavros Petridis Björn W. Schuller Maja Pantic 35 43 0 27 Apr 2021
Improving Neural Silent Speech Interface Models by Adversarial Training Amin Honarmandi Shandiz L. Tóth G. Gosztolya Alexandra Markó Tamás Gábor Csapó AAML GAN 24 7 0 23 Apr 2021
Reconstructing Speech from Real-Time Articulatory MRI Using Neural Vocoders Yicong Yu Amin Honarmandi Shandiz L. Tóth 22 18 0 23 Apr 2021
Cyclic Defense GAN Against Speech Adversarial Attacks Mohammad Esmaeilpour P. Cardinal Alessandro Lameiras Koerich AAML 32 7 0 26 Mar 2021
Improve GAN-based Neural Vocoder using Pointwise Relativistic LeastSquare GAN Cong Wang Yu Chen Bin Wang Yi Shi 35 1 0 26 Mar 2021
Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models Sam Bond-Taylor Adam Leach Yang Long Chris G. Willcocks VLM TPM 48 485 0 08 Mar 2021
A Spectral Enabled GAN for Time Series Data Generation Kaleb E. Smith Anthony O. Smith GAN 30 12 0 02 Mar 2021
AdaSpeech: Adaptive Text to Speech for Custom Voice Mingjian Chen Xu Tan Bohan Li Yanqing Liu Tao Qin Sheng Zhao Tie-Yan Liu VLM DiffM 37 188 0 01 Mar 2021
MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames Takuhiro Kaneko Hirokazu Kameoka Kou Tanaka Nobukatsu Hojo 38 57 0 25 Feb 2021
Context-Aware Prosody Correction for Text-Based Speech Editing Max Morrison Lucas Rencker Zeyu Jin Nicholas J. Bryan Juan-Pablo Caceres Bryan Pardo 30 28 0 16 Feb 2021
CDPAM: Contrastive learning for perceptual audio similarity Pranay Manocha Zeyu Jin Richard Y. Zhang Adam Finkelstein 27 68 0 09 Feb 2021
Multi-Task Self-Supervised Pre-Training for Music Classification Ho-Hsiang Wu Chieh-Chi Kao Qingming Tang Ming Sun Brian McFee J. P. Bello Chao Wang SSL 39 37 0 05 Feb 2021
Universal Neural Vocoding with Parallel WaveNet Yunlong Jiao Adam Gabry's Georgi Tinchev Bartosz Putrycz Daniel Korzekwa V. Klimkov 36 42 0 01 Feb 2021
Iterative Text-based Editing of Talking-heads Using Neural Retargeting Xinwei Yao Ohad Fried Kayvon Fatahalian Maneesh Agrawala VGen 24 33 0 21 Nov 2020
s-Transformer: Segment-Transformer for Robust Neural Speech Synthesis Xi Wang Huaiping Ming Lei He Frank Soong 19 5 0 17 Nov 2020
Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis C. Chien Hung-yi Lee 32 36 0 12 Nov 2020
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis Ron J. Weiss RJ Skerry-Ryan Eric Battenberg Soroosh Mariooryad Diederik P. Kingma 24 98 0 06 Nov 2020
Facial Keypoint Sequence Generation from Audio Prateek Manocha Prithwijit Guha 3DH VGen 25 0 0 02 Nov 2020
AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance Normalization Yen-Hao Chen Da-Yi Wu Tsung-Han Wu Hung-yi Lee 34 107 0 31 Oct 2020
Upsampling artifacts in neural audio synthesis Jordi Pons Santiago Pascual Giulio Cengarle Joan Serrà 35 62 0 27 Oct 2020
Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminators Ryuichi Yamamoto Eunwoo Song Min-Jae Hwang Jae-Min Kim 29 18 0 27 Oct 2020
CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram Conversion Takuhiro Kaneko Hirokazu Kameoka Kou Tanaka Nobukatsu Hojo 29 78 0 22 Oct 2020
AISHELL-3: A Multi-speaker Mandarin TTS Corpus and the Baselines Yao Shi Hui Bu Xin Xu Shaojing Zhang Ming Li 35 219 0 22 Oct 2020
NU-GAN: High resolution neural upsampling with GAN Rithesh Kumar Kundan Kumar Vicki Anand Yoshua Bengio Aaron Courville 27 25 0 22 Oct 2020
Real-time Speech Frequency Bandwidth Extension Yunpeng Li Marco Tagliasacchi Oleg Rybakov Victor Ungureanu Dominik Roblek 25 48 0 21 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong Jaehyeon Kim Jaekyoung Bae 66 1,869 0 12 Oct 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis Zhifeng Kong Ming-Yu Liu Jiaji Huang Kexin Zhao Bryan Catanzaro DiffM BDL 36 1,397 0 21 Sep 2020
Controllable neural text-to-speech synthesis using intuitive prosodic features T. Raitio Ramya Rasipuram D. Castellani 42 66 0 14 Sep 2020