v1v2 (latest)

Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram

25 October 2019

Papers citing "Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram"

50 / 464 papers shown

Title
N-Singer: A Non-Autoregressive Korean Singing Voice Synthesis System for Pronunciation Enhancement Gyeong-Hoon Lee Tae-Woo Kim Hanbin Bae Min-Ji Lee Young-Ik Kim Hoon-Young Cho VLM 79 20 0 29 Jun 2021
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis Jinhyeok Yang Jaesung Bae Taejun Bak Young-Ik Kim Hoon-Young Cho 134 37 0 29 Jun 2021
Basis-MelGAN: Efficient Neural Vocoder Based on Audio Decomposition Zhengxi Liu Y. Qian DRL 49 10 0 25 Jun 2021
Distilling the Knowledge from Conditional Normalizing Flows Dmitry Baranchuk Vladimir Aliev Artem Babenko BDL 85 2 0 24 Jun 2021
UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control M. Kang Sungjae Kim Injung Kim 77 3 0 21 Jun 2021
Non-native English lexicon creation for bilingual speech synthesis Arun Baby Pranav Jawale Saranya Vinnaitherthan Sumukh Badam Nagaraj Adiga Sharath Adavanne 44 8 0 21 Jun 2021
Glow-WaveGAN: Learning Speech Representations from GAN-based Variational Auto-Encoder For High Fidelity Flow-based Speech Synthesis Jian Cong Shan Yang Lei Xie Jane Polak Scowcroft DRL 107 29 0 21 Jun 2021
Improving robustness of one-shot voice conversion with deep discriminative speaker encoder Hongqiang Du Lei Xie 64 6 0 19 Jun 2021
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion Disong Wang Liqun Deng Y. Yeung Xiao Chen Xunying Liu Helen Meng DRL 84 141 0 18 Jun 2021
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis Nanxin Chen Yu Zhang Heiga Zen Ron J. Weiss Mohammad Norouzi Najim Dehak William Chan DiffM 97 88 0 17 Jun 2021
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model Chenye Cui Yi Ren Jinglin Liu Feiyang Chen Rongjie Huang Ming Lei Zhou Zhao 66 35 0 17 Jun 2021
Improving the expressiveness of neural vocoding with non-affine Normalizing Flows Adam Gabry's Yunlong Jiao V. Klimkov Daniel Korzekwa Roberto Barra-Chicote 45 1 0 16 Jun 2021
RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis Rohola Zandie Mohammad H. Mahoor Julia Madsen Eshrat S. Emamian 63 25 0 15 Jun 2021
Pathological voice adaptation with autoencoder-based voice conversion M. Illa B. Halpern Rob van Son Laureano Moro-Velazquez O. Scharenborg 40 13 0 15 Jun 2021
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation Won Jang D. Lim Jaesam Yoon Bongwan Kim Juntae Kim 116 132 0 15 Jun 2021
PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior Sang-gil Lee Heeseung Kim Chaehun Shin Xu Tan Chang-Shu Liu Qi Meng Tao Qin Wei Chen Sung-Hoon Yoon Tie-Yan Liu DiffM 85 89 0 11 Jun 2021
Sprachsynthese -- State-of-the-Art in englischer und deutscher Sprache René Peinl 45 0 0 11 Jun 2021
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis Ji-Hoon Kim Sang-Hoon Lee Ji-Hyun Lee Seong-Whan Lee 104 54 0 04 Jun 2021
A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion Wen-Chin Huang Kazuhiro Kobayashi Yu-Huai Peng Ching-Feng Liu Yu Tsao Hsin-Min Wang Tomoki Toda 65 11 0 02 Jun 2021
NVC-Net: End-to-End Adversarial Voice Conversion Bac Nguyen Cong Fabien Cardinaux AAML 124 42 0 02 Jun 2021
High-Fidelity and Low-Latency Universal Neural Vocoder based on Multiband WaveRNN with Data-Driven Linear Prediction for Discrete Waveform Modeling Patrick Lumban Tobing Tomoki Toda 60 8 0 20 May 2021
ItôTTS and ItôWave: Linear Stochastic Differential Equation Is All You Need For Audio Generation Shoule Wu Ziqiang Shi DiffM 157 11 0 17 May 2021
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech Vadim Popov Ivan Vovk Vladimir Gogoryan Tasnima Sadekova Mikhail Kudinov DiffM 117 544 0 13 May 2021
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism Jinglin Liu Chengxi Li Yi Ren Feiyang Chen Zhou Zhao DiffM 183 271 0 06 May 2021
End-to-End Video-To-Speech Synthesis using Generative Adversarial Networks Rodrigo Mira Konstantinos Vougioukas Pingchuan Ma Stavros Petridis Björn W. Schuller Maja Pantic 112 47 0 27 Apr 2021
One Billion Audio Sounds from GPU-enabled Modular Synthesis Joseph P. Turian Jordie Shier George Tzanetakis K. McNally Max Henry 103 22 0 27 Apr 2021
Phrase break prediction with bidirectional encoder representations in Japanese text-to-speech synthesis Kosuke Futamata Byeong-Cheol Park Ryuichi Yamamoto Kentaro Tachibana 35 14 0 26 Apr 2021
An Adaptive Learning based Generative Adversarial Network for One-To-One Voice Conversion Sandipan Dhar N. D. Jana Swagatam Das 54 18 0 25 Apr 2021
Review of end-to-end speech synthesis technology based on deep learning Zhaoxi Mu Xinyu Yang Yizhuo Dong AuLLM ALM 94 25 0 20 Apr 2021
KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset Saida Mussakhojayeva Aigerim Janaliyeva A. Mirzakhmetov Yerbolat Khassanov H. A. Varol 61 14 0 17 Apr 2021
FastS2S-VC: Streaming Non-Autoregressive Sequence-to-Sequence Voice Conversion Hirokazu Kameoka Kou Tanaka Takuhiro Kaneko 81 21 0 14 Apr 2021
Non-autoregressive sequence-to-sequence voice conversion Tomoki Hayashi Wen-Chin Huang Kazuhiro Kobayashi Tomoki Toda 41 24 0 14 Apr 2021
NoiseVC: Towards High Quality Zero-Shot Voice Conversion Shijun Wang Damian Borth DRL 75 6 0 13 Apr 2021
Unified Source-Filter GAN: Unified Source-filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN Reo Yoneyama Yi-Chiao Wu Tomoki Toda 73 12 0 10 Apr 2021
Flavored Tacotron: Conditional Learning for Prosodic-linguistic Features Mahsa Elyasi Gaurav Bharaj 44 2 0 08 Apr 2021
The AS-NU System for the M2VoC Challenge Cheng-Hung Hu Yi-Chiao Wu Wen-Chin Huang Yu-Huai Peng Yu-Wen Chen Pin-Jui Ku Tomoki Toda Yu Tsao Hsin-Min Wang 54 1 0 07 Apr 2021
Fast DCTTS: Efficient Deep Convolutional Text-to-Speech M. Kang Jihyun Lee Simin Kim Injung Kim 54 6 0 01 Apr 2021
Adversarial Attacks and Defenses for Speech Recognition Systems Piotr Żelasko Sonal Joshi Yiwen Shao Jesus Villalba J. Trmal Najim Dehak Sanjeev Khudanpur AAML 60 29 0 31 Mar 2021
Improve GAN-based Neural Vocoder using Pointwise Relativistic LeastSquare GAN Cong Wang Yu Chen Bin Wang Yi Shi 146 1 0 26 Mar 2021
GAN Vocoder: Multi-Resolution Discriminator Is All You Need J. You Dalhyun Kim Gyuhyeon Nam Geumbyeol Hwang Gyeongsu Chae 68 27 0 09 Mar 2021
crank: An Open-Source Software for Nonparallel Voice Conversion Based on Vector-Quantized Variational Autoencoder Kazuhiro Kobayashi Wen-Chin Huang Yi-Chiao Wu Patrick Lumban Tobing Tomoki Hayashi Tomoki Toda BDL DRL 65 19 0 04 Mar 2021
Deepfakes Generation and Detection: State-of-the-art, open challenges, countermeasures, and way forward Momina Masood M. Nawaz K. Malik A. Javed Aun Irtaza AAML 202 323 0 25 Feb 2021
MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames Takuhiro Kaneko Hirokazu Kameoka Kou Tanaka Nobukatsu Hojo 73 60 0 25 Feb 2021
Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input Brooke Stephenson Thomas Hueber Laurent Girin Laurent Besacier 89 10 0 19 Feb 2021
PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components Yukiya Hono Shinji Takaki Kei Hashimoto Keiichiro Oura Yoshihiko Nankaku K. Tokuda 69 16 0 15 Feb 2021
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention Peng Liu Yuewen Cao Songxiang Liu Na Hu Guangzhi Li Chao Weng Jane Polak Scowcroft 95 22 0 12 Feb 2021
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search Renqian Luo Xu Tan Rui Wang Tao Qin Jinzhu Li Sheng Zhao Enhong Chen Tie-Yan Liu 64 61 0 08 Feb 2021
EMA2S: An End-to-End Multimodal Articulatory-to-Speech System Yu-Wen Chen Kuo-Hsuan Hung Shang-Yi Chuang Jonathan Sherman Wen-Chin Huang Xugang Lu Yu Tsao 46 16 0 07 Feb 2021
Universal Neural Vocoding with Parallel WaveNet Yunlong Jiao Adam Gabry's Georgi Tinchev Bartosz Putrycz Daniel Korzekwa V. Klimkov 81 42 0 01 Feb 2021
High Fidelity Speech Regeneration with Application to Speech Enhancement Adam Polyak Lior Wolf Yossi Adi Ori Kabeli Yaniv Taigman 55 19 0 31 Jan 2021