ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.11480
  4. Cited By
Parallel WaveGAN: A fast waveform generation model based on generative
  adversarial networks with multi-resolution spectrogram
v1v2 (latest)

Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram

25 October 2019
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
ArXiv (abs)PDFHTML

Papers citing "Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram"

50 / 464 papers shown
Title
Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic
  Wasserstein GAN
Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic Wasserstein GAN
Yin-Ping Cho
Yu Tsao
Hsin-Min Wang
Yi-Wen Liu
DiffM
88
9
0
21 Sep 2022
Exploiting Pre-trained Feature Networks for Generative Adversarial
  Networks in Audio-domain Loop Generation
Exploiting Pre-trained Feature Networks for Generative Adversarial Networks in Audio-domain Loop Generation
Yen-Tung Yeh
Bo-Yu Chen
Yi-Hsuan Yang
77
6
0
05 Sep 2022
Music Separation Enhancement with Generative Modeling
Music Separation Enhancement with Generative Modeling
N. Schaffer
Boaz Cogan
Ethan Manilow
Max Morrison
Prem Seetharaman
Bryan Pardo
73
9
0
26 Aug 2022
An Initial Investigation for Detecting Vocoder Fingerprints of Fake
  Audio
An Initial Investigation for Detecting Vocoder Fingerprints of Fake Audio
Xin Yan
Jiangyan Yi
J. Tao
Chenglong Wang
Haoxin Ma
Tao Wang
Shiming Wang
Ruibo Fu
76
34
0
20 Aug 2022
Pathway to Future Symbiotic Creativity
Pathway to Future Symbiotic Creativity
Yi-Ting Guo
Qi-fei Liu
Jie Chen
Wei Xue
Jie Fu
...
Fernando Rosas
Jeffrey Shaw
Xing Wu
Jiji Zhang
Jianliang Xu
66
0
0
18 Aug 2022
Generative Data Augmentation Guided by Triplet Loss for Speech Emotion
  Recognition
Generative Data Augmentation Guided by Triplet Loss for Speech Emotion Recognition
Shijun Wang
Hamed Hemati
Jón Guðnason
Damian Borth
53
4
0
09 Aug 2022
DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A
  Comprehensive Evaluation
DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive Evaluation
Da-Yi Wu
Wen-Yi Hsiao
Fu-Rong Yang
Oscar D. Friedman
Warren Jackson
Scott Bruzenak
Yi-Wen Liu
Yi-Hsuan Yang
DiffM
111
24
0
09 Aug 2022
ProDiff: Progressive Fast Diffusion Model For High-Quality
  Text-to-Speech
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech
Rongjie Huang
Zhou Zhao
Huadai Liu
Jinglin Liu
Chenye Cui
Yi Ren
DiffM
118
201
0
13 Jul 2022
Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech
Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech
Zhengxi Liu
Qiao Tian
Chenxu Hu
Xudong Liu
Meng-Che Wu
Yuping Wang
Hang Zhao
Yuxuan Wang
87
10
0
13 Jul 2022
A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement
  of Neural Post-filter for Low-cost Text-to-speech System
A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement of Neural Post-filter for Low-cost Text-to-speech System
Yi-Chiao Wu
Patrick Lumban Tobing
Kazuki Yasuhara
Noriyuki Matsunaga
Yamato Ohtani
Tomoki Toda
69
0
0
13 Jul 2022
CFAD: A Chinese Dataset for Fake Audio Detection
CFAD: A Chinese Dataset for Fake Audio Detection
Haoxin Ma
Jiangyan Yi
Chenglong Wang
Xin Yan
J. Tao
Tao Wang
Shiming Wang
Ruibo Fu
95
30
0
12 Jul 2022
DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial
  Vector-Quantized Auto-Encoders
DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders
Yanqing Liu
Rui Xue
Lei He
Xu Tan
Sheng Zhao
87
25
0
11 Jul 2022
NESC: Robust Neural End-2-End Speech Coding with GANs
NESC: Robust Neural End-2-End Speech Coding with GANs
N. Pia
Kishan Gupta
Srikanth Korse
M. Multrus
Guillaume Fuchs
103
16
0
07 Jul 2022
WeSinger 2: Fully Parallel Singing Voice Synthesis via Multi-Singer
  Conditional Adversarial Training
WeSinger 2: Fully Parallel Singing Voice Synthesis via Multi-Singer Conditional Adversarial Training
Zewang Zhang
Yibin Zheng
Xinhui Li
Li Lu
DiffM
171
11
0
05 Jul 2022
BERT, can HE predict contrastive focus? Predicting and controlling
  prominence in neural TTS using a language model
BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model
Brooke Stephenson
Laurent Besacier
Laurent Girin
Thomas Hueber
66
10
0
04 Jul 2022
TMGAN-PLC: Audio Packet Loss Concealment using Temporal Memory
  Generative Adversarial Network
TMGAN-PLC: Audio Packet Loss Concealment using Temporal Memory Generative Adversarial Network
Yuansheng Guan
Guochen Yu
Andong Li
C. Zheng
Jie Wang
112
9
0
04 Jul 2022
Towards Error-Resilient Neural Speech Coding
Towards Error-Resilient Neural Speech Coding
Huaying Xue
Xiulian Peng
Xue Jiang
Yan Lu
68
7
0
03 Jul 2022
Learning Noise-independent Speech Representation for High-quality Voice
  Conversion for Noisy Target Speakers
Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers
Liumeng Xue
Shan Yang
Na Hu
Jane Polak Scowcroft
Linfu Xie
51
2
0
02 Jul 2022
Language Model-Based Emotion Prediction Methods for Emotional Speech
  Synthesis Systems
Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Hyun-Wook Yoon
Ohsung Kwon
Hoyeon Lee
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
Min-Jae Hwang
128
15
0
30 Jun 2022
TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis
  using ranking support vector machine with variational autoencoder
TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder
Eunwoo Song
Ryuichi Yamamoto
Ohsung Kwon
Chan Song
Min-Jae Hwang
Suhyeon Oh
Hyun-Wook Yoon
Jin-Seob Kim
Jae-Min Kim
78
7
0
30 Jun 2022
A Hierarchical Speaker Representation Framework for One-shot Singing
  Voice Conversion
A Hierarchical Speaker Representation Framework for One-shot Singing Voice Conversion
Xu Li
Shansong Liu
Ying Shan
76
13
0
28 Jun 2022
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
Taejun Bak
Junmo Lee
Hanbin Bae
Jinhyeok Yang
Jaesung Bae
Young-Sun Joo
107
28
0
27 Jun 2022
Attack Agnostic Dataset: Towards Generalization and Stabilization of
  Audio DeepFake Detection
Attack Agnostic Dataset: Towards Generalization and Stabilization of Audio DeepFake Detection
Piotr Kawa
Marcin Plata
P. Syga
AAML
95
23
0
27 Jun 2022
Speak Like a Professional: Increasing Speech Intelligibility by
  Mimicking Professional Announcer Voice with Voice Conversion
Speak Like a Professional: Increasing Speech Intelligibility by Mimicking Professional Announcer Voice with Voice Conversion
Tuan Vu Ho
M. Kobayashi
M. Akagi
27
1
0
27 Jun 2022
Adversarial Multi-Task Learning for Disentangling Timbre and Pitch in
  Singing Voice Synthesis
Adversarial Multi-Task Learning for Disentangling Timbre and Pitch in Singing Voice Synthesis
Tae-Woo Kim
Minguk Kang
Gyeong-Hoon Lee
AAML
174
7
0
23 Jun 2022
Radio2Speech: High Quality Speech Recovery from Radio Frequency Signals
Radio2Speech: High Quality Speech Recovery from Radio Frequency Signals
Running Zhao
Jiang-Tao Luca Yu
Tingle Li
Hang Zhao
Edith C.H. Ngai
53
4
0
22 Jun 2022
WOLONet: Wave Outlooker for Efficient and High Fidelity Speech Synthesis
WOLONet: Wave Outlooker for Efficient and High Fidelity Speech Synthesis
Yi Wang
Yi Si
37
0
0
20 Jun 2022
Coin Flipping Neural Networks
Coin Flipping Neural Networks
Yuval Sieradzki
Nitzan Hodos
Gal Yehuda
Assaf Schuster
UQCV
85
3
0
18 Jun 2022
EPG2S: Speech Generation and Speech Enhancement based on
  Electropalatography and Audio Signals using Multimodal Learning
EPG2S: Speech Generation and Speech Enhancement based on Electropalatography and Audio Signals using Multimodal Learning
Lichin Chen
Po-Hsun Chen
Richard Tzong-Han Tsai
Yu Tsao
56
8
0
16 Jun 2022
End-to-End Voice Conversion with Information Perturbation
End-to-End Voice Conversion with Information Perturbation
Qicong Xie
Shan Yang
Yinjiao Lei
Linfu Xie
Jane Polak Scowcroft
70
7
0
15 Jun 2022
NatiQ: An End-to-end Text-to-Speech System for Arabic
NatiQ: An End-to-end Text-to-Speech System for Arabic
Ahmed Abdelali
Nadir Durrani
C. Demiroğlu
Fahim Dalvi
Hamdy Mubarak
Kareem Darwish
77
14
0
15 Jun 2022
Streaming non-autoregressive model for any-to-many voice conversion
Streaming non-autoregressive model for any-to-many voice conversion
Ziyi Chen
Haoran Miao
Pengyuan Zhang
76
9
0
15 Jun 2022
Speak Like a Dog: Human to Non-human creature Voice Conversion
Speak Like a Dog: Human to Non-human creature Voice Conversion
Kohei Suzuki
Shoki Sakamoto
T. Taniguchi
Hirokazu Kameoka
55
3
0
09 Jun 2022
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
Sang-gil Lee
Ming-Yu Liu
Boris Ginsburg
Bryan Catanzaro
Sung-Hoon Yoon
151
255
0
09 Jun 2022
Context-based out-of-vocabulary word recovery for ASR systems in Indian
  languages
Context-based out-of-vocabulary word recovery for ASR systems in Indian languages
Arun Baby
Saranya Vinnaitherthan
Akhil Kerhalkar
Pranav Jawale
Sharath Adavanne
Nagaraj Adiga
51
1
0
09 Jun 2022
Universal Speech Enhancement with Score-based Diffusion
Universal Speech Enhancement with Score-based Diffusion
Joan Serrà
Santiago Pascual
Jordi Pons
R. O. Araz
D. Scaini
DiffM
114
105
0
07 Jun 2022
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
Hui Zhang
Tian Yuan
Junkun Chen
Xintong Li
Renjie Zheng
...
Zeyu Chen
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
Liang Huang
AuLLM
74
28
0
20 May 2022
End-to-End Zero-Shot Voice Conversion with Location-Variable
  Convolutions
End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions
Wonjune Kang
M. Hasegawa-Johnson
D. Roy
82
8
0
19 May 2022
Macedonian Speech Synthesis for Assistive Technology Applications
Macedonian Speech Synthesis for Assistive Technology Applications
B. Sofronievski
Elena Velovska
Martin Velichkovski
Violeta Argirova
Tea Veljkovikj
...
Kristijan Lazarev
Toni Bachvarovski
Z. Ivanovski
Dimitar Tashkovski
B. Gerazov
20
0
0
18 May 2022
SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed
  Stochastic Quantization
SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization
Yuhta Takida
Takashi Shibuya
Wei-Hsiang Liao
Chieh-Hsin Lai
Junki Ohmura
Toshimitsu Uesaka
Naoki Murata
Shusuke Takahashi
Toshiyuki Kumakura
Yuki Mitsufuji
BDL
85
67
0
16 May 2022
cMelGAN: An Efficient Conditional Generative Model Based on Mel
  Spectrograms
cMelGAN: An Efficient Conditional Generative Model Based on Mel Spectrograms
Tracy Qian
Jackson Kaunismaa
Tony Chung
MGenGANMedIm
40
6
0
15 May 2022
Unified Source-Filter GAN with Harmonic-plus-Noise Source Excitation
  Generation
Unified Source-Filter GAN with Harmonic-plus-Noise Source Excitation Generation
Reo Yoneyama
Yi-Chiao Wu
Tomoki Toda
70
14
0
12 May 2022
Read the Room: Adapting a Robot's Voice to Ambient and Social Contexts
Read the Room: Adapting a Robot's Voice to Ambient and Social Contexts
Paige Tuttosi
Emma Hughson
Akihiro Matsufuji
Angelica Lim
69
4
0
10 May 2022
Muskits: an End-to-End Music Processing Toolkit for Singing Voice
  Synthesis
Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis
Jiatong Shi
Shuai Guo
Tao Qian
Nan Huo
Tomoki Hayashi
...
Xuankai Chang
Hua-Wei Li
Peter Wu
Shinji Watanabe
Qin Jin
VLM
111
27
0
09 May 2022
SVTS: Scalable Video-to-Speech Synthesis
SVTS: Scalable Video-to-Speech Synthesis
Rodrigo Mira
A. Haliassos
Stavros Petridis
Björn W. Schuller
Maja Pantic
71
35
0
04 May 2022
Parallel Synthesis for Autoregressive Speech Generation
Parallel Synthesis for Autoregressive Speech Generation
Po-Chun Hsu
Da-Rong Liu
Andy T. Liu
Hung-yi Lee
74
5
0
25 Apr 2022
SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech
SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech
Zhenhui Ye
Zhou Zhao
Yi Ren
Leilei Gan
88
28
0
25 Apr 2022
Dictionary Attacks on Speaker Verification
Dictionary Attacks on Speaker Verification
Mirko Marras
Pawel Korus
Anubhav Jain
N. Memon
AAML
79
10
0
24 Apr 2022
Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation
Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation
Detai Xin
Shinnosuke Takamichi
T. Okamoto
Hisashi Kawai
Hiroshi Saruwatari
34
0
0
22 Apr 2022
Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using
  Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Ryo Terashima
Ryuichi Yamamoto
Eunwoo Song
Yuma Shirahata
Hyun-Wook Yoon
Jae-Min Kim
Kentaro Tachibana
52
16
0
21 Apr 2022
Previous
123456...8910
Next