ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.11480
  4. Cited By
Parallel WaveGAN: A fast waveform generation model based on generative
  adversarial networks with multi-resolution spectrogram
v1v2 (latest)

Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram

25 October 2019
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
ArXiv (abs)PDFHTML

Papers citing "Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram"

50 / 464 papers shown
Title
Assessing the Generalization Gap of Learning-Based Speech Enhancement
  Systems in Noisy and Reverberant Environments
Assessing the Generalization Gap of Learning-Based Speech Enhancement Systems in Noisy and Reverberant Environments
Philippe Gonzalez
T. S. Alstrøm
Tobias May
66
14
0
12 Sep 2023
CleanUNet 2: A Hybrid Speech Denoising Model on Waveform and Spectrogram
CleanUNet 2: A Hybrid Speech Denoising Model on Waveform and Spectrogram
Zhifeng Kong
Ming-Yu Liu
Ambrish Dantrey
Bryan Catanzaro
51
7
0
12 Sep 2023
A Two-Stage Training Framework for Joint Speech Compression and
  Enhancement
A Two-Stage Training Framework for Joint Speech Compression and Enhancement
Jiayi Huang
Zeyu Yan
Wenbin Jiang
Fei Wen
56
1
0
08 Sep 2023
BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial
  Network
BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network
Takashi Shibuya
Yuhta Takida
Yuki Mitsufuji
71
11
0
06 Sep 2023
Evaluating Methods for Ground-Truth-Free Foreign Accent Conversion
Evaluating Methods for Ground-Truth-Free Foreign Accent Conversion
Wen-Chin Huang
Tomoki Toda
CVBM
99
5
0
05 Sep 2023
General Purpose Audio Effect Removal
General Purpose Audio Effect Removal
Matthew Rice
C. Steinmetz
Georgy Fazekas
Joshua D. Reiss
73
8
0
30 Aug 2023
A Review of Differentiable Digital Signal Processing for Music & Speech
  Synthesis
A Review of Differentiable Digital Signal Processing for Music & Speech Synthesis
B. Hayes
Jordie Shier
Gyorgy Fazekas
Andrew Mcpherson
C. Saitis
83
25
0
29 Aug 2023
Let There Be Sound: Reconstructing High Quality Speech from Silent
  Videos
Let There Be Sound: Reconstructing High Quality Speech from Silent Videos
Ji-Hoon Kim
Jaehun Kim
Joon Son Chung
59
7
0
29 Aug 2023
Expressive paragraph text-to-speech synthesis with multi-step
  variational autoencoder
Expressive paragraph text-to-speech synthesis with multi-step variational autoencoder
Xuyuan Li
Zengqiang Shang
Peiyang Shi
Hua Hua
Jian Liu
Pengyuan Zhang
84
0
0
25 Aug 2023
Exploiting Time-Frequency Conformers for Music Audio Enhancement
Exploiting Time-Frequency Conformers for Music Audio Enhancement
Yunkee Chae
Junghyun Koo
Sungho Lee
Kyogu Lee
64
3
0
24 Aug 2023
iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using
  1D-2D CNN
iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Shogo Seki
80
5
0
14 Aug 2023
Audio-visual video-to-speech synthesis with synthesized input audio
Audio-visual video-to-speech synthesis with synthesized input audio
Triantafyllos Kefalas
Yannis Panagakis
Maja Pantic
VGenDiffM
95
1
0
31 Jul 2023
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech
  with Adversarial Learning and Architecture Design
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
Jungil Kong
Jihoon Park
Beomjeong Kim
Jeongmin Kim
Dohee Kong
Sangjin Kim
53
41
0
31 Jul 2023
SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer
SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer
Daegyeom Kim
Seong-soo Hong
Yong-Hoon Choi
79
2
0
20 Jul 2023
Complexity Matters: Rethinking the Latent Space for Generative Modeling
Complexity Matters: Rethinking the Latent Space for Generative Modeling
Tianyang Hu
Fei Chen
Hong Wang
Jiawei Li
Wei Cao
Jiacheng Sun
Zechao Li
DiffM
118
10
0
17 Jul 2023
NoiseBandNet: Controllable Time-Varying Neural Synthesis of Sound
  Effects Using Filterbanks
NoiseBandNet: Controllable Time-Varying Neural Synthesis of Sound Effects Using Filterbanks
Adrián Barahona-Ríos
Tom Collins
44
7
0
16 Jul 2023
The Ethical Implications of Generative Audio Models: A Systematic
  Literature Review
The Ethical Implications of Generative Audio Models: A Systematic Literature Review
J. Barnett
86
32
0
07 Jul 2023
Large-scale unsupervised audio pre-training for video-to-speech
  synthesis
Large-scale unsupervised audio pre-training for video-to-speech synthesis
Triantafyllos Kefalas
Yannis Panagakis
Maja Pantic
VGen
69
4
0
27 Jun 2023
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
Matt Le
Apoorv Vyas
Bowen Shi
Brian Karrer
Leda Sari
...
Mary Williamson
Vimal Manohar
Yossi Adi
Jay Mahadeokar
Wei-Ning Hsu
AuLLM
123
306
0
23 Jun 2023
Phase Repair for Time-Domain Convolutional Neural Networks in Music
  Super-Resolution
Phase Repair for Time-Domain Convolutional Neural Networks in Music Super-Resolution
Yenan Zhang
G. Kolkman
Hiroshi Watanabe
SupR
51
2
0
20 Jun 2023
Multi-Loss Convolutional Network with Time-Frequency Attention for
  Speech Enhancement
Multi-Loss Convolutional Network with Time-Frequency Attention for Speech Enhancement
Liang Wan
Hongqing Liu
Yi Zhou
Jie Ji
61
2
0
15 Jun 2023
Feature Normalization for Fine-tuning Self-Supervised Models in Speech
  Enhancement
Feature Normalization for Fine-tuning Self-Supervised Models in Speech Enhancement
Hejung Yang
Hong-Goo Kang
SSL
50
0
0
14 Jun 2023
HiddenSinger: High-Quality Singing Voice Synthesis via Neural Audio
  Codec and Latent Diffusion Models
HiddenSinger: High-Quality Singing Voice Synthesis via Neural Audio Codec and Latent Diffusion Models
Ji-Sang Hwang
Sang-Hoon Lee
Seong-Whan Lee
DiffM
60
9
0
12 Jun 2023
Mandarin Electrolaryngeal Speech Voice Conversion using Cross-domain
  Features
Mandarin Electrolaryngeal Speech Voice Conversion using Cross-domain Features
Hsin-Hao Chen
Yung-Lun Chien
Ming-Chi Yen
S. Tsai
Yu Tsao
T. Chi
Hsin-Min Wang
52
2
0
11 Jun 2023
Audio-Visual Mandarin Electrolaryngeal Speech Voice Conversion
Audio-Visual Mandarin Electrolaryngeal Speech Voice Conversion
Yung-Lun Chien
Hsin-Hao Chen
Ming-Chi Yen
S. Tsai
Hsin-Min Wang
Yu Tsao
T. Chi
58
1
0
11 Jun 2023
High-Fidelity Audio Compression with Improved RVQGAN
High-Fidelity Audio Compression with Improved RVQGAN
Rithesh Kumar
Prem Seetharaman
Alejandro Luebs
I. Kumar
Kundan Kumar
126
338
0
11 Jun 2023
The Age of Synthetic Realities: Challenges and Opportunities
The Age of Synthetic Realities: Challenges and Opportunities
J. P. Cardenuto
Jing Yang
Rafael Padilha
Renjie Wan
Daniel Moreira
Haoliang Li
Shiqi Wang
Fernanda A. Andaló
Sébastien Marcel
Anderson de Rezende Rocha
DeLMO
115
30
0
09 Jun 2023
Arabic Dysarthric Speech Recognition Using Adversarial and Signal-Based
  Augmentation
Arabic Dysarthric Speech Recognition Using Adversarial and Signal-Based Augmentation
Massa Baali
Ibrahim Almakky
Shady Shehata
Fakhri Karray
69
3
0
07 Jun 2023
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive
  Bias
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
Ziyue Jiang
Yi Ren
Zhe Ye
Jinglin Liu
Chen Zhang
...
Rongjie Huang
Chunfeng Wang
Xiang Yin
Zejun Ma
Zhou Zhao
DiffM
105
80
0
06 Jun 2023
HD-DEMUCS: General Speech Restoration with Heterogeneous Decoders
HD-DEMUCS: General Speech Restoration with Heterogeneous Decoders
Doyeon Kim
Soo-Whan Chung
Hyewon Han
Youna Ji
Hong-Goo Kang
66
7
0
02 Jun 2023
Differentiable Allpass Filters for Phase Response Estimation and
  Automatic Signal Alignment
Differentiable Allpass Filters for Phase Response Estimation and Automatic Signal Alignment
A. R. Bargum
Stefania Serafin
Cumhur Erkut
Julian Parker
13
0
0
01 Jun 2023
A Multi-dimensional Deep Structured State Space Approach to Speech
  Enhancement Using Small-footprint Models
A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
Pin-Jui Ku
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Chin-Hui Lee
89
13
0
01 Jun 2023
Text-to-Speech Pipeline for Swiss German -- A comparison
Text-to-Speech Pipeline for Swiss German -- A comparison
Tobias Bollinger
Jan Deriu
Manfred Vogel
DiffM
60
0
0
31 May 2023
Intelligible Lip-to-Speech Synthesis with Speech Units
Intelligible Lip-to-Speech Synthesis with Speech Units
J. Choi
Minsu Kim
Y. Ro
91
26
0
31 May 2023
Diff-Instruct: A Universal Approach for Transferring Knowledge From
  Pre-trained Diffusion Models
Diff-Instruct: A Universal Approach for Transferring Knowledge From Pre-trained Diffusion Models
Weijian Luo
Tianyang Hu
Shifeng Zhang
Jiacheng Sun
Zhenguo Li
Zhihua Zhang
124
137
0
29 May 2023
Room Impulse Response Estimation in a Multiple Source Environment
Room Impulse Response Estimation in a Multiple Source Environment
Kyungyun Lee
Jeonghun Seo
Keunwoo Choi
Sangmoon Lee
Ben Sangbae Chon
68
2
0
25 May 2023
Multilingual Text-to-Speech Synthesis for Turkic Languages Using
  Transliteration
Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration
Rustem Yeshpanov
Saida Mussakhojayeva
Yerbolat Khassanov
58
3
0
25 May 2023
Towards generalizing deep-audio fake detection networks
Towards generalizing deep-audio fake detection networks
Konstantin Gasenzer
Moritz Wolter
73
4
0
22 May 2023
Exploring How Generative Adversarial Networks Learn Phonological
  Representations
Exploring How Generative Adversarial Networks Learn Phonological Representations
Jing Chen
Micha Elsner
GAN
63
4
0
21 May 2023
Parameter-Efficient Learning for Text-to-Speech Accent Adaptation
Parameter-Efficient Learning for Text-to-Speech Accent Adaptation
Lijie Yang
Chao-Han Huck Yang
Jen-Tzung Chien
91
11
0
18 May 2023
FastFit: Towards Real-Time Iterative Neural Vocoder by Replacing U-Net
  Encoder With Multiple STFTs
FastFit: Towards Real-Time Iterative Neural Vocoder by Replacing U-Net Encoder With Multiple STFTs
Won Jang
D. Lim
Heayoung Park
88
1
0
18 May 2023
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction
  of Amplitude and Phase Spectra
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra
Yang Ai
Zhenhua Ling
101
14
0
13 May 2023
Enhancing Gappy Speech Audio Signals with Generative Adversarial
  Networks
Enhancing Gappy Speech Audio Signals with Generative Adversarial Networks
Deniss Strods
Alan F. Smeaton
45
2
0
09 May 2023
Accented Text-to-Speech Synthesis with Limited Data
Accented Text-to-Speech Synthesis with Limited Data
Xuehao Zhou
Mingyang Zhang
Yi Zhou
Zhizheng Wu
Haizhou Li
76
15
0
08 May 2023
Source-Filter-Based Generative Adversarial Neural Vocoder for High
  Fidelity Speech Synthesis
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis
Ye-Xin Lu
Yang Ai
Zhenhua Ling
105
1
0
26 Apr 2023
AI-Synthesized Voice Detection Using Neural Vocoder Artifacts
AI-Synthesized Voice Detection Using Neural Vocoder Artifacts
Chengzhe Sun
Shan Jia
Shuwei Hou
Siwei Lyu
72
45
0
25 Apr 2023
Affective social anthropomorphic intelligent system
Affective social anthropomorphic intelligent system
Md. Adyelullahil Mamun
Hasnat Md. Abdullah
Md. Golam Rabiul Alam
Muhammad Mehedi Hassan
Md. Zia Uddin
52
1
0
19 Apr 2023
Enhancing Speech-to-Speech Translation with Multiple TTS Targets
Enhancing Speech-to-Speech Translation with Multiple TTS Targets
Jiatong Shi
Yun Tang
Ann Lee
Hirofumi Inaguma
Changhan Wang
J. Pino
Shinji Watanabe
77
9
0
10 Apr 2023
Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos
Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos
Kun Su
Kaizhi Qian
Eli Shlizerman
Antonio Torralba
Chuang Gan
VGenAI4CE
87
20
0
29 Mar 2023
Time-domain Speech Enhancement Assisted by Multi-resolution Frequency
  Encoder and Decoder
Time-domain Speech Enhancement Assisted by Multi-resolution Frequency Encoder and Decoder
Hao Shi
Masato Mimura
Longbiao Wang
Jianwu Dang
Tatsuya Kawahara
75
14
0
26 Mar 2023
Previous
123456...8910
Next