Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.11946
Cited By
v1
v2
v3
v4 (latest)
Neural source-filter-based waveform model for statistical parametric speech synthesis
29 October 2018
Xin Wang
Shinji Takaki
Junichi Yamagishi
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Neural source-filter-based waveform model for statistical parametric speech synthesis"
50 / 79 papers shown
Title
Neurodyne: Neural Pitch Manipulation with Representation Learning and Cycle-Consistency GAN
Yicheng Gu
Chaoren Wang
Zhizheng Wu
Lauri Juvela
108
1
0
21 May 2025
Improving Inference-Time Optimisation for Vocal Effects Style Transfer with a Gaussian Prior
Chin-Yun Yu
Marco A. Martínez-Ramírez
Junghyun Koo
Wei-Hsiang Liao
Yuki Mitsufuji
Gyorgy Fazekas
71
1
0
16 May 2025
ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram
Xiao-Hang Jiang
Hui-Peng Du
Yang Ai
Ye-Xin Lu
Zhen-Hua Ling
81
0
0
18 Nov 2024
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
Lam Pham
Phat Lam
Dat Tran
Hieu Tang
Tin Nguyen
Alexander Schindler
Canh Vu
Alexander Polonsky
Canh Vu
129
5
0
23 Sep 2024
Zero-Shot Sing Voice Conversion: built upon clustering-based phoneme representations
Wangjin Zhou
Fengrun Zhang
Yiming Liu
Wenhao Guan
Yi Zhao
He Qu
36
2
0
12 Sep 2024
InstructSing: High-Fidelity Singing Voice Generation via Instructing Yourself
Chang Zeng
Chunhui Wang
Xiaoxiao Miao
Jian Zhao
Zhonglin Jiang
Yong Chen
69
0
0
10 Sep 2024
Hear Your Face: Face-based voice conversion with F0 estimation
Jaejun Lee
Yoori Oh
Injune Hwang
Kyogu Lee
CVBM
49
3
0
19 Aug 2024
Adapting General Disentanglement-Based Speaker Anonymization for Enhanced Emotion Preservation
Xiaoxiao Miao
Yuxiang Zhang
Xin Wang
N. Tomashenko
D. Soh
Ian Mcloughlin
116
2
0
12 Aug 2024
A Benchmark for Multi-speaker Anonymization
Xiaoxiao Miao
Ruijie Tao
Chang Zeng
Xin Wang
99
1
0
08 Jul 2024
Fine-Grained and Interpretable Neural Speech Editing
Max Morrison
Cameron Churchwell
Nathan Pruyne
Bryan Pardo
84
3
0
07 Jul 2024
Real-time Timbre Remapping with Differentiable DSP
Jordie Shier
C. Saitis
Andrew Robertson
Andrew Mcpherson
74
3
0
05 Jul 2024
SPA-SVC: Self-supervised Pitch Augmentation for Singing Voice Conversion
Bingsong Bai
Fengping Wang
Yingming Gao
Ya Li
75
1
0
09 Jun 2024
Differentiable Time-Varying Linear Prediction in the Context of End-to-End Analysis-by-Synthesis
Chin-Yun Yu
Gyorgy Fazekas
56
1
0
07 Jun 2024
An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
Yicheng Gu
Xueyao Zhang
Liumeng Xue
Haizhou Li
Zhizheng Wu
55
3
0
26 Apr 2024
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Xueyao Zhang
Liumeng Xue
Yicheng Gu
Yuancheng Wang
Haorui He
...
Mingxuan Wang
Jun Han
Kai Chen
Haizhou Li
Zhizheng Wu
91
35
0
15 Dec 2023
Multi-Scale Sub-Band Constant-Q Transform Discriminator for High-Fidelity Vocoder
Yicheng Gu
Xueyao Zhang
Liumeng Xue
Zhizheng Wu
72
12
0
25 Nov 2023
The Impact of Silence on Speech Anti-Spoofing
Yuxiang Zhang
Zhuo Li
Jingze Lu
Hua Hua
Wenchao Wang
Pengyuan Zhang
80
21
0
21 Sep 2023
VoicePAT: An Efficient Open-source Evaluation Toolkit for Voice Privacy Research
Sarina Meyer
Xiaoxiao Miao
Ngoc Thang Vu
127
6
0
14 Sep 2023
Differentiable Modelling of Percussive Audio with Transient and Spectral Synthesis
Jordie Shier
Franco Caspe
Andrew Robertson
Mark Sandler
C. Saitis
Andrew Mcpherson
66
3
0
13 Sep 2023
FSD: An Initial Chinese Dataset for Fake Song Detection
Yuankun Xie
Jingjing Zhou
Xiaolin Lu
Zhenghao Jiang
Yuxin Yang
Haonan Cheng
Long Ye
88
15
0
05 Sep 2023
A Review of Differentiable Digital Signal Processing for Music & Speech Synthesis
B. Hayes
Jordie Shier
Gyorgy Fazekas
Andrew Mcpherson
C. Saitis
83
25
0
29 Aug 2023
The Ethical Implications of Generative Audio Models: A Systematic Literature Review
J. Barnett
86
32
0
07 Jul 2023
Towards single integrated spoofing-aware speaker verification embeddings
Sung Hwan Mun
Hye-jin Shim
Hemlata Tak
Xin Wang
Xuechen Liu
...
Junichi Yamagishi
Nicholas W. D. Evans
Tomi Kinnunen
N. Kim
Jee-weon Jung
152
12
0
30 May 2023
Speaker anonymization using orthogonal Householder neural network
Xiaoxiao Miao
Xin Wang
Erica Cooper
Junichi Yamagishi
N. Tomashenko
BDL
74
21
0
30 May 2023
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra
Yang Ai
Zhenhua Ling
101
14
0
13 May 2023
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis
Ye-Xin Lu
Yang Ai
Zhenhua Ling
105
1
0
26 Apr 2023
DSPGAN: a GAN-based universal vocoder for high-fidelity TTS by time-frequency domain supervision from DSP
Kun Song
Yongmao Zhang
Yinjiao Lei
Jian Cong
Hanzhao Li
Linfu Xie
Gang He
Jinfeng Bai
99
15
0
02 Nov 2022
Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis
Yuma Shirahata
Ryuichi Yamamoto
Eunwoo Song
Ryo Terashima
Jae-Min Kim
Kentaro Tachibana
86
11
0
28 Oct 2022
Source-Filter HiFi-GAN: Fast and Pitch Controllable High-Fidelity Neural Vocoder
Reo Yoneyama
Yi-Chiao Wu
Tomoki Toda
82
27
0
27 Oct 2022
Towards Parametric Speech Synthesis Using Gaussian-Markov Model of Spectral Envelope and Wavelet-Based Decomposition of F0
M. S. Al-Radhi
Tamás Gábor Csapó
Csaba Zainkó
Géza Németh
50
1
0
15 Aug 2022
Unified Source-Filter GAN with Harmonic-plus-Noise Source Excitation Generation
Reo Yoneyama
Yi-Chiao Wu
Tomoki Toda
70
14
0
12 May 2022
A Post Auto-regressive GAN Vocoder Focused on Spectrum Fracture
Zhe-ming Lu
Mengnan He
Ruixiong Zhang
Caixia Gong
GAN
25
2
0
12 Apr 2022
Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models
Xiaoxiao Miao
Xin Wang
Erica Cooper
Junichi Yamagishi
N. Tomashenko
175
25
0
26 Feb 2022
Improving Adversarial Waveform Generation based Singing Voice Conversion with Harmonic Signals
Haohan Guo
Zhiping Zhou
Fanbo Meng
Kai-Chun Liu
97
16
0
25 Jan 2022
Unsupervised Music Source Separation Using Differentiable Parametric Source Models
Kilian Schulze-Forster
G. Richard
Liam Kelley
Clement S. J. Doire
Roland Badeau
82
21
0
24 Jan 2022
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations
Hyeong-Seok Choi
Juheon Lee
W. Kim
Jie Hwan Lee
Hoon Heo
Kyogu Lee
109
158
0
27 Oct 2021
Chunked Autoregressive GAN for Conditional Waveform Synthesis
Max Morrison
Rithesh Kumar
Kundan Kumar
Prem Seetharaman
Aaron Courville
Yoshua Bengio
GAN
130
72
0
19 Oct 2021
KaraTuner: Towards end to end natural pitch correction for singing voice in karaoke
Xiaobin Zhuang
Huiran Yu
Weifeng Zhao
Tao Jiang
Peng Hu
90
6
0
18 Oct 2021
Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet
Max Morrison
Zeyu Jin
Nicholas J. Bryan
Juan-Pablo Caceres
Bryan Pardo
73
14
0
05 Oct 2021
Physiological-Physical Feature Fusion for Automatic Voice Spoofing Detection
Junxiao Xue
Hao Zhou
Yabo Wang
31
9
0
01 Sep 2021
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
133
359
0
29 Jun 2021
FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis
Taejun Bak
Jaesung Bae
Hanbin Bae
Young-Ik Kim
Hoon-Young Cho
120
17
0
29 Jun 2021
High-Fidelity and Low-Latency Universal Neural Vocoder based on Multiband WaveRNN with Data-Driven Linear Prediction for Discrete Waveform Modeling
Patrick Lumban Tobing
Tomoki Toda
62
8
0
20 May 2021
Unified Source-Filter GAN: Unified Source-filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN
Reo Yoneyama
Yi-Chiao Wu
Tomoki Toda
73
12
0
10 Apr 2021
Real-time Denoising and Dereverberation with Tiny Recurrent U-Net
Hyeong-Seok Choi
Sungjin Park
Jie Hwan Lee
Hoon Heo
Dongsuk Jeon
Kyogu Lee
95
57
0
05 Feb 2021
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Wei-Ning Hsu
David Harwath
Christopher Song
James R. Glass
CLIP
90
67
0
31 Dec 2020
The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders
Wen-Chin Huang
Patrick Lumban Tobing
Yi-Chiao Wu
Kazuhiro Kobayashi
Tomoki Toda
86
8
0
09 Oct 2020
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed Langevin Dynamics
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
Shogo Seki
DiffM
124
21
0
06 Oct 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis
Zhifeng Kong
Ming-Yu Liu
Jiaji Huang
Kexin Zhao
Bryan Catanzaro
DiffM
BDL
219
1,471
0
21 Sep 2020
Nonparallel Voice Conversion with Augmented Classifier Star Generative Adversarial Networks
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
99
20
0
27 Aug 2020
1
2
Next