ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.11946
  4. Cited By
Neural source-filter-based waveform model for statistical parametric
  speech synthesis
v1v2v3v4 (latest)

Neural source-filter-based waveform model for statistical parametric speech synthesis

29 October 2018
Xin Wang
Shinji Takaki
Junichi Yamagishi
ArXiv (abs)PDFHTML

Papers citing "Neural source-filter-based waveform model for statistical parametric speech synthesis"

50 / 79 papers shown
Title
Neurodyne: Neural Pitch Manipulation with Representation Learning and Cycle-Consistency GAN
Neurodyne: Neural Pitch Manipulation with Representation Learning and Cycle-Consistency GAN
Yicheng Gu
Chaoren Wang
Zhizheng Wu
Lauri Juvela
108
1
0
21 May 2025
Improving Inference-Time Optimisation for Vocal Effects Style Transfer with a Gaussian Prior
Improving Inference-Time Optimisation for Vocal Effects Style Transfer with a Gaussian Prior
Chin-Yun Yu
Marco A. Martínez-Ramírez
Junghyun Koo
Wei-Hsiang Liao
Yuki Mitsufuji
Gyorgy Fazekas
71
1
0
16 May 2025
ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram
Xiao-Hang Jiang
Hui-Peng Du
Yang Ai
Ye-Xin Lu
Zhen-Hua Ling
81
0
0
18 Nov 2024
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
Lam Pham
Phat Lam
Dat Tran
Hieu Tang
Tin Nguyen
Alexander Schindler
Canh Vu
Alexander Polonsky
Canh Vu
129
5
0
23 Sep 2024
Zero-Shot Sing Voice Conversion: built upon clustering-based phoneme
  representations
Zero-Shot Sing Voice Conversion: built upon clustering-based phoneme representations
Wangjin Zhou
Fengrun Zhang
Yiming Liu
Wenhao Guan
Yi Zhao
He Qu
36
2
0
12 Sep 2024
InstructSing: High-Fidelity Singing Voice Generation via Instructing
  Yourself
InstructSing: High-Fidelity Singing Voice Generation via Instructing Yourself
Chang Zeng
Chunhui Wang
Xiaoxiao Miao
Jian Zhao
Zhonglin Jiang
Yong Chen
69
0
0
10 Sep 2024
Hear Your Face: Face-based voice conversion with F0 estimation
Hear Your Face: Face-based voice conversion with F0 estimation
Jaejun Lee
Yoori Oh
Injune Hwang
Kyogu Lee
CVBM
49
3
0
19 Aug 2024
Adapting General Disentanglement-Based Speaker Anonymization for Enhanced Emotion Preservation
Adapting General Disentanglement-Based Speaker Anonymization for Enhanced Emotion Preservation
Xiaoxiao Miao
Yuxiang Zhang
Xin Wang
N. Tomashenko
D. Soh
Ian Mcloughlin
116
2
0
12 Aug 2024
A Benchmark for Multi-speaker Anonymization
A Benchmark for Multi-speaker Anonymization
Xiaoxiao Miao
Ruijie Tao
Chang Zeng
Xin Wang
99
1
0
08 Jul 2024
Fine-Grained and Interpretable Neural Speech Editing
Fine-Grained and Interpretable Neural Speech Editing
Max Morrison
Cameron Churchwell
Nathan Pruyne
Bryan Pardo
84
3
0
07 Jul 2024
Real-time Timbre Remapping with Differentiable DSP
Real-time Timbre Remapping with Differentiable DSP
Jordie Shier
C. Saitis
Andrew Robertson
Andrew Mcpherson
74
3
0
05 Jul 2024
SPA-SVC: Self-supervised Pitch Augmentation for Singing Voice Conversion
SPA-SVC: Self-supervised Pitch Augmentation for Singing Voice Conversion
Bingsong Bai
Fengping Wang
Yingming Gao
Ya Li
75
1
0
09 Jun 2024
Differentiable Time-Varying Linear Prediction in the Context of
  End-to-End Analysis-by-Synthesis
Differentiable Time-Varying Linear Prediction in the Context of End-to-End Analysis-by-Synthesis
Chin-Yun Yu
Gyorgy Fazekas
56
1
0
07 Jun 2024
An Investigation of Time-Frequency Representation Discriminators for
  High-Fidelity Vocoder
An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
Yicheng Gu
Xueyao Zhang
Liumeng Xue
Haizhou Li
Zhizheng Wu
55
3
0
26 Apr 2024
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Xueyao Zhang
Liumeng Xue
Yicheng Gu
Yuancheng Wang
Haorui He
...
Mingxuan Wang
Jun Han
Kai Chen
Haizhou Li
Zhizheng Wu
91
35
0
15 Dec 2023
Multi-Scale Sub-Band Constant-Q Transform Discriminator for
  High-Fidelity Vocoder
Multi-Scale Sub-Band Constant-Q Transform Discriminator for High-Fidelity Vocoder
Yicheng Gu
Xueyao Zhang
Liumeng Xue
Zhizheng Wu
72
12
0
25 Nov 2023
The Impact of Silence on Speech Anti-Spoofing
The Impact of Silence on Speech Anti-Spoofing
Yuxiang Zhang
Zhuo Li
Jingze Lu
Hua Hua
Wenchao Wang
Pengyuan Zhang
80
21
0
21 Sep 2023
VoicePAT: An Efficient Open-source Evaluation Toolkit for Voice Privacy
  Research
VoicePAT: An Efficient Open-source Evaluation Toolkit for Voice Privacy Research
Sarina Meyer
Xiaoxiao Miao
Ngoc Thang Vu
127
6
0
14 Sep 2023
Differentiable Modelling of Percussive Audio with Transient and Spectral
  Synthesis
Differentiable Modelling of Percussive Audio with Transient and Spectral Synthesis
Jordie Shier
Franco Caspe
Andrew Robertson
Mark Sandler
C. Saitis
Andrew Mcpherson
66
3
0
13 Sep 2023
FSD: An Initial Chinese Dataset for Fake Song Detection
FSD: An Initial Chinese Dataset for Fake Song Detection
Yuankun Xie
Jingjing Zhou
Xiaolin Lu
Zhenghao Jiang
Yuxin Yang
Haonan Cheng
Long Ye
88
15
0
05 Sep 2023
A Review of Differentiable Digital Signal Processing for Music & Speech
  Synthesis
A Review of Differentiable Digital Signal Processing for Music & Speech Synthesis
B. Hayes
Jordie Shier
Gyorgy Fazekas
Andrew Mcpherson
C. Saitis
83
25
0
29 Aug 2023
The Ethical Implications of Generative Audio Models: A Systematic
  Literature Review
The Ethical Implications of Generative Audio Models: A Systematic Literature Review
J. Barnett
86
32
0
07 Jul 2023
Towards single integrated spoofing-aware speaker verification embeddings
Towards single integrated spoofing-aware speaker verification embeddings
Sung Hwan Mun
Hye-jin Shim
Hemlata Tak
Xin Wang
Xuechen Liu
...
Junichi Yamagishi
Nicholas W. D. Evans
Tomi Kinnunen
N. Kim
Jee-weon Jung
152
12
0
30 May 2023
Speaker anonymization using orthogonal Householder neural network
Speaker anonymization using orthogonal Householder neural network
Xiaoxiao Miao
Xin Wang
Erica Cooper
Junichi Yamagishi
N. Tomashenko
BDL
74
21
0
30 May 2023
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction
  of Amplitude and Phase Spectra
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra
Yang Ai
Zhenhua Ling
101
14
0
13 May 2023
Source-Filter-Based Generative Adversarial Neural Vocoder for High
  Fidelity Speech Synthesis
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis
Ye-Xin Lu
Yang Ai
Zhenhua Ling
105
1
0
26 Apr 2023
DSPGAN: a GAN-based universal vocoder for high-fidelity TTS by
  time-frequency domain supervision from DSP
DSPGAN: a GAN-based universal vocoder for high-fidelity TTS by time-frequency domain supervision from DSP
Kun Song
Yongmao Zhang
Yinjiao Lei
Jian Cong
Hanzhao Li
Linfu Xie
Gang He
Jinfeng Bai
99
15
0
02 Nov 2022
Period VITS: Variational Inference with Explicit Pitch Modeling for
  End-to-end Emotional Speech Synthesis
Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis
Yuma Shirahata
Ryuichi Yamamoto
Eunwoo Song
Ryo Terashima
Jae-Min Kim
Kentaro Tachibana
86
11
0
28 Oct 2022
Source-Filter HiFi-GAN: Fast and Pitch Controllable High-Fidelity Neural
  Vocoder
Source-Filter HiFi-GAN: Fast and Pitch Controllable High-Fidelity Neural Vocoder
Reo Yoneyama
Yi-Chiao Wu
Tomoki Toda
82
27
0
27 Oct 2022
Towards Parametric Speech Synthesis Using Gaussian-Markov Model of
  Spectral Envelope and Wavelet-Based Decomposition of F0
Towards Parametric Speech Synthesis Using Gaussian-Markov Model of Spectral Envelope and Wavelet-Based Decomposition of F0
M. S. Al-Radhi
Tamás Gábor Csapó
Csaba Zainkó
Géza Németh
50
1
0
15 Aug 2022
Unified Source-Filter GAN with Harmonic-plus-Noise Source Excitation
  Generation
Unified Source-Filter GAN with Harmonic-plus-Noise Source Excitation Generation
Reo Yoneyama
Yi-Chiao Wu
Tomoki Toda
70
14
0
12 May 2022
A Post Auto-regressive GAN Vocoder Focused on Spectrum Fracture
Zhe-ming Lu
Mengnan He
Ruixiong Zhang
Caixia Gong
GAN
25
2
0
12 Apr 2022
Language-Independent Speaker Anonymization Approach using
  Self-Supervised Pre-Trained Models
Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models
Xiaoxiao Miao
Xin Wang
Erica Cooper
Junichi Yamagishi
N. Tomashenko
175
25
0
26 Feb 2022
Improving Adversarial Waveform Generation based Singing Voice Conversion
  with Harmonic Signals
Improving Adversarial Waveform Generation based Singing Voice Conversion with Harmonic Signals
Haohan Guo
Zhiping Zhou
Fanbo Meng
Kai-Chun Liu
97
16
0
25 Jan 2022
Unsupervised Music Source Separation Using Differentiable Parametric
  Source Models
Unsupervised Music Source Separation Using Differentiable Parametric Source Models
Kilian Schulze-Forster
G. Richard
Liam Kelley
Clement S. J. Doire
Roland Badeau
82
21
0
24 Jan 2022
Neural Analysis and Synthesis: Reconstructing Speech from
  Self-Supervised Representations
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations
Hyeong-Seok Choi
Juheon Lee
W. Kim
Jie Hwan Lee
Hoon Heo
Kyogu Lee
109
158
0
27 Oct 2021
Chunked Autoregressive GAN for Conditional Waveform Synthesis
Chunked Autoregressive GAN for Conditional Waveform Synthesis
Max Morrison
Rithesh Kumar
Kundan Kumar
Prem Seetharaman
Aaron Courville
Yoshua Bengio
GAN
130
72
0
19 Oct 2021
KaraTuner: Towards end to end natural pitch correction for singing voice
  in karaoke
KaraTuner: Towards end to end natural pitch correction for singing voice in karaoke
Xiaobin Zhuang
Huiran Yu
Weifeng Zhao
Tao Jiang
Peng Hu
90
6
0
18 Oct 2021
Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet
Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet
Max Morrison
Zeyu Jin
Nicholas J. Bryan
Juan-Pablo Caceres
Bryan Pardo
73
14
0
05 Oct 2021
Physiological-Physical Feature Fusion for Automatic Voice Spoofing
  Detection
Physiological-Physical Feature Fusion for Automatic Voice Spoofing Detection
Junxiao Xue
Hao Zhou
Yabo Wang
31
9
0
01 Sep 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
133
359
0
29 Jun 2021
FastPitchFormant: Source-filter based Decomposed Modeling for Speech
  Synthesis
FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis
Taejun Bak
Jaesung Bae
Hanbin Bae
Young-Ik Kim
Hoon-Young Cho
120
17
0
29 Jun 2021
High-Fidelity and Low-Latency Universal Neural Vocoder based on
  Multiband WaveRNN with Data-Driven Linear Prediction for Discrete Waveform
  Modeling
High-Fidelity and Low-Latency Universal Neural Vocoder based on Multiband WaveRNN with Data-Driven Linear Prediction for Discrete Waveform Modeling
Patrick Lumban Tobing
Tomoki Toda
62
8
0
20 May 2021
Unified Source-Filter GAN: Unified Source-filter Network Based On
  Factorization of Quasi-Periodic Parallel WaveGAN
Unified Source-Filter GAN: Unified Source-filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN
Reo Yoneyama
Yi-Chiao Wu
Tomoki Toda
73
12
0
10 Apr 2021
Real-time Denoising and Dereverberation with Tiny Recurrent U-Net
Real-time Denoising and Dereverberation with Tiny Recurrent U-Net
Hyeong-Seok Choi
Sungjin Park
Jie Hwan Lee
Hoon Heo
Dongsuk Jeon
Kyogu Lee
95
57
0
05 Feb 2021
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Wei-Ning Hsu
David Harwath
Christopher Song
James R. Glass
CLIP
90
67
0
31 Dec 2020
The NU Voice Conversion System for the Voice Conversion Challenge 2020:
  On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural
  Vocoders
The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders
Wen-Chin Huang
Patrick Lumban Tobing
Yi-Chiao Wu
Kazuhiro Kobayashi
Tomoki Toda
86
8
0
09 Oct 2020
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed
  Langevin Dynamics
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed Langevin Dynamics
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
Shogo Seki
DiffM
124
21
0
06 Oct 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis
DiffWave: A Versatile Diffusion Model for Audio Synthesis
Zhifeng Kong
Ming-Yu Liu
Jiaji Huang
Kexin Zhao
Bryan Catanzaro
DiffMBDL
219
1,471
0
21 Sep 2020
Nonparallel Voice Conversion with Augmented Classifier Star Generative
  Adversarial Networks
Nonparallel Voice Conversion with Augmented Classifier Star Generative Adversarial Networks
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
99
20
0
27 Aug 2020
12
Next