v1v2v3v4 (latest)

Neural source-filter-based waveform model for statistical parametric speech synthesis

29 October 2018

Xin Wang

Shinji Takaki

Junichi Yamagishi

ArXiv (abs)PDF HTML

Papers citing "Neural source-filter-based waveform model for statistical parametric speech synthesis"

50 / 79 papers shown

Title
Neurodyne: Neural Pitch Manipulation with Representation Learning and Cycle-Consistency GAN Yicheng Gu Chaoren Wang Zhizheng Wu Lauri Juvela 108 1 0 21 May 2025
Improving Inference-Time Optimisation for Vocal Effects Style Transfer with a Gaussian Prior Chin-Yun Yu Marco A. Martínez-Ramírez Junghyun Koo Wei-Hsiang Liao Yuki Mitsufuji Gyorgy Fazekas 71 1 0 16 May 2025
ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram Xiao-Hang Jiang Hui-Peng Du Yang Ai Ye-Xin Lu Zhen-Hua Ling 81 0 0 18 Nov 2024
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection Lam Pham Phat Lam Dat Tran Hieu Tang Tin Nguyen Alexander Schindler Canh Vu Alexander Polonsky Canh Vu 129 5 0 23 Sep 2024
Zero-Shot Sing Voice Conversion: built upon clustering-based phoneme representations Wangjin Zhou Fengrun Zhang Yiming Liu Wenhao Guan Yi Zhao He Qu 36 2 0 12 Sep 2024
InstructSing: High-Fidelity Singing Voice Generation via Instructing Yourself Chang Zeng Chunhui Wang Xiaoxiao Miao Jian Zhao Zhonglin Jiang Yong Chen 69 0 0 10 Sep 2024
Hear Your Face: Face-based voice conversion with F0 estimation Jaejun Lee Yoori Oh Injune Hwang Kyogu Lee CVBM 49 3 0 19 Aug 2024
Adapting General Disentanglement-Based Speaker Anonymization for Enhanced Emotion Preservation Xiaoxiao Miao Yuxiang Zhang Xin Wang N. Tomashenko D. Soh Ian Mcloughlin 116 2 0 12 Aug 2024
A Benchmark for Multi-speaker Anonymization Xiaoxiao Miao Ruijie Tao Chang Zeng Xin Wang 99 1 0 08 Jul 2024
Fine-Grained and Interpretable Neural Speech Editing Max Morrison Cameron Churchwell Nathan Pruyne Bryan Pardo 84 3 0 07 Jul 2024
Real-time Timbre Remapping with Differentiable DSP Jordie Shier C. Saitis Andrew Robertson Andrew Mcpherson 74 3 0 05 Jul 2024
SPA-SVC: Self-supervised Pitch Augmentation for Singing Voice Conversion Bingsong Bai Fengping Wang Yingming Gao Ya Li 75 1 0 09 Jun 2024
Differentiable Time-Varying Linear Prediction in the Context of End-to-End Analysis-by-Synthesis Chin-Yun Yu Gyorgy Fazekas 56 1 0 07 Jun 2024
An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder Yicheng Gu Xueyao Zhang Liumeng Xue Haizhou Li Zhizheng Wu 55 3 0 26 Apr 2024
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit Xueyao Zhang Liumeng Xue Yicheng Gu Yuancheng Wang Haorui He ... Mingxuan Wang Jun Han Kai Chen Haizhou Li Zhizheng Wu 91 35 0 15 Dec 2023
Multi-Scale Sub-Band Constant-Q Transform Discriminator for High-Fidelity Vocoder Yicheng Gu Xueyao Zhang Liumeng Xue Zhizheng Wu 72 12 0 25 Nov 2023
The Impact of Silence on Speech Anti-Spoofing Yuxiang Zhang Zhuo Li Jingze Lu Hua Hua Wenchao Wang Pengyuan Zhang 80 21 0 21 Sep 2023
VoicePAT: An Efficient Open-source Evaluation Toolkit for Voice Privacy Research Sarina Meyer Xiaoxiao Miao Ngoc Thang Vu 127 6 0 14 Sep 2023
Differentiable Modelling of Percussive Audio with Transient and Spectral Synthesis Jordie Shier Franco Caspe Andrew Robertson Mark Sandler C. Saitis Andrew Mcpherson 66 3 0 13 Sep 2023
FSD: An Initial Chinese Dataset for Fake Song Detection Yuankun Xie Jingjing Zhou Xiaolin Lu Zhenghao Jiang Yuxin Yang Haonan Cheng Long Ye 88 15 0 05 Sep 2023
A Review of Differentiable Digital Signal Processing for Music & Speech Synthesis B. Hayes Jordie Shier Gyorgy Fazekas Andrew Mcpherson C. Saitis 83 25 0 29 Aug 2023
The Ethical Implications of Generative Audio Models: A Systematic Literature Review J. Barnett 86 32 0 07 Jul 2023
Towards single integrated spoofing-aware speaker verification embeddings Sung Hwan Mun Hye-jin Shim Hemlata Tak Xin Wang Xuechen Liu ... Junichi Yamagishi Nicholas W. D. Evans Tomi Kinnunen N. Kim Jee-weon Jung 152 12 0 30 May 2023
Speaker anonymization using orthogonal Householder neural network Xiaoxiao Miao Xin Wang Erica Cooper Junichi Yamagishi N. Tomashenko BDL 74 21 0 30 May 2023
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra Yang Ai Zhenhua Ling 101 14 0 13 May 2023
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis Ye-Xin Lu Yang Ai Zhenhua Ling 105 1 0 26 Apr 2023
DSPGAN: a GAN-based universal vocoder for high-fidelity TTS by time-frequency domain supervision from DSP Kun Song Yongmao Zhang Yinjiao Lei Jian Cong Hanzhao Li Linfu Xie Gang He Jinfeng Bai 99 15 0 02 Nov 2022
Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis Yuma Shirahata Ryuichi Yamamoto Eunwoo Song Ryo Terashima Jae-Min Kim Kentaro Tachibana 86 11 0 28 Oct 2022
Source-Filter HiFi-GAN: Fast and Pitch Controllable High-Fidelity Neural Vocoder Reo Yoneyama Yi-Chiao Wu Tomoki Toda 82 27 0 27 Oct 2022
Towards Parametric Speech Synthesis Using Gaussian-Markov Model of Spectral Envelope and Wavelet-Based Decomposition of F0 M. S. Al-Radhi Tamás Gábor Csapó Csaba Zainkó Géza Németh 50 1 0 15 Aug 2022
Unified Source-Filter GAN with Harmonic-plus-Noise Source Excitation Generation Reo Yoneyama Yi-Chiao Wu Tomoki Toda 70 14 0 12 May 2022
A Post Auto-regressive GAN Vocoder Focused on Spectrum Fracture Zhe-ming Lu Mengnan He Ruixiong Zhang Caixia Gong GAN 25 2 0 12 Apr 2022
Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models Xiaoxiao Miao Xin Wang Erica Cooper Junichi Yamagishi N. Tomashenko 175 25 0 26 Feb 2022
Improving Adversarial Waveform Generation based Singing Voice Conversion with Harmonic Signals Haohan Guo Zhiping Zhou Fanbo Meng Kai-Chun Liu 97 16 0 25 Jan 2022
Unsupervised Music Source Separation Using Differentiable Parametric Source Models Kilian Schulze-Forster G. Richard Liam Kelley Clement S. J. Doire Roland Badeau 82 21 0 24 Jan 2022
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations Hyeong-Seok Choi Juheon Lee W. Kim Jie Hwan Lee Hoon Heo Kyogu Lee 109 158 0 27 Oct 2021
Chunked Autoregressive GAN for Conditional Waveform Synthesis Max Morrison Rithesh Kumar Kundan Kumar Prem Seetharaman Aaron Courville Yoshua Bengio GAN 130 72 0 19 Oct 2021
KaraTuner: Towards end to end natural pitch correction for singing voice in karaoke Xiaobin Zhuang Huiran Yu Weifeng Zhao Tao Jiang Peng Hu 90 6 0 18 Oct 2021
Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet Max Morrison Zeyu Jin Nicholas J. Bryan Juan-Pablo Caceres Bryan Pardo 73 14 0 05 Oct 2021
Physiological-Physical Feature Fusion for Automatic Voice Spoofing Detection Junxiao Xue Hao Zhou Yabo Wang 31 9 0 01 Sep 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 133 359 0 29 Jun 2021
FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis Taejun Bak Jaesung Bae Hanbin Bae Young-Ik Kim Hoon-Young Cho 120 17 0 29 Jun 2021
High-Fidelity and Low-Latency Universal Neural Vocoder based on Multiband WaveRNN with Data-Driven Linear Prediction for Discrete Waveform Modeling Patrick Lumban Tobing Tomoki Toda 62 8 0 20 May 2021
Unified Source-Filter GAN: Unified Source-filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN Reo Yoneyama Yi-Chiao Wu Tomoki Toda 73 12 0 10 Apr 2021
Real-time Denoising and Dereverberation with Tiny Recurrent U-Net Hyeong-Seok Choi Sungjin Park Jie Hwan Lee Hoon Heo Dongsuk Jeon Kyogu Lee 95 57 0 05 Feb 2021
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units Wei-Ning Hsu David Harwath Christopher Song James R. Glass CLIP 90 67 0 31 Dec 2020
The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders Wen-Chin Huang Patrick Lumban Tobing Yi-Chiao Wu Kazuhiro Kobayashi Tomoki Toda 86 8 0 09 Oct 2020
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed Langevin Dynamics Hirokazu Kameoka Takuhiro Kaneko Kou Tanaka Nobukatsu Hojo Shogo Seki DiffM 124 21 0 06 Oct 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis Zhifeng Kong Ming-Yu Liu Jiaji Huang Kexin Zhao Bryan Catanzaro DiffM BDL 219 1,471 0 21 Sep 2020
Nonparallel Voice Conversion with Augmented Classifier Star Generative Adversarial Networks Hirokazu Kameoka Takuhiro Kaneko Kou Tanaka Nobukatsu Hojo 99 20 0 27 Aug 2020