ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.12088
  4. Cited By
Neural source-filter waveform models for statistical parametric speech
  synthesis

Neural source-filter waveform models for statistical parametric speech synthesis

27 April 2019
Xin Wang
Shinji Takaki
Junichi Yamagishi
ArXivPDFHTML

Papers citing "Neural source-filter waveform models for statistical parametric speech synthesis"

50 / 86 papers shown
Title
The Inverse Drum Machine: Source Separation Through Joint Transcription and Analysis-by-Synthesis
The Inverse Drum Machine: Source Separation Through Joint Transcription and Analysis-by-Synthesis
Bernardo Torres
Geoffroy Peeters
G. Richard
41
0
0
06 May 2025
DOSE : Drum One-Shot Extraction from Music Mixture
DOSE : Drum One-Shot Extraction from Music Mixture
Suntae Hwang
Seonghyeon Kang
Kyungsu Kim
Semin Ahn
K. Lee
36
0
0
25 Apr 2025
Wavehax: Aliasing-Free Neural Waveform Synthesis Based on 2D Convolution
  and Harmonic Prior for Reliable Complex Spectrogram Estimation
Wavehax: Aliasing-Free Neural Waveform Synthesis Based on 2D Convolution and Harmonic Prior for Reliable Complex Spectrogram Estimation
Reo Yoneyama
Atsushi Miyashita
Ryuichi Yamamoto
T. Toda
27
1
0
11 Nov 2024
SiFiSinger: A High-Fidelity End-to-End Singing Voice Synthesizer based
  on Source-filter Model
SiFiSinger: A High-Fidelity End-to-End Singing Voice Synthesizer based on Source-filter Model
Jianwei Cui
Yu Gu
Chao Weng
Jie M. Zhang
Liping Chen
Lirong Dai
62
3
0
16 Oct 2024
HiFi-Glot: Neural Formant Synthesis with Differentiable Resonant Filters
HiFi-Glot: Neural Formant Synthesis with Differentiable Resonant Filters
Lauri Juvela
Pablo Pérez Zarazaga
G. Henter
Zofia Malisz
27
0
0
23 Sep 2024
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
Lam Pham
Phat Lam
Dat Tran
Hieu Tang
Tin Nguyen
Alexander Schindler
Canh Vu
Alexander Polonsky
Canh Vu
51
3
0
23 Sep 2024
SpoofCeleb: Speech Deepfake Detection and SASV In The Wild
SpoofCeleb: Speech Deepfake Detection and SASV In The Wild
Jee-weon Jung
Yihan Wu
Xin Wang
Ji-Hoon Kim
Soumi Maiti
...
Joon Son Chung
Wangyou Zhang
Seyun Um
Shinnosuke Takamichi
Shinji Watanabe
65
1
0
18 Sep 2024
LCM-SVC: Latent Diffusion Model Based Singing Voice Conversion with
  Inference Acceleration via Latent Consistency Distillation
LCM-SVC: Latent Diffusion Model Based Singing Voice Conversion with Inference Acceleration via Latent Consistency Distillation
Shihao Chen
Yu Gu
Jianwei Cui
Jie Zhang
Rilin Chen
Lirong Dai
34
2
0
22 Aug 2024
Diff-MST: Differentiable Mixing Style Transfer
Diff-MST: Differentiable Mixing Style Transfer
Soumya Sai Vanka
Christian Steinmetz
Jean-Baptiste Rolland
Joshua Reiss
George Fazekas
23
5
0
11 Jul 2024
Differentiable Modal Synthesis for Physical Modeling of Planar String
  Sound and Motion Simulation
Differentiable Modal Synthesis for Physical Modeling of Planar String Sound and Motion Simulation
J. Lee
Jaehyun Park
Min Jun Choi
Kyogu Lee
32
2
0
07 Jul 2024
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts
  for Text-to-Speech and Style Captioning
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning
Masaya Kawamura
Ryuichi Yamamoto
Yuma Shirahata
Takuya Hasumi
Kentaro Tachibana
VLM
27
5
0
12 Jun 2024
JenGAN: Stacked Shifted Filters in GAN-Based Speech Synthesis
JenGAN: Stacked Shifted Filters in GAN-Based Speech Synthesis
Hyunjae Cho
Junhyeok Lee
Wonbin Jung
18
0
0
10 Jun 2024
LDM-SVC: Latent Diffusion Model Based Zero-Shot Any-to-Any Singing Voice
  Conversion with Singer Guidance
LDM-SVC: Latent Diffusion Model Based Zero-Shot Any-to-Any Singing Voice Conversion with Singer Guidance
Shihao Chen
Yu Gu
Jie Zhang
Na Li
Rilin Chen
Liping Chen
Lirong Dai
DiffM
40
6
0
08 Jun 2024
TI-ASU: Toward Robust Automatic Speech Understanding through
  Text-to-speech Imputation Against Missing Speech Modality
TI-ASU: Toward Robust Automatic Speech Understanding through Text-to-speech Imputation Against Missing Speech Modality
Tiantian Feng
Xuan Shi
Rahul Gupta
Shrikanth S. Narayanan
41
0
0
27 Apr 2024
Unsupervised Harmonic Parameter Estimation Using Differentiable DSP and
  Spectral Optimal Transport
Unsupervised Harmonic Parameter Estimation Using Differentiable DSP and Spectral Optimal Transport
Bernardo Torres
Geoffroy Peeters
Gaël Richard
35
4
0
22 Dec 2023
Neural Concatenative Singing Voice Conversion: Rethinking
  Concatenation-Based Approach for One-Shot Singing Voice Conversion
Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion
Binzhu Sha
Xu Li
Zhiyong Wu
Yin Shan
Helen M. Meng
21
7
0
08 Dec 2023
Learning to Solve Inverse Problems for Perceptual Sound Matching
Learning to Solve Inverse Problems for Perceptual Sound Matching
Han Han
Vincent Lostanlen
Mathieu Lagrange
21
3
0
23 Nov 2023
VITS-based Singing Voice Conversion System with DSPGAN post-processing
  for SVCC2023
VITS-based Singing Voice Conversion System with DSPGAN post-processing for SVCC2023
Yi-Hua Zhou
Meng Chen
Yi Lei
Jihua Zhu
Weifeng Zhao
16
5
0
08 Oct 2023
CrossSinger: A Cross-Lingual Multi-Singer High-Fidelity Singing Voice
  Synthesizer Trained on Monolingual Singers
CrossSinger: A Cross-Lingual Multi-Singer High-Fidelity Singing Voice Synthesizer Trained on Monolingual Singers
Xintong Wang
Chang Zeng
Jun Chen
Chunhui Wang
19
6
0
22 Sep 2023
HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise
  Filter and Inverse Short Time Fourier Transform
HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform
Yinghao Aaron Li
Cong Han
Xilin Jiang
N. Mesgarani
30
4
0
18 Sep 2023
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech
  Using Natural Language Descriptions
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions
Reo Shimizu
Ryuichi Yamamoto
Masaya Kawamura
Yuma Shirahata
Hironori Doi
Tatsuya Komatsu
Kentaro Tachibana
DiffM
16
19
0
15 Sep 2023
Echotune: A Modular Extractor Leveraging the Variable-Length Nature of
  Speech in ASR Tasks
Echotune: A Modular Extractor Leveraging the Variable-Length Nature of Speech in ASR Tasks
Sizhou Chen
Songyang Gao
Sen Fang
19
0
0
14 Sep 2023
DDSP-based Neural Waveform Synthesis of Polyphonic Guitar Performance
  from String-wise MIDI Input
DDSP-based Neural Waveform Synthesis of Polyphonic Guitar Performance from String-wise MIDI Input
Nicolas Jonason
Xin Eric Wang
Erica Cooper
Lauri Juvela
Bob L. T. Sturm
Junichi Yamagishi
36
1
0
14 Sep 2023
Can large-scale vocoded spoofed data improve speech spoofing
  countermeasure with a self-supervised front end?
Can large-scale vocoded spoofed data improve speech spoofing countermeasure with a self-supervised front end?
Xin Wang
Junichi Yamagishi
SyDa
50
23
0
12 Sep 2023
A Review of Differentiable Digital Signal Processing for Music & Speech
  Synthesis
A Review of Differentiable Digital Signal Processing for Music & Speech Synthesis
B. Hayes
Jordie Shier
Gyorgy Fazekas
Andrew Mcpherson
C. Saitis
27
21
0
29 Aug 2023
The Ethical Implications of Generative Audio Models: A Systematic
  Literature Review
The Ethical Implications of Generative Audio Models: A Systematic Literature Review
J. Barnett
16
25
0
07 Jul 2023
The Singing Voice Conversion Challenge 2023
The Singing Voice Conversion Challenge 2023
Wen-Chin Huang
Lester Phillip Violeta
Songxiang Liu
Jiatong Shi
T. Toda
16
46
0
26 Jun 2023
Puffin: pitch-synchronous neural waveform generation for fullband speech
  on modest devices
Puffin: pitch-synchronous neural waveform generation for fullband speech on modest devices
O. Watts
Lovisa Wihlborg
Cassia Valentini-Botinhao
14
3
0
25 Nov 2022
Can Knowledge of End-to-End Text-to-Speech Models Improve Neural
  MIDI-to-Audio Synthesis Systems?
Can Knowledge of End-to-End Text-to-Speech Models Improve Neural MIDI-to-Audio Synthesis Systems?
Xuan Shi
Erica Cooper
Xin Wang
Junichi Yamagishi
Shrikanth Narayanan
25
1
0
25 Nov 2022
Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural
  Speech Synthesis System
Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System
Takenori Yoshimura
Shinji Takaki
Kazuhiro Nakamura
Keiichiro Oura
Yukiya Hono
Kei Hashimoto
Yoshihiko Nankaku
K. Tokuda
13
7
0
21 Nov 2022
NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis
NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis
Hyeong-Seok Choi
Jinhyeok Yang
Juheon Lee
Hyeongju Kim
18
46
0
17 Nov 2022
Autovocoder: Fast Waveform Generation from a Learned Speech
  Representation using Differentiable Digital Signal Processing
Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing
J. Webber
Cassia Valentini-Botinhao
Evelyn Williams
G. Henter
Simon King
11
9
0
13 Nov 2022
NNSVS: A Neural Network-Based Singing Voice Synthesis Toolkit
NNSVS: A Neural Network-Based Singing Voice Synthesis Toolkit
Ryuichi Yamamoto
Reo Yoneyama
T. Toda
134
11
0
28 Oct 2022
Period VITS: Variational Inference with Explicit Pitch Modeling for
  End-to-end Emotional Speech Synthesis
Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis
Yuma Shirahata
Ryuichi Yamamoto
Eunwoo Song
Ryo Terashima
Jae-Min Kim
Kentaro Tachibana
23
10
0
28 Oct 2022
Source-Filter HiFi-GAN: Fast and Pitch Controllable High-Fidelity Neural
  Vocoder
Source-Filter HiFi-GAN: Fast and Pitch Controllable High-Fidelity Neural Vocoder
Reo Yoneyama
Yi-Chiao Wu
T. Toda
41
26
0
27 Oct 2022
HiFi-WaveGAN: Generative Adversarial Network with Auxiliary
  Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation
HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation
Chunhui Wang
Chang Zeng
Jun Chen
Xingji He
44
7
0
23 Oct 2022
Spoofed training data for speech spoofing countermeasure can be
  efficiently created using neural vocoders
Spoofed training data for speech spoofing countermeasure can be efficiently created using neural vocoders
Xin Wang
Junichi Yamagishi
16
36
0
19 Oct 2022
Music Separation Enhancement with Generative Modeling
Music Separation Enhancement with Generative Modeling
N. Schaffer
Boaz Cogan
Ethan Manilow
Max Morrison
Prem Seetharaman
Bryan Pardo
20
9
0
26 Aug 2022
Are disentangled representations all you need to build speaker
  anonymization systems?
Are disentangled representations all you need to build speaker anonymization systems?
Pierre Champion
D. Jouvet
Anthony Larcher
22
20
0
22 Aug 2022
DDX7: Differentiable FM Synthesis of Musical Instrument Sounds
DDX7: Differentiable FM Synthesis of Musical Instrument Sounds
Franco Caspe
Andrew Mcpherson
Mark Sandler
33
30
0
12 Aug 2022
DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A
  Comprehensive Evaluation
DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive Evaluation
Da-Yi Wu
Wen-Yi Hsiao
Fu-Rong Yang
Oscar D. Friedman
Warren Jackson
Scott Bruzenak
Yi-Wen Liu
Yi-Hsuan Yang
DiffM
26
24
0
09 Aug 2022
Style Transfer of Audio Effects with Differentiable Signal Processing
Style Transfer of Audio Effects with Differentiable Signal Processing
C. Steinmetz
Nicholas J. Bryan
Joshua D. Reiss
16
39
0
18 Jul 2022
Learning and controlling the source-filter representation of speech with
  a variational autoencoder
Learning and controlling the source-filter representation of speech with a variational autoencoder
Samir Sadok
Simon Leglaive
Laurent Girin
Xavier Alameda-Pineda
Renaud Séguier
SSL
DRL
BDL
30
14
0
14 Apr 2022
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with
  Adaptive Noise Spectral Shaping
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping
Yuma Koizumi
Heiga Zen
Kohei Yatabe
Nanxin Chen
M. Bacchiani
DiffM
27
45
0
31 Mar 2022
Differentially Private Speaker Anonymization
Differentially Private Speaker Anonymization
Ali Shahin Shamsabadi
B. M. L. Srivastava
A. Bellet
Nathalie Vauquier
Emmanuel Vincent
Mohamed Maouche
Marc Tommasi
Nicolas Papernot
MIACV
38
32
0
23 Feb 2022
Audio representations for deep learning in sound synthesis: A review
Audio representations for deep learning in sound synthesis: A review
Anastasia Natsiou
Seán O'Leary
AI4TS
19
18
0
07 Jan 2022
MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical
  Modeling
MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling
Yusong Wu
Ethan Manilow
Yi Deng
Rigel Swavely
Kyle Kastner
Tim Cooijmans
Aaron Courville
Cheng-Zhi Anna Huang
Jesse Engel
26
44
0
17 Dec 2021
RAVE: A variational autoencoder for fast and high-quality neural audio
  synthesis
RAVE: A variational autoencoder for fast and high-quality neural audio synthesis
Antoine Caillon
P. Esling
DRL
17
109
0
09 Nov 2021
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
Joel Frank
Lea Schonherr
DiffM
129
123
0
04 Nov 2021
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice
  Generation
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation
Rongjie Huang
Chenye Cui
Feiyang Chen
Yi Ren
Jinglin Liu
Zhou Zhao
Baoxing Huai
N. Yuan
GAN
102
62
0
14 Oct 2021
12
Next