Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks

23 September 2017

Yuki Saito

Shinnosuke Takamichi

Hiroshi Saruwatari

ArXiv (abs)PDF HTML

Papers citing "Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks"

50 / 59 papers shown

Vocoder-Projected Feature Discriminator

141

25 Aug 2025

Generative Data Imputation for Sparse Learner Performance Data Using Generative Adversarial Imputation Networks

316

23 Mar 2025

Evaluating Synthetic Command Attacks on Smart Voice Assistants

256

13 Nov 2024

Sok: Comprehensive Security Overview, Challenges, and Future Directions of Voice-Controlled Systems

Cong Wu

277

27 May 2024

A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation

...

206

22 May 2024

Llama-VITS: Enhancing TTS Synthesis with Semantic AwarenessInternational Conference on Language Resources and Evaluation (LREC), 2024

Xincan Feng

A. Yoshimoto

264

10 Apr 2024

HumanDiffusion: diffusion model using perceptual gradientsInterspeech (Interspeech), 2023

Yuki Saito

Hiroshi Saruwatari

152

21 Jun 2023

Accented Text-to-Speech Synthesis with Limited DataIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Haizhou Li

195

08 May 2023

Improving novelty detection with generative adversarial networks on hand gesture data

M. Simão

Pedro Neto

O. Gibaru

145

13 Apr 2023

Improving robustness of spontaneous speech synthesis with linguistic speech regularization and pseudo-filled-pause insertionSpeech Synthesis Workshop (SSW), 2022

Yuta Matsunaga

Takaaki Saeki

Shinnosuke Takamichi

Hiroshi Saruwatari

249

18 Oct 2022

Multi-Task Adversarial Training Algorithm for Multi-Speaker Neural Text-to-SpeechAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2022

Yusuke Nakai

Yuki Saito

K. Udagawa

Hiroshi Saruwatari

AAML

200

26 Sep 2022

A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTSInterspeech (Interspeech), 2022

163

22 Sep 2022

Generative models and Bayesian inversion using Laplace approximationComputational statistics (Zeitschrift) (CSZ), 2022

248

15 Mar 2022

A Multi-Scale Time-Frequency Spectrogram Discriminator for GAN-based Non-Autoregressive TTSInterspeech (Interspeech), 2022

822

02 Mar 2022

J-MAC: Japanese multi-speaker audiobook corpus for speech synthesisInterspeech (Interspeech), 2022

Hiroshi Saruwatari

128

26 Jan 2022

Audio representations for deep learning in sound synthesis: A reviewACS/IEEE International Conference on Computer Systems and Applications (AICCSA), 2021

Anastasia Natsiou

Seán O'Leary

AI4TS

156

07 Jan 2022

Automated Side Channel Analysis of Media Software with Manifold Learning

237

09 Dec 2021

Provably Valid and Diverse Mutations of Real-World Media Data for DNN Testing

262

03 Dec 2021

How Deep Are the Fakes? Focusing on Audio Deepfake: A Survey

Zahra Khanjani

Gabrielle Watson

V. P Janeja

166

28 Nov 2021

DarkGAN: Exploiting Knowledge Distillation for Comprehensible Audio Synthesis with GANs

J. Nistal

Stefan Lattner

G. Richard

203

03 Aug 2021

Adversarial Data Augmentation for Disordered Speech Recognition

Zengrui Jin

119

02 Aug 2021

Review of end-to-end speech synthesis technology based on deep learning

205

20 Apr 2021

PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic componentsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

166

15 Feb 2021

HumanACGAN: conditional generative adversarial network with human-based auxiliary classifier and its evaluation in phoneme perceptionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

Yuki Saito

Hiroshi Saruwatari

08 Feb 2021

JSSS: free Japanese speech corpus for summarization and simplification

Shinnosuke Takamichi

Mamoru Komachi

Naoko Tanji

Hiroshi Saruwatari

159

05 Oct 2020

DrumGAN: Synthesis of Drum Sounds With Timbral Feature Conditioning Using Generative Adversarial NetworksInternational Society for Music Information Retrieval Conference (ISMIR), 2020

239

27 Aug 2020

Nonparallel Voice Conversion with Augmented Classifier Star Generative Adversarial NetworksIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020

344

27 Aug 2020

Incorporating Reinforced Adversarial Learning in Autoregressive Image Generation

208

20 Jul 2020

Recent Advances in Network-based Methods for Disease Gene Prediction

255

19 Jul 2020

Estimation with Uncertainty via Conditional Generative Adversarial Networks

Minhyeok Lee

Junhee Seok

MedIm

154

01 Jul 2020

Cumulant GANIEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), 2020

Markos A. Katsoulakis

GAN

344

11 Jun 2020

A comparison of Vietnamese Statistical Parametric Speech Synthesis SystemsInternational Conference on Knowledge and Systems Engineering (KSE), 2020

120

26 May 2020

Conditional Spoken Digit Generation with StyleGANInterspeech (Interspeech), 2020

227

28 Apr 2020

The Attacker's Perspective on Automatic Speaker Verification: An OverviewInterspeech (Interspeech), 2020

Rohan Kumar Das

Xiaohai Tian

Tomi Kinnunen

Haizhou Li

AAML

154

19 Apr 2020

A Novel Framework for Selection of GANs for an Application

Tanya Motwani

Manojkumar Somabhai Parmar

325

20 Feb 2020

A Review on Generative Adversarial Networks: Algorithms, Theory, and ApplicationsIEEE Transactions on Knowledge and Data Engineering (TKDE), 2020

316

1,025

20 Jan 2020

SeismoGen: Seismic Waveform Synthesis Using Generative Adversarial Networks

127

10 Nov 2019

Spoofing Speaker Verification Systems with Deep Multi-speaker Text-to-speech Synthesis

Mingrui Yuan

Z. Duan

29 Oct 2019

High Fidelity Speech Synthesis with Adversarial NetworksInternational Conference on Learning Representations (ICLR), 2019

632

260

25 Sep 2019

HumanGAN: generative adversarial network with human-based discriminator and its evaluation in speech perception modelingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019

Kazuki Fujii

Yuki Saito

Shinnosuke Takamichi

Yukino Baba

Hiroshi Saruwatari

119

25 Sep 2019

JVS corpus: free Japanese multi-speaker voice corpus

Yuki Saito

Hiroshi Saruwatari

148

17 Aug 2019

V2S attack: building DNN-based voice conversion from automatic speaker verificationSpeech Synthesis Workshop (SSW), 2019

Taiki Nakamura

Yuki Saito

Shinnosuke Takamichi

Yusuke Ijima

Hiroshi Saruwatari

151

05 Aug 2019

DNN-based Speaker Embedding Using Subjective Inter-speaker Similarity for Multi-speaker Modeling in Speech SynthesisSpeech Synthesis Workshop (SSW), 2019

Yuki Saito

Shinnosuke Takamichi

Hiroshi Saruwatari

104

19 Jul 2019

A Unified Speaker Adaptation Method for Speech Synthesis using Transcribed and Untranscribed Speech with Backpropagation

Hieu-Thi Luong

Junichi Yamagishi

184

18 Jun 2019

VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019Interspeech (Interspeech), 2019

Haizhou Li

180

27 May 2019

A New GAN-based End-to-End TTS Training Algorithm

Haohan Guo

Frank Soong

Lei He

Lei Xie

177

09 Apr 2019

Probability density distillation with generative adversarial networks for high-quality parallel waveform generation

Ryuichi Yamamoto

Eunwoo Song

Jae-Min Kim

255

09 Apr 2019

Taco-VC: A Single Speaker Tacotron based Voice Conversion with Limited Data

Roee Levy Leshem

Raja Giryes

288

06 Apr 2019

WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation

238

05 Apr 2019

An Interaction Framework for Studying Co-Creative AI

Matthew J. Guzdial

Mark O. Riedl

165

22 Mar 2019