Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1709.08041
Cited By
Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks
23 September 2017
Yuki Saito
Shinnosuke Takamichi
Hiroshi Saruwatari
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks"
50 / 59 papers shown
Vocoder-Projected Feature Discriminator
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Yuto Kondo
DiffM
141
0
0
25 Aug 2025
Generative Data Imputation for Sparse Learner Performance Data Using Generative Adversarial Imputation Networks
Liang Zhang
Jionghao Lin
John Sabatini
Diego Zapata-Rivera
Carol Forsyth
Yang Jiang
John Hollander
Xiangen Hu
Arthur C. Graesser
316
0
0
23 Mar 2025
Evaluating Synthetic Command Attacks on Smart Voice Assistants
Zhengxian He
Ashish Kundu
M. Ahamad
ELM
AAML
256
0
0
13 Nov 2024
Sok: Comprehensive Security Overview, Challenges, and Future Directions of Voice-Controlled Systems
Haozhe Xu
Cong Wu
Yangyang Gu
Xingcan Shang
Jing Chen
Kun He
Ruiying Du
277
4
0
27 May 2024
A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation
Gwanghyun Kim
Alonso Martinez
Yu-Chuan Su
Brendan Jou
José Lezama
...
Lijun Yu
Lu Jiang
A. Jansen
Jacob Walker
Krishna Somandepalli
206
17
0
22 May 2024
Llama-VITS: Enhancing TTS Synthesis with Semantic Awareness
International Conference on Language Resources and Evaluation (LREC), 2024
Xincan Feng
A. Yoshimoto
264
4
0
10 Apr 2024
HumanDiffusion: diffusion model using perceptual gradients
Interspeech (Interspeech), 2023
Yota Ueda
Shinnosuke Takamichi
Yuki Saito
Norihiro Takamune
Hiroshi Saruwatari
DiffM
152
0
0
21 Jun 2023
Accented Text-to-Speech Synthesis with Limited Data
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Xuehao Zhou
Mingyang Zhang
Yi Zhou
Zhizheng Wu
Haizhou Li
195
23
0
08 May 2023
Improving novelty detection with generative adversarial networks on hand gesture data
M. Simão
Pedro Neto
O. Gibaru
145
12
0
13 Apr 2023
Improving robustness of spontaneous speech synthesis with linguistic speech regularization and pseudo-filled-pause insertion
Speech Synthesis Workshop (SSW), 2022
Yuta Matsunaga
Takaaki Saeki
Shinnosuke Takamichi
Hiroshi Saruwatari
249
2
0
18 Oct 2022
Multi-Task Adversarial Training Algorithm for Multi-Speaker Neural Text-to-Speech
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2022
Yusuke Nakai
Yuki Saito
K. Udagawa
Hiroshi Saruwatari
AAML
200
2
0
26 Sep 2022
A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS
Interspeech (Interspeech), 2022
Haohan Guo
Fenglong Xie
Frank Soong
Xixin Wu
Helen M. Meng
163
15
0
22 Sep 2022
Generative models and Bayesian inversion using Laplace approximation
Computational statistics (Zeitschrift) (CSZ), 2022
M. Marschall
G. Wübbeler
F. Schmähling
Clemens Elster
248
2
0
15 Mar 2022
A Multi-Scale Time-Frequency Spectrogram Discriminator for GAN-based Non-Autoregressive TTS
Interspeech (Interspeech), 2022
Haohan Guo
Hui Lu
Xixin Wu
Helen Meng
822
9
0
02 Mar 2022
J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis
Interspeech (Interspeech), 2022
Shinnosuke Takamichi
Wataru Nakata
Naoko Tanji
Hiroshi Saruwatari
AuLLM
128
8
0
26 Jan 2022
Audio representations for deep learning in sound synthesis: A review
ACS/IEEE International Conference on Computer Systems and Applications (AICCSA), 2021
Anastasia Natsiou
Seán O'Leary
AI4TS
156
26
0
07 Jan 2022
Automated Side Channel Analysis of Media Software with Manifold Learning
Yuanyuan Yuan
Qi Pang
Shuai Wang
AAML
237
20
0
09 Dec 2021
Provably Valid and Diverse Mutations of Real-World Media Data for DNN Testing
Yuanyuan Yuan
Qi Pang
Shuai Wang
DiffM
AAML
MedIm
262
7
0
03 Dec 2021
How Deep Are the Fakes? Focusing on Audio Deepfake: A Survey
Zahra Khanjani
Gabrielle Watson
V. P Janeja
166
34
0
28 Nov 2021
DarkGAN: Exploiting Knowledge Distillation for Comprehensible Audio Synthesis with GANs
J. Nistal
Stefan Lattner
G. Richard
203
10
0
03 Aug 2021
Adversarial Data Augmentation for Disordered Speech Recognition
Zengrui Jin
Mengzhe Geng
Xurong Xie
Jianwei Yu
Shansong Liu
Xunying Liu
Helen Meng
119
46
0
02 Aug 2021
Review of end-to-end speech synthesis technology based on deep learning
Zhaoxi Mu
Xinyu Yang
Yizhuo Dong
AuLLM
ALM
205
31
0
20 Apr 2021
PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Yukiya Hono
Shinji Takaki
Kei Hashimoto
Keiichiro Oura
Yoshihiko Nankaku
K. Tokuda
166
16
0
15 Feb 2021
HumanACGAN: conditional generative adversarial network with human-based auxiliary classifier and its evaluation in phoneme perception
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Yota Ueda
Kazuki Fujii
Yuki Saito
Shinnosuke Takamichi
Yukino Baba
Hiroshi Saruwatari
GAN
88
1
0
08 Feb 2021
JSSS: free Japanese speech corpus for summarization and simplification
Shinnosuke Takamichi
Mamoru Komachi
Naoko Tanji
Hiroshi Saruwatari
159
2
0
05 Oct 2020
DrumGAN: Synthesis of Drum Sounds With Timbral Feature Conditioning Using Generative Adversarial Networks
International Society for Music Information Retrieval Conference (ISMIR), 2020
J. Nistal
Stefan Lattner
G. Richard
GAN
239
65
0
27 Aug 2020
Nonparallel Voice Conversion with Augmented Classifier Star Generative Adversarial Networks
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
344
22
0
27 Aug 2020
Incorporating Reinforced Adversarial Learning in Autoregressive Image Generation
Kenan E. Ak
N. Xu
Zhe Lin
Yilin Wang
208
13
0
20 Jul 2020
Recent Advances in Network-based Methods for Disease Gene Prediction
S. Ata
Ruibing Jin
Yuan Fang
Le Ou-Yang
C. Kwoh
Xiaoli Li
255
65
0
19 Jul 2020
Estimation with Uncertainty via Conditional Generative Adversarial Networks
Minhyeok Lee
Junhee Seok
MedIm
154
20
0
01 Jul 2020
Cumulant GAN
IEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), 2020
Yannis Pantazis
D. Paul
M. Fasoulakis
Y. Stylianou
Markos A. Katsoulakis
GAN
344
21
0
11 Jun 2020
A comparison of Vietnamese Statistical Parametric Speech Synthesis Systems
International Conference on Knowledge and Systems Engineering (KSE), 2020
Phan Huy Kinh
V. Phung
Anh-Tuan Dinh
Quoc Bao Nguyen
120
1
0
26 May 2020
Conditional Spoken Digit Generation with StyleGAN
Interspeech (Interspeech), 2020
Kasperi Palkama
Lauri Juvela
Alexander Ilin
GAN
227
11
0
28 Apr 2020
The Attacker's Perspective on Automatic Speaker Verification: An Overview
Interspeech (Interspeech), 2020
Rohan Kumar Das
Xiaohai Tian
Tomi Kinnunen
Haizhou Li
AAML
154
87
0
19 Apr 2020
A Novel Framework for Selection of GANs for an Application
Tanya Motwani
Manojkumar Somabhai Parmar
325
11
0
20 Feb 2020
A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2020
Jie Gui
Zhenan Sun
Yonggang Wen
Dacheng Tao
Jieping Ye
EGVM
316
1,025
0
20 Jan 2020
SeismoGen: Seismic Waveform Synthesis Using Generative Adversarial Networks
Tiantong Wang
D. Trugman
Youzuo Lin
GAN
127
3
0
10 Nov 2019
Spoofing Speaker Verification Systems with Deep Multi-speaker Text-to-speech Synthesis
Mingrui Yuan
Z. Duan
84
2
0
29 Oct 2019
High Fidelity Speech Synthesis with Adversarial Networks
International Conference on Learning Representations (ICLR), 2019
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
632
260
0
25 Sep 2019
HumanGAN: generative adversarial network with human-based discriminator and its evaluation in speech perception modeling
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Kazuki Fujii
Yuki Saito
Shinnosuke Takamichi
Yukino Baba
Hiroshi Saruwatari
119
7
0
25 Sep 2019
JVS corpus: free Japanese multi-speaker voice corpus
Shinnosuke Takamichi
Kentaro Mitsui
Yuki Saito
Tomoki Koriyama
Naoko Tanji
Hiroshi Saruwatari
148
96
0
17 Aug 2019
V2S attack: building DNN-based voice conversion from automatic speaker verification
Speech Synthesis Workshop (SSW), 2019
Taiki Nakamura
Yuki Saito
Shinnosuke Takamichi
Yusuke Ijima
Hiroshi Saruwatari
151
7
0
05 Aug 2019
DNN-based Speaker Embedding Using Subjective Inter-speaker Similarity for Multi-speaker Modeling in Speech Synthesis
Speech Synthesis Workshop (SSW), 2019
Yuki Saito
Shinnosuke Takamichi
Hiroshi Saruwatari
104
10
0
19 Jul 2019
A Unified Speaker Adaptation Method for Speech Synthesis using Transcribed and Untranscribed Speech with Backpropagation
Hieu-Thi Luong
Junichi Yamagishi
184
10
0
18 Jun 2019
VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019
Interspeech (Interspeech), 2019
Andros Tjandra
Berrak Sisman
Mingyang Zhang
S. Sakti
Haizhou Li
Satoshi Nakamura
180
76
0
27 May 2019
A New GAN-based End-to-End TTS Training Algorithm
Haohan Guo
Frank Soong
Lei He
Lei Xie
177
49
0
09 Apr 2019
Probability density distillation with generative adversarial networks for high-quality parallel waveform generation
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
255
58
0
09 Apr 2019
Taco-VC: A Single Speaker Tacotron based Voice Conversion with Limited Data
Roee Levy Leshem
Raja Giryes
288
8
0
06 Apr 2019
WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation
Kou Tanaka
Hirokazu Kameoka
Takuhiro Kaneko
Nobukatsu Hojo
238
20
0
05 Apr 2019
An Interaction Framework for Studying Co-Creative AI
Matthew J. Guzdial
Mark O. Riedl
165
44
0
22 Mar 2019
1
2
Next
Page 1 of 2