Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1909.11646
Cited By
v1
v2 (latest)
High Fidelity Speech Synthesis with Adversarial Networks
International Conference on Learning Representations (ICLR), 2019
25 September 2019
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"High Fidelity Speech Synthesis with Adversarial Networks"
50 / 153 papers shown
Title
MARS: Audio Generation via Multi-Channel Autoregression on Spectrograms
Eleonora Ristori
Luca Bindini
Paolo Frasconi
104
0
0
30 Sep 2025
Scaling to Multimodal and Multichannel Heart Sound Classification with Synthetic and Augmented Biosignals
Milan Marocchi
Matthew Fynn
Kayapanda Mandana
Yue Rong
151
0
0
15 Sep 2025
MEAN-RIR: Multi-Modal Environment-Aware Network for Robust Room Impulse Response Estimation
Jiajian Chen
Jiakang Chen
Hang Chen
Qing Wang
Yu Gao
Jun Du
76
1
0
05 Sep 2025
Layer-wise Analysis for Quality of Multilingual Synthesized Speech
Erica Cooper
T. Okamoto
Yamato Ohtani
Tomoki Toda
Hisashi Kawai
116
0
0
05 Sep 2025
WaveLLDM: Design and Development of a Lightweight Latent Diffusion Model for Speech Enhancement and Restoration
Kevin Putra Santoso
Rizka Wakhidatus Sholikah
Raden Venantius Hari Ginardi
147
0
0
28 Aug 2025
DARAS: Dynamic Audio-Room Acoustic Synthesis for Blind Room Impulse Response Estimation
Chunxi Wang
Maoshen Jia
Wenyu Jin
112
0
0
10 Jul 2025
DPN-GAN: Inducing Periodic Activations in Generative Adversarial Networks for High-Fidelity Audio Synthesis
IEEE Access (IEEE Access), 2025
Zeeshan Ahmad
Shudi Bao
Meng Chen
215
1
0
14 May 2025
Hierarchical Conditional Tabular GAN for Multi-Tabular Synthetic Data Generation
Wilhelm Ågren
Victorio Úbeda Sosa
235
2
0
11 Nov 2024
Generative Deep Learning and Signal Processing for Data Augmentation of Cardiac Auscultation Signals: Improving Model Robustness Using Synthetic Audio
Biomedical Signal Processing and Control (BSPC), 2024
Leigh Abbott
Milan Marocchi
Matthew Fynn
Yue Rong
Sven Nordholm
MedIm
203
2
0
14 Oct 2024
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation
International Conference on Learning Representations (ICLR), 2024
Sang-Hoon Lee
Ha-Yeong Choi
Seong-Whan Lee
OOD
DiffM
AI4TS
301
13
0
14 Aug 2024
Coarse-to-Fine Proposal Refinement Framework for Audio Temporal Forgery Detection and Localization
Junyan Wu
Wei Lu
Xiangyang Luo
Rui Yang
Qian Wang
Xiaochun Cao
237
11
0
23 Jul 2024
A Survey of Deep Learning Audio Generation Methods
Matej Bozic
Marko Horvat
VLM
MedIm
299
9
0
31 May 2024
Sok: Comprehensive Security Overview, Challenges, and Future Directions of Voice-Controlled Systems
Haozhe Xu
Cong Wu
Yangyang Gu
Xingcan Shang
Jing Chen
Kun He
Ruiying Du
260
4
0
27 May 2024
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model
Xiangyu Zhang
Daijiao Liu
Hexin Liu
Qiquan Zhang
Hanyu Meng
Leibny Paola García
Chng Eng Siong
Lina Yao
DiffM
214
13
0
16 Feb 2024
Brain-Conditional Multimodal Synthesis: A Survey and Taxonomy
IEEE Transactions on Artificial Intelligence (IEEE TAI), 2023
Weijian Mai
Jian Zhang
Pengfei Fang
Zhijun Zhang
457
14
0
31 Dec 2023
The Effects of Signal-to-Noise Ratio on Generative Adversarial Networks Applied to Marine Bioacoustic Data
Georgia Atkinson
Nick Wright
A. Mcgough
Per Berggren
GAN
182
0
0
22 Dec 2023
A Representative Study on Human Detection of Artificially Generated Media Across Countries
Joel Frank
Franziska Herbert
Jonas Ricker
Lea Schonherr
Thorsten Eisenhofer
Asja Fischer
Markus Dürmuth
Thorsten Holz
249
28
0
10 Dec 2023
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores
International Conference on Learning Representations (ICLR), 2023
Daniel Y. Fu
Hermann Kumbong
Eric N. D. Nguyen
Christopher Ré
VLM
252
38
0
10 Nov 2023
Enabling Acoustic Audience Feedback in Large Virtual Events
Tamay Aykut
M. Hofbauer
Christopher B. Kuhn
Eckehard Steinbach
Bernd Girod
166
0
0
27 Oct 2023
Evaluating Speech Synthesis by Training Recognizers on Synthetic Speech
Dareen Alharthi
Roshan S. Sharma
Hira Dhamyal
Soumi Maiti
Bhiksha Raj
Rita Singh
142
7
0
01 Oct 2023
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Yiwei Guo
Chenpeng Du
Ziyang Ma
Xie Chen
K. Yu
DiffM
264
61
0
10 Sep 2023
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
ACM Computing Surveys (ACM Comput. Surv.), 2023
Lin Geng Foo
Hossein Rahmani
Jing Liu
708
45
0
27 Aug 2023
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
Interspeech (Interspeech), 2023
Jungil Kong
Jihoon Park
Beomjeong Kim
Jeongmin Kim
Dohee Kong
Sangjin Kim
205
66
0
31 Jul 2023
The Ethical Implications of Generative Audio Models: A Systematic Literature Review
AAAI/ACM Conference on AI, Ethics, and Society (AIES), 2023
J. Barnett
251
47
0
07 Jul 2023
LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading
International Conference on Learning Representations (ICLR), 2023
Yochai Yemini
Aviv Shamsian
Lior Bracha
Sharon Gannot
Ethan Fetaya
DiffM
326
22
0
05 Jun 2023
UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion Model
Interspeech (Interspeech), 2023
A. Iashchenko
Pavel Andreev
Ivan Shchekotov
Nicholas Babaev
Dmitry Vetrov
DiffM
324
7
0
01 Jun 2023
U-DiT TTS: U-Diffusion Vision Transformer for Text-to-Speech
Xin Jing
Yi Chang
Zijiang Yang
Jiang-jian Xie
Andreas Triantafyllopoulos
Bjoern W. Schuller
206
11
0
22 May 2023
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Yang Ai
Zhenhua Ling
170
25
0
13 May 2023
Learn to Sing by Listening: Building Controllable Virtual Singer by Unsupervised Learning from Voice Recordings
Wei Xue
Yiwen Wang
Qi-fei Liu
Yi-Ting Guo
175
1
0
09 May 2023
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis
Ye-Xin Lu
Yang Ai
Zhenhua Ling
175
1
0
26 Apr 2023
ArmanTTS single-speaker Persian dataset
Mohammd Hasan Shamgholi
Vahid Saeedi
J. Peymanfard
Leila Alhabib
Hossein Zeinali
93
3
0
07 Apr 2023
A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI
Chenshuang Zhang
Chaoning Zhang
Sheng Zheng
Mengchun Zhang
Maryam Qamar
Sung-Ho Bae
In So Kweon
DiffM
MedIm
252
106
0
23 Mar 2023
Speech Modeling with a Hierarchical Transformer Dynamical VAE
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Xiaoyu Lin
Xiaoyu Bie
Simon Leglaive
Laurent Girin
Xavier Alameda-Pineda
BDL
178
3
0
07 Mar 2023
Contrast-PLC: Contrastive Learning for Packet Loss Concealment
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Huaying Xue
Xiulian Peng
Yan Lu
173
7
0
26 Feb 2023
Separate And Diffuse: Using a Pretrained Diffusion Model for Improving Source Separation
Shahar Lutati
Eliya Nachmani
Lior Wolf
DiffM
205
19
0
25 Jan 2023
MooseNet: A Trainable Metric for Synthesized Speech with a PLDA Module
Speech Synthesis Workshop (SSW), 2023
Ondvrej Plátek
Ondrej Dusek
176
2
0
17 Jan 2023
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech
Ze Chen
Yihan Wu
Yichong Leng
Jiawei Chen
Haohe Liu
...
Ke Wang
Lei He
Sheng Zhao
Jiang Bian
Danilo Mandic
DiffM
204
26
0
30 Dec 2022
Semantics-Empowered Communication: A Tutorial-cum-Survey
Zhilin Lu
Rongpeng Li
Kun Lu
Xianfu Chen
Ekram Hossain
Zhifeng Zhao
Honggang Zhang
503
23
0
16 Dec 2022
BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Mingda Chen
Paul-Ambroise Duquenne
Pierre Yves Andrews
Justine T. Kao
Alexandre Mourachko
Holger Schwenk
Marta R. Costa-jussá
246
23
0
16 Dec 2022
Evaluating and reducing the distance between synthetic and real speech distributions
Interspeech (Interspeech), 2022
Christoph Minixhofer
Ondˇrej Klejch
P. Bell
217
9
0
29 Nov 2022
Deep Fake Detection, Deterrence and Response: Challenges and Opportunities
Amin Azmoodeh
Ali Dehghantanha
173
3
0
26 Nov 2022
Towards Building Text-To-Speech Systems for the Next Billion Users
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Gokul Karthik Kumar
V. PraveenS.
Pratyush Kumar
Mitesh M. Khapra
Karthik Nandakumar
271
28
0
17 Nov 2022
NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis
International Conference on Learning Representations (ICLR), 2022
Hyeong-Seok Choi
Jinhyeok Yang
Juheon Lee
Hyeongju Kim
228
54
0
17 Nov 2022
Efficiently Trained Low-Resource Mongolian Text-to-Speech System Based On FullConv-TTS
Ziqi Liang
178
0
0
24 Oct 2022
Adversarial Permutation Invariant Training for Universal Sound Separation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Emilian Postolache
Jordi Pons
Santiago Pascual
Joan Serrà
VLM
269
10
0
21 Oct 2022
Improving robustness of spontaneous speech synthesis with linguistic speech regularization and pseudo-filled-pause insertion
Speech Synthesis Workshop (SSW), 2022
Yuta Matsunaga
Takaaki Saeki
Shinnosuke Takamichi
Hiroshi Saruwatari
212
2
0
18 Oct 2022
Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules
International Conference on Learning Representations (ICLR), 2022
Kazuki Irie
Jürgen Schmidhuber
229
9
0
07 Oct 2022
Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic Wasserstein GAN
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2022
Yin-Ping Cho
Yu Tsao
Hsin-Min Wang
Yi-Wen Liu
DiffM
221
9
0
21 Sep 2022
Lightweight Long-Range Generative Adversarial Networks
Bowen Li
Thomas Lukasiewicz
GAN
178
4
0
08 Sep 2022
AudioLM: a Language Modeling Approach to Audio Generation
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Zalan Borsos
Raphaël Marinier
Damien Vincent
Eugene Kharitonov
Olivier Pietquin
...
Dominik Roblek
O. Teboul
David Grangier
Marco Tagliasacchi
Neil Zeghidour
AuLLM
392
813
0
07 Sep 2022
1
2
3
4
Next