Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1909.11646
Cited By
v1
v2 (latest)
High Fidelity Speech Synthesis with Adversarial Networks
International Conference on Learning Representations (ICLR), 2019
25 September 2019
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"High Fidelity Speech Synthesis with Adversarial Networks"
50 / 153 papers shown
Exploiting Pre-trained Feature Networks for Generative Adversarial Networks in Audio-domain Loop Generation
International Society for Music Information Retrieval Conference (ISMIR), 2022
Yen-Tung Yeh
Bo-Yu Chen
Yi-Hsuan Yang
248
7
0
05 Sep 2022
Lip-to-Speech Synthesis for Arbitrary Speakers in the Wild
ACM Multimedia (ACM MM), 2022
Sindhu B. Hegde
Prajwal K R
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
215
16
0
01 Sep 2022
Music Separation Enhancement with Generative Modeling
International Society for Music Information Retrieval Conference (ISMIR), 2022
N. Schaffer
Boaz Cogan
Ethan Manilow
Max Morrison
Prem Seetharaman
Bryan Pardo
213
11
0
26 Aug 2022
Deepfake: Definitions, Performance Metrics and Standards, Datasets and Benchmarks, and a Meta-Review
Enes ALTUNCU
V. N. Franqueira
Shujun Li
323
25
0
21 Aug 2022
Generative Extraction of Audio Classifiers for Speaker Identification
Tejumade Afonja
Lucas Bourtoule
Varun Chandrasekaran
Sageev Oore
Nicolas Papernot
AAML
147
1
0
26 Jul 2022
A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement of Neural Post-filter for Low-cost Text-to-speech System
APSIPA Transactions on Signal and Information Processing (TASIP), 2022
Yi-Chiao Wu
Patrick Lumban Tobing
Kazuki Yasuhara
Noriyuki Matsunaga
Yamato Ohtani
Tomoki Toda
157
0
0
13 Jul 2022
Towards Error-Resilient Neural Speech Coding
Interspeech (Interspeech), 2022
Huaying Xue
Xiulian Peng
Xue Jiang
Yan Lu
161
9
0
03 Jul 2022
Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers
Interspeech (Interspeech), 2022
Liumeng Xue
Shan Yang
Na Hu
Jane Polak Scowcroft
Linfu Xie
119
4
0
02 Jul 2022
Estimating the Optimal Covariance with Imperfect Mean in Diffusion Probabilistic Models
International Conference on Machine Learning (ICML), 2022
Fan Bao
Chongxuan Li
Jiacheng Sun
Jun Zhu
Bo Zhang
DiffM
200
84
0
15 Jun 2022
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
International Conference on Learning Representations (ICLR), 2022
Sang-gil Lee
Ming-Yu Liu
Boris Ginsburg
Bryan Catanzaro
Sung-Hoon Yoon
307
379
0
09 Jun 2022
Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis
IEEE Transactions on Wireless Communications (TWC), 2022
Zhenzi Weng
Zhijin Qin
Xiaoming Tao
Chengkang Pan
Guangyi Liu
Geoffrey Ye Li
208
205
0
09 May 2022
SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech
International Joint Conference on Artificial Intelligence (IJCAI), 2022
Zhenhui Ye
Zhou Zhao
Yi Ren
Leilei Gan
126
30
0
25 Apr 2022
The Sillwood Technologies System for the VoiceMOS Challenge 2022
Jiameng Gao
170
0
0
08 Apr 2022
Adversarial Learning of Intermediate Acoustic Feature for End-to-End Lightweight Text-to-Speech
Interspeech (Interspeech), 2022
Hyungchan Yoon
Seyun Um
Changwhan Kim
Hong-Goo Kang
160
0
0
05 Apr 2022
WavThruVec: Latent speech representation as intermediate features for neural speech synthesis
Interspeech (Interspeech), 2022
Hubert Siuzdak
Piotr Dura
Pol van Rijn
Nori Jacoby
AI4TS
348
37
0
31 Mar 2022
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
International Conference on Learning Representations (ICLR), 2022
Max W. Y. Lam
Jun Wang
Jane Polak Scowcroft
Dong Yu
DiffM
222
103
0
25 Mar 2022
Modeling speech recognition and synthesis simultaneously: Encoding and decoding lexical and sublexical semantic information into speech with no direct access to speech data
Interspeech (Interspeech), 2022
Gašper Beguš
Alan Zhou
SSL
255
6
0
22 Mar 2022
Reproducible Subjective Evaluation
Max Morrison
Brian Tang
Gefei Tan
Bryan Pardo
128
7
0
08 Mar 2022
Practical cognitive speech compression
Reza Lotfidereshgi
P. Gournay
183
2
0
08 Mar 2022
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Takuhiro Kaneko
Kou Tanaka
Hirokazu Kameoka
Shogo Seki
185
87
0
04 Mar 2022
Revisiting Over-Smoothness in Text to Speech
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Yi Ren
Xu Tan
Tao Qin
Zhou Zhao
Tie-Yan Liu
199
70
0
26 Feb 2022
It's Raw! Audio Generation with State-Space Models
International Conference on Machine Learning (ICML), 2022
Karan Goel
Albert Gu
Chris Donahue
Christopher Ré
261
233
0
20 Feb 2022
Attributable-Watermarking of Speech Generative Models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yongbaek Cho
Changhoon Kim
Yezhou Yang
Yi Ren
185
10
0
17 Feb 2022
Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Adam Gabry's
Goeric Huybrechts
M. Ribeiro
C. Chien
Julian Roth
Giulia Comini
Roberto Barra-Chicote
Bartek Perz
Jaime Lorenzo-Trueba
185
27
0
16 Feb 2022
InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Zehua Chen
Xu Tan
Ke Wang
Shifeng Pan
Danilo Mandic
Lei He
Sheng Zhao
DiffM
181
34
0
08 Feb 2022
J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis
Interspeech (Interspeech), 2022
Shinnosuke Takamichi
Wataru Nakata
Naoko Tanji
Hiroshi Saruwatari
AuLLM
125
8
0
26 Jan 2022
Improved Input Reprogramming for GAN Conditioning
Tuan Dinh
Daewon Seo
Zhixu Du
Liang Shang
Kangwook Lee
AI4CE
250
8
0
07 Jan 2022
Audio representations for deep learning in sound synthesis: A review
ACS/IEEE International Conference on Computer Systems and Applications (AICCSA), 2021
Anastasia Natsiou
Seán O'Leary
AI4TS
150
25
0
07 Jan 2022
Semantic Communications: Principles and Challenges
Zhijin Qin
Xiaoming Tao
Jianhua Lu
Wen Tong
Geoffrey Ye Li
502
415
0
30 Dec 2021
Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus
ACM Multimedia (MM), 2021
Rongjie Huang
Feiyang Chen
Yi Ren
Jinglin Liu
Chenye Cui
Zhou Zhao
214
124
0
20 Dec 2021
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
International Conference on Machine Learning (ICML), 2021
Edresson Casanova
Julian Weber
C. Shulby
Arnaldo Cândido Júnior
Eren Golge
M. Ponti
673
547
0
04 Dec 2021
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance
Heeseung Kim
Sungwon Kim
Sungroh Yoon
DiffM
BDL
317
125
0
23 Nov 2021
High Quality Streaming Speech Synthesis with Low, Sentence-Length-Independent Latency
Nikolaos Ellinas
G. Vamvoukakis
K. Markopoulos
Aimilios Chalamandaris
Georgia Maniati
Panos Kakoulidis
S. Raptis
June Sig Sung
Hyoungmin Park
Pirros Tsiakoulis
202
39
0
17 Nov 2021
Generating Diverse Realistic Laughter for Interactive Art
Mehdi Park Eric Paquette Étienne Gidel Gauthier Mathewso Afsar
Eric Park
Étienne Paquette
Gauthier Gidel
Kory W. Mathewson
Eilif B. Muller
124
8
0
04 Nov 2021
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
Joel Frank
Lea Schonherr
DiffM
329
176
0
04 Nov 2021
Towards Language Modelling in the Speech Domain Using Sub-word Linguistic Units
Anurag Katakkar
A. Black
AuLLM
98
1
0
31 Oct 2021
Chunked Autoregressive GAN for Conditional Waveform Synthesis
Max Morrison
Rithesh Kumar
Kundan Kumar
Prem Seetharaman
Aaron Courville
Yoshua Bengio
GAN
193
85
0
19 Oct 2021
FMFCC-A: A Challenging Mandarin Dataset for Synthetic Speech Detection
Zhenyu Zhang
Yewei Gu
Xiaowei Yi
Xianfeng Zhao
139
33
0
18 Oct 2021
Taming Visually Guided Sound Generation
Vladimir E. Iashin
Esa Rahtu
VLM
314
171
0
17 Oct 2021
ESPnet2-TTS: Extending the Edge of TTS Research
Tomoki Hayashi
Ryuichi Yamamoto
Takenori Yoshimura
Peter Wu
Jiatong Shi
Takaaki Saeki
Yooncheol Ju
Yusuke Yasuda
Shinnosuke Takamichi
Shinji Watanabe
VLM
166
71
0
15 Oct 2021
LaughNet: synthesizing laughter utterances from waveform silhouettes and a single laughter example
Hieu-Thi Luong
Junichi Yamagishi
158
10
0
11 Oct 2021
Denoising Diffusion Gamma Models
Eliya Nachmani
S. Robin
Lior Wolf
DiffM
VLM
209
33
0
10 Oct 2021
FlowVocoder: A small Footprint Neural Vocoder based Normalizing flow for Speech Synthesis
Interspeech (Interspeech), 2021
Manh Luong
Viet-Anh Tran
103
3
0
27 Sep 2021
Bilateral Denoising Diffusion Models
Max W. Y. Lam
Jun Wang
Rongjie Huang
Jane Polak Scowcroft
Dong Yu
DiffM
204
44
0
26 Aug 2021
A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit Rate
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2021
Ahmed Mustafa
Jan Büthe
Srikanth Korse
Kishan Gupta
Guillaume Fuchs
N. Pia
204
23
0
09 Aug 2021
DarkGAN: Exploiting Knowledge Distillation for Comprehensible Audio Synthesis with GANs
J. Nistal
Stefan Lattner
G. Richard
203
10
0
03 Aug 2021
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing
Zhaofeng Shi
150
11
0
01 Aug 2021
Generative Models for Security: Attacks, Defenses, and Opportunities
L. A. Bauer
Vincent Bindschaedler
225
5
0
21 Jul 2021
Filtered Noise Shaping for Time Domain Room Impulse Response Estimation From Reverberant Speech
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2021
C. Steinmetz
V. Ithapu
P. Calamia
135
52
0
15 Jul 2021
Adversarial Auto-Encoding for Packet Loss Concealment
Santiago Pascual
Joan Serrà
Jordi Pons
288
33
0
07 Jul 2021
Previous
1
2
3
4
Next