v1v2 (latest)

High Fidelity Speech Synthesis with Adversarial Networks

International Conference on Learning Representations (ICLR), 2019

25 September 2019

Papers citing "High Fidelity Speech Synthesis with Adversarial Networks"

50 / 153 papers shown

Exploiting Pre-trained Feature Networks for Generative Adversarial Networks in Audio-domain Loop GenerationInternational Society for Music Information Retrieval Conference (ISMIR), 2022

Yen-Tung Yeh

Bo-Yu Chen

Yi-Hsuan Yang

248

05 Sep 2022

Lip-to-Speech Synthesis for Arbitrary Speakers in the WildACM Multimedia (ACM MM), 2022

Sindhu B. Hegde

Prajwal K R

Rudrabha Mukhopadhyay

Vinay P. Namboodiri

C. V. Jawahar

215

01 Sep 2022

Music Separation Enhancement with Generative ModelingInternational Society for Music Information Retrieval Conference (ISMIR), 2022

213

26 Aug 2022

Deepfake: Definitions, Performance Metrics and Standards, Datasets and Benchmarks, and a Meta-Review

Enes ALTUNCU

V. N. Franqueira

Shujun Li

323

21 Aug 2022

Generative Extraction of Audio Classifiers for Speaker Identification

147

26 Jul 2022

A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement of Neural Post-filter for Low-cost Text-to-speech SystemAPSIPA Transactions on Signal and Information Processing (TASIP), 2022

Yi-Chiao Wu

Patrick Lumban Tobing

157

13 Jul 2022

Towards Error-Resilient Neural Speech CodingInterspeech (Interspeech), 2022

161

03 Jul 2022

Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target SpeakersInterspeech (Interspeech), 2022

Shan Yang

119

02 Jul 2022

Estimating the Optimal Covariance with Imperfect Mean in Diffusion Probabilistic ModelsInternational Conference on Machine Learning (ICML), 2022

Jun Zhu

200

15 Jun 2022

BigVGAN: A Universal Neural Vocoder with Large-Scale TrainingInternational Conference on Learning Representations (ICLR), 2022

Boris Ginsburg

307

379

09 Jun 2022

Deep Learning Enabled Semantic Communications with Speech Recognition and SynthesisIEEE Transactions on Wireless Communications (TWC), 2022

208

205

09 May 2022

SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-SpeechInternational Joint Conference on Artificial Intelligence (IJCAI), 2022

Zhenhui Ye

Zhou Zhao

Yi Ren

Leilei Gan

126

25 Apr 2022

The Sillwood Technologies System for the VoiceMOS Challenge 2022

Jiameng Gao

170

08 Apr 2022

Adversarial Learning of Intermediate Acoustic Feature for End-to-End Lightweight Text-to-SpeechInterspeech (Interspeech), 2022

160

05 Apr 2022

WavThruVec: Latent speech representation as intermediate features for neural speech synthesisInterspeech (Interspeech), 2022

348

31 Mar 2022

BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech SynthesisInternational Conference on Learning Representations (ICLR), 2022

222

103

25 Mar 2022

Modeling speech recognition and synthesis simultaneously: Encoding and decoding lexical and sublexical semantic information into speech with no direct access to speech dataInterspeech (Interspeech), 2022

Gašper Beguš

Alan Zhou

SSL

255

22 Mar 2022

Reproducible Subjective Evaluation

128

08 Mar 2022

Practical cognitive speech compression

Reza Lotfidereshgi

P. Gournay

183

08 Mar 2022

iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier TransformIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

185

04 Mar 2022

Revisiting Over-Smoothness in Text to SpeechAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Yi Ren

Xu Tan

Tao Qin

Zhou Zhao

Tie-Yan Liu

199

26 Feb 2022

It's Raw! Audio Generation with State-Space ModelsInternational Conference on Machine Learning (ICML), 2022

261

233

20 Feb 2022

Attributable-Watermarking of Speech Generative ModelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Yongbaek Cho

Changhoon Kim

Yezhou Yang

Yi Ren

185

17 Feb 2022

Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing moduleIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Roberto Barra-Chicote

Bartek Perz

Jaime Lorenzo-Trueba

185

16 Feb 2022

InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in TrainingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Xu Tan

Shifeng Pan

181

08 Feb 2022

J-MAC: Japanese multi-speaker audiobook corpus for speech synthesisInterspeech (Interspeech), 2022

Hiroshi Saruwatari

125

26 Jan 2022

Improved Input Reprogramming for GAN Conditioning

Liang Shang

250

07 Jan 2022

Audio representations for deep learning in sound synthesis: A reviewACS/IEEE International Conference on Computer Systems and Applications (AICCSA), 2021

Anastasia Natsiou

Seán O'Leary

AI4TS

150

07 Jan 2022

Semantic Communications: Principles and Challenges

502

415

30 Dec 2021

Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale CorpusACM Multimedia (MM), 2021

Rongjie Huang

Zhou Zhao

214

124

20 Dec 2021

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyoneInternational Conference on Machine Learning (ICML), 2021

Edresson Casanova

Julian Weber

C. Shulby

Arnaldo Cândido Júnior

Eren Golge

M. Ponti

673

547

04 Dec 2021

Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance

317

125

23 Nov 2021

High Quality Streaming Speech Synthesis with Low, Sentence-Length-Independent Latency

Nikolaos Ellinas

G. Vamvoukakis

K. Markopoulos

Aimilios Chalamandaris

202

17 Nov 2021

Generating Diverse Realistic Laughter for Interactive Art

Mehdi Park Eric Paquette Étienne Gidel Gauthier Mathewso Afsar

124

04 Nov 2021

WaveFake: A Data Set to Facilitate Audio Deepfake Detection

Joel Frank

Lea Schonherr

DiffM

329

176

04 Nov 2021

Towards Language Modelling in the Speech Domain Using Sub-word Linguistic Units

Anurag Katakkar

A. Black

AuLLM

31 Oct 2021

Chunked Autoregressive GAN for Conditional Waveform Synthesis

Aaron Courville

193

19 Oct 2021

FMFCC-A: A Challenging Mandarin Dataset for Synthetic Speech Detection

139

18 Oct 2021

Taming Visually Guided Sound Generation

Vladimir E. Iashin

Esa Rahtu

VLM

314

171

17 Oct 2021

ESPnet2-TTS: Extending the Edge of TTS Research

Tomoki Hayashi

Ryuichi Yamamoto

Takenori Yoshimura

Peter Wu

Jiatong Shi

166

15 Oct 2021

LaughNet: synthesizing laughter utterances from waveform silhouettes and a single laughter example

Hieu-Thi Luong

Junichi Yamagishi

158

11 Oct 2021

Denoising Diffusion Gamma Models

Eliya Nachmani

S. Robin

Lior Wolf

DiffM VLM

209

10 Oct 2021

FlowVocoder: A small Footprint Neural Vocoder based Normalizing flow for Speech SynthesisInterspeech (Interspeech), 2021

Manh Luong

Viet-Anh Tran

103

27 Sep 2021

Bilateral Denoising Diffusion Models

Rongjie Huang

204

26 Aug 2021

A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit RateIEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2021

204

09 Aug 2021

DarkGAN: Exploiting Knowledge Distillation for Comprehensible Audio Synthesis with GANs

J. Nistal

Stefan Lattner

G. Richard

203

03 Aug 2021

A Survey on Audio Synthesis and Audio-Visual Multimodal Processing

Zhaofeng Shi

150

01 Aug 2021

Generative Models for Security: Attacks, Defenses, and Opportunities

L. A. Bauer

Vincent Bindschaedler

225

21 Jul 2021

Filtered Noise Shaping for Time Domain Room Impulse Response Estimation From Reverberant SpeechIEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2021

C. Steinmetz

V. Ithapu

P. Calamia

135

15 Jul 2021

Adversarial Auto-Encoding for Packet Loss Concealment

Santiago Pascual

Joan Serrà

Jordi Pons

288

07 Jul 2021