The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS

6 October 2020

ArXiv (abs)PDF HTML Github (9152★)

Papers citing "The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS"

21 / 21 papers shown

Enhancing Polyglot Voices by Leveraging Cross-Lingual Fine-Tuning in Any-to-One Voice ConversionConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

227

25 Sep 2024

End-to-end Streaming model for Low-Latency Speech Anonymization

Waris Quamer

Ricardo Gutierrez-Osuna

274

13 Jun 2024

SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition

Wangyou Zhang

Shinji Watanabe

299

31 Jan 2024

Automatic Speech Disentanglement for Voice Conversion using Rank Module and Speech AugmentationInterspeech (Interspeech), 2023

220

21 Jun 2023

ALO-VC: Any-to-any Low-latency One-shot Voice ConversionInterspeech (Interspeech), 2023

387

01 Jun 2023

WESPER: Zero-shot and Realtime Whisper to Normal Voice Conversion for Whisper-based Speech InteractionsInternational Conference on Human Factors in Computing Systems (CHI), 2023

Jun Rekimoto

321

03 Mar 2023

Subband-based Generative Adversarial Network for Non-parallel Many-to-many Voice Conversion

Hao Fei

233

13 Jul 2022

A Comparative Study of Self-supervised Speech Representation Based Voice ConversionIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022

230

10 Jul 2022

Deep Learning and Synthetic Media

Raphaël Millière

315

11 May 2022

ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet

...

249

29 Nov 2021

A Comparison of Discrete and Soft Speech Units for Improved Voice ConversionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

476

163

03 Nov 2021

S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech RepresentationsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

206

12 Oct 2021

On Prosody Modeling for ASR+TTS based Voice ConversionAutomatic Speech Recognition & Understanding (ASRU), 2021

288

20 Jul 2021

Understanding the Tradeoffs in Client-side Privacy for Downstream Speech TasksAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2021

Peter Wu

Paul Pu Liang

Jiatong Shi

Ruslan Salakhutdinov

Shinji Watanabe

Louis-Philippe Morency

322

22 Jan 2021

The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans

...

Aswin Shanmugam Subramanian

Wangyou Zhang

VLM

253

23 Dec 2020

The NeteaseGames System for Voice Conversion Challenge 2020 with Vector-quantization Variational Autoencoder and WaveNet

Haitong Zhang

DRL

151

15 Oct 2020

The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders

Wen-Chin Huang

Patrick Lumban Tobing

Yi-Chiao Wu

Kazuhiro Kobayashi

Tomoki Toda

233

09 Oct 2020

Baseline System of Voice Conversion Challenge 2020 with Cyclic Variational Autoencoder and Parallel WaveGAN

Patrick Lumban Tobing

Yi-Chiao Wu

Tomoki Toda

DRL

191

09 Oct 2020

Latent linguistic embedding for cross-lingual text-to-speech and voice conversion

Hieu-Thi Luong

Junichi Yamagishi

206

08 Oct 2020

The Academia Sinica Systems of Voice Conversion for VCC2020

Hung-Shin Lee

186

06 Oct 2020

Transfer Learning from Monolingual ASR to Transcription-free Cross-lingual Voice Conversion

Che-Jui Chang

212

30 Sep 2020