ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.02434
  4. Cited By
The Sequence-to-Sequence Baseline for the Voice Conversion Challenge
  2020: Cascading ASR and TTS

The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS

6 October 2020
Wen-Chin Huang
Tomoki Hayashi
Shinji Watanabe
Tomoki Toda
    DRL
ArXiv (abs)PDFHTMLGithub (9152★)

Papers citing "The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS"

21 / 21 papers shown
Enhancing Polyglot Voices by Leveraging Cross-Lingual Fine-Tuning in
  Any-to-One Voice Conversion
Enhancing Polyglot Voices by Leveraging Cross-Lingual Fine-Tuning in Any-to-One Voice ConversionConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Giuseppe Ruggiero
Matteo Testa
Jurgen Van de Walle
Luigi Di Caro
227
1
0
25 Sep 2024
End-to-end Streaming model for Low-Latency Speech Anonymization
End-to-end Streaming model for Low-Latency Speech Anonymization
Waris Quamer
Ricardo Gutierrez-Osuna
274
10
0
13 Jun 2024
SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition
SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition
Yihan Wu
Soumi Maiti
Yifan Peng
Wangyou Zhang
Chenda Li
Yuyue Wang
Xihua Wang
Shinji Watanabe
Ruihua Song
299
7
0
31 Jan 2024
Automatic Speech Disentanglement for Voice Conversion using Rank Module
  and Speech Augmentation
Automatic Speech Disentanglement for Voice Conversion using Rank Module and Speech AugmentationInterspeech (Interspeech), 2023
Zhonghua Liu
Shijun Wang
Ning Chen
DRL
220
3
0
21 Jun 2023
ALO-VC: Any-to-any Low-latency One-shot Voice Conversion
ALO-VC: Any-to-any Low-latency One-shot Voice ConversionInterspeech (Interspeech), 2023
Bo Wang
Damien Ronssin
Milos Cernak
BDL
387
5
0
01 Jun 2023
WESPER: Zero-shot and Realtime Whisper to Normal Voice Conversion for
  Whisper-based Speech Interactions
WESPER: Zero-shot and Realtime Whisper to Normal Voice Conversion for Whisper-based Speech InteractionsInternational Conference on Human Factors in Computing Systems (CHI), 2023
Jun Rekimoto
321
38
0
03 Mar 2023
Subband-based Generative Adversarial Network for Non-parallel
  Many-to-many Voice Conversion
Subband-based Generative Adversarial Network for Non-parallel Many-to-many Voice Conversion
Jianchun Ma
Zhedong Zheng
Hao Fei
Feng Zheng
Tat-Seng Chua
Yi Yang
GAN
233
0
0
13 Jul 2022
A Comparative Study of Self-supervised Speech Representation Based Voice
  Conversion
A Comparative Study of Self-supervised Speech Representation Based Voice ConversionIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Wen-Chin Huang
Shu-Wen Yang
Tomoki Hayashi
Tomoki Toda
230
24
0
10 Jul 2022
Deep Learning and Synthetic Media
Deep Learning and Synthetic Media
Raphaël Millière
315
29
0
11 May 2022
ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
Siddhant Arora
Siddharth Dalmia
Pavel Denisov
Xuankai Chang
Yushi Ueda
...
Karthik Ganesan
Brian Yan
Ngoc Thang Vu
A. Black
Shinji Watanabe
VLM
249
82
0
29 Nov 2021
A Comparison of Discrete and Soft Speech Units for Improved Voice
  Conversion
A Comparison of Discrete and Soft Speech Units for Improved Voice ConversionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Benjamin van Niekerk
M. Carbonneau
Julian Zaïdi
Matthew Baas
Hugo Seuté
Herman Kamper
DRL
476
163
0
03 Nov 2021
S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised
  Speech Representations
S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech RepresentationsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Wen-Chin Huang
Shu-Wen Yang
Tomoki Hayashi
Hung-yi Lee
Shinji Watanabe
Tomoki Toda
206
45
0
12 Oct 2021
On Prosody Modeling for ASR+TTS based Voice Conversion
On Prosody Modeling for ASR+TTS based Voice ConversionAutomatic Speech Recognition & Understanding (ASRU), 2021
Wen-Chin Huang
Tomoki Hayashi
Xinjian Li
Shinji Watanabe
Tomoki Toda
288
11
0
20 Jul 2021
Understanding the Tradeoffs in Client-side Privacy for Downstream Speech
  Tasks
Understanding the Tradeoffs in Client-side Privacy for Downstream Speech TasksAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2021
Peter Wu
Paul Pu Liang
Jiatong Shi
Ruslan Salakhutdinov
Shinji Watanabe
Louis-Philippe Morency
322
10
0
22 Jan 2021
The 2020 ESPnet update: new features, broadened applications,
  performance improvements, and future plans
The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans
Shinji Watanabe
Florian Boyer
Xuankai Chang
Pengcheng Guo
Tomoki Hayashi
...
Shigeki Karita
Chenda Li
Jing Shi
Aswin Shanmugam Subramanian
Wangyou Zhang
VLM
253
39
0
23 Dec 2020
The NeteaseGames System for Voice Conversion Challenge 2020 with
  Vector-quantization Variational Autoencoder and WaveNet
The NeteaseGames System for Voice Conversion Challenge 2020 with Vector-quantization Variational Autoencoder and WaveNet
Haitong Zhang
DRL
151
4
0
15 Oct 2020
The NU Voice Conversion System for the Voice Conversion Challenge 2020:
  On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural
  Vocoders
The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders
Wen-Chin Huang
Patrick Lumban Tobing
Yi-Chiao Wu
Kazuhiro Kobayashi
Tomoki Toda
233
9
0
09 Oct 2020
Baseline System of Voice Conversion Challenge 2020 with Cyclic
  Variational Autoencoder and Parallel WaveGAN
Baseline System of Voice Conversion Challenge 2020 with Cyclic Variational Autoencoder and Parallel WaveGAN
Patrick Lumban Tobing
Yi-Chiao Wu
Tomoki Toda
DRL
191
15
0
09 Oct 2020
Latent linguistic embedding for cross-lingual text-to-speech and voice
  conversion
Latent linguistic embedding for cross-lingual text-to-speech and voice conversion
Hieu-Thi Luong
Junichi Yamagishi
206
5
0
08 Oct 2020
The Academia Sinica Systems of Voice Conversion for VCC2020
The Academia Sinica Systems of Voice Conversion for VCC2020
Yu-Huai Peng
Cheng-Hung Hu
A. Kang
Hung-Shin Lee
Pin-Yuan Chen
Yu Tsao
Hsin-Min Wang
186
2
0
06 Oct 2020
Transfer Learning from Monolingual ASR to Transcription-free
  Cross-lingual Voice Conversion
Transfer Learning from Monolingual ASR to Transcription-free Cross-lingual Voice Conversion
Che-Jui Chang
212
5
0
30 Sep 2020
1
Page 1 of 1