ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.12063
  4. Cited By
How Far Are We from Robust Voice Conversion: A Survey
v1v2v3 (latest)

How Far Are We from Robust Voice Conversion: A Survey

Spoken Language Technology Workshop (SLT), 2020
24 November 2020
Tzu-hsien Huang
Jheng-hao Lin
Chien-yu Huang
Hung-yi Lee
ArXiv (abs)PDFHTML

Papers citing "How Far Are We from Robust Voice Conversion: A Survey"

16 / 16 papers shown
Speech DF Arena: A Leaderboard for Speech DeepFake Detection Models
Speech DF Arena: A Leaderboard for Speech DeepFake Detection Models
Sandipana Dowerah
Atharva Kulkarni
Ajinkya Kulkarni
Hoan My Tran
Joonas Kalda
Artem Fedorchenko
Benoit Fauve
Damien Lolive
Tanel Alumae
Matthew Magimai Doss
ELM
122
9
0
02 Sep 2025
RoVo: Robust Voice Protection Against Unauthorized Speech Synthesis with Embedding-Level Perturbations
RoVo: Robust Voice Protection Against Unauthorized Speech Synthesis with Embedding-Level Perturbations
Seungmin Kim
Sohee Park
Donghyun Kim
Jisu Lee
Daeseon Choi
AAML
297
0
0
19 May 2025
Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation
Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech GenerationIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2025
Haorui He
Zengqiang Shang
Chaoren Wang
Xuyuan Li
Yicheng Gu
...
Peiyang Shi
Longji Xu
Kai Chen
Pengyuan Zhang
Zhikai Wu
AuLLM
457
23
0
27 Jan 2025
Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for
  Large-Scale Speech Generation
Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation
Haorui He
Zengqiang Shang
Chaoren Wang
Xuyuan Li
Yicheng Gu
...
Peiyang Shi
Yuancheng Wang
Kai Chen
Pengyuan Zhang
Zhizheng Wu
283
210
0
07 Jul 2024
Noise-Robust Voice Conversion by Conditional Denoising Training Using
  Latent Variables of Recording Quality and Environment
Noise-Robust Voice Conversion by Conditional Denoising Training Using Latent Variables of Recording Quality and Environment
Takuto Igarashi
Yuki Saito
Kentaro Seki
Shinnosuke Takamichi
Ryuichi Yamamoto
Kentaro Tachibana
Hiroshi Saruwatari
183
3
0
11 Jun 2024
SRC4VC: Smartphone-Recorded Corpus for Voice Conversion Benchmark
SRC4VC: Smartphone-Recorded Corpus for Voice Conversion Benchmark
Yuki Saito
Takuto Igarashi
Kentaro Seki
Shinnosuke Takamichi
Ryuichi Yamamoto
Kentaro Tachibana
Hiroshi Saruwatari
188
1
0
11 Jun 2024
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible
  recipes, self-supervised front-ends, and off-the-shelf models
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models
Jee-weon Jung
Wangyou Zhang
Jiatong Shi
Zakaria Aldeneh
Takuya Higuchi
B. Theobald
Ahmed Hussen Abdelaziz
Shinji Watanabe
499
47
0
30 Jan 2024
Low-latency Real-time Voice Conversion on CPU
Low-latency Real-time Voice Conversion on CPU
Konstantine Sadov
Matthew Hutter
Asara Near
VLM
636
3
0
01 Nov 2023
Learning Repeatable Speech Embeddings Using An Intra-class Correlation
  Regularizer
Learning Repeatable Speech Embeddings Using An Intra-class Correlation RegularizerNeural Information Processing Systems (NeurIPS), 2023
Jianwei Zhang
Suren Jayasuriya
Visar Berisha
SSL
273
3
0
25 Oct 2023
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Dongchao Yang
Jinchuan Tian
Xuejiao Tan
Rongjie Huang
Songxiang Liu
...
Jiang Bian
Xixin Wu
Zhou Zhao
Shinji Watanabe
Helen M. Meng
CVBMAuLLM
643
193
0
01 Oct 2023
Noise-robust voice conversion with domain adversarial training
Noise-robust voice conversion with domain adversarial trainingNeural Networks (NN), 2022
Hongqiang Du
Lei Xie
Haizhou Li
223
20
0
26 Jan 2022
Training Robust Zero-Shot Voice Conversion Models with Self-supervised
  Features
Training Robust Zero-Shot Voice Conversion Models with Self-supervised Features
Trung D. Q. Dang
Dung T. Tran
Peter Chin
K. Koishida
SSL
226
18
0
08 Dec 2021
Toward Degradation-Robust Voice Conversion
Toward Degradation-Robust Voice Conversion
Chien-yu Huang
Kai-Wei Chang
Hung-yi Lee
388
14
0
14 Oct 2021
Improving robustness of one-shot voice conversion with deep
  discriminative speaker encoder
Improving robustness of one-shot voice conversion with deep discriminative speaker encoderInterspeech (Interspeech), 2021
Hongqiang Du
Lei Xie
143
7
0
19 Jun 2021
S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised
  Pretrained Representations
S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained RepresentationsInterspeech (Interspeech), 2021
Jheng-hao Lin
Yist Y. Lin
C. Chien
Hung-yi Lee
473
63
0
07 Apr 2021
Deepfakes Generation and Detection: State-of-the-art, open challenges,
  countermeasures, and way forward
Deepfakes Generation and Detection: State-of-the-art, open challenges, countermeasures, and way forward
Momina Masood
M. Nawaz
K. Malik
A. Javed
Aun Irtaza
AAML
552
441
0
25 Feb 2021
1
Page 1 of 1