ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.12063
  4. Cited By
How Far Are We from Robust Voice Conversion: A Survey
v1v2v3 (latest)

How Far Are We from Robust Voice Conversion: A Survey

Spoken Language Technology Workshop (SLT), 2020
24 November 2020
Tzu-hsien Huang
Jheng-hao Lin
Chien-yu Huang
Hung-yi Lee
ArXiv (abs)PDFHTML

Papers citing "How Far Are We from Robust Voice Conversion: A Survey"

16 / 16 papers shown
Speech DF Arena: A Leaderboard for Speech DeepFake Detection Models
Speech DF Arena: A Leaderboard for Speech DeepFake Detection Models
Sandipana Dowerah
Atharva Kulkarni
Ajinkya Kulkarni
Hoan My Tran
Joonas Kalda
Artem Fedorchenko
Benoit Fauve
Damien Lolive
Tanel Alumae
Matthew Magimai Doss
ELM
123
9
0
02 Sep 2025
RoVo: Robust Voice Protection Against Unauthorized Speech Synthesis with Embedding-Level Perturbations
RoVo: Robust Voice Protection Against Unauthorized Speech Synthesis with Embedding-Level Perturbations
Seungmin Kim
Sohee Park
Donghyun Kim
Jisu Lee
Daeseon Choi
AAML
301
0
0
19 May 2025
Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation
Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech GenerationIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2025
Haorui He
Zengqiang Shang
Chaoren Wang
Xuyuan Li
Yicheng Gu
...
Peiyang Shi
Longji Xu
Kai Chen
Pengyuan Zhang
Zhikai Wu
AuLLM
458
23
0
27 Jan 2025
Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for
  Large-Scale Speech Generation
Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation
Haorui He
Zengqiang Shang
Chaoren Wang
Xuyuan Li
Yicheng Gu
...
Peiyang Shi
Yuancheng Wang
Kai Chen
Pengyuan Zhang
Zhizheng Wu
287
215
0
07 Jul 2024
Noise-Robust Voice Conversion by Conditional Denoising Training Using
  Latent Variables of Recording Quality and Environment
Noise-Robust Voice Conversion by Conditional Denoising Training Using Latent Variables of Recording Quality and Environment
Takuto Igarashi
Yuki Saito
Kentaro Seki
Shinnosuke Takamichi
Ryuichi Yamamoto
Kentaro Tachibana
Hiroshi Saruwatari
183
3
0
11 Jun 2024
SRC4VC: Smartphone-Recorded Corpus for Voice Conversion Benchmark
SRC4VC: Smartphone-Recorded Corpus for Voice Conversion Benchmark
Yuki Saito
Takuto Igarashi
Kentaro Seki
Shinnosuke Takamichi
Ryuichi Yamamoto
Kentaro Tachibana
Hiroshi Saruwatari
195
1
0
11 Jun 2024
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible
  recipes, self-supervised front-ends, and off-the-shelf models
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models
Jee-weon Jung
Wangyou Zhang
Jiatong Shi
Zakaria Aldeneh
Takuya Higuchi
B. Theobald
Ahmed Hussen Abdelaziz
Shinji Watanabe
499
49
0
30 Jan 2024
Low-latency Real-time Voice Conversion on CPU
Low-latency Real-time Voice Conversion on CPU
Konstantine Sadov
Matthew Hutter
Asara Near
VLM
640
3
0
01 Nov 2023
Learning Repeatable Speech Embeddings Using An Intra-class Correlation
  Regularizer
Learning Repeatable Speech Embeddings Using An Intra-class Correlation RegularizerNeural Information Processing Systems (NeurIPS), 2023
Jianwei Zhang
Suren Jayasuriya
Visar Berisha
SSL
289
4
0
25 Oct 2023
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Dongchao Yang
Jinchuan Tian
Xuejiao Tan
Rongjie Huang
Songxiang Liu
...
Jiang Bian
Xixin Wu
Zhou Zhao
Shinji Watanabe
Helen M. Meng
CVBMAuLLM
646
193
0
01 Oct 2023
Noise-robust voice conversion with domain adversarial training
Noise-robust voice conversion with domain adversarial trainingNeural Networks (NN), 2022
Hongqiang Du
Lei Xie
Haizhou Li
233
20
0
26 Jan 2022
Training Robust Zero-Shot Voice Conversion Models with Self-supervised
  Features
Training Robust Zero-Shot Voice Conversion Models with Self-supervised Features
Trung D. Q. Dang
Dung T. Tran
Peter Chin
K. Koishida
SSL
227
18
0
08 Dec 2021
Toward Degradation-Robust Voice Conversion
Toward Degradation-Robust Voice Conversion
Chien-yu Huang
Kai-Wei Chang
Hung-yi Lee
408
14
0
14 Oct 2021
Improving robustness of one-shot voice conversion with deep
  discriminative speaker encoder
Improving robustness of one-shot voice conversion with deep discriminative speaker encoderInterspeech (Interspeech), 2021
Hongqiang Du
Lei Xie
150
7
0
19 Jun 2021
S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised
  Pretrained Representations
S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained RepresentationsInterspeech (Interspeech), 2021
Jheng-hao Lin
Yist Y. Lin
C. Chien
Hung-yi Lee
477
63
0
07 Apr 2021
Deepfakes Generation and Detection: State-of-the-art, open challenges,
  countermeasures, and way forward
Deepfakes Generation and Detection: State-of-the-art, open challenges, countermeasures, and way forward
Momina Masood
M. Nawaz
K. Malik
A. Javed
Aun Irtaza
AAML
558
452
0
25 Feb 2021
1
Page 1 of 1