Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2011.12063
Cited By
v1
v2
v3 (latest)
How Far Are We from Robust Voice Conversion: A Survey
Spoken Language Technology Workshop (SLT), 2020
24 November 2020
Tzu-hsien Huang
Jheng-hao Lin
Chien-yu Huang
Hung-yi Lee
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"How Far Are We from Robust Voice Conversion: A Survey"
16 / 16 papers shown
Speech DF Arena: A Leaderboard for Speech DeepFake Detection Models
Sandipana Dowerah
Atharva Kulkarni
Ajinkya Kulkarni
Hoan My Tran
Joonas Kalda
Artem Fedorchenko
Benoit Fauve
Damien Lolive
Tanel Alumae
Matthew Magimai Doss
ELM
122
9
0
02 Sep 2025
RoVo: Robust Voice Protection Against Unauthorized Speech Synthesis with Embedding-Level Perturbations
Seungmin Kim
Sohee Park
Donghyun Kim
Jisu Lee
Daeseon Choi
AAML
297
0
0
19 May 2025
Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation
IEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2025
Haorui He
Zengqiang Shang
Chaoren Wang
Xuyuan Li
Yicheng Gu
...
Peiyang Shi
Longji Xu
Kai Chen
Pengyuan Zhang
Zhikai Wu
AuLLM
457
23
0
27 Jan 2025
Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation
Haorui He
Zengqiang Shang
Chaoren Wang
Xuyuan Li
Yicheng Gu
...
Peiyang Shi
Yuancheng Wang
Kai Chen
Pengyuan Zhang
Zhizheng Wu
283
210
0
07 Jul 2024
Noise-Robust Voice Conversion by Conditional Denoising Training Using Latent Variables of Recording Quality and Environment
Takuto Igarashi
Yuki Saito
Kentaro Seki
Shinnosuke Takamichi
Ryuichi Yamamoto
Kentaro Tachibana
Hiroshi Saruwatari
183
3
0
11 Jun 2024
SRC4VC: Smartphone-Recorded Corpus for Voice Conversion Benchmark
Yuki Saito
Takuto Igarashi
Kentaro Seki
Shinnosuke Takamichi
Ryuichi Yamamoto
Kentaro Tachibana
Hiroshi Saruwatari
188
1
0
11 Jun 2024
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models
Jee-weon Jung
Wangyou Zhang
Jiatong Shi
Zakaria Aldeneh
Takuya Higuchi
B. Theobald
Ahmed Hussen Abdelaziz
Shinji Watanabe
499
47
0
30 Jan 2024
Low-latency Real-time Voice Conversion on CPU
Konstantine Sadov
Matthew Hutter
Asara Near
VLM
636
3
0
01 Nov 2023
Learning Repeatable Speech Embeddings Using An Intra-class Correlation Regularizer
Neural Information Processing Systems (NeurIPS), 2023
Jianwei Zhang
Suren Jayasuriya
Visar Berisha
SSL
273
3
0
25 Oct 2023
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Dongchao Yang
Jinchuan Tian
Xuejiao Tan
Rongjie Huang
Songxiang Liu
...
Jiang Bian
Xixin Wu
Zhou Zhao
Shinji Watanabe
Helen M. Meng
CVBM
AuLLM
643
193
0
01 Oct 2023
Noise-robust voice conversion with domain adversarial training
Neural Networks (NN), 2022
Hongqiang Du
Lei Xie
Haizhou Li
223
20
0
26 Jan 2022
Training Robust Zero-Shot Voice Conversion Models with Self-supervised Features
Trung D. Q. Dang
Dung T. Tran
Peter Chin
K. Koishida
SSL
226
18
0
08 Dec 2021
Toward Degradation-Robust Voice Conversion
Chien-yu Huang
Kai-Wei Chang
Hung-yi Lee
388
14
0
14 Oct 2021
Improving robustness of one-shot voice conversion with deep discriminative speaker encoder
Interspeech (Interspeech), 2021
Hongqiang Du
Lei Xie
143
7
0
19 Jun 2021
S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations
Interspeech (Interspeech), 2021
Jheng-hao Lin
Yist Y. Lin
C. Chien
Hung-yi Lee
473
63
0
07 Apr 2021
Deepfakes Generation and Detection: State-of-the-art, open challenges, countermeasures, and way forward
Momina Masood
M. Nawaz
K. Malik
A. Javed
Aun Irtaza
AAML
552
441
0
25 Feb 2021
1
Page 1 of 1