ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.14156
  4. Cited By
SpeechSplit 2.0: Unsupervised speech disentanglement for voice
  conversion Without tuning autoencoder Bottlenecks

SpeechSplit 2.0: Unsupervised speech disentanglement for voice conversion Without tuning autoencoder Bottlenecks

26 March 2022
Chak Ho Chan
Kaizhi Qian
Yang Zhang
M. Hasegawa-Johnson
    DRL
ArXiv (abs)PDFHTMLGithub (132★)

Papers citing "SpeechSplit 2.0: Unsupervised speech disentanglement for voice conversion Without tuning autoencoder Bottlenecks"

31 / 31 papers shown
Title
Pureformer-VC: Non-parallel Voice Conversion with Pure Stylized Transformer Blocks and Triplet Discriminative Training
Wenhan Yao
Fen Xiao
Xiarun Chen
Jia Liu
yongqiang He
Weiping Wen
25
0
0
10 Jun 2025
PseudoVC: Improving One-shot Voice Conversion with Pseudo Paired Data
PseudoVC: Improving One-shot Voice Conversion with Pseudo Paired Data
Songjun Cao
Qinghua Wu
Jie Chen
Jin Li
Long Ma
52
0
0
01 Jun 2025
Discl-VC: Disentangled Discrete Tokens and In-Context Learning for Controllable Zero-Shot Voice Conversion
Discl-VC: Disentangled Discrete Tokens and In-Context Learning for Controllable Zero-Shot Voice Conversion
Kaidi Wang
Wenhao Guan
Ziyue Jiang
Hukai Huang
Peijie Chen
Weijie Wu
Q. Hong
Lin Li
30
0
0
30 May 2025
Whispering Under the Eaves: Protecting User Privacy Against Commercial and LLM-powered Automatic Speech Recognition Systems
Whispering Under the Eaves: Protecting User Privacy Against Commercial and LLM-powered Automatic Speech Recognition Systems
Weifei Jin
Yuxin Cao
Junjie Su
Derui Wang
Yedi Zhang
Minhui Xue
Jie Hao
Jin Song Dong
Yixian Yang
AAML
83
0
0
01 Apr 2025
ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training
ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training
Xinfa Zhu
Lei He
Yujia Xiao
Xi Wang
Xu Tan
Sheng Zhao
Lei Xie
DiffM
102
2
0
08 Jan 2025
Representation Purification for End-to-End Speech Translation
Representation Purification for End-to-End Speech Translation
Chengwei Zhang
Yue Zhou
Rui Zhao
Yidong Chen
Xiaodong Shi
92
0
0
05 Dec 2024
LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec
LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec
Yiwei Guo
Zhihan Li
Chenpeng Du
Hankun Wang
Xie Chen
Kai Yu
102
3
0
21 Oct 2024
Pureformer-VC: Non-parallel One-Shot Voice Conversion with Pure
  Transformer Blocks and Triplet Discriminative Training
Pureformer-VC: Non-parallel One-Shot Voice Conversion with Pure Transformer Blocks and Triplet Discriminative Training
Wenhan Yao
Zedong Xing
Xiarun Chen
Jia Liu
yongqiang He
Weiping Wen
68
0
0
03 Sep 2024
vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders
vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders
Yiwei Guo
Zhihan Li
Junjie Li
Chenpeng Du
Hankun Wang
Shuai Wang
Xie Chen
Kai Yu
105
0
0
03 Sep 2024
Progressive Residual Extraction based Pre-training for Speech
  Representation Learning
Progressive Residual Extraction based Pre-training for Speech Representation Learning
Tianrui Wang
Jin Li
Ziyang Ma
Rui Cao
Xie Chen
...
Meng Ge
Xiaobao Wang
Yuguang Wang
Jianwu Dang
Nyima Tashi
SSL
112
0
0
31 Aug 2024
An Unsupervised Domain Adaptation Method for Locating Manipulated Region
  in partially fake Audio
An Unsupervised Domain Adaptation Method for Locating Manipulated Region in partially fake Audio
Siding Zeng
Jiangyan Yi
Jianhua Tao
Yujie Chen
Shan Liang
Yong Ren
Xiaohui Zhang
98
0
0
11 Jul 2024
Imperceptible Rhythm Backdoor Attacks: Exploring Rhythm Transformation
  for Embedding Undetectable Vulnerabilities on Speech Recognition
Imperceptible Rhythm Backdoor Attacks: Exploring Rhythm Transformation for Embedding Undetectable Vulnerabilities on Speech Recognition
Wenhan Yao
Jiangkun Yang
yongqiang He
Jia Liu
Weiping Wen
93
3
0
16 Jun 2024
End-to-end Streaming model for Low-Latency Speech Anonymization
End-to-end Streaming model for Low-Latency Speech Anonymization
Waris Quamer
Ricardo Gutierrez-Osuna
98
0
0
13 Jun 2024
Generalized Source Tracing: Detecting Novel Audio Deepfake Algorithm
  with Real Emphasis and Fake Dispersion Strategy
Generalized Source Tracing: Detecting Novel Audio Deepfake Algorithm with Real Emphasis and Fake Dispersion Strategy
Yuankun Xie
Ruibo Fu
Zhengqi Wen
Zhiyong Wang
Xiaopeng Wang
Haonnan Cheng
Long Ye
Jianhua Tao
95
7
0
05 Jun 2024
Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing
  Conversion
Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing Conversion
Ruiqi Li
Rongjie Huang
Yongqi Wang
Zhiqing Hong
Zhou Zhao
69
1
0
04 Jun 2024
Towards Evaluating the Robustness of Automatic Speech Recognition
  Systems via Audio Style Transfer
Towards Evaluating the Robustness of Automatic Speech Recognition Systems via Audio Style Transfer
Weifei Jin
Yuxin Cao
Junjie Su
Qi Shen
Kai Ye
Derui Wang
Jie Hao
Ziyao Liu
AAML
132
2
0
15 May 2024
MAIN-VC: Lightweight Speech Representation Disentanglement for One-shot
  Voice Conversion
MAIN-VC: Lightweight Speech Representation Disentanglement for One-shot Voice Conversion
Pengcheng Li
Jianzong Wang
Xulong Zhang
Yong Zhang
Jing Xiao
Ning Cheng
DRL
77
2
0
02 May 2024
Learning Expressive Disentangled Speech Representations with Soft Speech
  Units and Adversarial Style Augmentation
Learning Expressive Disentangled Speech Representations with Soft Speech Units and Adversarial Style Augmentation
Yimin Deng
Jianzong Wang
Xulong Zhang
Ning Cheng
Jing Xiao
98
0
0
01 May 2024
EAD-VC: Enhancing Speech Auto-Disentanglement for Voice Conversion with
  IFUB Estimator and Joint Text-Guided Consistent Learning
EAD-VC: Enhancing Speech Auto-Disentanglement for Voice Conversion with IFUB Estimator and Joint Text-Guided Consistent Learning
Ziqi Liang
Jianzong Wang
Xulong Zhang
Yong Zhang
Ning Cheng
Jing Xiao
61
1
0
30 Apr 2024
Learning Disentangled Speech Representations with Contrastive Learning
  and Time-Invariant Retrieval
Learning Disentangled Speech Representations with Contrastive Learning and Time-Invariant Retrieval
Yimin Deng
Huaizhen Tang
Xulong Zhang
Ning Cheng
Jing Xiao
Jianzong Wang
DRL
82
1
0
16 Jan 2024
SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross
  Attention
SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention
Junjie Li
Yiwei Guo
Xie Chen
Kai Yu
113
18
0
14 Dec 2023
CLN-VC: Text-Free Voice Conversion Based on Fine-Grained Style Control
  and Contrastive Learning with Negative Samples Augmentation
CLN-VC: Text-Free Voice Conversion Based on Fine-Grained Style Control and Contrastive Learning with Negative Samples Augmentation
Yimin Deng
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
79
3
0
15 Nov 2023
An Efficient Temporary Deepfake Location Approach Based Embeddings for
  Partially Spoofed Audio Detection
An Efficient Temporary Deepfake Location Approach Based Embeddings for Partially Spoofed Audio Detection
Yuankun Xie
Haonan Cheng
Yutian Wang
Long Ye
99
9
0
06 Sep 2023
Automatic Speech Disentanglement for Voice Conversion using Rank Module
  and Speech Augmentation
Automatic Speech Disentanglement for Voice Conversion using Rank Module and Speech Augmentation
Zhonghua Liu
Shijun Wang
Ning Chen
DRL
68
2
0
21 Jun 2023
Who is Speaking Actually? Robust and Versatile Speaker Traceability for
  Voice Conversion
Who is Speaking Actually? Robust and Versatile Speaker Traceability for Voice Conversion
Yanzhen Ren
Hongcheng Zhu
Liming Zhai
Zongkun Sun
Rubing Shen
Lina Wang
71
9
0
09 May 2023
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
Ruiqi Li
Rongjie Huang
Lichao Zhang
Jinglin Liu
Zhou Zhao
85
4
0
08 May 2023
Disentangling Prosody Representations with Unsupervised Speech
  Reconstruction
Disentangling Prosody Representations with Unsupervised Speech Reconstruction
Leyuan Qu
Taiha Li
C. Weber
Theresa Pekarek-Rosin
F. Ren
S. Wermter
87
10
0
14 Dec 2022
Improved disentangled speech representations using contrastive learning
  in factorized hierarchical variational autoencoder
Improved disentangled speech representations using contrastive learning in factorized hierarchical variational autoencoder
Yuying Xie
Thomas Arildsen
Zheng-Hua Tan
54
2
0
15 Nov 2022
A unified one-shot prosody and speaker conversion system with
  self-supervised discrete speech units
A unified one-shot prosody and speaker conversion system with self-supervised discrete speech units
Li-Wei Chen
Shinji Watanabe
Alexander I. Rudnicky
77
7
0
12 Nov 2022
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
Jingyi Li
Weiping Tu
Li Xiao
134
113
0
27 Oct 2022
Speech Representation Disentanglement with Adversarial Mutual
  Information Learning for One-shot Voice Conversion
Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion
Sicheng Yang
Methawee Tantrawenith
Hao-Wen Zhuang
Zhiyong Wu
Aolan Sun
...
Ning Cheng
Huaizhen Tang
Xintao Zhao
Jie Wang
Helen Meng
DRL
56
39
0
18 Aug 2022
1