Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1905.05879
Cited By
v1
v2 (latest)
AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
14 May 2019
Kaizhi Qian
Yang Zhang
Shiyu Chang
Xuesong Yang
M. Hasegawa-Johnson
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss"
50 / 273 papers shown
Title
DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion
Haram Choi
Sang-Hoon Lee
Seong-Whan Lee
DiffM
80
35
0
25 May 2023
Iteratively Improving Speech Recognition and Voice Conversion
Mayank Singh
Naoya Takahashi
Ono Naoyuki
51
4
0
24 May 2023
DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding
Ziqian Ning
Yuepeng Jiang
Pengcheng Zhu
Jixun Yao
Shuai Wang
Linfu Xie
Mengxiao Bi
76
10
0
21 May 2023
Data Augmentation for Diverse Voice Conversion in Noisy Environments
Avani Tanna
Michael Stephen Saxon
A. El Abbadi
William Yang Wang
17
0
0
18 May 2023
Adversarial Speaker Disentanglement Using Unannotated External Data for Self-supervised Representation Based Voice Conversion
Xintao Zhao
Shuai Wang
Yang Chao
Zhiyong Wu
Helen Meng
71
3
0
16 May 2023
Self-supervised Neural Factor Analysis for Disentangling Utterance-level Speech Representations
Wei-wei Lin
Chenhang He
Man-Wai Mak
Youzhi Tu
58
5
0
14 May 2023
Multi-level Temporal-channel Speaker Retrieval for Zero-shot Voice Conversion
Zhichao Wang
Liumeng Xue
Qiuqiang Kong
Linfu Xie
Yuan-Jui Chen
Qiao Tian
Yuping Wang
BDL
100
3
0
12 May 2023
VSMask: Defending Against Voice Synthesis Attack via Real-Time Predictive Perturbation
Yuanda Wang
Hanqing Guo
Guangjing Wang
Bocheng Chen
Qiben Yan
AAML
60
18
0
09 May 2023
Who is Speaking Actually? Robust and Versatile Speaker Traceability for Voice Conversion
Yanzhen Ren
Hongcheng Zhu
Liming Zhai
Zongkun Sun
Rubing Shen
Lina Wang
71
9
0
09 May 2023
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
Ruiqi Li
Rongjie Huang
Lichao Zhang
Jinglin Liu
Zhou Zhao
85
4
0
08 May 2023
Breaching FedMD: Image Recovery via Paired-Logits Inversion Attack
Hideaki Takahashi
Jingjing Liu
Yang Liu
FedML
104
11
0
22 Apr 2023
Wave-U-Net Discriminator: Fast and Lightweight Discriminator for Generative Adversarial Network-Based Speech Synthesis
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Shogo Seki
52
9
0
24 Mar 2023
TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion
Hyun Joon Park
Seok Woo Yang
Jin Sob Kim
Wooseok Shin
S. W. Han
68
20
0
16 Mar 2023
Improving Few-Shot Learning for Talking Face System with TTS Data Augmentation
Qi Chen
Ziyang Ma
Tao Liu
Xuejiao Tan
Qu Lu
Xie Chen
K. Yu
CVBM
69
5
0
09 Mar 2023
WESPER: Zero-shot and Realtime Whisper to Normal Voice Conversion for Whisper-based Speech Interactions
Jun Rekimoto
96
20
0
03 Mar 2023
A Comparative Analysis Of Latent Regressor Losses For Singing Voice Conversion
Brendan O'Connor
S. Dixon
43
0
0
27 Feb 2023
QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
Houjian Guo
Chaoran Liu
C. Ishi
H. Ishiguro
BDL
100
13
0
16 Feb 2023
ACE-VC: Adaptive and Controllable Voice Conversion using Explicitly Disentangled Self-supervised Speech Representations
Shehzeen Samarah Hussain
Paarth Neekhara
Jocelyn Huang
Jason Chun Lok Li
Boris Ginsburg
73
25
0
16 Feb 2023
Autodecompose: A generative self-supervised model for semantic decomposition
M. Bonyadi
SSL
41
0
0
06 Feb 2023
SPADE: Self-supervised Pretraining for Acoustic DisEntanglement
John Harvill
Jarred Barber
Arun Nair
Ramin Pishehvar
SSL
DRL
47
0
0
03 Feb 2023
StyleTTS-VC: One-Shot Voice Conversion by Knowledge Transfer from Style-Based TTS Models
Yinghao Aaron Li
Cong Han
N. Mesgarani
80
19
0
29 Dec 2022
Speaking Style Conversion in the Waveform Domain Using Discrete Self-Supervised Units
Gallil Maimon
Yossi Adi
104
14
0
19 Dec 2022
Disentangling Prosody Representations with Unsupervised Speech Reconstruction
Leyuan Qu
Taiha Li
C. Weber
Theresa Pekarek-Rosin
F. Ren
S. Wermter
85
10
0
14 Dec 2022
VarietySound: Timbre-Controllable Video to Sound Generation via Unsupervised Information Disentanglement
Chenye Cui
Yi Ren
Jinglin Liu
Rongjie Huang
Zhou Zhao
VGen
86
14
0
19 Nov 2022
Delivering Speaking Style in Low-resource Voice Conversion with Multi-factor Constraints
Zhichao Wang
Xinsheng Wang
Linfu Xie
Yuan-Jui Chen
Qiao Tian
Yuping Wang
79
5
0
16 Nov 2022
Improved disentangled speech representations using contrastive learning in factorized hierarchical variational autoencoder
Yuying Xie
Thomas Arildsen
Zheng-Hua Tan
47
2
0
15 Nov 2022
A unified one-shot prosody and speaker conversion system with self-supervised discrete speech units
Li-Wei Chen
Shinji Watanabe
Alexander I. Rudnicky
77
7
0
12 Nov 2022
Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion of Bottleneck and Perturbation Features
Ziqian Ning
Qicong Xie
Pengcheng Zhu
Zhichao Wang
Liumeng Xue
Jixun Yao
Linfu Xie
Mengxiao Bi
73
18
0
09 Nov 2022
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
Jingyi Li
Weiping Tu
Li Xiao
134
113
0
27 Oct 2022
Streaming Voice Conversion Via Intermediate Bottleneck Features And Non-streaming Teacher Guidance
Yuan-Jui Chen
Ming Tu
Tang-Chun Li
Xin Li
Qiuqiang Kong
Jiaxin Li
Zhichao Wang
Qiao Tian
Yuping Wang
Yuxuan Wang
80
11
0
27 Oct 2022
MetaSpeech: Speech Effects Switch Along with Environment for Metaverse
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
41
1
0
25 Oct 2022
Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using
β
β
β
-VAE
Hui Lu
Disong Wang
Xixin Wu
Zhiyong Wu
Xunying Liu
Helen M. Meng
DRL
115
10
0
25 Oct 2022
DisC-VC: Disentangled and F0-Controllable Neural Voice Conversion
Chihiro Watanabe
Hirokazu Kameoka
DRL
112
0
0
20 Oct 2022
Semi-Supervised Domain Adaptation with Auto-Encoder via Simultaneous Learning
Md. Mahmudur Rahman
Yikang Shen
Mohammad Arif Ul Alam
37
5
0
18 Oct 2022
Hierarchical Diffusion Models for Singing Voice Neural Vocoder
Naoya Takahashi
Mayank Kumar
Singh
Yuki Mitsufuji
DiffM
72
16
0
14 Oct 2022
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed
Mei-Shuo Chen
Z. Duan
105
11
0
23 Sep 2022
Non-Parallel Voice Conversion for ASR Augmentation
Gary Wang
Andrew Rosenberg
Bhuvana Ramabhadran
Fadi Biadsy
Yinghui Huang
Jesse Emond
P. M. Mengibar
104
2
0
15 Sep 2022
DeID-VC: Speaker De-identification via Zero-shot Pseudo Voice Conversion
Ruibin Yuan
Yuxuan Wu
Jacob Li
Jaxter Kim
112
5
0
09 Sep 2022
Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion
Sicheng Yang
Methawee Tantrawenith
Hao-Wen Zhuang
Zhiyong Wu
Aolan Sun
...
Ning Cheng
Huaizhen Tang
Xintao Zhao
Jie Wang
Helen Meng
DRL
56
39
0
18 Aug 2022
TGAVC: Improving Autoencoder Voice Conversion with Text-Guided and Adversarial Training
Huaizhen Tang
Xulong Zhang
Jianzong Wang
Ning Cheng
Zhen Zeng
Edward Xiao
Jing Xiao
92
20
0
08 Aug 2022
Subband-based Generative Adversarial Network for Non-parallel Many-to-many Voice Conversion
Jianchun Ma
Zhedong Zheng
Hao Fei
Feng Zheng
Tat-Seng Chua
Yi Yang
GAN
61
0
0
13 Jul 2022
GlowVC: Mel-spectrogram space disentangling model for language-independent text-free voice conversion
Magdalena Proszewska
Grzegorz Beringer
Daniel Sáez-Trigueros
Thomas Merritt
Abdelhamid Ezzerg
Roberto Barra-Chicote
70
6
0
04 Jul 2022
Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers
Liumeng Xue
Shan Yang
Na Hu
Jane Polak Scowcroft
Linfu Xie
51
2
0
02 Jul 2022
iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for Speech Synthesis based on Disentanglement between Prosody and Timbre
Guangyan Zhang
Ying Qin
Weinan Zhang
Jialun Wu
Mei Li
Yu Gai
Feijun Jiang
Tan Lee
108
27
0
29 Jun 2022
RetrieverTTS: Modeling Decomposed Factors for Text-Based Speech Insertion
Dacheng Yin
Chuanxin Tang
Yanqing Liu
Xiaoqiang Wang
Zhiyuan Zhao
Yucheng Zhao
Zhiwei Xiong
Sheng Zhao
Chong Luo
83
12
0
28 Jun 2022
Data Augmentation for Dementia Detection in Spoken Language
Anna Hlédiková
Dominika Woszczyk
Alican Acman
Soteris Demetriou
Björn Schuller
70
13
0
26 Jun 2022
Identifying Source Speakers for Voice Conversion based Spoofing Attacks on Speaker Verification Systems
Danwei Cai
Zexin Cai
Ming Li
93
10
0
18 Jun 2022
End-to-End Voice Conversion with Information Perturbation
Qicong Xie
Shan Yang
Yinjiao Lei
Linfu Xie
Jane Polak Scowcroft
70
7
0
15 Jun 2022
Streaming non-autoregressive model for any-to-many voice conversion
Ziyi Chen
Haoran Miao
Pengyuan Zhang
76
9
0
15 Jun 2022
Speak Like a Dog: Human to Non-human creature Voice Conversion
Kohei Suzuki
Shoki Sakamoto
T. Taniguchi
Hirokazu Kameoka
55
3
0
09 Jun 2022
Previous
1
2
3
4
5
6
Next