Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1905.05879
Cited By
v1
v2 (latest)
AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
14 May 2019
Kaizhi Qian
Yang Zhang
Shiyu Chang
Xuesong Yang
M. Hasegawa-Johnson
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss"
50 / 273 papers shown
Title
Timbre Transfer with Variational Auto Encoding and Cycle-Consistent Adversarial Networks
Russell Sammut Bonnici
C. Saitis
Martin Benning
GAN
93
15
0
05 Sep 2021
StarGAN-VC+ASR: StarGAN-based Non-Parallel Voice Conversion Regularized by Automatic Speech Recognition
Shoki Sakamoto
Akira Taniguchi
T. Taniguchi
Hirokazu Kameoka
BDL
63
5
0
10 Aug 2021
Information Sieve: Content Leakage Reduction in End-to-End Prosody For Expressive Speech Synthesis
Xudong Dai
Cheng Gong
Longbiao Wang
Kaili Zhang
34
2
0
04 Aug 2021
Beyond Voice Identity Conversion: Manipulating Voice Attributes by Adversarial Learning of Structured Disentangled Representations
L. Benaroya
Nicolas Obin
Axel Roebel
42
5
0
26 Jul 2021
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
Yinghao Aaron Li
A. Zare
N. Mesgarani
97
101
0
21 Jul 2021
An Improved StarGAN for Emotional Voice Conversion: Enhancing Voice Quality and Data Augmentation
Xiangheng He
Junjie Chen
Georgios Rizos
Björn W. Schuller
47
14
0
18 Jul 2021
Many-to-Many Voice Conversion based Feature Disentanglement using Variational Autoencoder
Manh Luong
Viet-Anh Tran
DRL
46
16
0
11 Jul 2021
Preliminary study on using vector quantization latent spaces for TTS/VC systems with consistent performance
Hieu-Thi Luong
Junichi Yamagishi
85
0
0
25 Jun 2021
Improving robustness of one-shot voice conversion with deep discriminative speaker encoder
Hongqiang Du
Lei Xie
64
6
0
19 Jun 2021
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion
Disong Wang
Liqun Deng
Y. Yeung
Xiao Chen
Xunying Liu
Helen Meng
DRL
84
141
0
18 Jun 2021
Voicy: Zero-Shot Non-Parallel Voice Conversion in Noisy Reverberant Environments
Alejandro Mottini
Jaime Lorenzo-Trueba
S. Karlapati
Thomas Drugman
29
8
0
16 Jun 2021
Enriching Source Style Transfer in Recognition-Synthesis based Non-Parallel Voice Conversion
Zhichao Wang
Xinyong Zhou
Fengyu Yang
Tao Li
Hongqiang Du
Lei Xie
Wendong Gan
Haitao Chen
Hai Li
65
22
0
16 Jun 2021
Global Rhythm Style Transfer Without Text Transcriptions
Kaizhi Qian
Yang Zhang
Shiyu Chang
Jinjun Xiong
Chuang Gan
David D. Cox
M. Hasegawa-Johnson
78
20
0
16 Jun 2021
MONCAE: Multi-Objective Neuroevolution of Convolutional Autoencoders
Daniel Dimanov
E. Balaguer-Ballester
Colin Singleton
Shahin Rostami
42
7
0
07 Jun 2021
NVC-Net: End-to-End Adversarial Voice Conversion
Bac Nguyen Cong
Fabien Cardinaux
AAML
126
42
0
02 Jun 2021
StarGAN-ZSVC: Towards Zero-Shot Voice Conversion in Low-Resource Contexts
Matthew Baas
Herman Kamper
53
6
0
31 May 2021
SpeechNet: A Universal Modularized Model for Speech Processing Tasks
Yi-Chen Chen
Po-Han Chi
Shu-Wen Yang
Kai-Wei Chang
Jheng-hao Lin
Sung-Feng Huang
Da-Rong Liu
Chi-Liang Liu
Cheng-Kuang Lee
Hung-yi Lee
MoE
64
17
0
07 May 2021
Review of end-to-end speech synthesis technology based on deep learning
Zhaoxi Mu
Xinyu Yang
Yizhuo Dong
AuLLM
ALM
94
25
0
20 Apr 2021
FastS2S-VC: Streaming Non-Autoregressive Sequence-to-Sequence Voice Conversion
Hirokazu Kameoka
Kou Tanaka
Takuhiro Kaneko
81
21
0
14 Apr 2021
NoiseVC: Towards High Quality Zero-Shot Voice Conversion
Shijun Wang
Damian Borth
DRL
75
6
0
13 Apr 2021
S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations
Jheng-hao Lin
Yist Y. Lin
C. Chien
Hung-yi Lee
147
56
0
07 Apr 2021
Assem-VC: Realistic Voice Conversion by Assembling Modern Speech Synthesis Techniques
Kang-Wook Kim
Seung-won Park
Junhyeok Lee
Myun-chul Joe
76
28
0
02 Apr 2021
STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech
Keon Lee
Kyumin Park
Daeyoung Kim
69
32
0
17 Mar 2021
Improving Zero-shot Voice Style Transfer via Disentangled Representation Learning
Siyang Yuan
Pengyu Cheng
Ruiyi Zhang
Weituo Hao
Zhe Gan
Lawrence Carin
DRL
66
61
0
17 Mar 2021
Lightweight and interpretable neural modeling of an audio distortion effect using hyperconditioned differentiable biquads
S. Nercessian
Andy M. Sarroff
K. Werner
40
29
0
15 Mar 2021
MBNet: MOS Prediction for Synthesized Speech with Mean-Bias Network
Yichong Leng
Xu Tan
Sheng Zhao
Frank Soong
Xiang-Yang Li
Tao Qin
88
96
0
27 Feb 2021
Deepfakes Generation and Detection: State-of-the-art, open challenges, countermeasures, and way forward
Momina Masood
M. Nawaz
K. Malik
A. Javed
Aun Irtaza
AAML
202
323
0
25 Feb 2021
MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Nobukatsu Hojo
73
60
0
25 Feb 2021
AudioVisual Speech Synthesis: A brief literature review
Efthymios Georgiou
Athanasios Katsamanis
25
0
0
18 Feb 2021
Adversarially learning disentangled speech representations for robust multi-factor voice conversion
Jie Wang
Jingbei Li
Xintao Zhao
Zhiyong Wu
Shiyin Kang
Helen Meng
DRL
123
29
0
30 Jan 2021
EmoCat: Language-agnostic Emotional Voice Conversion
Bastian Schnell
Goeric Huybrechts
Bartek Perz
Thomas Drugman
Jaime Lorenzo-Trueba
89
11
0
14 Jan 2021
AudioViewer: Learning to Visualize Sounds
Chunjin Song
Yuchi Zhang
Willis Peng
Parmis Mohaghegh
Bastian Wandt
Helge Rhodin
105
1
0
22 Dec 2020
How Far Are We from Robust Voice Conversion: A Survey
Tzu-hsien Huang
Jheng-hao Lin
Chien-yu Huang
Hung-yi Lee
96
25
0
24 Nov 2020
Accent and Speaker Disentanglement in Many-to-many Voice Conversion
Zhichao Wang
Wenshuo Ge
Xiong Wang
Shan Yang
Wendong Gan
Haitao Chen
Hai Li
Lei Xie
Xiulin Li
CVBM
98
33
0
17 Nov 2020
Optimizing voice conversion network with cycle consistency loss of speaker identity
Hongqiang Du
Xiaohai Tian
Lei Xie
Haizhou Li
57
18
0
17 Nov 2020
Fine-grained Style Modeling, Transfer and Prediction in Text-to-Speech Synthesis via Phone-Level Content-Style Disentanglement
Daxin Tan
Tan Lee
116
21
0
08 Nov 2020
Semi-supervised Learning for Singing Synthesis Timbre
J. Bonada
Merlijn Blaauw
51
4
0
05 Nov 2020
CVC: Contrastive Learning for Non-parallel Voice Conversion
Tingle Li
Yichen Liu
Chenxu Hu
Hang Zhao
DRL
100
13
0
02 Nov 2020
AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance Normalization
Yen-Hao Chen
Da-Yi Wu
Tsung-Han Wu
Hung-yi Lee
111
108
0
31 Oct 2020
PPG-based singing voice conversion with adversarial representation learning
Zhonghao Li
Benlai Tang
Xiang Yin
Yuan Wan
Linjia Xu
Chen Shen
Zejun Ma
59
37
0
28 Oct 2020
FragmentVC: Any-to-Any Voice Conversion by End-to-End Extracting and Fusing Fine-Grained Voice Fragments With Attention
Yist Y. Lin
C. Chien
Jheng-hao Lin
Hung-yi Lee
Lin-Shan Lee
60
79
0
27 Oct 2020
CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram Conversion
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Nobukatsu Hojo
87
82
0
22 Oct 2020
The NeteaseGames System for Voice Conversion Challenge 2020 with Vector-quantization Variational Autoencoder and WaveNet
Haitong Zhang
DRL
28
4
0
15 Oct 2020
FastVC: Fast Voice Conversion with non-parallel data
Oriol Barbany
Milos Cernak
43
7
0
08 Oct 2020
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed Langevin Dynamics
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
Shogo Seki
DiffM
124
21
0
06 Oct 2020
The Academia Sinica Systems of Voice Conversion for VCC2020
Yu-Huai Peng
Cheng-Hung Hu
A. Kang
Hung-Shin Lee
Pin-Yuan Chen
Yu Tsao
Hsin-Min Wang
64
2
0
06 Oct 2020
The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS
Wen-Chin Huang
Tomoki Hayashi
Shinji Watanabe
Tomoki Toda
DRL
81
40
0
06 Oct 2020
Transfer Learning from Speech Synthesis to Voice Conversion with Non-Parallel Training Data
Mingyang Zhang
Yi Zhou
Li Zhao
Haizhou Li
92
53
0
30 Sep 2020
A Deep Learning Based Analysis-Synthesis Framework For Unison Singing
Pritish Chandna
Helena Cuesta
Emilia Gómez
53
4
0
21 Sep 2020
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling
Songxiang Liu
Yuewen Cao
Disong Wang
Xixin Wu
Xunying Liu
Helen Meng
BDL
116
92
0
06 Sep 2020
Previous
1
2
3
4
5
6
Next