ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1610.04019
  4. Cited By
Voice Conversion from Non-parallel Corpora Using Variational
  Auto-encoder

Voice Conversion from Non-parallel Corpora Using Variational Auto-encoder

13 October 2016
Chin-Cheng Hsu
Hsin-Te Hwang
Yi-Chiao Wu
Yu Tsao
H. Wang
ArXivPDFHTML

Papers citing "Voice Conversion from Non-parallel Corpora Using Variational Auto-encoder"

47 / 47 papers shown
Title
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
Lam Pham
Phat Lam
Dat Tran
Hieu Tang
Tin Nguyen
Alexander Schindler
Canh Vu
Alexander Polonsky
Canh Vu
61
3
0
23 Sep 2024
Hear Your Face: Face-based voice conversion with F0 estimation
Hear Your Face: Face-based voice conversion with F0 estimation
Jaejun Lee
Yoori Oh
Injune Hwang
Kyogu Lee
CVBM
29
2
0
19 Aug 2024
Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with
  Progressive Constraints in a Dual-mode Training Strategy
Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with Progressive Constraints in a Dual-mode Training Strategy
Linhan Ma
Xinfa Zhu
Yuanjun Lv
Zhichao Wang
Ziqian Wang
Wendi He
Hongbin Zhou
Lei Xie
42
2
0
14 Jun 2024
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
40
20
0
15 Apr 2024
SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross
  Attention
SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention
Junjie Li
Yiwei Guo
Xie Chen
Kai Yu
45
13
0
14 Dec 2023
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
Ruiqi Li
Rongjie Huang
Lichao Zhang
Jinglin Liu
Zhou Zhao
33
4
0
08 May 2023
Delivering Speaking Style in Low-resource Voice Conversion with
  Multi-factor Constraints
Delivering Speaking Style in Low-resource Voice Conversion with Multi-factor Constraints
Zhichao Wang
Xinsheng Wang
Linfu Xie
Yuan-Jui Chen
Qiao Tian
Yuping Wang
30
5
0
16 Nov 2022
MetaSpeech: Speech Effects Switch Along with Environment for Metaverse
MetaSpeech: Speech Effects Switch Along with Environment for Metaverse
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
24
1
0
25 Oct 2022
An Overview of Affective Speech Synthesis and Conversion in the Deep
  Learning Era
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era
Andreas Triantafyllopoulos
Björn W. Schuller
Gokcce .Iymen
M. Sezgin
Xiangheng He
...
Shuo Liu
Silvan Mertes
Elisabeth André
Ruibo Fu
Jianhua Tao
20
53
0
06 Oct 2022
Non-Parallel Voice Conversion for ASR Augmentation
Non-Parallel Voice Conversion for ASR Augmentation
Gary Wang
Andrew Rosenberg
Bhuvana Ramabhadran
Fadi Biadsy
Yinghui Huang
Jesse Emond
P. M. Mengibar
26
2
0
15 Sep 2022
ContentVec: An Improved Self-Supervised Speech Representation by
  Disentangling Speakers
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
30
110
0
20 Apr 2022
Enhanced exemplar autoencoder with cycle consistency loss in any-to-one
  voice conversion
Enhanced exemplar autoencoder with cycle consistency loss in any-to-one voice conversion
Weida Liang
Lantian Li
Wenqiang Du
Dong Wang
56
0
0
08 Apr 2022
Noise-robust voice conversion with domain adversarial training
Noise-robust voice conversion with domain adversarial training
Hongqiang Du
Lei Xie
Haizhou Li
19
11
0
26 Jan 2022
IQDUBBING: Prosody modeling based on discrete self-supervised speech
  representation for expressive voice conversion
IQDUBBING: Prosody modeling based on discrete self-supervised speech representation for expressive voice conversion
Wendong Gan
Bolong Wen
Yin Yan
Haitao Chen
Zhichao Wang
Hongqiang Du
Lei Xie
Kaixuan Guo
Hai Li
15
14
0
02 Jan 2022
Training Robust Zero-Shot Voice Conversion Models with Self-supervised
  Features
Training Robust Zero-Shot Voice Conversion Models with Self-supervised Features
Trung D. Q. Dang
Dung T. Tran
Peter Chin
K. Koishida
SSL
19
15
0
08 Dec 2021
Disentanglement of Emotional Style and Speaker Identity for Expressive
  Voice Conversion
Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conversion
Zongyang Du
Berrak Sisman
Kun Zhou
Haizhou Li
18
24
0
20 Oct 2021
StarGAN-VC+ASR: StarGAN-based Non-Parallel Voice Conversion Regularized
  by Automatic Speech Recognition
StarGAN-VC+ASR: StarGAN-based Non-Parallel Voice Conversion Regularized by Automatic Speech Recognition
Shoki Sakamoto
Akira Taniguchi
T. Taniguchi
Hirokazu Kameoka
BDL
31
5
0
10 Aug 2021
SVSNet: An End-to-end Speaker Voice Similarity Assessment Model
SVSNet: An End-to-end Speaker Voice Similarity Assessment Model
Cheng-Hung Hu
Yu-Huai Peng
Junichi Yamagishi
Yu Tsao
Hsin-Min Wang
29
5
0
20 Jul 2021
A learned conditional prior for the VAE acoustic space of a TTS system
A learned conditional prior for the VAE acoustic space of a TTS system
Panagiota Karanasou
S. Karlapati
Alexis Moinet
Arnaud Joly
Ammar Abbas
Simon Slangen
Jaime Lorenzo-Trueba
Thomas Drugman
35
7
0
14 Jun 2021
A Benchmark of Dynamical Variational Autoencoders applied to Speech
  Spectrogram Modeling
A Benchmark of Dynamical Variational Autoencoders applied to Speech Spectrogram Modeling
Xiaoyu Bie
Laurent Girin
Simon Leglaive
Thomas Hueber
Xavier Alameda-Pineda
26
12
0
11 Jun 2021
Emotional Voice Conversion: Theory, Databases and ESD
Emotional Voice Conversion: Theory, Databases and ESD
Kun Zhou
Berrak Sisman
Rui Liu
Haizhou Li
33
168
0
31 May 2021
MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in
  Frames
MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Nobukatsu Hojo
38
57
0
25 Feb 2021
Optimizing voice conversion network with cycle consistency loss of
  speaker identity
Optimizing voice conversion network with cycle consistency loss of speaker identity
Hongqiang Du
Xiaohai Tian
Lei Xie
Haizhou Li
21
17
0
17 Nov 2020
VAW-GAN for Disentanglement and Recomposition of Emotional Elements in
  Speech
VAW-GAN for Disentanglement and Recomposition of Emotional Elements in Speech
Kun Zhou
Berrak Sisman
Haizhou Li
DRL
34
40
0
03 Nov 2020
AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and
  Adaptive Instance Normalization
AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance Normalization
Yen-Hao Chen
Da-Yi Wu
Tsung-Han Wu
Hung-yi Lee
34
107
0
31 Oct 2020
GAZEV: GAN-Based Zero-Shot Voice Conversion over Non-parallel Speech
  Corpus
GAZEV: GAN-Based Zero-Shot Voice Conversion over Non-parallel Speech Corpus
Zining Zhang
Bingsheng He
Zhenjie Zhang
16
19
0
24 Oct 2020
CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram
  Conversion
CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram Conversion
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Nobukatsu Hojo
26
78
0
22 Oct 2020
The Sequence-to-Sequence Baseline for the Voice Conversion Challenge
  2020: Cascading ASR and TTS
The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS
Wen-Chin Huang
Tomoki Hayashi
Shinji Watanabe
T. Toda
DRL
13
39
0
06 Oct 2020
An Overview of Voice Conversion and its Challenges: From Statistical
  Modeling to Deep Learning
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning
Berrak Sisman
Junichi Yamagishi
Simon King
Haizhou Li
BDL
41
318
0
09 Aug 2020
VQVC+: One-Shot Voice Conversion by Vector Quantization and U-Net
  architecture
VQVC+: One-Shot Voice Conversion by Vector Quantization and U-Net architecture
Da-Yi Wu
Yen-Hao Chen
Hung-yi Lee
8
99
0
07 Jun 2020
Contrastive Predictive Coding Supported Factorized Variational
  Autoencoder for Unsupervised Learning of Disentangled Speech Representations
Contrastive Predictive Coding Supported Factorized Variational Autoencoder for Unsupervised Learning of Disentangled Speech Representations
Janek Ebbers
Michael Kuhlmann
Tobias Cord-Landwehr
Reinhold Haeb-Umbach
DRL
CoGe
SSL
31
4
0
26 May 2020
Many-to-Many Voice Transformer Network
Many-to-Many Voice Transformer Network
Hirokazu Kameoka
Wen-Chin Huang
Kou Tanaka
Takuhiro Kaneko
Nobukatsu Hojo
T. Toda
ViT
30
30
0
18 May 2020
F0-consistent many-to-many non-parallel voice conversion via conditional
  autoencoder
F0-consistent many-to-many non-parallel voice conversion via conditional autoencoder
Kaizhi Qian
Zeyu Jin
M. Hasegawa-Johnson
G. J. Mysore
29
107
0
15 Apr 2020
Many-to-Many Voice Conversion using Conditional Cycle-Consistent
  Adversarial Networks
Many-to-Many Voice Conversion using Conditional Cycle-Consistent Adversarial Networks
Shindong Lee
Bonggu Ko
Keonnyeong Lee
In-Chul Yoo
Dongsuk Yook
GAN
30
33
0
15 Feb 2020
Content Based Singing Voice Extraction From a Musical Mixture
Content Based Singing Voice Extraction From a Musical Mixture
Pritish Chandna
Merlijn Blaauw
J. Bonada
E. Gómez
28
14
0
12 Feb 2020
Transforming Spectrum and Prosody for Emotional Voice Conversion with
  Non-Parallel Training Data
Transforming Spectrum and Prosody for Emotional Voice Conversion with Non-Parallel Training Data
Kun Zhou
Berrak Sisman
Haizhou Li
27
66
0
01 Feb 2020
Unsupervised Representation Disentanglement using Cross Domain Features
  and Adversarial Learning in Variational Autoencoder based Voice Conversion
Unsupervised Representation Disentanglement using Cross Domain Features and Adversarial Learning in Variational Autoencoder based Voice Conversion
Wen-Chin Huang
Hao Luo
Hsin-Te Hwang
Chen-Chou Lo
Yu-Huai Peng
Yu Tsao
Hsin-Min Wang
DRL
17
42
0
22 Jan 2020
DNN-based cross-lingual voice conversion using Bottleneck Features
DNN-based cross-lingual voice conversion using Bottleneck Features
M. K. Reddy
K. S. Rao
26
4
0
09 Sep 2019
Non-Parallel Sequence-to-Sequence Voice Conversion with Disentangled
  Linguistic and Speaker Representations
Non-Parallel Sequence-to-Sequence Voice Conversion with Disentangled Linguistic and Speaker Representations
Jing-Xuan Zhang
Zhenhua Ling
Lirong Dai
22
99
0
25 Jun 2019
TTS Skins: Speaker Conversion via ASR
TTS Skins: Speaker Conversion via ASR
Adam Polyak
Lior Wolf
Yaniv Taigman
18
27
0
18 Apr 2019
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and
  Context Preservation Mechanisms
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms
Kou Tanaka
Hirokazu Kameoka
Takuhiro Kaneko
Nobukatsu Hojo
17
111
0
09 Nov 2018
ACVAE-VC: Non-parallel many-to-many voice conversion with auxiliary
  classifier variational autoencoder
ACVAE-VC: Non-parallel many-to-many voice conversion with auxiliary classifier variational autoencoder
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
DRL
16
59
0
13 Aug 2018
Rhythm-Flexible Voice Conversion without Parallel Data Using Cycle-GAN
  over Phoneme Posteriorgram Sequences
Rhythm-Flexible Voice Conversion without Parallel Data Using Cycle-GAN over Phoneme Posteriorgram Sequences
Cheng-chieh Yeh
Po-Chun Hsu
Ju-Chieh Chou
Hung-yi Lee
Lin-Shan Lee
33
23
0
09 Aug 2018
StarGAN-VC: Non-parallel many-to-many voice conversion with star
  generative adversarial networks
StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
34
370
0
06 Jun 2018
Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial
  Networks
Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks
Takuhiro Kaneko
Hirokazu Kameoka
24
202
0
30 Nov 2017
Learning Latent Representations for Speech Generation and Transformation
Learning Latent Representations for Speech Generation and Transformation
Wei-Ning Hsu
Yu Zhang
James R. Glass
DRL
BDL
SSL
26
145
0
13 Apr 2017
Voice Conversion from Unaligned Corpora using Variational Autoencoding
  Wasserstein Generative Adversarial Networks
Voice Conversion from Unaligned Corpora using Variational Autoencoding Wasserstein Generative Adversarial Networks
Chin-Cheng Hsu
Hsin-Te Hwang
Yi-Chiao Wu
Yu Tsao
H. Wang
DRL
37
314
0
04 Apr 2017
1