ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2008.03648
  4. Cited By
An Overview of Voice Conversion and its Challenges: From Statistical
  Modeling to Deep Learning

An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning

9 August 2020
Berrak Sisman
Junichi Yamagishi
Simon King
Haizhou Li
    BDL
ArXivPDFHTML

Papers citing "An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning"

50 / 145 papers shown
Title
Emo-StarGAN: A Semi-Supervised Any-to-Many Non-Parallel
  Emotion-Preserving Voice Conversion
Emo-StarGAN: A Semi-Supervised Any-to-Many Non-Parallel Emotion-Preserving Voice Conversion
Suhita Ghosh
Arnab Das
Yamini Sinha
Ingo Siegert
Tim Polzehl
Sebastian Stober
9
4
0
14 Sep 2023
Parallel and Limited Data Voice Conversion Using Stochastic Variational
  Deep Kernel Learning
Parallel and Limited Data Voice Conversion Using Stochastic Variational Deep Kernel Learning
Mohamadreza Jafaryani
H. Sheikhzadeh
V. Pourahmadi
14
4
0
08 Sep 2023
Evaluating Methods for Ground-Truth-Free Foreign Accent Conversion
Evaluating Methods for Ground-Truth-Free Foreign Accent Conversion
Wen-Chin Huang
T. Toda
CVBM
21
5
0
05 Sep 2023
Timbre-reserved Adversarial Attack in Speaker Identification
Timbre-reserved Adversarial Attack in Speaker Identification
Qing Wang
Jixun Yao
Li Lyna Zhang
Pengcheng Guo
Linfu Xie
AAML
19
4
0
02 Sep 2023
A Review of Differentiable Digital Signal Processing for Music & Speech
  Synthesis
A Review of Differentiable Digital Signal Processing for Music & Speech Synthesis
B. Hayes
Jordie Shier
Gyorgy Fazekas
Andrew Mcpherson
C. Saitis
21
21
0
29 Aug 2023
Audio Deepfake Detection: A Survey
Audio Deepfake Detection: A Survey
Jiangyan Yi
Chenglong Wang
J. Tao
Xiaohui Zhang
Chu Yuan Zhang
Yan Zhao
29
41
0
29 Aug 2023
Evaluation of the Speech Resynthesis Capabilities of the VoicePrivacy
  Challenge Baseline B1
Evaluation of the Speech Resynthesis Capabilities of the VoicePrivacy Challenge Baseline B1
Ünal Ege Gaznepoglu
Nils Peters
22
0
0
22 Aug 2023
Effects of Convolutional Autoencoder Bottleneck Width on StarGAN-based
  Singing Technique Conversion
Effects of Convolutional Autoencoder Bottleneck Width on StarGAN-based Singing Technique Conversion
Tung-Cheng Su
Yung-Chuan Chang
Yi-Wen Liu
11
0
0
19 Aug 2023
SLMGAN: Exploiting Speech Language Model Representations for
  Unsupervised Zero-Shot Voice Conversion in GANs
SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs
Yinghao Aaron Li
Cong Han
N. Mesgarani
20
5
0
18 Jul 2023
Single and Multi-Speaker Cloned Voice Detection: From Perceptual to
  Learned Features
Single and Multi-Speaker Cloned Voice Detection: From Perceptual to Learned Features
Sarah Barrington
Romit Barua
Gautham Koorma
Hany Farid
19
14
0
15 Jul 2023
The Ethical Implications of Generative Audio Models: A Systematic
  Literature Review
The Ethical Implications of Generative Audio Models: A Systematic Literature Review
J. Barnett
14
25
0
07 Jul 2023
High-Quality Automatic Voice Over with Accurate Alignment: Supervision
  through Self-Supervised Discrete Speech Units
High-Quality Automatic Voice Over with Accurate Alignment: Supervision through Self-Supervised Discrete Speech Units
Junchen Lu
Berrak Sisman
Mingyang Zhang
Haizhou Li
24
4
0
29 Jun 2023
Fake the Real: Backdoor Attack on Deep Speech Classification via Voice
  Conversion
Fake the Real: Backdoor Attack on Deep Speech Classification via Voice Conversion
Zhe Ye
Terui Mao
Li Dong
Diqun Yan
AAML
14
7
0
28 Jun 2023
The Singing Voice Conversion Challenge 2023
The Singing Voice Conversion Challenge 2023
Wen-Chin Huang
Lester Phillip Violeta
Songxiang Liu
Jiatong Shi
T. Toda
16
46
0
26 Jun 2023
MFCCGAN: A Novel MFCC-Based Speech Synthesizer Using Adversarial
  Learning
MFCCGAN: A Novel MFCC-Based Speech Synthesizer Using Adversarial Learning
Mohammad Reza Hasanabadi
11
3
0
22 Jun 2023
Automatic Speech Disentanglement for Voice Conversion using Rank Module
  and Speech Augmentation
Automatic Speech Disentanglement for Voice Conversion using Rank Module and Speech Augmentation
Zhonghua Liu
Shijun Wang
Ning Chen
DRL
14
2
0
21 Jun 2023
Mandarin Electrolaryngeal Speech Voice Conversion using Cross-domain
  Features
Mandarin Electrolaryngeal Speech Voice Conversion using Cross-domain Features
Hsin-Hao Chen
Yung-Lun Chien
Ming-Chi Yen
S. Tsai
Yu Tsao
T. Chi
Hsin-Min Wang
17
2
0
11 Jun 2023
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked
  Cycle-Consistent Generative Adversarial Networks
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
Dominik Wagner
Ilja Baumann
Tobias Bocklet
20
1
0
10 Jun 2023
Phase perturbation improves channel robustness for speech spoofing
  countermeasures
Phase perturbation improves channel robustness for speech spoofing countermeasures
Yongyi Zang
You Zhang
Z. Duan
19
2
0
06 Jun 2023
LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading
LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading
Yochai Yemini
Aviv Shamsian
Lior Bracha
Sharon Gannot
Ethan Fetaya
DiffM
6
9
0
05 Jun 2023
Voice Conversion With Just Nearest Neighbors
Voice Conversion With Just Nearest Neighbors
Matthew Baas
Benjamin van Niekerk
Herman Kamper
SSL
30
48
0
30 May 2023
ADD 2023: the Second Audio Deepfake Detection Challenge
ADD 2023: the Second Audio Deepfake Detection Challenge
Jiangyan Yi
Jianhua Tao
Ruibo Fu
Xinrui Yan
Chenglong Wang
...
Zhengqi Wen
Shan Liang
Zheng Lian
Shuai Nie
Haizhou Li
80
93
0
23 May 2023
DualVC: Dual-mode Voice Conversion using Intra-model Knowledge
  Distillation and Hybrid Predictive Coding
DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding
Ziqian Ning
Yuepeng Jiang
Pengcheng Zhu
Jixun Yao
Shuai Wang
Linfu Xie
Mengxiao Bi
21
10
0
21 May 2023
Advancing Stuttering Detection via Data Augmentation, Class-Balanced
  Loss and Multi-Contextual Deep Learning
Advancing Stuttering Detection via Data Augmentation, Class-Balanced Loss and Multi-Contextual Deep Learning
S. A. Sheikh
Md. Sahidullah
F. Hirsch
Slim Ouni
16
16
0
21 Feb 2023
On granularity of prosodic representations in expressive text-to-speech
On granularity of prosodic representations in expressive text-to-speech
Mikolaj Babianski
Kamil Pokora
Raahil Shah
Rafał Sienkiewicz
Daniel Korzekwa
V. Klimkov
17
5
0
26 Jan 2023
UnifySpeech: A Unified Framework for Zero-shot Text-to-Speech and Voice
  Conversion
UnifySpeech: A Unified Framework for Zero-shot Text-to-Speech and Voice Conversion
Hao Liu
Tao Wang
Ruibo Fu
Jiangyan Yi
Zhengqi Wen
J. Tao
13
3
0
10 Jan 2023
Voice conversion with limited data and limitless data augmentations
Voice conversion with limited data and limitless data augmentations
Olga Slizovskaia
Jordi Janer
Pritish Chandna
Oscar Mayor
9
1
0
27 Dec 2022
Exploring the Optimized Value of Each Hyperparameter in Various Gradient
  Descent Algorithms
Exploring the Optimized Value of Each Hyperparameter in Various Gradient Descent Algorithms
Abel C. H. Chen
28
2
0
23 Dec 2022
Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion
  of Bottleneck and Perturbation Features
Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion of Bottleneck and Perturbation Features
Ziqian Ning
Qicong Xie
Pengcheng Zhu
Zhichao Wang
Liumeng Xue
Jixun Yao
Linfu Xie
Mengxiao Bi
16
16
0
09 Nov 2022
Accented Text-to-Speech Synthesis with a Conditional Variational
  Autoencoder
Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder
J. Melechovský
Ambuj Mehrish
Berrak Sisman
Dorien Herremans
21
6
0
07 Nov 2022
Preserving background sound in noise-robust voice conversion via
  multi-task learning
Preserving background sound in noise-robust voice conversion via multi-task learning
J.-H. Yao
Yi Lei
Qing Wang
Pengcheng Guo
Ziqian Ning
Linfu Xie
Hai Li
Junhui Liu
Danming Xie
23
10
0
06 Nov 2022
Disentangled Speech Representation Learning for One-Shot Cross-lingual
  Voice Conversion Using $β$-VAE
Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using βββ-VAE
Hui Lu
Disong Wang
Xixin Wu
Zhiyong Wu
Xunying Liu
Helen M. Meng
DRL
17
9
0
25 Oct 2022
Two-stage training method for Japanese electrolaryngeal speech
  enhancement based on sequence-to-sequence voice conversion
Two-stage training method for Japanese electrolaryngeal speech enhancement based on sequence-to-sequence voice conversion
D. Ma
Lester Phillip Violeta
Kazuhiro Kobayashi
T. Toda
18
6
0
19 Oct 2022
Synthetic Voice Detection and Audio Splicing Detection using
  SE-Res2Net-Conformer Architecture
Synthetic Voice Detection and Audio Splicing Detection using SE-Res2Net-Conformer Architecture
Lei Wang
Benedict Yeoh
Jun Wah Ng
32
7
0
07 Oct 2022
An Overview of Affective Speech Synthesis and Conversion in the Deep
  Learning Era
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era
Andreas Triantafyllopoulos
Björn W. Schuller
Gokcce .Iymen
M. Sezgin
Xiangheng He
...
Shuo Liu
Silvan Mertes
Elisabeth André
Ruibo Fu
Jianhua Tao
15
53
0
06 Oct 2022
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on
  Fixed-Point Iteration
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration
Yuma Koizumi
Kohei Yatabe
Heiga Zen
M. Bacchiani
DiffM
42
29
0
03 Oct 2022
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on
  Pitch and Speed
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed
Mei-Shuo Chen
Z. Duan
22
10
0
23 Sep 2022
Deep Speech Synthesis from Articulatory Representations
Deep Speech Synthesis from Articulatory Representations
Peter Wu
Shinji Watanabe
L. Goldstein
A. Black
Gopala K. Anumanchipalli
31
24
0
13 Sep 2022
Dispersed Pixel Perturbation-based Imperceptible Backdoor Trigger for
  Image Classifier Models
Dispersed Pixel Perturbation-based Imperceptible Backdoor Trigger for Image Classifier Models
Yulong Wang
Minghui Zhao
Shenghong Li
Xinnan Yuan
W. Ni
16
15
0
19 Aug 2022
Speech Representation Disentanglement with Adversarial Mutual
  Information Learning for One-shot Voice Conversion
Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion
Sicheng Yang
Methawee Tantrawenith
Hao-Wen Zhuang
Zhiyong Wu
Aolan Sun
...
Ning Cheng
Huaizhen Tang
Xintao Zhao
Jie Wang
H. Meng
DRL
9
37
0
18 Aug 2022
Speech Synthesis with Mixed Emotions
Speech Synthesis with Mixed Emotions
Kun Zhou
Berrak Sisman
R. Rana
B.W.Schuller
Haizhou Li
14
43
0
11 Aug 2022
Subband-based Generative Adversarial Network for Non-parallel
  Many-to-many Voice Conversion
Subband-based Generative Adversarial Network for Non-parallel Many-to-many Voice Conversion
Jianchun Ma
Zhedong Zheng
Hao Fei
Feng Zheng
Tat-Seng Chua
Yi Yang
GAN
19
0
0
13 Jul 2022
A Comparative Study of Self-supervised Speech Representation Based Voice
  Conversion
A Comparative Study of Self-supervised Speech Representation Based Voice Conversion
Wen-Chin Huang
Shu-Wen Yang
Tomoki Hayashi
T. Toda
10
15
0
10 Jul 2022
Automatic Evaluation of Speaker Similarity
Automatic Evaluation of Speaker Similarity
Kamil Deja
Ariadna Sánchez
Julian Roth
Marius Cotescu
17
6
0
01 Jul 2022
An Evaluation of Three-Stage Voice Conversion Framework for Noisy and
  Reverberant Conditions
An Evaluation of Three-Stage Voice Conversion Framework for Noisy and Reverberant Conditions
Yeonjong Choi
Chao Xie
T. Toda
DiffM
17
2
0
30 Jun 2022
Identifying Source Speakers for Voice Conversion based Spoofing Attacks
  on Speaker Verification Systems
Identifying Source Speakers for Voice Conversion based Spoofing Attacks on Speaker Verification Systems
Danwei Cai
Zexin Cai
Ming Li
13
10
0
18 Jun 2022
Streaming non-autoregressive model for any-to-many voice conversion
Streaming non-autoregressive model for any-to-many voice conversion
Ziyi Chen
Haoran Miao
Pengyuan Zhang
6
8
0
15 Jun 2022
Speak Like a Dog: Human to Non-human creature Voice Conversion
Speak Like a Dog: Human to Non-human creature Voice Conversion
Kohei Suzuki
Shoki Sakamoto
T. Taniguchi
Hirokazu Kameoka
17
2
0
09 Jun 2022
End-to-End Zero-Shot Voice Conversion with Location-Variable
  Convolutions
End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions
Wonjune Kang
M. Hasegawa-Johnson
D. Roy
24
8
0
19 May 2022
Towards Improved Zero-shot Voice Conversion with Conditional DSVAE
Towards Improved Zero-shot Voice Conversion with Conditional DSVAE
Jiachen Lian
Chunlei Zhang
Gopala Krishna Anumanchipalli
Dong Yu
6
23
0
11 May 2022
Previous
123
Next