ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2008.03648
  4. Cited By
An Overview of Voice Conversion and its Challenges: From Statistical
  Modeling to Deep Learning

An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning

9 August 2020
Berrak Sisman
Junichi Yamagishi
Simon King
Haizhou Li
    BDL
ArXivPDFHTML

Papers citing "An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning"

45 / 145 papers shown
Title
Read the Room: Adapting a Robot's Voice to Ambient and Social Contexts
Read the Room: Adapting a Robot's Voice to Ambient and Social Contexts
Paige Tuttosi
Emma Hughson
Akihiro Matsufuji
Angelica Lim
15
4
0
10 May 2022
Enhanced exemplar autoencoder with cycle consistency loss in any-to-one
  voice conversion
Enhanced exemplar autoencoder with cycle consistency loss in any-to-one voice conversion
Weida Liang
Lantian Li
Wenqiang Du
Dong Wang
43
0
0
08 Apr 2022
HiFi-VC: High Quality ASR-Based Voice Conversion
HiFi-VC: High Quality ASR-Based Voice Conversion
A. Kashkin
I. Karpukhin
S. Shishkin
21
5
0
31 Mar 2022
Robust Disentangled Variational Speech Representation Learning for
  Zero-shot Voice Conversion
Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion
Jiachen Lian
Chunlei Zhang
Dong Yu
DRL
9
50
0
30 Mar 2022
DGC-vector: A new speaker embedding for zero-shot voice conversion
DGC-vector: A new speaker embedding for zero-shot voice conversion
Ruitong Xiao
Haitong Zhang
Yue Lin
10
11
0
18 Mar 2022
Text-free non-parallel many-to-many voice conversion using normalising
  flows
Text-free non-parallel many-to-many voice conversion using normalising flows
Thomas Merritt
Abdelhamid Ezzerg
Piotr Bilinski
Magdalena Proszewska
Kamil Pokora
Roberto Barra-Chicote
Daniel Korzekwa
20
14
0
15 Mar 2022
The Vicomtech Audio Deepfake Detection System based on Wav2Vec2 for the
  2022 ADD Challenge
The Vicomtech Audio Deepfake Detection System based on Wav2Vec2 for the 2022 ADD Challenge
Juan M. Martín-Donas
Aitor Álvarez
30
98
0
03 Mar 2022
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech
  Editing
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing
Tao Wang
Jiangyan Yi
Ruibo Fu
J. Tao
Zhengqi Wen
KELM
12
18
0
21 Feb 2022
The HCCL-DKU system for fake audio generation task of the 2022 ICASSP
  ADD Challenge
The HCCL-DKU system for fake audio generation task of the 2022 ICASSP ADD Challenge
Ziyi Chen
Hua Hua
Yuxiang Zhang
Ming Li
Pengyuan Zhang
6
0
0
29 Jan 2022
Invertible Voice Conversion
Invertible Voice Conversion
Zexin Cai
Ming Li
BDL
23
1
0
26 Jan 2022
Emotion Intensity and its Control for Emotional Voice Conversion
Emotion Intensity and its Control for Emotional Voice Conversion
Kun Zhou
Berrak Sisman
R. Rana
Björn W. Schuller
Haizhou Li
41
54
0
10 Jan 2022
IQDUBBING: Prosody modeling based on discrete self-supervised speech
  representation for expressive voice conversion
IQDUBBING: Prosody modeling based on discrete self-supervised speech representation for expressive voice conversion
Wendong Gan
Bolong Wen
Yin Yan
Haitao Chen
Zhichao Wang
Hongqiang Du
Lei Xie
Kaixuan Guo
Hai Li
8
14
0
02 Jan 2022
Contrastive Fine-grained Class Clustering via Generative Adversarial
  Networks
Contrastive Fine-grained Class Clustering via Generative Adversarial Networks
Yunji Kim
Jung-Woo Ha
GAN
19
13
0
30 Dec 2021
More than Words: In-the-Wild Visually-Driven Prosody for Text-to-Speech
More than Words: In-the-Wild Visually-Driven Prosody for Text-to-Speech
Michael Hassid
Michelle Tadmor Ramanovich
Brendan Shillingford
Miaosen Wang
Ye Jia
Tal Remez
DiffM
17
16
0
19 Nov 2021
Zero-shot Voice Conversion via Self-supervised Prosody Representation
  Learning
Zero-shot Voice Conversion via Self-supervised Prosody Representation Learning
Shijun Wang
Dimche Kostadinov
Damian Borth
14
10
0
27 Oct 2021
Disentanglement of Emotional Style and Speaker Identity for Expressive
  Voice Conversion
Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conversion
Zongyang Du
Berrak Sisman
Kun Zhou
Haizhou Li
11
24
0
20 Oct 2021
DeepA: A Deep Neural Analyzer For Speech And Singing Vocoding
DeepA: A Deep Neural Analyzer For Speech And Singing Vocoding
Sergey Nikonorov
Berrak Sisman
Mingyang Zhang
Haizhou Li
15
2
0
13 Oct 2021
LaughNet: synthesizing laughter utterances from waveform silhouettes and
  a single laughter example
LaughNet: synthesizing laughter utterances from waveform silhouettes and a single laughter example
Hieu-Thi Luong
Junichi Yamagishi
44
9
0
11 Oct 2021
VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic
  Voice Over
VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over
Junchen Lu
Berrak Sisman
Rui Liu
Mingyang Zhang
Haizhou Li
DiffM
30
19
0
07 Oct 2021
A Tandem Framework Balancing Privacy and Security for Voice User
  Interfaces
A Tandem Framework Balancing Privacy and Security for Voice User Interfaces
Ranya Aloufi
Hamed Haddadi
David E. Boyle
20
2
0
21 Jul 2021
Expressive Voice Conversion: A Joint Framework for Speaker Identity and
  Emotional Style Transfer
Expressive Voice Conversion: A Joint Framework for Speaker Identity and Emotional Style Transfer
Zongyang Du
Berrak Sisman
Kun Zhou
Haizhou Li
14
20
0
08 Jul 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
351
0
29 Jun 2021
Improving multi-speaker TTS prosody variance with a residual encoder and
  normalizing flows
Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows
Iván Vallés-Pérez
Julian Roth
Grzegorz Beringer
Roberto Barra-Chicote
J. Droppo
13
8
0
10 Jun 2021
NVC-Net: End-to-End Adversarial Voice Conversion
NVC-Net: End-to-End Adversarial Voice Conversion
Bac Nguyen Cong
Fabien Cardinaux
AAML
29
41
0
02 Jun 2021
StarGAN-ZSVC: Towards Zero-Shot Voice Conversion in Low-Resource
  Contexts
StarGAN-ZSVC: Towards Zero-Shot Voice Conversion in Low-Resource Contexts
Matthew Baas
Herman Kamper
17
6
0
31 May 2021
Emotional Voice Conversion: Theory, Databases and ESD
Emotional Voice Conversion: Theory, Databases and ESD
Kun Zhou
Berrak Sisman
Rui Liu
Haizhou Li
15
167
0
31 May 2021
An Adaptive Learning based Generative Adversarial Network for One-To-One
  Voice Conversion
An Adaptive Learning based Generative Adversarial Network for One-To-One Voice Conversion
Sandipan Dhar
N. D. Jana
Swagatam Das
17
17
0
25 Apr 2021
FastS2S-VC: Streaming Non-Autoregressive Sequence-to-Sequence Voice
  Conversion
FastS2S-VC: Streaming Non-Autoregressive Sequence-to-Sequence Voice Conversion
Hirokazu Kameoka
Kou Tanaka
Takuhiro Kaneko
29
21
0
14 Apr 2021
Reinforcement Learning for Emotional Text-to-Speech Synthesis with
  Improved Emotion Discriminability
Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability
Rui Liu
Berrak Sisman
Haizhou Li
21
32
0
03 Apr 2021
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech:
  Two-stage Sequence-to-Sequence Training
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence Training
Kun Zhou
Berrak Sisman
Haizhou Li
10
27
0
31 Mar 2021
Deepfakes Generation and Detection: State-of-the-art, open challenges,
  countermeasures, and way forward
Deepfakes Generation and Detection: State-of-the-art, open challenges, countermeasures, and way forward
Momina Masood
M. Nawaz
K. Malik
A. Javed
Aun Irtaza
AAML
112
296
0
25 Feb 2021
Understanding the Tradeoffs in Client-side Privacy for Downstream Speech
  Tasks
Understanding the Tradeoffs in Client-side Privacy for Downstream Speech Tasks
Peter Wu
Paul Pu Liang
Jiatong Shi
Ruslan Salakhutdinov
Shinji Watanabe
Louis-Philippe Morency
18
8
0
22 Jan 2021
Technology-driven Alteration of Nonverbal Cues and its Effects on
  Negotiation
Technology-driven Alteration of Nonverbal Cues and its Effects on Negotiation
Raiyan Abdul Baten
E. Hoque
6
7
0
08 Dec 2020
VAW-GAN for Disentanglement and Recomposition of Emotional Elements in
  Speech
VAW-GAN for Disentanglement and Recomposition of Emotional Elements in Speech
Kun Zhou
Berrak Sisman
Haizhou Li
DRL
6
40
0
03 Nov 2020
Seen and Unseen emotional style transfer for voice conversion with a new
  emotional speech dataset
Seen and Unseen emotional style transfer for voice conversion with a new emotional speech dataset
Kun Zhou
Berrak Sisman
Rui Liu
Haizhou Li
8
185
0
28 Oct 2020
GraphSpeech: Syntax-Aware Graph Attention Network For Neural Speech
  Synthesis
GraphSpeech: Syntax-Aware Graph Attention Network For Neural Speech Synthesis
Rui Liu
Berrak Sisman
Haizhou Li
13
24
0
23 Oct 2020
FastVC: Fast Voice Conversion with non-parallel data
FastVC: Fast Voice Conversion with non-parallel data
Oriol Barbany
Milos Cernak
6
7
0
08 Oct 2020
Voice Conversion Challenge 2020: Intra-lingual semi-parallel and
  cross-lingual voice conversion
Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion
Yi Zhao
Wen-Chin Huang
Xiaohai Tian
Junichi Yamagishi
Rohan Kumar Das
Tomi Kinnunen
Zhenhua Ling
T. Toda
11
204
0
28 Aug 2020
Spectrum and Prosody Conversion for Cross-lingual Voice Conversion with
  CycleGAN
Spectrum and Prosody Conversion for Cross-lingual Voice Conversion with CycleGAN
Zongyang Du
Kun Zhou
Berrak Sisman
Haizhou Li
19
8
0
11 Aug 2020
VAW-GAN for Singing Voice Conversion with Non-parallel Training Data
VAW-GAN for Singing Voice Conversion with Non-parallel Training Data
Junchen Lu
Kun Zhou
Berrak Sisman
Haizhou Li
DRL
6
19
0
10 Aug 2020
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Wen-Chin Huang
Tomoki Hayashi
Yi-Chiao Wu
Hirokazu Kameoka
T. Toda
12
38
0
07 Aug 2020
Expressive TTS Training with Frame and Style Reconstruction Loss
Expressive TTS Training with Frame and Style Reconstruction Loss
Rui Liu
Berrak Sisman
Guanglai Gao
Haizhou Li
9
73
0
04 Aug 2020
Many-to-Many Voice Transformer Network
Many-to-Many Voice Transformer Network
Hirokazu Kameoka
Wen-Chin Huang
Kou Tanaka
Takuhiro Kaneko
Nobukatsu Hojo
T. Toda
ViT
12
30
0
18 May 2020
High Fidelity Speech Synthesis with Adversarial Networks
High Fidelity Speech Synthesis with Adversarial Networks
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
217
239
0
25 Sep 2019
Effective Approaches to Attention-based Neural Machine Translation
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
214
7,923
0
17 Aug 2015
Previous
123