Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2008.03648
Cited By
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning
9 August 2020
Berrak Sisman
Junichi Yamagishi
Simon King
Haizhou Li
BDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning"
50 / 145 papers shown
Title
Introducing voice timbre attribute detection
Jinghao He
Zhengyan Sheng
Liping Chen
Kong AiK Lee
Zhen-Hua Ling
12
1
0
14 May 2025
Generative Adversarial Network based Voice Conversion: Techniques, Challenges, and Recent Advancements
Sandipan Dhar
N. D. Jana
Swagatam Das
43
0
0
27 Apr 2025
Collective Learning Mechanism based Optimal Transport Generative Adversarial Network for Non-parallel Voice Conversion
Sandipan Dhar
Md. Tousin Akhter
N. D. Jana
Swagatam Das
27
1
0
18 Apr 2025
kNN-SVC: Robust Zero-Shot Singing Voice Conversion with Additive Synthesis and Concatenation Smoothness Optimization
Keren Shao
K. Chen
Matthew Baas
Shlomo Dubnov
20
0
0
08 Apr 2025
An Exhaustive Evaluation of TTS- and VC-based Data Augmentation for ASR
Sewade Ogun
Vincent Colotte
Emmanuel Vincent
59
0
0
11 Mar 2025
FaceSpeak: Expressive and High-Quality Speech Synthesis from Human Portraits of Different Styles
Tian-Hao Zhang
Jiawei Zhang
J. Wang
Xinyuan Qian
Xu-cheng Yin
CVBM
45
0
0
02 Jan 2025
A Review of Human Emotion Synthesis Based on Generative Technology
Fei Ma
Y. Li
Yifan Xie
Y. He
Y. Zhang
...
Z. Liu
Wei Yao
Fuji Ren
Fei Richard Yu
Shiguang Ni
76
1
0
10 Dec 2024
CoDiff-VC: A Codec-Assisted Diffusion Model for Zero-shot Voice Conversion
Yuke Li
Xinfa Zhu
Hanzhao Li
J.-H. Yao
WenJie Tian
XiPeng Yang
Yunlin Chen
Zhifei Li
Lei Xie
DiffM
61
0
0
28 Nov 2024
Anonymising Elderly and Pathological Speech: Voice Conversion Using DDSP and Query-by-Example
Suhita Ghosh
Melanie Jouaiti
Arnab Das
Yamini Sinha
Tim Polzehl
Ingo Siegert
Sebastian Stober
23
2
0
20 Oct 2024
Improving Voice Quality in Speech Anonymization With Just Perception-Informed Losses
Suhita Ghosh
Tim Thiele
Frederic Lorbeer
Frank Dreyer
Sebastian Stober
30
0
0
20 Oct 2024
DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech
J. Melechovský
Ambuj Mehrish
Berrak Sisman
Dorien Herremans
16
1
0
17 Oct 2024
Audio-based Kinship Verification Using Age Domain Conversion
Qiyang Sun
Alican Akman
Xin Jing
M. Milling
Björn Schuller
18
1
0
14 Oct 2024
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
Lam Pham
Phat Lam
Dat Tran
Hieu Tang
Tin Nguyen
Alexander Schindler
Canh Vu
Alexander Polonsky
Canh Vu
46
3
0
23 Sep 2024
Discrete Unit based Masking for Improving Disentanglement in Voice Conversion
Philip H. Lee
Ismail Rasim Ulgen
Berrak Sisman
23
0
0
17 Sep 2024
SafeEar: Content Privacy-Preserving Audio Deepfake Detection
Xinfeng Li
Kai Li
Yifan Zheng
Chen Yan
Xiaoyu Ji
Wenyuan Xu
23
13
0
14 Sep 2024
LHQ-SVC: Lightweight and High Quality Singing Voice Conversion Modeling
Yubo Huang
Xin Lai
Muyang Ye
Anran Zhu
Zixi Wang
Jingzehua Xu
Shuai Zhang
Zhiyuan Zhou
Weijie Niu
47
1
0
13 Sep 2024
D-CAPTCHA++: A Study of Resilience of Deepfake CAPTCHA under Transferable Imperceptible Adversarial Attack
Hong-Hanh Nguyen-Le
Van-Tuan Tran
Dinh-Thuc Nguyen
Nhien-An Le-Khac
AAML
30
2
0
11 Sep 2024
SelectTTS: Synthesizing Anyone's Voice via Discrete Unit-Based Frame Selection
Ismail Rasim Ulgen
Shreeram Suresh Chandra
Junchen Lu
Berrak Sisman
107
0
0
30 Aug 2024
Anonymization of Voices in Spaces for Civic Dialogue: Measuring Impact on Empathy, Trust, and Feeling Heard
Wonjune Kang
Margaret Hughes
Deb Roy
23
1
0
26 Aug 2024
Disentangling segmental and prosodic factors to non-native speech comprehensibility
Waris Quamer
Ricardo Gutierrez-Osuna
16
1
0
20 Aug 2024
ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild
Jiangyan Yi
Chu Yuan Zhang
Jianhua Tao
Chenglong Wang
Xinrui Yan
Yong Ren
Hao Gu
Junzuo Zhou
50
1
0
09 Aug 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
30
4
0
21 Jul 2024
The Tug-of-War Between Deepfake Generation and Detection
Hannah Lee
Changyeon Lee
Kevin Farhat
Lin Qiu
Steve Geluso
Aerin Kim
O. Etzioni
34
1
0
08 Jul 2024
Two-Path GMM-ResNet and GMM-SENet for ASV Spoofing Detection
Zhenchun Lei
Hui Yan
Changhong Liu
Minglei Ma
Yingen Yang
19
11
0
08 Jul 2024
We Need Variations in Speech Synthesis: Sub-center Modelling for Speaker Embeddings
Ismail Rasim Ulgen
Carlos Busso
John H. L. Hansen
Berrak Sisman
21
1
0
05 Jul 2024
Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals
Kentaro Seki
Shinnosuke Takamichi
Norihiro Takamune
Yuki Saito
Kanami Imamura
Hiroshi Saruwatari
18
0
0
25 Jun 2024
A Multi-Stream Fusion Approach with One-Class Learning for Audio-Visual Deepfake Detection
Kyungbok Lee
You Zhang
Zhiyao Duan
25
0
0
20 Jun 2024
Asynchronous Voice Anonymization Using Adversarial Perturbation On Speaker Embedding
Rui Wang
Liping Chen
Kong AiK Lee
Zhen-Hua Ling
21
2
0
12 Jun 2024
SRC4VC: Smartphone-Recorded Corpus for Voice Conversion Benchmark
Yuki Saito
Takuto Igarashi
Kentaro Seki
Shinnosuke Takamichi
Ryuichi Yamamoto
Kentaro Tachibana
Hiroshi Saruwatari
18
0
0
11 Jun 2024
Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model
Zongyang Du
Junchen Lu
Kun Zhou
Lakshmish Kaushik
Berrak Sisman
36
1
0
02 May 2024
MAIN-VC: Lightweight Speech Representation Disentanglement for One-shot Voice Conversion
Pengcheng Li
Jianzong Wang
Xulong Zhang
Yong Zhang
Jing Xiao
Ning Cheng
DRL
33
1
0
02 May 2024
Interactive tools for making temporally variable, multiple-attributes, and multiple-instances morphing accessible: Flexible manipulation of divergent speech instances for explorational research and education
Hideki Kawahara
Masanori Morise
31
1
0
20 Apr 2024
SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
Liumeng Xue
Chaoren Wang
Mingxuan Wang
Xueyao Zhang
Jun Han
Zhizheng Wu
DiffM
24
5
0
20 Feb 2024
Enhancing the Stability of LLM-based Speech Generation Systems through Self-Supervised Representations
Álvaro Martín-Cortinas
Daniel Sáez-Trigueros
Iván Vallés-Pérez
Biel Tura Vecino
Piotr Bilinski
Mateusz Lajszczak
Grzegorz Beringer
Roberto Barra-Chicote
Jaime Lorenzo-Trueba
16
5
0
05 Feb 2024
SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis
Teysir Baoueb
Haocheng Liu
Mathieu Fontaine
Jonathan Le Roux
Gaël Richard
DiffM
19
5
0
30 Jan 2024
Adversarial speech for voice privacy protection from Personalized Speech generation
Shihao Chen
Liping Chen
Jie Zhang
KongAik Lee
Zhenhua Ling
Lirong Dai
AAML
11
1
0
22 Jan 2024
Zero Shot Audio to Audio Emotion Transfer With Speaker Disentanglement
Soumya Dutta
Sriram Ganapathy
15
1
0
09 Jan 2024
AE-Flow: AutoEncoder Normalizing Flow
Jakub Mosiński
Piotr Bilinski
Thomas Merritt
Abdelhamid Ezzerg
Daniel Korzekwa
24
4
0
27 Dec 2023
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Xueyao Zhang
Liumeng Xue
Yicheng Gu
Yuancheng Wang
Haorui He
...
Mingxuan Wang
Jun Han
Kai Chen
Haizhou Li
Zhizheng Wu
27
26
0
15 Dec 2023
Reimagining Speech: A Scoping Review of Deep Learning-Powered Voice Conversion
A. R. Bargum
Stefania Serafin
Cumhur Erkut
21
3
0
14 Nov 2023
Can Authorship Attribution Models Distinguish Speakers in Speech Transcripts?
Cristina Aggazzotti
Nicholas Andrews
Elizabeth Allyn Smith
13
2
0
13 Nov 2023
Voice Conversion for Stuttered Speech, Instruments, Unseen Languages and Textually Described Voices
Matthew Baas
Herman Kamper
13
3
0
12 Oct 2023
DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion
Ziqian Ning
Yuepeng Jiang
Pengcheng Zhu
Shuai Wang
Jixun Yao
Linfu Xie
Mengxiao Bi
11
5
0
27 Sep 2023
Towards General-Purpose Text-Instruction-Guided Voice Conversion
Chun-Yi Kuan
Chen An Li
Tsung-Yuan Hsu
T. Lin
Ho-Lam Chung
Kai-Wei Chang
Shuo-yiin Chang
Hung-yi Lee
18
5
0
25 Sep 2023
Electrolaryngeal Speech Intelligibility Enhancement Through Robust Linguistic Encoders
Lester Phillip Violeta
Wen-Chin Huang
D. Ma
Ryuichi Yamamoto
Kazuhiro Kobayashi
T. Toda
14
3
0
18 Sep 2023
Face-Driven Zero-Shot Voice Conversion with Memory-based Face-Voice Alignment
Zheng-Yan Sheng
Yang Ai
Yan-Nian Chen
Zhenhua Ling
CVBM
11
4
0
18 Sep 2023
PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts
Jixun Yao
Yuguang Yang
Yinjiao Lei
Ziqian Ning
Yanni Hu
Y. Pan
Jingjing Yin
Hongbin Zhou
Heng Lu
Linfu Xie
DiffM
25
19
0
17 Sep 2023
Improving Voice Conversion for Dissimilar Speakers Using Perceptual Losses
Suhita Ghosh
Yamini Sinha
Ingo Siegert
Sebastian Stober
6
1
0
15 Sep 2023
AAS-VC: On the Generalization Ability of Automatic Alignment Search based Non-autoregressive Sequence-to-sequence Voice Conversion
Wen-Chin Huang
Kazuhiro Kobayashi
T. Toda
14
2
0
14 Sep 2023
StarGAN-VC++: Towards Emotion Preserving Voice Conversion Using Deep Embeddings
Arnab Das
Suhita Ghosh
Tim Polzehl
Sebastian Stober
22
4
0
14 Sep 2023
1
2
3
Next