Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2011.00316
Cited By
AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance Normalization
31 October 2020
Yen-Hao Chen
Da-Yi Wu
Tsung-Han Wu
Hung-yi Lee
Re-assign community
ArXiv
PDF
HTML
Papers citing
"AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance Normalization"
50 / 56 papers shown
Title
AVENet: Disentangling Features by Approximating Average Features for Voice Conversion
Wenyu Wang
Yiquan Zhou
Jihua Zhu
Hongwu Ding
Jiacheng Xu
Shihao Li
DRL
32
0
0
08 Apr 2025
Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model
Jialong Zuo
Shengpeng Ji
Minghui Fang
Ziyue Jiang
Xize Cheng
...
Wenrui Liu
Guangyan Zhang
Zehai Tu
Yiwen Guo
Zhou Zhao
49
0
0
08 Feb 2025
SKQVC: One-Shot Voice Conversion by K-Means Quantization with Self-Supervised Speech Representations
Youngjun Sim
Jinsung Yoon
Young-Joo Suh
79
0
0
25 Nov 2024
Discrete Unit based Masking for Improving Disentanglement in Voice Conversion
Philip H. Lee
Ismail Rasim Ulgen
Berrak Sisman
30
0
0
17 Sep 2024
Speaker Contrastive Learning for Source Speaker Tracing
Qing Wang
Hongmei Guo
Jian Kang
Mengjie Du
Jie Li
Xiao-Lei Zhang
Lei Xie
25
0
0
16 Sep 2024
FastVoiceGrad: One-step Diffusion-Based Voice Conversion with Adversarial Conditional Diffusion Distillation
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Yuto Kondo
DiffM
40
0
0
03 Sep 2024
RAVE for Speech: Efficient Voice Conversion at High Sampling Rates
A. R. Bargum
Simon Lajboschitz
Cumhur Erkut
27
1
0
29 Aug 2024
Disentangling segmental and prosodic factors to non-native speech comprehensibility
Waris Quamer
Ricardo Gutierrez-Osuna
32
1
0
20 Aug 2024
VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing
Chunyu Qiang
Wang Geng
Yi Zhao
Ruibo Fu
Tao Wang
...
Chen Zhang
Hao Che
Longbiao Wang
Jianwu Dang
Jianhua Tao
AI4TS
36
0
0
11 Aug 2024
End-to-end Streaming model for Low-Latency Speech Anonymization
Waris Quamer
Ricardo Gutierrez-Osuna
26
0
0
13 Jun 2024
Improving child speech recognition with augmented child-like speech
Yuanyuan Zhang
Zhengjun Yue
T. Patel
O. Scharenborg
30
5
0
12 Jun 2024
Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model
Zongyang Du
Junchen Lu
Kun Zhou
Lakshmish Kaushik
Berrak Sisman
42
1
0
02 May 2024
MAIN-VC: Lightweight Speech Representation Disentanglement for One-shot Voice Conversion
Pengcheng Li
Jianzong Wang
Xulong Zhang
Yong Zhang
Jing Xiao
Ning Cheng
DRL
33
1
0
02 May 2024
Who is Authentic Speaker
Qiang Huang
16
0
0
30 Apr 2024
Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction
Zhaoxi Mu
Xinyu Yang
Sining Sun
Qing Yang
SSL
18
8
0
16 Dec 2023
Low-latency Real-time Voice Conversion on CPU
Konstantine Sadov
Matthew Hutter
Asara Near
VLM
23
1
0
01 Nov 2023
SelfVC: Voice Conversion With Iterative Refinement using Self Transformations
Paarth Neekhara
Shehzeen Samarah Hussain
Rafael Valle
Boris Ginsburg
Rishabh Ranjan
Shlomo Dubnov
F. Koushanfar
Julian McAuley
16
3
0
14 Oct 2023
AutoCycle-VC: Towards Bottleneck-Independent Zero-Shot Cross-Lingual Voice Conversion
Haeyun Choi
Jio Gim
Yuho Lee
Youngin Kim
Young-Joo Suh
BDL
13
1
0
10 Oct 2023
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning
Tao Li
Zhichao Wang
Xinfa Zhu
Jian Cong
Qiao Tian
Yuping Wang
Lei Xie
DiffM
31
3
0
06 Oct 2023
An Efficient Temporary Deepfake Location Approach Based Embeddings for Partially Spoofed Audio Detection
Yuankun Xie
Haonan Cheng
Yutian Wang
Long Ye
27
6
0
06 Sep 2023
SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs
Yinghao Aaron Li
Cong Han
N. Mesgarani
20
5
0
18 Jul 2023
LM-VC: Zero-shot Voice Conversion via Speech Generation based on Language Models
Zhichao Wang
Yuan-Jui Chen
Linfu Xie
Qiao Tian
Yuping Wang
72
30
0
18 Jun 2023
Iteratively Improving Speech Recognition and Voice Conversion
Mayank Singh
Naoya Takahashi
Ono Naoyuki
13
4
0
24 May 2023
Multi-level Temporal-channel Speaker Retrieval for Zero-shot Voice Conversion
Zhichao Wang
Liumeng Xue
Qiuqiang Kong
Linfu Xie
Yuan-Jui Chen
Qiao Tian
Yuping Wang
BDL
9
3
0
12 May 2023
TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion
Hyun Joon Park
Seok Woo Yang
Jin Sob Kim
Wooseok Shin
S. W. Han
22
17
0
16 Mar 2023
Cross-modal Face- and Voice-style Transfer
Naoya Takahashi
M. Singh
Yuki Mitsufuji
CVBM
56
2
0
27 Feb 2023
Catch You and I Can: Revealing Source Voiceprint Against Voice Conversion
Jiangyi Deng
Yanjiao Chen
Yinan Zhong
Qianhao Miao
Xueluan Gong
Wenyuan Xu Zhejiang University
13
8
0
24 Feb 2023
StyleTTS-VC: One-Shot Voice Conversion by Knowledge Transfer from Style-Based TTS Models
Yinghao Aaron Li
Cong Han
N. Mesgarani
17
18
0
29 Dec 2022
Speaking Style Conversion in the Waveform Domain Using Discrete Self-Supervised Units
Gallil Maimon
Yossi Adi
21
13
0
19 Dec 2022
Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling
Xinfa Zhu
Yinjiao Lei
Kun Song
Yongmao Zhang
Tao Li
Linfu Xie
13
16
0
19 Nov 2022
Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion of Bottleneck and Perturbation Features
Ziqian Ning
Qicong Xie
Pengcheng Zhu
Zhichao Wang
Liumeng Xue
Jixun Yao
Linfu Xie
Mengxiao Bi
19
16
0
09 Nov 2022
Preserving background sound in noise-robust voice conversion via multi-task learning
J.-H. Yao
Yi Lei
Qing Wang
Pengcheng Guo
Ziqian Ning
Linfu Xie
Hai Li
Junhui Liu
Danming Xie
31
10
0
06 Nov 2022
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
Jingyi Li
Weiping Tu
Li Xiao
46
96
0
27 Oct 2022
MetaSpeech: Speech Effects Switch Along with Environment for Metaverse
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
19
1
0
25 Oct 2022
Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using
β
β
β
-VAE
Hui Lu
Disong Wang
Xixin Wu
Zhiyong Wu
Xunying Liu
Helen M. Meng
DRL
17
9
0
25 Oct 2022
Robust One-Shot Singing Voice Conversion
Naoya Takahashi
M. Singh
Yuki Mitsufuji
DiffM
15
8
0
20 Oct 2022
Identifying Source Speakers for Voice Conversion based Spoofing Attacks on Speaker Verification Systems
Danwei Cai
Zexin Cai
Ming Li
13
10
0
18 Jun 2022
End-to-End Voice Conversion with Information Perturbation
Qicong Xie
Shan Yang
Yinjiao Lei
Linfu Xie
Dan Su
15
7
0
15 Jun 2022
VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection
Joanna Hong
Minsu Kim
Y. Ro
CVBM
DiffM
30
8
0
15 Jun 2022
StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Yinghao Aaron Li
Cong Han
N. Mesgarani
33
38
0
30 May 2022
End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions
Wonjune Kang
M. Hasegawa-Johnson
D. Roy
30
8
0
19 May 2022
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
21
110
0
20 Apr 2022
DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised Learning
Qiqi Wang
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
DRL
17
23
0
22 Feb 2022
Training Robust Zero-Shot Voice Conversion Models with Self-supervised Features
Trung D. Q. Dang
Dung T. Tran
Peter Chin
K. Koishida
SSL
11
15
0
08 Dec 2021
One-shot Voice Conversion For Style Transfer Based On Speaker Adaptation
Zhichao Wang
Qicong Xie
Tao Li
Hongqiang Du
Lei Xie
Pengcheng Zhu
Mengxiao Bi
17
11
0
24 Nov 2021
SIG-VC: A Speaker Information Guided Zero-shot Voice Conversion System for Both Human Beings and Machines
Haozhe Zhang
Zexin Cai
Xiaoyi Qin
Ming Li
52
15
0
06 Nov 2021
Zero-shot Voice Conversion via Self-supervised Prosody Representation Learning
Shijun Wang
Dimche Kostadinov
Damian Borth
19
10
0
27 Oct 2021
Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conversion
Zongyang Du
Berrak Sisman
Kun Zhou
Haizhou Li
11
24
0
20 Oct 2021
Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme
Vadim Popov
Ivan Vovk
Vladimir Gogoryan
Tasnima Sadekova
Mikhail Kudinov
Jiansheng Wei
DiffM
BDL
13
121
0
28 Sep 2021
Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning
Rui Li
dong Pu
Minnie Huang
Bill Huang
47
14
0
23 Sep 2021
1
2
Next