ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.00316
  4. Cited By
AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and
  Adaptive Instance Normalization

AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance Normalization

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
31 October 2020
Yen-Hao Chen
Da-Yi Wu
Tsung-Han Wu
Hung-yi Lee
ArXiv (abs)PDFHTML

Papers citing "AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance Normalization"

50 / 62 papers shown
O_O-VC: Synthetic Data-Driven One-to-One Alignment for Any-to-Any Voice Conversion
O_O-VC: Synthetic Data-Driven One-to-One Alignment for Any-to-Any Voice Conversion
Huu Tuong Tu
Huan Vu
cuong tien nguyen
Dien Hy Ngo
Nguyen Thi Thu Trang
132
0
0
10 Oct 2025
LatentVoiceGrad: Nonparallel Voice Conversion with Latent Diffusion/Flow-Matching Models
LatentVoiceGrad: Nonparallel Voice Conversion with Latent Diffusion/Flow-Matching ModelsIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2025
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Yuto Kondo
DiffM
207
1
0
10 Sep 2025
NE-PADD: Leveraging Named Entity Knowledge for Robust Partial Audio Deepfake Detection via Attention Aggregation
NE-PADD: Leveraging Named Entity Knowledge for Robust Partial Audio Deepfake Detection via Attention Aggregation
Huhong Xian
Rui Liu
Berrak Sisman
Haizhou Li
144
1
0
04 Sep 2025
FreeTalk:A plug-and-play and black-box defense against speech synthesis attacks
FreeTalk:A plug-and-play and black-box defense against speech synthesis attacks
Yuwen Pu
Zhou Feng
Chunyi Zhou
Jiahao Chen
Chunqiang Hu
Haibo Hu
S. Ji
AAML
135
0
0
30 Aug 2025
ClearMask: Noise-Free and Naturalness-Preserving Protection Against Voice Deepfake Attacks
ClearMask: Noise-Free and Naturalness-Preserving Protection Against Voice Deepfake AttacksACM Asia Conference on Computer and Communications Security (AsiaCCS), 2025
Yuanda Wang
Bocheng Chen
Hanqing Guo
Guangjing Wang
Weikang Ding
Qiben Yan
AAML
162
0
0
25 Aug 2025
FasterVoiceGrad: Faster One-step Diffusion-Based Voice Conversion with Adversarial Diffusion Conversion Distillation
FasterVoiceGrad: Faster One-step Diffusion-Based Voice Conversion with Adversarial Diffusion Conversion Distillation
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Yuto Kondo
124
1
0
25 Aug 2025
ReFlow-VC: Zero-shot Voice Conversion Based on Rectified Flow and Speaker Feature Optimization
ReFlow-VC: Zero-shot Voice Conversion Based on Rectified Flow and Speaker Feature Optimization
Pengyu Ren
Wenhao Guan
Kaidi Wang
Peijie Chen
Q. Hong
Lin Li
163
2
0
01 Jun 2025
AVENet: Disentangling Features by Approximating Average Features for Voice Conversion
AVENet: Disentangling Features by Approximating Average Features for Voice Conversion
Wenyu Wang
Yiquan Zhou
Jihua Zhu
Hongwu Ding
Jiacheng Xu
Shihao Li
DRL
201
0
0
08 Apr 2025
Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model
Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching ModelIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Jialong Zuo
Shengpeng Ji
Minghui Fang
Ziyue Jiang
Xize Cheng
...
Wenrui Liu
Guangyan Zhang
Zehai Tu
Yiwen Guo
Zhou Zhao
488
9
0
08 Feb 2025
Discrete Unit based Masking for Improving Disentanglement in Voice
  Conversion
Discrete Unit based Masking for Improving Disentanglement in Voice ConversionSpoken Language Technology Workshop (SLT), 2024
Philip H. Lee
Ismail Rasim Ulgen
Berrak Sisman
239
2
0
17 Sep 2024
Speaker Contrastive Learning for Source Speaker Tracing
Speaker Contrastive Learning for Source Speaker TracingSpoken Language Technology Workshop (SLT), 2024
Qing Wang
Hongmei Guo
Jian Kang
Mengjie Du
Jie Li
Xiao-Lei Zhang
Lei Xie
367
2
0
16 Sep 2024
FastVoiceGrad: One-step Diffusion-Based Voice Conversion with
  Adversarial Conditional Diffusion Distillation
FastVoiceGrad: One-step Diffusion-Based Voice Conversion with Adversarial Conditional Diffusion DistillationInterspeech (Interspeech), 2024
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Yuto Kondo
DiffM
303
7
0
03 Sep 2024
RAVE for Speech: Efficient Voice Conversion at High Sampling Rates
RAVE for Speech: Efficient Voice Conversion at High Sampling Rates
A. R. Bargum
Simon Lajboschitz
Cumhur Erkut
235
2
0
29 Aug 2024
Disentangling segmental and prosodic factors to non-native speech
  comprehensibility
Disentangling segmental and prosodic factors to non-native speech comprehensibilityIEEE Transactions on Audio, Speech, and Language Processing (IEEE TASLP), 2024
Waris Quamer
Ricardo Gutierrez-Osuna
289
3
0
20 Aug 2024
VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing
VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech ProcessingIEEE Transactions on Audio, Speech, and Language Processing (IEEE TASLP), 2024
Chunyu Qiang
Wang Geng
Yi Zhao
Ruibo Fu
Tao Wang
...
Chen Zhang
Hao Che
L. Wang
Jianwu Dang
Jianhua Tao
AI4TS
420
8
0
11 Aug 2024
End-to-end Streaming model for Low-Latency Speech Anonymization
End-to-end Streaming model for Low-Latency Speech Anonymization
Waris Quamer
Ricardo Gutierrez-Osuna
260
10
0
13 Jun 2024
Improving child speech recognition with augmented child-like speech
Improving child speech recognition with augmented child-like speech
Yuanyuan Zhang
Zhengjun Yue
T. Patel
O. Scharenborg
203
13
0
12 Jun 2024
Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a
  Conditional Diffusion Model
Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion ModelThe Speaker and Language Recognition Workshop (Odyssey), 2024
Zongyang Du
Junchen Lu
Kun Zhou
Lakshmish Kaushik
Berrak Sisman
290
7
0
02 May 2024
MAIN-VC: Lightweight Speech Representation Disentanglement for One-shot
  Voice Conversion
MAIN-VC: Lightweight Speech Representation Disentanglement for One-shot Voice Conversion
Pengcheng Li
Jianzong Wang
Xulong Zhang
Yong Zhang
Jing Xiao
Ning Cheng
DRL
258
3
0
02 May 2024
Who is Authentic Speaker
Who is Authentic Speaker
Qiang Huang
208
0
0
30 Apr 2024
Self-Supervised Disentangled Representation Learning for Robust Target
  Speech Extraction
Self-Supervised Disentangled Representation Learning for Robust Target Speech ExtractionAAAI Conference on Artificial Intelligence (AAAI), 2023
Zhaoxi Mu
Xinyu Yang
Sining Sun
Qing Yang
SSL
326
13
0
16 Dec 2023
Low-latency Real-time Voice Conversion on CPU
Low-latency Real-time Voice Conversion on CPU
Konstantine Sadov
Matthew Hutter
Asara Near
VLM
622
3
0
01 Nov 2023
SelfVC: Voice Conversion With Iterative Refinement using Self
  Transformations
SelfVC: Voice Conversion With Iterative Refinement using Self Transformations
Paarth Neekhara
Shehzeen Samarah Hussain
Rafael Valle
Boris Ginsburg
Rishabh Ranjan
Shlomo Dubnov
F. Koushanfar
Julian McAuley
239
7
0
14 Oct 2023
AutoCycle-VC: Towards Bottleneck-Independent Zero-Shot Cross-Lingual
  Voice Conversion
AutoCycle-VC: Towards Bottleneck-Independent Zero-Shot Cross-Lingual Voice Conversion
Haeyun Choi
Jio Gim
Yuho Lee
Youngin Kim
Young-Joo Suh
BDL
206
2
0
10 Oct 2023
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling
  for Zero-Shot Voice Cloning
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice CloningIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Tao Li
Zhichao Wang
Xinfa Zhu
Jian Cong
Qiao Tian
Yuping Wang
Lei Xie
DiffM
205
9
0
06 Oct 2023
An Efficient Temporary Deepfake Location Approach Based Embeddings for
  Partially Spoofed Audio Detection
An Efficient Temporary Deepfake Location Approach Based Embeddings for Partially Spoofed Audio DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Yuankun Xie
Haonan Cheng
Yutian Wang
Long Ye
245
25
0
06 Sep 2023
SLMGAN: Exploiting Speech Language Model Representations for
  Unsupervised Zero-Shot Voice Conversion in GANs
SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANsIEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2023
Yinghao Aaron Li
Cong Han
N. Mesgarani
275
5
0
18 Jul 2023
LM-VC: Zero-shot Voice Conversion via Speech Generation based on
  Language Models
LM-VC: Zero-shot Voice Conversion via Speech Generation based on Language ModelsIEEE Signal Processing Letters (IEEE SPL), 2023
Zhichao Wang
Yuan-Jui Chen
Linfu Xie
Qiao Tian
Yuping Wang
361
43
0
18 Jun 2023
Iteratively Improving Speech Recognition and Voice Conversion
Iteratively Improving Speech Recognition and Voice ConversionInterspeech (Interspeech), 2023
Mayank Singh
Naoya Takahashi
Ono Naoyuki
250
5
0
24 May 2023
Multi-level Temporal-channel Speaker Retrieval for Zero-shot Voice
  Conversion
Multi-level Temporal-channel Speaker Retrieval for Zero-shot Voice ConversionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Zhichao Wang
Liumeng Xue
Qiuqiang Kong
Linfu Xie
Yuan-Jui Chen
Qiao Tian
Yuping Wang
BDL
375
5
0
12 May 2023
TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice
  Conversion
TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice ConversionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Hyun Joon Park
Seok Woo Yang
Jin Sob Kim
Wooseok Shin
S. W. Han
258
29
0
16 Mar 2023
Cross-modal Face- and Voice-style Transfer
Cross-modal Face- and Voice-style Transfer
Naoya Takahashi
M. Singh
Yuki Mitsufuji
CVBM
295
2
0
27 Feb 2023
Catch You and I Can: Revealing Source Voiceprint Against Voice
  Conversion
Catch You and I Can: Revealing Source Voiceprint Against Voice ConversionUSENIX Security Symposium (USENIX Security), 2023
Jiangyi Deng
Yanjiao Chen
Yinan Zhong
Qianhao Miao
Xueluan Gong
Wenyuan Xu Zhejiang University
301
15
0
24 Feb 2023
StyleTTS-VC: One-Shot Voice Conversion by Knowledge Transfer from
  Style-Based TTS Models
StyleTTS-VC: One-Shot Voice Conversion by Knowledge Transfer from Style-Based TTS ModelsSpoken Language Technology Workshop (SLT), 2022
Yinghao Aaron Li
Cong Han
N. Mesgarani
203
23
0
29 Dec 2022
Speaking Style Conversion in the Waveform Domain Using Discrete
  Self-Supervised Units
Speaking Style Conversion in the Waveform Domain Using Discrete Self-Supervised UnitsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Gallil Maimon
Yossi Adi
361
21
0
19 Dec 2022
Multi-Speaker Expressive Speech Synthesis via Multiple Factors
  Decoupling
Multi-Speaker Expressive Speech Synthesis via Multiple Factors DecouplingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Xinfa Zhu
Yinjiao Lei
Kun Song
Yongmao Zhang
Tao Li
Linfu Xie
255
24
0
19 Nov 2022
Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion
  of Bottleneck and Perturbation Features
Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion of Bottleneck and Perturbation FeaturesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Ziqian Ning
Qicong Xie
Pengcheng Zhu
Zhichao Wang
Liumeng Xue
Jixun Yao
Linfu Xie
Mengxiao Bi
182
27
0
09 Nov 2022
Preserving background sound in noise-robust voice conversion via
  multi-task learning
Preserving background sound in noise-robust voice conversion via multi-task learningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Jixun Yao
Yi Lei
Qing Wang
Pengcheng Guo
Ziqian Ning
Linfu Xie
Hai Li
Junhui Liu
Danming Xie
243
16
0
06 Nov 2022
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
FreeVC: Towards High-Quality Text-Free One-Shot Voice ConversionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Jingyi Li
Weiping Tu
Li Xiao
403
206
0
27 Oct 2022
MetaSpeech: Speech Effects Switch Along with Environment for Metaverse
MetaSpeech: Speech Effects Switch Along with Environment for MetaverseInternational Conference on Mobile Ad-hoc and Sensor Networks (MSN), 2022
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
155
2
0
25 Oct 2022
Disentangled Speech Representation Learning for One-Shot Cross-lingual
  Voice Conversion Using $β$-VAE
Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using βββ-VAESpoken Language Technology Workshop (SLT), 2022
Hui Lu
Disong Wang
Xixin Wu
Zhiyong Wu
Xunying Liu
Helen M. Meng
DRL
252
13
0
25 Oct 2022
Robust One-Shot Singing Voice Conversion
Robust One-Shot Singing Voice Conversion
Naoya Takahashi
M. Singh
Yuki Mitsufuji
DiffM
306
9
0
20 Oct 2022
Identifying Source Speakers for Voice Conversion based Spoofing Attacks
  on Speaker Verification Systems
Identifying Source Speakers for Voice Conversion based Spoofing Attacks on Speaker Verification SystemsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Danwei Cai
Zexin Cai
Ming Li
267
15
0
18 Jun 2022
End-to-End Voice Conversion with Information Perturbation
End-to-End Voice Conversion with Information PerturbationInternational Symposium on Chinese Spoken Language Processing (ISCSLP), 2022
Qicong Xie
Shan Yang
Yinjiao Lei
Linfu Xie
Jane Polak Scowcroft
173
8
0
15 Jun 2022
VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via
  Speech-Visage Feature Selection
VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature SelectionEuropean Conference on Computer Vision (ECCV), 2022
Joanna Hong
Minsu Kim
Y. Ro
CVBMDiffM
289
8
0
15 Jun 2022
StyleTTS: A Style-Based Generative Model for Natural and Diverse
  Text-to-Speech Synthesis
StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech SynthesisIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Yinghao Aaron Li
Cong Han
N. Mesgarani
396
72
0
30 May 2022
End-to-End Zero-Shot Voice Conversion with Location-Variable
  Convolutions
End-to-End Zero-Shot Voice Conversion with Location-Variable ConvolutionsInterspeech (Interspeech), 2022
Wonjune Kang
M. Hasegawa-Johnson
D. Roy
284
10
0
19 May 2022
ContentVec: An Improved Self-Supervised Speech Representation by
  Disentangling Speakers
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling SpeakersInternational Conference on Machine Learning (ICML), 2022
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
247
153
0
20 Apr 2022
DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised
  Learning
DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised LearningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Qiqi Wang
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
DRL
264
25
0
22 Feb 2022
Training Robust Zero-Shot Voice Conversion Models with Self-supervised
  Features
Training Robust Zero-Shot Voice Conversion Models with Self-supervised Features
Trung D. Q. Dang
Dung T. Tran
Peter Chin
K. Koishida
SSL
226
18
0
08 Dec 2021
12
Next
Page 1 of 2