Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2005.11004
Cited By
v1
v2 (latest)
NAUTILUS: a Versatile Voice Cloning System
22 May 2020
Hieu-Thi Luong
Junichi Yamagishi
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"NAUTILUS: a Versatile Voice Cloning System"
25 / 25 papers shown
AudioRole: An Audio Dataset for Character Role-Playing in Large Language Models
Wenyu Li
Xiaoqi Jiao
Yi Chang
Guangyan Zhang
Yiwen Guo
AuLLM
VGen
181
0
0
27 Sep 2025
Dataset of News Articles with Provenance Metadata for Media Relevance Assessment
Tomas Peterka
Matyas Bohacek
247
0
0
11 Jun 2025
Voice Cloning: Comprehensive Survey
Hussam Azzuni
Abdulmotaleb El Saddik
VLM
450
6
0
01 May 2025
LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation Generation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Hieu-Thi Luong
Haoyang Li
Lin Zhang
Kong Aik Lee
Eng Siong Chng
378
17
0
23 Sep 2024
Intelli-Z: Toward Intelligible Zero-Shot TTS
Sunghee Jung
Won Jang
Jaesam Yoon
Bongwan Kim
269
1
0
25 Jan 2024
Empowering Communication: Speech Technology for Indian and Western Accents through AI-powered Speech Synthesis
R. Vinotha
D. Hepsiba
L. D. V. Anand
Deepak John Reji
92
4
0
22 Jan 2024
TranssionADD: A multi-frame reinforcement based sequence tagging model for audio deepfake detection
Jiexi Liu
Zhiba Su
Hui Huang
CaiYan Wan
Quanxiu Wang
Jiangli Hong
Benlai Tang
Feng Zhu
179
13
0
27 Jun 2023
Speech Synthesis with Mixed Emotions
IEEE Transactions on Affective Computing (IEEE TAC), 2022
Kun Zhou
Berrak Sisman
R. Rana
B.W.Schuller
Haizhou Li
372
67
0
11 Aug 2022
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion
Interspeech (Interspeech), 2022
Yinjiao Lei
Shan Yang
Jian Cong
Linfu Xie
Jane Polak Scowcroft
DiffM
235
14
0
05 Jul 2022
GlowVC: Mel-spectrogram space disentangling model for language-independent text-free voice conversion
Interspeech (Interspeech), 2022
Magdalena Proszewska
Grzegorz Beringer
Daniel Sáez-Trigueros
Thomas Merritt
Abdelhamid Ezzerg
Roberto Barra-Chicote
187
6
0
04 Jul 2022
Self-supervised learning for robust voice cloning
Interspeech (Interspeech), 2022
Konstantinos Klapsas
Nikolaos Ellinas
Karolos Nikitaras
G. Vamvoukakis
Panos Kakoulidis
...
S. Raptis
June Sig Sung
Gunu Jho
Aimilios Chalamandaris
Pirros Tsiakoulis
SSL
272
7
0
07 Apr 2022
Improve few-shot voice cloning using multi-modal learning
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Haitong Zhang
Yue Lin
176
11
0
18 Mar 2022
Human Detection of Political Speech Deepfakes across Transcripts, Audio, and Video
Nature Communications (Nat Commun), 2022
Matthew Groh
Aruna Sankaranarayanan
Nikhil Singh
Dong Young Kim
A. Lippman
Rosalind W. Picard
417
49
0
25 Feb 2022
MHTTS: Fast multi-head text-to-speech for spontaneous speech with imperfect transcription
IEEE International Conference on Tools with Artificial Intelligence (ICTAI), 2022
Dabiao Ma
Yitong Zhang
Meng Li
Feng Ye
123
1
0
19 Jan 2022
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
470
451
0
29 Jun 2021
Preliminary study on using vector quantization latent spaces for TTS/VC systems with consistent performance
Hieu-Thi Luong
Junichi Yamagishi
263
0
0
25 Jun 2021
MASS: Multi-task Anthropomorphic Speech Synthesis Framework
Computer Speech and Language (CSL), 2021
Jinyin Chen
Linhui Ye
Zhaoyan Ming
154
7
0
10 May 2021
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Yuzi Yan
Xu Tan
Bohan Li
Tao Qin
Sheng Zhao
Yuan-Chung Shen
Tie-Yan Liu
138
50
0
20 Apr 2021
Deepfakes Generation and Detection: State-of-the-art, open challenges, countermeasures, and way forward
Momina Masood
M. Nawaz
K. Malik
A. Javed
Aun Irtaza
AAML
567
452
0
25 Feb 2021
Optimizing voice conversion network with cycle consistency loss of speaker identity
Spoken Language Technology Workshop (SLT), 2020
Hongqiang Du
Xiaohai Tian
Lei Xie
Haizhou Li
237
21
0
17 Nov 2020
Latent linguistic embedding for cross-lingual text-to-speech and voice conversion
Hieu-Thi Luong
Junichi Yamagishi
206
5
0
08 Oct 2020
Transfer Learning from Speech Synthesis to Voice Conversion with Non-Parallel Training Data
Mingyang Zhang
Yi Zhou
Li Zhao
Haizhou Li
276
61
0
30 Sep 2020
Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion
Yi Zhao
Wen-Chin Huang
Xiaohai Tian
Junichi Yamagishi
Rohan Kumar Das
Tomi Kinnunen
Zhenhua Ling
Tomoki Toda
254
236
0
28 Aug 2020
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020
Berrak Sisman
Junichi Yamagishi
Simon King
Haizhou Li
BDL
643
413
0
09 Aug 2020
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020
Wen-Chin Huang
Tomoki Hayashi
Yi-Chiao Wu
Hirokazu Kameoka
Tomoki Toda
392
48
0
07 Aug 2020
1
Page 1 of 1