ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.11004
  4. Cited By
NAUTILUS: a Versatile Voice Cloning System
v1v2 (latest)

NAUTILUS: a Versatile Voice Cloning System

22 May 2020
Hieu-Thi Luong
Junichi Yamagishi
ArXiv (abs)PDFHTML

Papers citing "NAUTILUS: a Versatile Voice Cloning System"

25 / 25 papers shown
AudioRole: An Audio Dataset for Character Role-Playing in Large Language Models
AudioRole: An Audio Dataset for Character Role-Playing in Large Language Models
Wenyu Li
Xiaoqi Jiao
Yi Chang
Guangyan Zhang
Yiwen Guo
AuLLMVGen
181
0
0
27 Sep 2025
Dataset of News Articles with Provenance Metadata for Media Relevance Assessment
Dataset of News Articles with Provenance Metadata for Media Relevance Assessment
Tomas Peterka
Matyas Bohacek
247
0
0
11 Jun 2025
Voice Cloning: Comprehensive Survey
Voice Cloning: Comprehensive Survey
Hussam Azzuni
Abdulmotaleb El Saddik
VLM
450
6
0
01 May 2025
LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation Generation
LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation GenerationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Hieu-Thi Luong
Haoyang Li
Lin Zhang
Kong Aik Lee
Eng Siong Chng
378
17
0
23 Sep 2024
Intelli-Z: Toward Intelligible Zero-Shot TTS
Intelli-Z: Toward Intelligible Zero-Shot TTS
Sunghee Jung
Won Jang
Jaesam Yoon
Bongwan Kim
269
1
0
25 Jan 2024
Empowering Communication: Speech Technology for Indian and Western
  Accents through AI-powered Speech Synthesis
Empowering Communication: Speech Technology for Indian and Western Accents through AI-powered Speech Synthesis
R. Vinotha
D. Hepsiba
L. D. V. Anand
Deepak John Reji
92
4
0
22 Jan 2024
TranssionADD: A multi-frame reinforcement based sequence tagging model
  for audio deepfake detection
TranssionADD: A multi-frame reinforcement based sequence tagging model for audio deepfake detection
Jiexi Liu
Zhiba Su
Hui Huang
CaiYan Wan
Quanxiu Wang
Jiangli Hong
Benlai Tang
Feng Zhu
179
13
0
27 Jun 2023
Speech Synthesis with Mixed Emotions
Speech Synthesis with Mixed EmotionsIEEE Transactions on Affective Computing (IEEE TAC), 2022
Kun Zhou
Berrak Sisman
R. Rana
B.W.Schuller
Haizhou Li
372
67
0
11 Aug 2022
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and
  Any-to-any Voice Conversion
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice ConversionInterspeech (Interspeech), 2022
Yinjiao Lei
Shan Yang
Jian Cong
Linfu Xie
Jane Polak Scowcroft
DiffM
235
14
0
05 Jul 2022
GlowVC: Mel-spectrogram space disentangling model for
  language-independent text-free voice conversion
GlowVC: Mel-spectrogram space disentangling model for language-independent text-free voice conversionInterspeech (Interspeech), 2022
Magdalena Proszewska
Grzegorz Beringer
Daniel Sáez-Trigueros
Thomas Merritt
Abdelhamid Ezzerg
Roberto Barra-Chicote
187
6
0
04 Jul 2022
Self-supervised learning for robust voice cloning
Self-supervised learning for robust voice cloningInterspeech (Interspeech), 2022
Konstantinos Klapsas
Nikolaos Ellinas
Karolos Nikitaras
G. Vamvoukakis
Panos Kakoulidis
...
S. Raptis
June Sig Sung
Gunu Jho
Aimilios Chalamandaris
Pirros Tsiakoulis
SSL
272
7
0
07 Apr 2022
Improve few-shot voice cloning using multi-modal learning
Improve few-shot voice cloning using multi-modal learningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Haitong Zhang
Yue Lin
176
11
0
18 Mar 2022
Human Detection of Political Speech Deepfakes across Transcripts, Audio,
  and Video
Human Detection of Political Speech Deepfakes across Transcripts, Audio, and VideoNature Communications (Nat Commun), 2022
Matthew Groh
Aruna Sankaranarayanan
Nikhil Singh
Dong Young Kim
A. Lippman
Rosalind W. Picard
417
49
0
25 Feb 2022
MHTTS: Fast multi-head text-to-speech for spontaneous speech with
  imperfect transcription
MHTTS: Fast multi-head text-to-speech for spontaneous speech with imperfect transcriptionIEEE International Conference on Tools with Artificial Intelligence (ICTAI), 2022
Dabiao Ma
Yitong Zhang
Meng Li
Feng Ye
123
1
0
19 Jan 2022
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
470
451
0
29 Jun 2021
Preliminary study on using vector quantization latent spaces for TTS/VC
  systems with consistent performance
Preliminary study on using vector quantization latent spaces for TTS/VC systems with consistent performance
Hieu-Thi Luong
Junichi Yamagishi
263
0
0
25 Jun 2021
MASS: Multi-task Anthropomorphic Speech Synthesis Framework
MASS: Multi-task Anthropomorphic Speech Synthesis FrameworkComputer Speech and Language (CSL), 2021
Jinyin Chen
Linhui Ye
Zhaoyan Ming
154
7
0
10 May 2021
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data
AdaSpeech 2: Adaptive Text to Speech with Untranscribed DataIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Yuzi Yan
Xu Tan
Bohan Li
Tao Qin
Sheng Zhao
Yuan-Chung Shen
Tie-Yan Liu
138
50
0
20 Apr 2021
Deepfakes Generation and Detection: State-of-the-art, open challenges,
  countermeasures, and way forward
Deepfakes Generation and Detection: State-of-the-art, open challenges, countermeasures, and way forward
Momina Masood
M. Nawaz
K. Malik
A. Javed
Aun Irtaza
AAML
567
452
0
25 Feb 2021
Optimizing voice conversion network with cycle consistency loss of
  speaker identity
Optimizing voice conversion network with cycle consistency loss of speaker identitySpoken Language Technology Workshop (SLT), 2020
Hongqiang Du
Xiaohai Tian
Lei Xie
Haizhou Li
237
21
0
17 Nov 2020
Latent linguistic embedding for cross-lingual text-to-speech and voice
  conversion
Latent linguistic embedding for cross-lingual text-to-speech and voice conversion
Hieu-Thi Luong
Junichi Yamagishi
206
5
0
08 Oct 2020
Transfer Learning from Speech Synthesis to Voice Conversion with
  Non-Parallel Training Data
Transfer Learning from Speech Synthesis to Voice Conversion with Non-Parallel Training Data
Mingyang Zhang
Yi Zhou
Li Zhao
Haizhou Li
276
61
0
30 Sep 2020
Voice Conversion Challenge 2020: Intra-lingual semi-parallel and
  cross-lingual voice conversion
Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion
Yi Zhao
Wen-Chin Huang
Xiaohai Tian
Junichi Yamagishi
Rohan Kumar Das
Tomi Kinnunen
Zhenhua Ling
Tomoki Toda
254
236
0
28 Aug 2020
An Overview of Voice Conversion and its Challenges: From Statistical
  Modeling to Deep Learning
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep LearningIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020
Berrak Sisman
Junichi Yamagishi
Simon King
Haizhou Li
BDL
643
413
0
09 Aug 2020
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Pretraining Techniques for Sequence-to-Sequence Voice ConversionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020
Wen-Chin Huang
Tomoki Hayashi
Yi-Chiao Wu
Hirokazu Kameoka
Tomoki Toda
392
48
0
07 Aug 2020
1
Page 1 of 1