Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2005.08484
Cited By
v1
v2 (latest)
Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding
18 May 2020
Seungwoo Choi
Seungju Han
Dongyoung Kim
S. Ha
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding"
38 / 38 papers shown
TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Yu Zhang
Wenxiang Guo
Changhao Pan
Dongyu Yao
Zhiyuan Zhu
Ziyue Jiang
Yuhan Wang
Tao Jin
Zhou Zhao
VLM
666
11
0
20 May 2025
Voice Cloning: Comprehensive Survey
Hussam Azzuni
Abdulmotaleb El Saddik
VLM
450
6
0
01 May 2025
ISDrama: Immersive Spatial Drama Generation through Multimodal Prompting
Yanzhe Zhang
Wenxiang Guo
Changhao Pan
Zehan Zhu
Tao Jin
Zhou Zhao
VGen
750
9
0
29 Apr 2025
Towards Zero-Shot Text-To-Speech for Arabic Dialects
Khai Duy Doan
Abdul Waheed
Muhammad Abdul-Mageed
429
5
0
24 Jun 2024
XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model
Interspeech (Interspeech), 2024
Edresson Casanova
Kelly Davis
Eren Golge
Görkem Göknar
Iulian Gulea
...
Aya Aljafari
Joshua Meyer
Reuben Morais
Samuel Olayemi
Julian Weber
VLM
394
254
0
07 Jun 2024
Unmasking Illusions: Understanding Human Perception of Audiovisual Deepfakes
Ammarah Hashmi
Sahibzada Adil Shahzad
Chia-Wen Lin
Yu Tsao
Hsin-Min Wang
251
12
0
07 May 2024
StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis
AAAI Conference on Artificial Intelligence (AAAI), 2023
Yu Zhang
Rongjie Huang
Ruiqi Li
Jinzheng He
Yan Xia
Feiyang Chen
Xinyu Duan
Baoxing Huai
Zhou Zhao
VLM
564
43
0
17 Dec 2023
Detecting Voice Cloning Attacks via Timbre Watermarking
Chang-rui Liu
Jie Zhang
Tianwei Zhang
Xi Yang
Weiming Zhang
Neng H. Yu
319
69
0
06 Dec 2023
AV-Deepfake1M: A Large-Scale LLM-Driven Audio-Visual Deepfake Dataset
ACM Multimedia (ACM MM), 2023
Zhixi Cai
Shreya Ghosh
Aman Pankaj Adatia
Munawar Hayat
Abhinav Dhall
Kalin Stefanov
268
95
0
26 Nov 2023
Controllable Generation of Artificial Speaker Embeddings through Discovery of Principal Directions
Interspeech (Interspeech), 2023
Florian Lux
Pascal Tilli
Sarina Meyer
Ngoc Thang Vu
211
3
0
26 Oct 2023
Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Shunwei Lei
Yixuan Zhou
Liyang Chen
Dan Luo
Zhiyong Wu
...
Shiyin Kang
Tao Jiang
Yahui Zhou
Yuxing Han
Helen M. Meng
VLM
213
4
0
21 Sep 2023
Stylebook: Content-Dependent Speaking Style Modeling for Any-to-Any Voice Conversion using Only Speech Data
Hyungseob Lim
Kyungguen Byun
Sunkuk Moon
Erik Visser
DiffM
326
2
0
06 Sep 2023
Generalizable Zero-Shot Speaker Adaptive Speech Synthesis with Disentangled Representations
Interspeech (Interspeech), 2023
Wen Wang
Yang Song
S. Jha
227
15
0
24 Aug 2023
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis
International Conference on Learning Representations (ICLR), 2023
Ziyue Jiang
Jinglin Liu
Yi Ren
Jinzheng He
Zhe Ye
...
Pengfei Wei
Chunfeng Wang
Xiang Yin
Zejun Ma
Zhou Zhao
367
74
0
14 Jul 2023
Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis
Interspeech (Interspeech), 2023
Seong-Hyun Park
Bohyung Kim
Tae-Hyun Oh
233
1
0
26 May 2023
Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Sung-Feng Huang
Chia-Ping Chen
Zhi-Sheng Chen
Yu-Pao Tsai
Hung-yi Lee
265
7
0
21 Mar 2023
Warning: Humans Cannot Reliably Detect Speech Deepfakes
PLoS ONE (PLoS ONE), 2023
Kimberly T. Mai
Sergi D. Bray
Toby O. Davies
Lewis D. Griffin
362
74
0
19 Jan 2023
Towards zero-shot Text-based voice editing using acoustic context conditioning, utterance embeddings, and reference encoders
Jason Fong
Yun Wang
Prabhav Agrawal
Vimal Manohar
Jilong Wu
Thilo Kohler
Qing He
202
0
0
28 Oct 2022
Semi-Supervised Learning Based on Reference Model for Low-resource TTS
International Conference on Mobile Ad-hoc and Sensor Networks (MSN), 2022
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
AI4TS
302
6
0
25 Oct 2022
Low-Resource Multilingual and Zero-Shot Multispeaker TTS
Florian Lux
Julia Koch
Ngoc Thang Vu
238
27
0
21 Oct 2022
Mid-attribute speaker generation using optimal-transport-based interpolation of Gaussian mixture models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Aya Watanabe
Shinnosuke Takamichi
Yuki Saito
Detai Xin
Hiroshi Saruwatari
178
4
0
18 Oct 2022
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion
Interspeech (Interspeech), 2022
Yinjiao Lei
Shan Yang
Jian Cong
Linfu Xie
Jane Polak Scowcroft
DiffM
224
14
0
05 Jul 2022
Exact Prosody Cloning in Zero-Shot Multispeaker Text-to-Speech
Spoken Language Technology Workshop (SLT), 2022
Florian Lux
Julia Koch
Ngoc Thang Vu
259
25
0
24 Jun 2022
Fine-grained Noise Control for Multispeaker Speech Synthesis
Interspeech (Interspeech), 2022
Karolos Nikitaras
G. Vamvoukakis
Nikolaos Ellinas
Konstantinos Klapsas
K. Markopoulos
S. Raptis
June Sig Sung
Gunu Jho
Aimilios Chalamandaris
Pirros Tsiakoulis
232
5
0
11 Apr 2022
Self-supervised learning for robust voice cloning
Interspeech (Interspeech), 2022
Konstantinos Klapsas
Nikolaos Ellinas
Karolos Nikitaras
G. Vamvoukakis
Panos Kakoulidis
...
S. Raptis
June Sig Sung
Gunu Jho
Aimilios Chalamandaris
Pirros Tsiakoulis
SSL
270
7
0
07 Apr 2022
Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Interspeech (Interspeech), 2022
Yixuan Zhou
Changhe Song
Xiang Li
Lu Zhang
Zhiyong Wu
Yanyao Bian
Jane Polak Scowcroft
Helen Meng
389
28
0
03 Apr 2022
ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversion
Interspeech (Interspeech), 2022
Edresson Casanova
C. Shulby
Alexander Korolev
Arnaldo Cândido Júnior
A. S. Soares
S. Aluísio
M. Ponti
387
19
0
29 Mar 2022
Attacker Attribution of Audio Deepfakes
Interspeech (Interspeech), 2022
Nicolas Müller
Franziska Dieckmann
Jennifer Williams
152
24
0
28 Mar 2022
Speaker Adaption with Intuitive Prosodic Features for Statistical Parametric Speech Synthesis
International Conference on Digital Signal Processing (DSP), 2022
Pengyu Cheng
Zhenhua Ling
229
4
0
02 Mar 2022
Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Adam Gabry's
Goeric Huybrechts
M. Ribeiro
C. Chien
Julian Roth
Giulia Comini
Roberto Barra-Chicote
Bartek Perz
Jaime Lorenzo-Trueba
241
29
0
16 Feb 2022
MR-SVS: Singing Voice Synthesis with Multi-Reference Encoder
Shoutong Wang
Jinglin Liu
Yi Ren
Zhen Wang
Changliang Xu
Zhou Zhao
126
7
0
11 Jan 2022
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
International Conference on Machine Learning (ICML), 2021
Edresson Casanova
Julian Weber
C. Shulby
Arnaldo Cândido Júnior
Eren Golge
M. Ponti
838
585
0
04 Dec 2021
Speaker Generation
Daisy Stanton
Matt Shannon
Soroosh Mariooryad
RJ Skerry-Ryan
Eric Battenberg
Tom Bagby
David Kao
293
39
0
07 Nov 2021
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech
Sung-Feng Huang
Chyi-Jiunn Lin
Da-Rong Liu
Yi-Chen Chen
Hung-yi Lee
661
75
0
07 Nov 2021
Msdtron: a high-capability multi-speaker speech synthesis system for diverse data using characteristic information
Qinghua Wu
Quanbo Shen
Jian Luan
YuJun Wang
274
4
0
07 Jul 2021
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
466
446
0
29 Jun 2021
SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model
Interspeech (Interspeech), 2021
Edresson Casanova
C. Shulby
Eren Golge
Nicolas Müller
F. S. Oliveira
Arnaldo Cândido Júnior
A. S. Soares
S. Aluísio
M. Ponti
296
113
0
02 Apr 2021
A Survey on Machine Learning from Few Samples
Pattern Recognition (Pattern Recognit.), 2020
Jiang Lu
Pinghua Gong
Jieping Ye
Jianwei Zhang
Changshu Zhang
377
81
0
06 Sep 2020
1
Page 1 of 1