Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.00990
Cited By
Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
3 April 2022
Yixuan Zhou
Changhe Song
Xiang Li
Lu Zhang
Zhiyong Wu
Yanyao Bian
Dan Su
Helen Meng
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis"
16 / 16 papers shown
Title
Voice Cloning: Comprehensive Survey
Hussam Azzuni
Abdulmotaleb El Saddik
VLM
44
0
0
01 May 2025
Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing
Zhedong Zhang
Liang-Sheng Li
C. Yan
Chunshan Liu
Anton Van Den Hengel
Yuankai Qi
91
2
0
15 Mar 2025
Clip-TTS: Contrastive Text-content and Mel-spectrogram, A High-Quality Text-to-Speech Method based on Contextual Semantic Understanding
Tianyun Liu
CLIP
VLM
68
0
0
26 Feb 2025
EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing
Gaoxiang Cong
Jiadong Pan
Liang-Sheng Li
Yuankai Qi
Yuxin Peng
Anton Van Den Hengel
Jian Yang
Qingming Huang
92
6
0
12 Dec 2024
CoDiff-VC: A Codec-Assisted Diffusion Model for Zero-shot Voice Conversion
Yuke Li
Xinfa Zhu
Hanzhao Li
J.-H. Yao
WenJie Tian
XiPeng Yang
Yunlin Chen
Zhifei Li
Lei Xie
DiffM
66
0
0
28 Nov 2024
The NPU-HWC System for the ISCSLP 2024 Inspirational and Convincing Audio Generation Challenge
Dake Guo
J.-H. Yao
Xinfa Zhu
Kangxiang Xia
Zhao Guo
Ziyu Zhang
Yishuo Wang
Jie Liu
Lei Xie
34
1
0
31 Oct 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
39
4
0
21 Jul 2024
ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation
Ruibo Fu
Xin Qi
Zhengqi Wen
Jianhua Tao
Tao Wang
...
Xiaopeng Wang
Shuchen Shi
Yukun Liu
Xuefei Liu
Shuai Zhang
49
0
0
07 Jul 2024
Intelli-Z: Toward Intelligible Zero-Shot TTS
Sunghee Jung
Won Jang
Jaesam Yoon
Bongwan Kim
30
0
0
25 Jan 2024
ELF: Encoding Speaker-Specific Latent Speech Feature for Speech Synthesis
Jungil Kong
Junmo Lee
Jeongmin Kim
Beomjeong Kim
Jihoon Park
Dohee Kong
Changheon Lee
Sangjin Kim
25
1
0
20 Nov 2023
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning
Tao Li
Zhichao Wang
Xinfa Zhu
Jian Cong
Qiao Tian
Yuping Wang
Lei Xie
DiffM
31
3
0
06 Oct 2023
Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts
Shunwei Lei
Yixuan Zhou
Liyang Chen
Dan Luo
Zhiyong Wu
...
Shiyin Kang
Tao Jiang
Yahui Zhou
Yuxing Han
Helen M. Meng
VLM
33
2
0
21 Sep 2023
Stylebook: Content-Dependent Speaking Style Modeling for Any-to-Any Voice Conversion using Only Speech Data
Hyungseob Lim
Kyungguen Byun
Sunkuk Moon
Erik Visser
DiffM
28
2
0
06 Sep 2023
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis
Ziyue Jiang
Jinglin Liu
Yi Ren
Jinzheng He
Zhe Ye
...
Pengfei Wei
Chunfeng Wang
Xiang Yin
Zejun Ma
Zhou Zhao
35
44
0
14 Jul 2023
ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios
Yuyue Wang
Huanhou Xiao
Yihan Wu
Ruihua Song
21
0
0
20 May 2023
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Z. Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
207
820
0
12 Jun 2018
1