Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2309.08140
Cited By
v1
v2 (latest)
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
15 September 2023
Reo Shimizu
Ryuichi Yamamoto
Masaya Kawamura
Yuma Shirahata
Hironori Doi
Tatsuya Komatsu
Kentaro Tachibana
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions"
11 / 11 papers shown
Title
HuLA: Prosody-Aware Anti-Spoofing with Multi-Task Learning for Expressive and Emotional Synthetic Speech
Aurosweta Mahapatra
Ismail Rasim Ulgen
Berrak Sisman
103
0
0
25 Sep 2025
Expressive Speech Retrieval using Natural Language Descriptions of Speaking Style
Wonjune Kang
D. Roy
68
0
0
15 Aug 2025
MoE-TTS: Enhancing Out-of-Domain Text Understanding for Description-based TTS via Mixture-of-Experts
Heyang Xue
Xuchen Song
Yu Tang
J. Chen
Yanru Chen
Yang Li
Yahui Zhou
MoE
76
1
0
15 Aug 2025
ARECHO: Autoregressive Evaluation via Chain-Based Hypothesis Optimization for Speech Multi-Metric Estimation
Jiatong Shi
Yifan Cheng
Bo-Hao Su
Hye-jin Shim
Jinchuan Tian
Samuele Cornell
Yiwen Zhao
Siddhant Arora
Shinji Watanabe
193
0
0
30 May 2025
Can Emotion Fool Anti-spoofing?
Aurosweta Mahapatra
Ismail Rasim Ulgen
Abinay Reddy Naini
John H. L. Hansen
Berrak Sisman
125
4
0
29 May 2025
RA-CLAP: Relation-Augmented Emotional Speaking Style Contrastive Language-Audio Pretraining For Speech Retrieval
Haoqin Sun
Jingguang Tian
Jiaming Zhou
Hui Wang
Jiabei He
...
Xiangyu Kong
Desheng Hu
Xinkang Xu
Xinhui Hu
Yong Qin
199
2
0
26 May 2025
Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration
Shigeki Karita
Yuma Koizumi
Heiga Zen
Haruko Ishikawa
Robin Scheibler
M. Bacchiani
VLM
963
3
0
07 May 2025
SIFT-50M: A Large-Scale Multilingual Dataset for Speech Instruction Fine-Tuning
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Prabhat Pandey
Rupak Vignesh Swaminathan
K V Vijay Girish
Arunasish Sen
Jian Xie
Grant P. Strimel
Andreas Schwarz
877
8
0
12 Apr 2025
VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
ACM Multimedia (MM), 2024
Yixuan Zhou
Xiaoyu Qin
Zeyu Jin
Shuoyi Zhou
Shun Lei
Songtao Zhou
Zhiyong Wu
Jia Jia
AuLLM
250
17
0
28 Aug 2024
Voice Attribute Editing with Text Prompt
Zheng-Yan Sheng
Yang Ai
Li-Juan Liu
Jia Pan
Zhenhua Ling
181
10
0
13 Apr 2024
Natural language guidance of high-fidelity text-to-speech with synthetic annotations
Daniel Lyth
Simon King
258
90
0
02 Feb 2024
1