ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.08140
  4. Cited By
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech
  Using Natural Language Descriptions
v1v2 (latest)

PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
15 September 2023
Reo Shimizu
Ryuichi Yamamoto
Masaya Kawamura
Yuma Shirahata
Hironori Doi
Tatsuya Komatsu
Kentaro Tachibana
    DiffM
ArXiv (abs)PDFHTML

Papers citing "PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions"

11 / 11 papers shown
Title
HuLA: Prosody-Aware Anti-Spoofing with Multi-Task Learning for Expressive and Emotional Synthetic Speech
HuLA: Prosody-Aware Anti-Spoofing with Multi-Task Learning for Expressive and Emotional Synthetic Speech
Aurosweta Mahapatra
Ismail Rasim Ulgen
Berrak Sisman
103
0
0
25 Sep 2025
Expressive Speech Retrieval using Natural Language Descriptions of Speaking Style
Expressive Speech Retrieval using Natural Language Descriptions of Speaking Style
Wonjune Kang
D. Roy
68
0
0
15 Aug 2025
MoE-TTS: Enhancing Out-of-Domain Text Understanding for Description-based TTS via Mixture-of-Experts
MoE-TTS: Enhancing Out-of-Domain Text Understanding for Description-based TTS via Mixture-of-Experts
Heyang Xue
Xuchen Song
Yu Tang
J. Chen
Yanru Chen
Yang Li
Yahui Zhou
MoE
76
1
0
15 Aug 2025
ARECHO: Autoregressive Evaluation via Chain-Based Hypothesis Optimization for Speech Multi-Metric Estimation
ARECHO: Autoregressive Evaluation via Chain-Based Hypothesis Optimization for Speech Multi-Metric Estimation
Jiatong Shi
Yifan Cheng
Bo-Hao Su
Hye-jin Shim
Jinchuan Tian
Samuele Cornell
Yiwen Zhao
Siddhant Arora
Shinji Watanabe
193
0
0
30 May 2025
Can Emotion Fool Anti-spoofing?
Can Emotion Fool Anti-spoofing?
Aurosweta Mahapatra
Ismail Rasim Ulgen
Abinay Reddy Naini
John H. L. Hansen
Berrak Sisman
125
4
0
29 May 2025
RA-CLAP: Relation-Augmented Emotional Speaking Style Contrastive Language-Audio Pretraining For Speech Retrieval
RA-CLAP: Relation-Augmented Emotional Speaking Style Contrastive Language-Audio Pretraining For Speech Retrieval
Haoqin Sun
Jingguang Tian
Jiaming Zhou
Hui Wang
Jiabei He
...
Xiangyu Kong
Desheng Hu
Xinkang Xu
Xinhui Hu
Yong Qin
199
2
0
26 May 2025
Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration
Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration
Shigeki Karita
Yuma Koizumi
Heiga Zen
Haruko Ishikawa
Robin Scheibler
M. Bacchiani
VLM
963
3
0
07 May 2025
SIFT-50M: A Large-Scale Multilingual Dataset for Speech Instruction Fine-Tuning
SIFT-50M: A Large-Scale Multilingual Dataset for Speech Instruction Fine-TuningAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Prabhat Pandey
Rupak Vignesh Swaminathan
K V Vijay Girish
Arunasish Sen
Jian Xie
Grant P. Strimel
Andreas Schwarz
877
8
0
12 Apr 2025
VoxInstruct: Expressive Human Instruction-to-Speech Generation with
  Unified Multilingual Codec Language Modelling
VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language ModellingACM Multimedia (MM), 2024
Yixuan Zhou
Xiaoyu Qin
Zeyu Jin
Shuoyi Zhou
Shun Lei
Songtao Zhou
Zhiyong Wu
Jia Jia
AuLLM
250
17
0
28 Aug 2024
Voice Attribute Editing with Text Prompt
Voice Attribute Editing with Text Prompt
Zheng-Yan Sheng
Yang Ai
Li-Juan Liu
Jia Pan
Zhenhua Ling
181
10
0
13 Apr 2024
Natural language guidance of high-fidelity text-to-speech with synthetic
  annotations
Natural language guidance of high-fidelity text-to-speech with synthetic annotations
Daniel Lyth
Simon King
258
90
0
02 Feb 2024
1