Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2503.04713
Cited By

Scaling Rich Style-Prompted Text-to-Speech Datasets

v1v2 (latest)

Scaling Rich Style-Prompted Text-to-Speech Datasets

6 March 2025

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)Github (3★)

Papers citing "Scaling Rich Style-Prompted Text-to-Speech Datasets"

9 / 9 papers shown

Revisiting Audio-language Pretraining for Learning General-purpose Audio Representation

Revisiting Audio-language Pretraining for Learning General-purpose Audio Representation

Wei-Cheng Tseng

205

1

0

20 Nov 2025

Auden-Voice: General-Purpose Voice Encoder for Speech and Language Understanding

Auden-Voice: General-Purpose Voice Encoder for Speech and Language Understanding

Wei-Cheng Tseng

535

3

0

19 Nov 2025

VoxGuard: Evaluating User and Attribute Privacy in Speech via Membership Inference Attacks

VoxGuard: Evaluating User and Attribute Privacy in Speech via Membership Inference Attacks

Efthymios Tsaprazlis

Thanathai Lertpetchpun

Sai Praneeth Karimireddy

271

0

0

22 Sep 2025

Vevo2: A Unified and Controllable Framework for Speech and Singing Voice Generation

Vevo2: A Unified and Controllable Framework for Speech and Singing Voice Generation

329

6

0

22 Aug 2025

MoE-TTS: Enhancing Out-of-Domain Text Understanding for Description-based TTS via Mixture-of-Experts

MoE-TTS: Enhancing Out-of-Domain Text Understanding for Description-based TTS via Mixture-of-Experts

172

2

0

15 Aug 2025

Voxlect: A Speech Foundation Model Benchmark for Modeling Dialects and Regional Languages Around the Globe

Voxlect: A Speech Foundation Model Benchmark for Modeling Dialects and Regional Languages Around the Globe

Thanathai Lertpetchpun

Shrikanth Narayanan

201

9

0

03 Aug 2025

InstructTTSEval: Benchmarking Complex Natural-Language Instruction Following in Text-to-Speech Systems

InstructTTSEval: Benchmarking Complex Natural-Language Instruction Following in Text-to-Speech Systems

329

11

0

19 Jun 2025

VoiceStar: Robust Zero-Shot Autoregressive TTS with Duration Control and Extrapolation

VoiceStar: Robust Zero-Shot Autoregressive TTS with Duration Control and Extrapolation

Abdelrahman Mohamed

286

0

0

26 May 2025

Vox-Profile: A Speech Foundation Model Benchmark for Characterizing Diverse Speaker and Speech Traits

Thanathai Lertpetchpun

...

Laureano Moro-Velazquez

337

18

0

20 May 2025

Page 1 of 1