v1v2 (latest)

IndexTTS2: A Breakthrough in Emotionally Expressive and Duration-Controlled Auto-Regressive Zero-Shot Text-to-Speech

23 June 2025

Papers citing "IndexTTS2: A Breakthrough in Emotionally Expressive and Duration-Controlled Auto-Regressive Zero-Shot Text-to-Speech"

6 / 6 papers shown

UltraVoice: Scaling Fine-Grained Style-Controlled Speech Conversations for Spoken Dialogue Models

156

26 Oct 2025

VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

...

136

29 Sep 2025

Evaluating Bias in Spoken Dialogue LLMs for Real-World Decisions and Recommendations

158

27 Sep 2025

Bridging the gap between training and inference in LM-based TTS models

145

21 Sep 2025

Vevo2: A Unified and Controllable Framework for Speech and Singing Voice Generation

240

22 Aug 2025

Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis

...

357

14 Apr 2025