ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.17081
35
2

Continuous Speech Tokenizer in Text To Speech

22 October 2024
Yixing Li
Ruobing Xie
X. Sun
Yu Cheng
Zhanhui Kang
    AuLLM
    CLL
ArXivPDFHTML
Abstract

The fusion of speech and language in the era of large language models has garnered significant attention. Discrete speech token is often utilized in text-to-speech tasks for speech compression and portability, which is convenient for joint training with text and have good compression efficiency. However, we found that the discrete speech tokenizer still suffers from information loss. Therefore, we propose a simple yet effective continuous speech tokenizer named Cont-SPT, and a text-to-speech model based on continuous speech tokens. Our results show that the speech language model based on the continuous speech tokenizer has better continuity and higher estimated Mean Opinion Scores (MoS). This enhancement is attributed to better information preservation rate of the continuous speech tokenizer across both low and high frequencies in the frequency domain. The code and resources for Cont-SPT can be found inthis https URL

View on arXiv
@article{li2025_2410.17081,
  title={ Continuous Speech Tokenizer in Text To Speech },
  author={ Yixing Li and Ruobing Xie and Xingwu Sun and Yu Cheng and Zhanhui Kang },
  journal={arXiv preprint arXiv:2410.17081},
  year={ 2025 }
}
Comments on this paper