ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.19709
  4. Cited By
XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations
  for Text-to-Speech

XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech

Interspeech (Interspeech), 2023
31 May 2023
L. T. Nguyen
Thinh-Le-Gia Pham
Dat Quoc Nguyen
ArXiv (abs)PDFHTMLGithub (322★)

Papers citing "XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech"

13 / 13 papers shown
Title
Perturbation Self-Supervised Representations for Cross-Lingual Emotion TTS: Stage-Wise Modeling of Emotion and Speaker
Perturbation Self-Supervised Representations for Cross-Lingual Emotion TTS: Stage-Wise Modeling of Emotion and Speaker
Cheng Gong
Chunyu Qiang
Tianrui Wang
Yu Jiang
Yuheng Lu
Ruihao Jing
Xiaoxiao Miao
Xiaolei Zhang
Longbiao Wang
Jianwu Dang
67
0
0
13 Oct 2025
Multi-task Pretraining for Enhancing Interpretable L2 Pronunciation Assessment
Multi-task Pretraining for Enhancing Interpretable L2 Pronunciation Assessment
Jiun-Ting Li
Bi-Cheng Yan
Yi-Cheng Wang
Berlin Chen
68
0
0
21 Sep 2025
Whisper based Cross-Lingual Phoneme Recognition between Vietnamese and English
Whisper based Cross-Lingual Phoneme Recognition between Vietnamese and English
Nguyen Huu Nhat Minh
Tran Nguyen Anh
Truong Dinh Dung
Vo Van Nam
Le Pham Tuyen
72
1
0
22 Aug 2025
LAPS-Diff: A Diffusion-Based Framework for Singing Voice Synthesis With Language Aware Prosody-Style Guided Learning
LAPS-Diff: A Diffusion-Based Framework for Singing Voice Synthesis With Language Aware Prosody-Style Guided Learning
Sandipan Dhar
Mayank Gupta
Preeti Rao
DiffM
43
0
0
07 Jul 2025
PMF-CEC: Phoneme-augmented Multimodal Fusion for Context-aware ASR Error Correction with Error-specific Selective Decoding
PMF-CEC: Phoneme-augmented Multimodal Fusion for Context-aware ASR Error Correction with Error-specific Selective DecodingIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2025
Jiajun He
Tomoki Toda
126
3
0
31 May 2025
Cross-Lingual IPA Contrastive Learning for Zero-Shot NER
Jimin Sohn
David R. Mortensen
207
0
0
10 Mar 2025
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLMAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Siyang Song
Mohammed Irfan Kurpath
Sahal Shaji Mullappilly
Jean Lahoud
Fahad A Khan
Rao Muhammad Anwer
Salman Khan
Hisham Cholakkal
AuLLM
625
5
0
06 Mar 2025
Zero-Shot Cross-Lingual NER Using Phonemic Representations for
  Low-Resource Languages
Zero-Shot Cross-Lingual NER Using Phonemic Representations for Low-Resource Languages
Jimin Sohn
Haeji Jung
Alex Cheng
Jooeon Kang
Yilin Du
David R. Mortensen
139
2
0
23 Jun 2024
An Initial Investigation of Language Adaptation for TTS Systems under
  Low-resource Scenarios
An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios
Cheng Gong
Erica Cooper
Xin Wang
Chunyu Qiang
Mengzhe Geng
...
Jianwu Dang
Marc Tessier
Aidan Pine
Korin Richmond
Junichi Yamagishi
145
5
0
13 Jun 2024
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot
  Text-to-Speech
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-SpeechInternational Conference on Learning Representations (ICLR), 2024
Jaehyeon Kim
Keon Lee
Seungjun Chung
Jaewoong Cho
199
61
0
03 Apr 2024
Mitigating the Linguistic Gap with Phonemic Representations for Robust
  Multilingual Language Understanding
Mitigating the Linguistic Gap with Phonemic Representations for Robust Multilingual Language Understanding
Haeji Jung
Changdae Oh
Jooeon Kang
Jimin Sohn
Kyungwoo Song
Jinkyu Kim
David R. Mortensen
158
0
0
22 Feb 2024
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis
  Conditioned on Self-supervised Discrete Speech Representations
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
Cheng Gong
Xin Wang
Erica Cooper
Dan Wells
Longbiao Wang
Jianwu Dang
Korin Richmond
Junichi Yamagishi
244
35
0
22 Dec 2023
PhoBERT: Pre-trained language models for Vietnamese
PhoBERT: Pre-trained language models for VietnameseFindings (Findings), 2020
Dat Quoc Nguyen
A. Nguyen
605
403
0
02 Mar 2020
1