ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.20212
88
0

Dolphin: A Large-Scale Automatic Speech Recognition Model for Eastern Languages

26 March 2025
Yangyang Meng
Jinpeng Li
Guodong Lin
Yu Pu
G. Wang
Hu Du
Zhiming Shao
Yukai Huang
Ke Li
Wei-Qiang Zhang
    ObjD
ArXivPDFHTML
Abstract

This report introduces Dolphin, a large-scale multilingual automatic speech recognition (ASR) model that extends the Whisper architecture to support a wider range of languages. Our approach integrates in-house proprietary and open-source datasets to refine and optimize Dolphin's performance. The model is specifically designed to achieve notable recognition accuracy for 40 Eastern languages across East Asia, South Asia, Southeast Asia, and the Middle East, while also supporting 22 Chinese dialects. Experimental evaluations show that Dolphin significantly outperforms current state-of-the-art open-source models across various languages. To promote reproducibility and community-driven innovation, we are making our trained models and inference source code publicly available.

View on arXiv
@article{meng2025_2503.20212,
  title={ Dolphin: A Large-Scale Automatic Speech Recognition Model for Eastern Languages },
  author={ Yangyang Meng and Jinpeng Li and Guodong Lin and Yu Pu and Guanbo Wang and Hu Du and Zhiming Shao and Yukai Huang and Ke Li and Wei-Qiang Zhang },
  journal={arXiv preprint arXiv:2503.20212},
  year={ 2025 }
}
Comments on this paper