ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.14750
20
0

Exploring Speaker Diarization with Mixture of Experts

17 June 2025
Gaobin Yang
Maokui He
Shutong Niu
Ruoyu Wang
Hang Chen
Jun Du
    MoE
ArXiv (abs)PDFHTML
Main:10 Pages
12 Figures
Bibliography:1 Pages
7 Tables
Abstract

In this paper, we propose a novel neural speaker diarization system using memory-aware multi-speaker embedding with sequence-to-sequence architecture (NSD-MS2S), which integrates a memory-aware multi-speaker embedding module with a sequence-to-sequence architecture. The system leverages a memory module to enhance speaker embeddings and employs a Seq2Seq framework to efficiently map acoustic features to speaker labels. Additionally, we explore the application of mixture of experts in speaker diarization, and introduce a Shared and Soft Mixture of Experts (SS-MoE) module to further mitigate model bias and enhance performance. Incorporating SS-MoE leads to the extended model NSD-MS2S-SSMoE. Experiments on multiple complex acoustic datasets, including CHiME-6, DiPCo, Mixer 6 and DIHARD-III evaluation sets, demonstrate meaningful improvements in robustness and generalization. The proposed methods achieve state-of-the-art results, showcasing their effectiveness in challenging real-world scenarios.

View on arXiv
@article{yang2025_2506.14750,
  title={ Exploring Speaker Diarization with Mixture of Experts },
  author={ Gaobin Yang and Maokui He and Shutong Niu and Ruoyu Wang and Hang Chen and Jun Du },
  journal={arXiv preprint arXiv:2506.14750},
  year={ 2025 }
}
Comments on this paper