ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.19576
12
0

Mel-McNet: A Mel-Scale Framework for Online Multichannel Speech Enhancement

26 May 2025
Yujie Yang
Bing Yang
Xiaofei Li
ArXiv (abs)PDFHTML
Main:4 Pages
1 Figures
Bibliography:1 Pages
2 Tables
Abstract

Online multichannel speech enhancement has been intensively studied recently. Though Mel-scale frequency is more matched with human auditory perception and computationally efficient than linear frequency, few works are implemented in a Mel-frequency domain. To this end, this work proposes a Mel-scale framework (namely Mel-McNet). It processes spectral and spatial information with two key components: an effective STFT-to-Mel module compressing multi-channel STFT features into Mel-frequency representations, and a modified McNet backbone directly operating in the Mel domain to generate enhanced LogMel spectra. The spectra can be directly fed to vocoders for waveform reconstruction or ASR systems for transcription. Experiments on CHiME-3 show that Mel-McNet can reduce computational complexity by 60% while maintaining comparable enhancement and ASR performance to the original McNet. Mel-McNet also outperforms other SOTA methods, verifying the potential of Mel-scale speech enhancement.

View on arXiv
@article{yang2025_2505.19576,
  title={ Mel-McNet: A Mel-Scale Framework for Online Multichannel Speech Enhancement },
  author={ Yujie Yang and Bing Yang and Xiaofei Li },
  journal={arXiv preprint arXiv:2505.19576},
  year={ 2025 }
}
Comments on this paper