ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.06285
38
0

End-to-End Multi-Microphone Speaker Extraction Using Relative Transfer Functions

10 February 2025
Aviad Eisenberg
Sharon Gannot
Shlomo E. Chazan
ArXivPDFHTML
Abstract

This paper introduces a multi-microphone method for extracting a desired speaker from a mixture involving multiple speakers and directional noise in a reverberant environment. In this work, we propose leveraging the instantaneous relative transfer function (RTF), estimated from a reference utterance recorded in the same position as the desired source. The effectiveness of the RTF-based spatial cue is compared with direction of arrival (DOA)-based spatial cue and the conventional spectral embedding. Experimental results in challenging acoustic scenarios demonstrate that using spatial cues yields better performance than the spectral-based cue and that the instantaneous RTF outperforms the DOA-based spatial cue.

View on arXiv
@article{eisenberg2025_2502.06285,
  title={ End-to-End Multi-Microphone Speaker Extraction Using Relative Transfer Functions },
  author={ Aviad Eisenberg and Sharon Gannot and Shlomo E. Chazan },
  journal={arXiv preprint arXiv:2502.06285},
  year={ 2025 }
}
Comments on this paper