Perceptual Noise-Masking with Music through Deep Spectral Envelope Shaping

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

24 February 2025

Main:4 Pages

4 Figures

Bibliography:1 Pages

Abstract

People often listen to music in noisy environments, seeking to isolate themselves from ambient sounds. Indeed, a music signal can mask some of the noise's frequency components due to the effect of simultaneous masking. In this article, we propose a neural network based on a psychoacoustic masking model, designed to enhance the music's ability to mask ambient noise by reshaping its spectral envelope with predicted filter frequency responses. The model is trained with a perceptual loss function that balances two constraints: effectively masking the noise while preserving the original music mix and the user's chosen listening level. We evaluate our approach on simulated data replicating a user's experience of listening to music with headphones in a noisy environment. The results, based on defined objective metrics, demonstrate that our system improves the state of the art.

View on arXiv

Comments on this paper