v1v2 (latest)

Spiking Structured State Space Model for Monaural Speech Enhancement

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

7 September 2023

Yu Du

Xu Liu

Yansong Chua

ArXiv (abs)PDF HTML

Abstract

Speech enhancement seeks to extract clean speech from noisy signals. Traditional deep learning methods face two challenges: efficiently using information in long speech sequences and high computational costs. To address these, we introduce the Spiking Structured State Space Model (Spiking-S4). This approach merges the energy efficiency of Spiking Neural Networks (SNN) with the long-range sequence modeling capabilities of Structured State Space Models (S4), offering a compelling solution. Evaluation on the DNS Challenge and VoiceBank+Demand Datasets confirms that Spiking-S4 rivals existing Artificial Neural Network (ANN) methods but with fewer computational resources, as evidenced by reduced parameters and Floating Point Operations (FLOPs).

View on arXiv

Comments on this paper