Auscultatory analysis using an electronic stethoscope has attracted increasing attention in the clinical diagnosis of respiratory diseases. Recently, neural networks have been applied to assist in respiratory sound classification with achievements. However, it remains challenging due to the scarcity of abnormal respiratory sound. In this paper, we propose a novel architecture, namely Waveform-Logmel audio neural networks (WLANN), which uses both waveform and log-mel spectrogram as the input features and uses Bidirectional Gated Recurrent Units (Bi-GRU) to context model the fused features. Experimental results of our WLANN applied to SPRSound respiratory dataset show that the proposed framework can effectively distinguish pathological respiratory sound classes, outperforming the previous studies, with 90.3% in sensitivity and 93.6% in total score. Our study demonstrates the high effectiveness of the WLANN in the diagnosis of respiratory diseases.
View on arXiv@article{xie2025_2504.17156, title={ Waveform-Logmel Audio Neural Networks for Respiratory Sound Classification }, author={ Jiadong Xie and Yunlian Zhou and Mingsheng Xu }, journal={arXiv preprint arXiv:2504.17156}, year={ 2025 } }