A dictionary learning and source recovery based approach to classify diverse audio sources

Abstract
A dictionary learning based audio source classification algorithm is proposed to classify a sample audio signal as one amongst a finite set of different audio sources. Cosine similarity measure is used to select the atoms during dictionary learning. Based on three objective measures proposed, namely, signal to distortion ratio (SDR), the number of non-zero weights and the sum of weights, a frame-wise source classification accuracy of 98.2% is obtained for twelve different sources. Cent percent accuracy has been obtained using moving SDR accumulated over six successive frames for ten of the audio sources tested, while the two other sources require accumulation of 10 and 14 frames.
View on arXivComments on this paper