75

TsetlinKWS: A 65nm 16.58uW, 0.63mm2 State-Driven Convolutional Tsetlin Machine-Based Accelerator For Keyword Spotting

Main:10 Pages
21 Figures
Bibliography:1 Pages
Appendix:1 Pages
Abstract

The Tsetlin Machine (TM) has recently attracted attention as a low-power alternative to neural networks due to its simple and interpretable inference mechanisms. However, its performance on speech-related tasks remains limited. This paper proposes TsetlinKWS, the first algorithm-hardware co-design framework for the Convolutional Tsetlin Machine (CTM) on the 12-keyword spotting task. Firstly, we introduce a novel Mel-Frequency Spectral Coefficient and Spectral Flux (MFSC-SF) feature extraction scheme together with spectral convolution, enabling the CTM to reach its first-ever competitive accuracy of 87.35% on the 12-keyword spotting task. Secondly, we develop an Optimized Grouped Block-Compressed Sparse Row (OG-BCSR) algorithm that achieves a remarkable 9.84×\times reduction in model size, significantly improving the storage efficiency on CTMs. Finally, we propose a state-driven architecture tailored for the CTM, which simultaneously exploits data reuse and sparsity to achieve high energy efficiency. The full system is evaluated in 65 nm process technology, consuming 16.58 μ\muW at 0.7 V with a compact 0.63 mm2^2 core area. TsetlinKWS requires only 907k logic operations per inference, representing a 10×\times reduction compared to the state-of-the-art KWS accelerators, positioning the CTM as a highly-efficient candidate for ultra-low-power speech applications.

View on arXiv
Comments on this paper