148
v1v2 (latest)

SplashNet: Split-and-Share Encoders for Accurate and Efficient Typing with Surface Electromyography

Main:8 Pages
6 Figures
Bibliography:3 Pages
11 Tables
Appendix:14 Pages
Abstract

Surface electromyography (sEMG) at the wrists could enable natural, keyboard-free text entry, yet the state-of-the-art emg2qwerty baseline still misrecognizes 51.8%51.8\% of characters in the zero-shot setting on unseen users and 7.0%7.0\% after user-specific fine-tuning. We trace many of these errors to mismatched cross-user signal statistics, fragile reliance on high-order feature dependencies, and the absence of architectural inductive biases aligned with the bilateral nature of typing. To address these issues, we introduce three simple modifications: (i) Rolling Time Normalization, which adaptively aligns input distributions across users; (ii) Aggressive Channel Masking, which encourages reliance on low-order feature combinations more likely to generalize across users; and (iii) a Split-and-Share encoder that processes each hand independently with weight-shared streams to reflect the bilateral symmetry of the neuromuscular system. Combined with a five-fold reduction in spectral resolution (33 ⁣ ⁣633\!\rightarrow\!6 frequency bands), these components yield a compact Split-and-Share model, SplashNet-mini, which uses only 14\tfrac14 the parameters and 0.6×0.6\times the FLOPs of the baseline while reducing character-error rate (CER) to 36.4%36.4\% zero-shot and 5.9%5.9\% after fine-tuning. An upscaled variant, SplashNet (12\tfrac12 the parameters, 1.15×1.15\times the FLOPs of the baseline), further lowers error to 35.7%35.7\% and 5.5%5.5\%, representing relative improvements of 31%31\% and 21%21\% in the zero-shot and fine-tuned settings, respectively. SplashNet therefore establishes a new state of the art without requiring additional data.

View on arXiv
Comments on this paper