Toward Interpretable Sleep Stage Classification Using Cross-Modal Transformers

Accurate sleep stage classification is significant for sleep health assessment. In recent years, several machine-learning based sleep staging algorithms have been developed , and in particular, deep-learning based algorithms have achieved performance on par with human annotation. Despite improved performance, a limitation of most deep-learning based algorithms is their black-box behavior, which have limited their use in clinical settings. Here, we propose a cross-modal transformer, which is a transformer-based method for sleep stage classification. The proposed cross-modal transformer consists of a novel cross-modal transformer encoder architecture along with a multi-scale one-dimensional convolutional neural network for automatic representation learning. Our method outperforms the state-of-the-art methods and eliminates the black-box behavior of deep-learning models by utilizing the interpretability aspect of the attention modules. Furthermore, our method provides considerable reductions in the number of parameters and training time compared to the state-of-the-art methods. Our code is available atthis https URL. A demo of our work can be found atthis https URL.
View on arXiv@article{pradeepkumar2025_2208.06991, title={ Toward Interpretable Sleep Stage Classification Using Cross-Modal Transformers }, author={ Jathurshan Pradeepkumar and Mithunjha Anandakumar and Vinith Kugathasan and Dhinesh Suntharalingham and Simon L. Kappel and Anjula C. De Silva and Chamira U. S. Edussooriya }, journal={arXiv preprint arXiv:2208.06991}, year={ 2025 } }