Beyond Random Missingness: Clinically Rethinking for Healthcare Time Series Imputation

This study investigates the impact of masking strategies on time series imputation models in healthcare settings. While current approaches predominantly rely on random masking for model evaluation, this practice fails to capture the structured nature of missing patterns in clinical data. Using the PhysioNet Challenge 2012 dataset, we analyse how different masking implementations affect both imputation accuracy and downstream clinical predictions across eleven imputation methods. Our results demonstrate that masking choices significantly influence model performance, while recurrent architectures show more consistent performance across strategies. Analysis of downstream mortality prediction reveals that imputation accuracy doesn't necessarily translate to optimal clinical prediction capabilities. Our findings emphasise the need for clinically-informed masking strategies that better reflect real-world missing patterns in healthcare data, suggesting current evaluation frameworks may need reconsideration for reliable clinical deployment.
View on arXiv@article{qian2025_2405.17508, title={ Beyond Random Missingness: Clinically Rethinking for Healthcare Time Series Imputation }, author={ Linglong Qian and Yiyuan Yang and Wenjie Du and Jun Wang and Richard Dobsoni and Zina Ibrahim }, journal={arXiv preprint arXiv:2405.17508}, year={ 2025 } }