160

Missing Data Imputation Based on Structural Equation Modeling Enhanced with Self-Attention

Ou Deng
Abstract

Addressing missing data in complex datasets like Electronic Health Records (EHR) is critical for ensuring accurate analysis and decision-making in healthcare. This paper proposes Structural Equation Modeling (SEM) enhanced with the Self-Attention method (SESA), an innovative approach for data imputation in EHR. SESA innovates beyond traditional SEM-based methods by incorporating self-attention mechanisms, enhancing the model's adaptability and accuracy across diverse EHR datasets. This enhancement allows SESA to dynamically adjust and optimize imputation processes, overcoming the limitations of static SEM frameworks. Our experimental analyses demonstrate that SESA achieves robust predictive performance, effectively handling missing data in EHR. Moreover, SESA's architecture not only rectifies potential mis-specifications in SEM but also synergizes with causal discovery algorithms, to refine its imputation logic based on underlying data structures. These features highlight SESA's advanced capabilities and its potential for broader application in EHR data analysis and beyond, marking a significant leap forward in the field of data imputation.

View on arXiv
Comments on this paper