127
v1v2 (latest)

Statistical Jump Model for Mixed-Type Data with Missing Data Imputation

Advances in Data Analysis and Classification (ADAC), 2024
Main:22 Pages
5 Figures
Bibliography:3 Pages
10 Tables
Abstract

In this paper, we address the challenge of clustering mixed-type data with temporal evolution by introducing the statistical jump model for mixed-type data. This novel framework incorporates regime persistence, enhancing interpretability and reducing the frequency of state switches, and efficiently handles missing data. The model is easily interpretable through its state-conditional means and modes, making it accessible to practitioners and policymakers. We validate our approach through extensive simulation studies and an empirical application to air quality data, demonstrating its superiority in inferring persistent air quality regimes compared to the traditional air quality index. Our contributions include a robust method for mixed-type temporal clustering, effective missing data management, and practical insights for environmental monitoring.

View on arXiv
Comments on this paper