Interactive Diabetes Risk Prediction Using Explainable Machine Learning: A Dash-Based Approach with SHAP, LIME, and Comorbidity Insights

Abstract
This study presents a web-based interactive health risk prediction tool designed to assess diabetes risk using machine learning models. Built on the 2015 CDC BRFSS dataset, the study evaluates models including Logistic Regression, Random Forest, XGBoost, LightGBM, KNN, and Neural Networks under original, SMOTE, and undersampling strategies. LightGBM with undersampling achieved the best recall, making it ideal for risk detection. The tool integrates SHAP and LIME to explain predictions and highlights comorbidity correlations using Pearson analysis. A Dash-based UI enables user-friendly interaction with model predictions, personalized suggestions, and feature insights, supporting data-driven health awareness.
View on arXiv@article{allani2025_2505.05683, title={ Interactive Diabetes Risk Prediction Using Explainable Machine Learning: A Dash-Based Approach with SHAP, LIME, and Comorbidity Insights }, author={ Udaya Allani }, journal={arXiv preprint arXiv:2505.05683}, year={ 2025 } }
Comments on this paper