A Comprehensive Machine Learning Framework for Heart Disease Prediction: Performance Evaluation and Future Perspectives

This study presents a machine learning-based framework for heart disease prediction using the heart-disease dataset, comprising 303 samples with 14 features. The methodology involves data preprocessing, model training, and evaluation using three classifiers: Logistic Regression, K-Nearest Neighbors (KNN), and Random Forest. Hyperparameter tuning with GridSearchCV and RandomizedSearchCV was employed to enhance model performance. The Random Forest classifier outperformed other models, achieving an accuracy of 91% and an F1-score of 0.89. Evaluation metrics, including precision, recall, and confusion matrix, revealed balanced performance across classes. The proposed model demonstrates strong potential for aiding clinical decision-making by effectively predicting heart disease. Limitations such as dataset size and generalizability underscore the need for future studies using larger and more diverse datasets. This work highlights the utility of machine learning in healthcare, offering insights for further advancements in predictive diagnostics.
View on arXiv@article{lamir2025_2505.09969, title={ A Comprehensive Machine Learning Framework for Heart Disease Prediction: Performance Evaluation and Future Perspectives }, author={ Ali Azimi Lamir and Shiva Razzagzadeh and Zeynab Rezaei }, journal={arXiv preprint arXiv:2505.09969}, year={ 2025 } }