An Insight Into Viable Machine Learning Models for Early Diagnosis of Cardiovascular Disease

Authors

  • Mukkoti Maruthi Venkata Chalapathi School of Computer Science and Engineering, VIT-AP University, Amaravati 522237, Andhra Pradesh, India
  • Dudekula Khasim Vali School of Computer Science and Engineering, VIT-AP University, Amaravati 522237, Andhra Pradesh, India
  • Yellapragada Venkata Pavan Kumar School of Electronics Engineering, VIT-AP University, Amaravati 522237, Andhra Pradesh, India
  • Challa Pradeep Reddy School of Computer Science and Engineering, VIT-AP University, Amaravati 522237, Andhra Pradesh, India
  • Purna Prakash Kasaraneni School of Computer Science and Engineering, VIT-AP University, Amaravati 522237, Andhra Pradesh, India

DOI:

https://doi.org/10.12694/scpe.v25i1.2326

Keywords:

Average Classification Accuracy, Cardiovascular Disease, Computer Aided Diagnostics, Machine Learning Algorithms, Test Data Split

Abstract

Cardiovascular diseases (CVD) are a prominent source of death across the globe, and these deaths are taking place in low-to middle-income nations. Due to this, CVD prevention is a pressing issue that has already been the subject of extensive research. Innovative methodologies in machine learning (ML) can have a greater impact on the diagnosis of CVD, yet the research on CVD is more challenging and attracting more research indeed. In this paper, we investigate the differences between four distinct machine learning models, support vector machine (SVM), logistic regression, decision trees (DT), and artificial neural networks (ANN) in their classification accuracy and possible practicality in CVD classification. techniques such as ensemble learning and other model-specific optimizations are not part of this study, but more basic implementations of the various models were used. To implement abovementioned ML models, a subset of 14 features from the original heart disease dataset is considered and deemed relevant for classification where no individual feature data are missing.

From the results, it is observed that there is no clear winner in the comparison of models. There is no significant difference in the average accuracy of models. The highest average hit rate is observed in SVM and ANN, however it is slightly lower in ANN. Even though the DT had lower accuracy, the fully trained model can be easily visualized and interpreted by humans. Hence, the DT is possibly the most practical model to use as a complement to doctors in their current methods of diagnosis.

 

Downloads

Published

2024-01-04

Issue

Section

Special Issue - Scalable Machine Learning for Health Care: Innovations and Applications