Comparative Evaluation of Feature Selection Methods for Heart Disease Classification with Support Vector Machine

Bidul, Winarsi J. and Surono, Sugiyarto and Kurniawan, Tri Basuki (2024) Comparative Evaluation of Feature Selection Methods for Heart Disease Classification with Support Vector Machine. Jurnal Ilmiah Teknik Elektro Komputer dan Informatika (JITEKI), 10 (4). pp. 265-278.

[thumbnail of 6-Comparative Evaluation of Feature Selection Methods for Heart Disease Classification with Support Vector Machine.pdf] Text
6-Comparative Evaluation of Feature Selection Methods for Heart Disease Classification with Support Vector Machine.pdf

Download (847kB)

Abstract

The purpose of this study is to compare the effectiveness of a variety of feature selection techniques to enhance the performance of Support Vector Machine (SVM) models for classifying heart disease data, particularly in the context of big data. The main challenge lies in managing large datasets, which necessitates the application of feature selection techniques to streamline the analysis process. Therefore, several feature selection methods, including Logistic Regression-Recursive Feature Elimination (LR-RFE), Logistic RegressionSequential Forward Selection (LR-SFS), Correlation-based Feature Selection (CFS), and Variance Threshold were explored to identify the most efficient approach. Based on existing research, these methods have shown a great impact in improving classification accuracy. In this study, it was found that combining the SVM model with LR-RFE, LR-SFS, and Variance Threshold resulted in superior evaluation, achieving the highest accuracy of 89%. Based on the comparison of other evaluation results, including precision, recall, and F1-score, the performance of these models varied depending on the feature selection method chosen and the distribution of data used for training and testing. But in general, LR-RFE-SVM and Variance Threshold-SVM tend to provide better evaluation values than LR-SFS-SVM and SVM-CFS. Based on the computation time, SVM classification with the Variance Threshold method as the feature selection method obtained the fastest time of 118.1540 seconds with the number and retention of 23 important features. Therefore, it is very important to choose a suitable feature selection technique, taking into account the number of retained features and the computation time. This research underscores the significance of feature selection in addressing big data challenges, particularly in heart disease classification. In addition, this study also highlights practical implications for healthcare practitioners and researchers by recommending methods that can be integrated into real-world healthcare settings or existing clinical decision support systems.

Item Type: Artikel Umum
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering
Divisi / Prodi: Faculty of Industrial Technology (Fakultas Teknologi Industri) > S1-Electrical Engineering (S1-Teknik Elektro)
Depositing User: M.Eng. Alfian Ma'arif
Date Deposited: 24 Aug 2024 02:21
Last Modified: 24 Aug 2024 02:21
URI: http://eprints.uad.ac.id/id/eprint/68623

Actions (login required)

View Item View Item