Arifah, Dina and Hamonangan Saragih, Triando and Kartini, Dwi and Muliadi, Muliadi and Mazdadi, Muhammad Itqan (2023) Application of SMOTE to Handle Imbalance Class in Deposit Classification Using the Extreme Gradient Boosting Algorithm. Jurnal Ilmiah Teknik Elektro Komputer dan Informatika (JITEKI), 9 (2). pp. 396-410.
Text
15-Application of SMOTE to Handle Imbalance Class in Deposit Classification Using the Extreme Gradient Boosting Algorithm.pdf Download (1MB) |
Abstract
Deposits became one of the main products and funding sources for banks and increasing deposit marketing is very important. However, telemarketing as a form of deposit marketing is less effective and efficient as it requires calling every customer for deposit offers. Therefore, the identification of potential deposit customers was necessary so that telemarketing became more effective and efficient by targeting the right customers, thus improving bank marketing performance with the ultimate goal of increasing sources of funding for banks. To identify customers, data mining is used with the UCI Bank Marketing Dataset from a Portuguese banking institution. This dataset consists of 45,211 records with 17 attributes. The classification algorithm used is Extreme Gradient Boosting (XGBoost) which is suitable for large data. The data used has a high-class imbalance, with "yes" and "no" percentages of 11.7% and 88.3%, respectively. Therefore, the proposed solution in the research, which focused on addressing the Imbalance Class in the Bank marketing dataset, was to use Synthetic Minority Over-sampling (SMOTE) and the XGBoost method. The result of the XGBoost study was an accuracy of 0.91016, precision of 0.79476, recall of 0.72928, F1-Score of 0.56198, ROC Area of 0.93831, and AUCPR of 0.63886. After SMOTE was applied, the accuracy was 0.91072, the precision was 0.78883, the recall was 0.75588, F1-Score was 0.59153, ROC Area was 0.93723, and AUCPR was 0.63733. The results showed that XGBoost and SMOTE could outperform other algorithms such as K-Nearest Neighbor, Random Forest, Logistic Regression, Artificial Neural Network, Naïve Bayes, and Support Vector Machine in terms of accuracy. This study contributes to the development of effective machine learning models that can be used as a support system for information technology experts in the finance and banking industries to identify potential customers interested in subscribing to deposits and increasing bank funding sources.
Item Type: | Artikel Umum |
---|---|
Subjects: | T Technology > TK Electrical engineering. Electronics Nuclear engineering |
Divisi / Prodi: | Faculty of Industrial Technology (Fakultas Teknologi Industri) > S1-Electrical Engineering (S1-Teknik Elektro) |
Depositing User: | M.Eng. Alfian Ma'arif |
Date Deposited: | 13 Jun 2023 01:24 |
Last Modified: | 13 Jun 2023 01:24 |
URI: | http://eprints.uad.ac.id/id/eprint/43356 |
Actions (login required)
View Item |