In recent years credit scoring has become a challenging issue among financial institutions. Several researchers have dedicated efforts in machine learning in the areas of credit scoring and results have shown that machine learning algorithms have had a satisfactory performance in the sector of credit scoring. Decision trees have been used for data sets that have high dimension and have a complex correlation and the benefits of feature combination and feature selection has led to the usage of decision trees in classification. The disadvantage of decision tree which is overfitting has led to the introduction of extreme gradient boosting that overcomes the shortcoming by integrating tree models. Employing optimization method helps in tuning the hyperparameters of the model. In this paper, a modified XGBoost model is developed that incorporates inflation parameter. In addition to the proposed model, the study uses adaptive particle swarm optimization since it does not fall into local optima. The swarm split algorithm uses clustering and two learning strategies to promote subswarm diversity and avoid local optimums. In this study the modified XGBoost model was compared to five traditional machine learning algorithms namely, the standard XGBoost model, logistic regression, KNN, support vector machine and decision tree. The study used one data set in credit scoring and the evaluation measures used were accuracy, precision, recall and F1-score. Results demonstrate that the proposed model outperforms other models.
Published in | Machine Learning Research (Volume 9, Issue 2) |
DOI | 10.11648/j.mlr.20240902.15 |
Page(s) | 64-74 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2024. Published by Science Publishing Group |
Modified XGBoost, Credit Scoring, Optimization
[1] | James Bergstra and Yoshua Bengio. Random search for hyper-parameter optimization. Journal of machine learning research, 13(2), 2012. |
[2] | Yung-Chia Chang, Kuei-Hu Chang, and Guan-Jhih Wu. Application of extreme gradient boosting trees in the construction of credit risk assessment models for financial institutions. Applied Soft Computing, 73: 914-920, 2018. |
[3] | Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785-794, 2016. |
[4] | Robert A Eisenbeis. Problems in applying discriminant analysis in credit scoring models. Journal of Banking & Finance, 2(3): 205-219, 1978. |
[5] | Diego Paganoti Fonseca, Peter Fernandes Wanke, and Henrique Luiz Correa. A two-stage fuzzy neural approach for credit risk assessment in a brazilian credit card company. Applied Soft Computing, 92: 106329, 2020. |
[6] | Akhil Bandhu Hens and Manoj Kumar Tiwari. Computational time reduction for credit scoring: An integrated approach based on support vector machine and stratified sampling method. Expert Systems with Applications, 39(8): 6774-6781, 2012. |
[7] | Nan-Chen Hsieh and Lun-Ping Hung. A data driven ensemble classifier for credit scoring analysis. Expert systems with Applications, 37(1): 534-545, 2010. |
[8] | Hui-Ling Huang and Fang-Lin Chang. Esvm: Evolutionary support vector machine for automatic feature selection and classification of microarray data. Biosystems, 90(2): 516-528, 2007. |
[9] | Chao Qin, Yunfeng Zhang, Fangxun Bao, Caiming Zhang, Peide Liu, and Peipei Liu. Xgboost optimized by adaptive particle swarm optimization for credit scoring. Mathematical Problems in Engineering, 2021(1): 6655510, 2021. |
[10] | Kennedy J Eberhart RC et al. Particle swarm optimization. In Proc IEEE Int Conf Neural Networks, volume 4, pages 1942-1948, 1995. |
[11] | Feng Shen, Xingchao Zhao, Zhiyong Li, Ke Li, and Zhiyi Meng. A novel ensemble classification model based on neural networks and a classifier optimisation technique for imbalanced credit risk evaluation. Physica A: Statistical Mechanics and its Applications, 526: 121073, 2019. |
[12] | Gang Wang, Jinxing Hao, Jian Ma, and Hongbing Jiang. A comparative assessment of ensemble learning for credit scoring. Expert systems with applications, 38(1): 223- 230, 2011. |
[13] | Gang Wang, Jian Ma, Lihua Huang, and Kaiquan Xu. Two credit scoring models based on dual strategy ensemble trees. Knowledge-Based Systems, 26: 61-68, 2012. |
[14] | David H Wolpert and William G Macready. No free lunch theorems for optimization. IEEE transactions on evolutionary computation, 1(1): 67-82, 1997. |
[15] | Yufei Xia, Chuanzhe Liu, and Nana Liu. Cost-sensitive boosted tree for loan evaluation in peer- to-peer lending. Electronic Commerce Research and Applications, 24: 30- 49, 2017. |
[16] | Qiang Yang, Wei-Neng Chen, Jeremiah Da Deng, Yun Li, Tianlong Gu, and Jun Zhang. A level-based learning swarm optimizer for large-scale optimization. IEEE Transactions on Evolutionary Computation, 22(4): 578- 594, 2017. |
[17] | Maciej Zi&ecedil;ba, Sebastian K Tomczak, and Jakub M Tomczak. Ensemble boosted trees with synthetic features generation in application to bankruptcy prediction. Expert systems with applications, 58: 93-101, 2016. |
APA Style
Langat, K. K., Waititu, A. G., Ngare, P. O. (2024). Modified XGBoost Hyper-Parameter Tuning Using Adaptive Particle Swarm Optimization for Credit Score Classification. Machine Learning Research, 9(2), 64-74. https://doi.org/10.11648/j.mlr.20240902.15
ACS Style
Langat, K. K.; Waititu, A. G.; Ngare, P. O. Modified XGBoost Hyper-Parameter Tuning Using Adaptive Particle Swarm Optimization for Credit Score Classification. Mach. Learn. Res. 2024, 9(2), 64-74. doi: 10.11648/j.mlr.20240902.15
AMA Style
Langat KK, Waititu AG, Ngare PO. Modified XGBoost Hyper-Parameter Tuning Using Adaptive Particle Swarm Optimization for Credit Score Classification. Mach Learn Res. 2024;9(2):64-74. doi: 10.11648/j.mlr.20240902.15
@article{10.11648/j.mlr.20240902.15, author = {Kenneth Kiprotich Langat and Anthony Gichuhi Waititu and Philip Odhiambo Ngare}, title = {Modified XGBoost Hyper-Parameter Tuning Using Adaptive Particle Swarm Optimization for Credit Score Classification}, journal = {Machine Learning Research}, volume = {9}, number = {2}, pages = {64-74}, doi = {10.11648/j.mlr.20240902.15}, url = {https://doi.org/10.11648/j.mlr.20240902.15}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.mlr.20240902.15}, abstract = {In recent years credit scoring has become a challenging issue among financial institutions. Several researchers have dedicated efforts in machine learning in the areas of credit scoring and results have shown that machine learning algorithms have had a satisfactory performance in the sector of credit scoring. Decision trees have been used for data sets that have high dimension and have a complex correlation and the benefits of feature combination and feature selection has led to the usage of decision trees in classification. The disadvantage of decision tree which is overfitting has led to the introduction of extreme gradient boosting that overcomes the shortcoming by integrating tree models. Employing optimization method helps in tuning the hyperparameters of the model. In this paper, a modified XGBoost model is developed that incorporates inflation parameter. In addition to the proposed model, the study uses adaptive particle swarm optimization since it does not fall into local optima. The swarm split algorithm uses clustering and two learning strategies to promote subswarm diversity and avoid local optimums. In this study the modified XGBoost model was compared to five traditional machine learning algorithms namely, the standard XGBoost model, logistic regression, KNN, support vector machine and decision tree. The study used one data set in credit scoring and the evaluation measures used were accuracy, precision, recall and F1-score. Results demonstrate that the proposed model outperforms other models. }, year = {2024} }
TY - JOUR T1 - Modified XGBoost Hyper-Parameter Tuning Using Adaptive Particle Swarm Optimization for Credit Score Classification AU - Kenneth Kiprotich Langat AU - Anthony Gichuhi Waititu AU - Philip Odhiambo Ngare Y1 - 2024/10/31 PY - 2024 N1 - https://doi.org/10.11648/j.mlr.20240902.15 DO - 10.11648/j.mlr.20240902.15 T2 - Machine Learning Research JF - Machine Learning Research JO - Machine Learning Research SP - 64 EP - 74 PB - Science Publishing Group SN - 2637-5680 UR - https://doi.org/10.11648/j.mlr.20240902.15 AB - In recent years credit scoring has become a challenging issue among financial institutions. Several researchers have dedicated efforts in machine learning in the areas of credit scoring and results have shown that machine learning algorithms have had a satisfactory performance in the sector of credit scoring. Decision trees have been used for data sets that have high dimension and have a complex correlation and the benefits of feature combination and feature selection has led to the usage of decision trees in classification. The disadvantage of decision tree which is overfitting has led to the introduction of extreme gradient boosting that overcomes the shortcoming by integrating tree models. Employing optimization method helps in tuning the hyperparameters of the model. In this paper, a modified XGBoost model is developed that incorporates inflation parameter. In addition to the proposed model, the study uses adaptive particle swarm optimization since it does not fall into local optima. The swarm split algorithm uses clustering and two learning strategies to promote subswarm diversity and avoid local optimums. In this study the modified XGBoost model was compared to five traditional machine learning algorithms namely, the standard XGBoost model, logistic regression, KNN, support vector machine and decision tree. The study used one data set in credit scoring and the evaluation measures used were accuracy, precision, recall and F1-score. Results demonstrate that the proposed model outperforms other models. VL - 9 IS - 2 ER -