基于机器学习的银行客户流失预测

Bank Customer Churn Prediction Based on Machine Learning

邓颖凡

摘要：从大体量的、可读性差的数据中发现有价值的知识，无疑是计算机技术在应用领域的最大亮点，了解客户忠诚度是任何业务的重要组成部分，提前预测潜在流失客户可以帮助企业启用早期干预流程有效地预防和控制。在流失预测、欺诈识别、疾病检测、事故监控等领域，数据失衡的情况时有发生，绝大多数机器学习算法都不能从不平衡数据中正确学习。本文基于Stacking技术提出了一种异源集成算法G_R_L_D，应用于不平衡数据构建的客户流失预测模型，准确率达87%、精确率达79%，优于其基准分类器及绝大多数机器学习算法，有助于对抗数据失衡。另外，采用SMOTE-ENN采样，有效处理了银行客户流失数据的不平衡特性，提高召回率的同时保持了高精度，并进一步将G_R_L_D模型及其基准分类器的准确率提高了3%以上。

学科分类：财政、金融计算技术、计算机技术

中文关键词：计算机应用技术机器学习流失预测不平衡数据

推荐引用：邓颖凡.基于机器学习的银行客户流失预测[EB/OL].(2020-04-23)[2025-11-05].http://www.paper.edu.cn/releasepaper/content/202004-240.点此复制

Abstract：It is undoubtedly the biggest bright spot of computer technology in the application field to find valuable knowledge from large and poorly readable data, understanding customer loyalty is an important part of any business, and predicting potential churn customers in advance can help companies enable early intervention processes to effectively prevent and control. In areas such as churn prediction, fraud identification, disease detection, and accident monitoring, imbalanced data occur from time to time, which may prevent most algorithms from learning correctly. This paper proposes a heterogeneous integration algorithm G_R_L_D based on Stacking technology, which is applied to the customer churn prediction model constructed by unbalanced data, the accuracy is 87% and the precision is79%,which is better than its base classifiers and most other algorithms, which helps to combat data imbalance.In addition, SMOTE-ENN sampling is used to effectively deal with the unbalanced characteristics of bank customer churn data, improve the recall while maintaining high precision, further improve the accuracyof the G_R_L_D and its base classifiers by more than 3%.

Keywords：Computer Application TechnologyMachine LearningChurn PredictionUnbalanced Data

展开英文信息

基于机器学习的银行客户流失预测

Bank Customer Churn Prediction Based on Machine Learning

评论