|国家预印本平台
首页|omparative Evaluation and Comprehensive Analysis of Machine Learning Models for Regression Problems

omparative Evaluation and Comprehensive Analysis of Machine Learning Models for Regression Problems

omparative Evaluation and Comprehensive Analysis of Machine Learning Models for Regression Problems

中文摘要英文摘要

rtificial intelligence and machine learning applications are of significant importance almost in every fieldof human life to solve problems or support human experts. However, the determination of the machinelearning model to achieve a superior result for a particular problem within the wide real-life application areasis still a challenging task for researchers. The success of a model could be affected by several factors such asdataset characteristics, training strategy and model responses. Therefore, a comprehensive analysis is requiredto determine model ability and the efficiency of the considered strategies. This study implemented tenbenchmark machine learning models on seventeen varied datasets. Experiments are performed using fourdifferent training strategies 60:40, 70:30, and 80:20 hold-out and five-fold cross-validation techniques.We used three evaluation metrics to evaluate the experimental results: mean squared error, mean absoluteerror, and coefficient of determination (R2 score). The considered models are analyzed, and each model'sadvantages, disadvantages, and data dependencies are indicated. As a result of performed excess number ofexperiments, the deep Long-Short Term Memory (LSTM) neural network outperformed other consideredmodels, namely, decision tree, linear regression, support vector regression with a linear and radial basisfunction kernels, random forest, gradient boosting, extreme gradient boosting, shallow neural network, anddeep neural network. It has also been shown that cross-validation has a tremendous impact on the results ofthe experiments and should be considered for the model evaluation in regression studies where data miningor selection is not performed.

rtificial intelligence and machine learning applications are of significant importance almost in every fieldof human life to solve problems or support human experts. However, the determination of the machinelearning model to achieve a superior result for a particular problem within the wide real-life application areasis still a challenging task for researchers. The success of a model could be affected by several factors such asdataset characteristics, training strategy and model responses. Therefore, a comprehensive analysis is requiredto determine model ability and the efficiency of the considered strategies. This study implemented tenbenchmark machine learning models on seventeen varied datasets. Experiments are performed using fourdifferent training strategies 60:40, 70:30, and 80:20 hold-out and five-fold cross-validation techniques.We used three evaluation metrics to evaluate the experimental results: mean squared error, mean absoluteerror, and coefficient of determination (R2 score). The considered models are analyzed, and each model'sadvantages, disadvantages, and data dependencies are indicated. As a result of performed excess number ofexperiments, the deep Long-Short Term Memory (LSTM) neural network outperformed other consideredmodels, namely, decision tree, linear regression, support vector regression with a linear and radial basisfunction kernels, random forest, gradient boosting, extreme gradient boosting, shallow neural network, anddeep neural network. It has also been shown that cross-validation has a tremendous impact on the results ofthe experiments and should be considered for the model evaluation in regression studies where data miningor selection is not performed.

Kamil, Dimililer、Fadi, Al-Turjman、Boran, Sekeroglu、 Yoney, Kirsal Ever

10.12074/202211.00424V1

计算技术、计算机技术

Machine learningRegressionComparative evaluationAnalysisValidation

Machine learningRegressionComparative evaluationAnalysisValidation

Kamil, Dimililer,Fadi, Al-Turjman,Boran, Sekeroglu, Yoney, Kirsal Ever.omparative Evaluation and Comprehensive Analysis of Machine Learning Models for Regression Problems[EB/OL].(2022-11-28)[2025-08-03].https://chinaxiv.org/abs/202211.00424.点此复制

评论