|国家预印本平台
首页|Test Set Sizing for the Ridge Regression

Test Set Sizing for the Ridge Regression

Test Set Sizing for the Ridge Regression

来源:Arxiv_logoArxiv
英文摘要

We derive the ideal train/test split for the ridge regression to high accuracy in the limit that the number of training rows m becomes large. The split must depend on the ridge tuning parameter, alpha, but we find that the dependence is weak and can asymptotically be ignored; all parameters vanish except for m and the number of features, n. This is the first time that such a split is calculated mathematically for a machine learning model in the large data limit. The goal of the calculations is to maximize "integrity," so that the measured error in the trained model is as close as possible to what it theoretically should be. This paper's result for the ridge regression split matches prior art for the plain vanilla linear regression split to the first two terms asymptotically, and it appears that practically there is no difference.

计算技术、计算机技术

.Test Set Sizing for the Ridge Regression[EB/OL].(2025-04-27)[2025-05-15].https://arxiv.org/abs/2504.19231.点此复制

评论