首页|When do Random Forests work?

When do Random Forests work?

来源：

英文摘要

We study the effectiveness of randomizing split-directions in random forests. Prior literature has shown that, on the one hand, randomization can reduce variance through decorrelation, and, on the other hand, randomization regularizes and works in low signal-to-noise ratio (SNR) environments. First, we bring together and revisit decorrelation and regularization by presenting a systematic analysis of out-of-sample mean-squared error (MSE) for different SNR scenarios based on commonly-used data-generating processes. We find that variance reduction tends to increase with the SNR and forests outperform bagging when the SNR is low because, in low SNR cases, variance dominates bias for both methods. Second, we show that the effectiveness of randomization is a question that goes beyond the SNR. We present a simulation study with fixed and moderate SNR, in which we examine the effectiveness of randomization for other data characteristics. In particular, we find that (i) randomization can increase bias in the presence of fat tails in the distribution of covariates; (ii) in the presence of irrelevant covariates randomization is ineffective because bias dominates variance; and (iii) when covariates are mutually correlated randomization tends to be effective because variance dominates bias. Beyond randomization, we find that, for both bagging and random forests, bias can be significantly reduced in the presence of correlated covariates. This last finding goes beyond the prevailing view that averaging mostly works by variance reduction. Given that in practice covariates are often correlated, our findings on correlated covariates could open the way for a better understanding of why random forests work well in many applications.

作者：C. Revelas、O. Boldea、B. J. M. Werker

作者单位：

学科分类：计算技术、计算机技术

推荐引用：C. Revelas,O. Boldea,B. J. M. Werker.When do Random Forests work?[EB/OL].(2025-04-17)[2025-05-05].https://arxiv.org/abs/2504.12860.点此复制

When do Random Forests work?

When do Random Forests work?

评论