A Weighted-likelihood framework for class imbalance in Bayesian prediction models
A Weighted-likelihood framework for class imbalance in Bayesian prediction models
Class imbalance occurs when data used for training classification models has a different number of observations or samples within each category or class. Models built on such data can be biased towards the majority class and have poor predictive performance and generalisation for the minority class. We propose a Bayesian weighted-likelihood (power-likelihood) approach to deal with class imbalance: each observation's likelihood is raised to a weight inversely proportional to its class proportion, with weights normalized to sum to the number of samples. This embeds cost-sensitive learning directly into Bayesian updating and is applicable to binary, multinomial and ordered logistic prediction models. Example models are implemented in Stan, PyMC, and Turing.jl, and all code and reproducible scripts are archived on Github: https://github.com/stanlazic/weighted_likelihoods. This approach is simple to implement and extends naturally to arbitrary error-cost matrices.
Stanley E. Lazic
计算技术、计算机技术
Stanley E. Lazic.A Weighted-likelihood framework for class imbalance in Bayesian prediction models[EB/OL].(2025-04-23)[2025-05-24].https://arxiv.org/abs/2504.17013.点此复制
评论