288
v1v2 (latest)

A weighted-likelihood framework for class imbalance in Bayesian prediction models

Main:6 Pages
3 Figures
Bibliography:3 Pages
3 Tables
Abstract

Class imbalance is a pervasive problem in predictive toxicology, where the number of non-toxic compounds often exceeds the number of toxic ones. Models trained on such data often perform well on the majority class but poorly on the minority class, which is most relevant for safety assessment. We propose a simple and general Bayesian framework that addresses class imbalance by modifying the likelihood function. Each observation's likelihood is raised to a power inversely proportional to its class proportion, with the weights normalized to preserve the overall information content. This weighted-likelihood (or power-likelihood) approach embeds cost-sensitive learning directly into Bayesian updating. The method is demonstrated using simulated binary data and an ordered logistic model for drug-induced liver injury (DILI). Weighting alters parameter estimates and decision boundaries, improving balanced accuracy and sensitivity for the minority (toxic) class. The approach can be implemented with minimal changes in standard probabilistic programming languages such as Stan, PyMC, andthis http URL. This framework provides an easily extensible foundation for developing Bayesian prediction models that better reflect the asymmetric costs of safety-critical decisions.

View on arXiv
Comments on this paper