Volume 66, Issue 8 e70004
RESEARCH ARTICLE
Open Data

Addressing Class Imbalance in Bayesian Classification Through Posterior Probability Adjustment

Vahid Nassiri

Corresponding Author

Vahid Nassiri

Open Analytics, Antwerp, Belgium

Correspondence: Vahid Nassiri ([email protected])

Search for more papers by this author
Fetene Tekle

Fetene Tekle

Johnson & Johnson Innovative Medicine, Discovery Statistics, Beerse, Belgium

Search for more papers by this author
Kanaka Tatikola

Kanaka Tatikola

Johnson & Johnson Innovative Medicine, Discovery Statistics, Beerse, Belgium

Johnson & Johnson Innovative Medicine, Discovery Statistics, Raritan, New Jersey, USA

Search for more papers by this author
Helena Geys

Helena Geys

Johnson & Johnson Innovative Medicine, Discovery Statistics, Beerse, Belgium

Search for more papers by this author
First published: 18 November 2024

ABSTRACT

Class imbalance is a known issue in classification tasks that can lead to predictive bias toward dominant classes. This paper introduces a novel straightforward Bayesian framework that adjusts posterior probabilities to counteract the bias introduced by imbalanced data sets. Instead of relying on the mean posterior distribution of class probabilities, we propose a method that scales the posterior probability of each class according to their representation in the training data.

Conflicts of Interest

The authors declare no conflicts of interest.

Open Research Badges

Open Data

Data Availability Statement

The data that support the findings of this study are available in the Supporting Information of this article.

This article has earned an Open Data badge for making publicly available the digitally-shareable data necessary to reproduce the reported results. The data is available in the Supporting Information section.

This article has earned an open data badge “Reproducible Research” for making publicly available the code necessary to reproduce the reported results. The results reported in this article could fully be reproduced.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.