Minority oversampling based on the attraction-repulsion Weber problem
Present Address:
Ugo Fiore, Via Gen. Parisi, 13, 80133 Naples, Italy
Summary
Learning on imbalanced datasets, where one class is underrepresented, is problematic and important at the same time. On the one hand, a limited number of positive examples restricts the generalization ability of classifiers. On the other hand, often, the class of interest is such exactly because it is rare. The Synthetic Minority Oversampling TEchnique (SMOTE) is a preprocessing method that creates new synthetic examples by interpolating between neighboring instances. In this work, an enhancement to SMOTE is proposed, which characterizes synthetic instances as solutions of attraction-repulsion problems among the neighboring data points. Experimental evaluation shows an improvement in the positive predictive power of classification.