Volume 38, Issue 8 e12768
ORIGINAL ARTICLE

Meta-heuristic optimization algorithm for predicting software defects

Mahmoud A. Elsabagh

Corresponding Author

Mahmoud A. Elsabagh

Department of Machine Learning and Information Retrieval, Faculty of Artificial Intelligence, Kafrelsheikh University, Kafrelsheikh, Egypt

Correspondence

Mahmoud A. Elsabagh, Department of Machine Learning and Information Retrieval, Artificial Intelligence, Kafrelsheikh University, 33511 Kafrelsheikh, Egypt.

Email: [email protected]

Search for more papers by this author
Marwa S. Farhan

Marwa S. Farhan

Department of Information Systems, Faculty of Computers and Artificial Intelligence, Helwan University, Cairo, Egypt

Department of Information Systems, Faculty of Informatics and Computer Science, British University in Egypt, Cairo, Egypt

Search for more papers by this author
Mona G. Gafar

Mona G. Gafar

Department of Machine Learning and Information Retrieval, Faculty of Artificial Intelligence, Kafrelsheikh University, Kafrelsheikh, Egypt

Department of Computer Science, College of Science and Humanities in Al-Sulail, Prince Sattam bin Abdulaziz University, Kharj, Saudi Arabia

Search for more papers by this author
First published: 10 August 2021

Correction added on 25 August 2021, after first online publication: Affiliations 2 and 3 have been corrected in this version.

Abstract

Software engineering companies strive to improve software quality by predicting software defects-prone modules. Although various data mining methods have been developed, unstable accuracy rates are still critical issues owing to the imbalanced nature and high dimensionality of software defect datasets. To deal with this issue, we propose a spotted hyena, a novel meta-heuristic optimization algorithm for predicting software defects. Support and confidence in classification rules are the basis of a multi-objective fitness function that assists the spotted hyena algorithm in serving as a classifier by finding the fittest classification or standard rules among individuals. Experiments were conducted on four NASA software datasets, JM1, KC2, KC1, and PC3. The spotted hyena classifier provides an accuracy of 85.2, 84, 89.6, and 81.8%, respectively, for these datasets. These accuracy rates are better than those achieved using other popular data mining techniques. We also discuss other classification measures in connection with the experimental results, such as precision, recall, and confusion matrices, in connection with the experimental results. Moreover, the Gaussian mixture model is used to study the uncertainty quantification of the proposed classifier. The study proved the feasible performance of the spotted hyena classifier in four different case studies.

CONFLICT OF INTEREST

The authors declare there is no conflict of interest.

DATA AVAILABILITY STATEMENT

Research data are not shared.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.