ORIGINAL ARTICLE

Meta-heuristic optimization algorithm for predicting software defects

Corresponding Author

Mahmoud A. Elsabagh

[email protected]

orcid.org/0000-0001-8704-3887

Department of Machine Learning and Information Retrieval, Faculty of Artificial Intelligence, Kafrelsheikh University, Kafrelsheikh, Egypt

Correspondence

Mahmoud A. Elsabagh, Department of Machine Learning and Information Retrieval, Artificial Intelligence, Kafrelsheikh University, 33511 Kafrelsheikh, Egypt.

Email: [email protected]

Search for more papers by this author

Marwa S. Farhan,

Marwa S. Farhan

Department of Information Systems, Faculty of Computers and Artificial Intelligence, Helwan University, Cairo, Egypt

Department of Information Systems, Faculty of Informatics and Computer Science, British University in Egypt, Cairo, Egypt

Search for more papers by this author

Mona G. Gafar,

Mona G. Gafar

orcid.org/0000-0001-7592-840X

Department of Machine Learning and Information Retrieval, Faculty of Artificial Intelligence, Kafrelsheikh University, Kafrelsheikh, Egypt

Department of Computer Science, College of Science and Humanities in Al-Sulail, Prince Sattam bin Abdulaziz University, Kharj, Saudi Arabia

Search for more papers by this author

Mahmoud A. Elsabagh,

Corresponding Author

Mahmoud A. Elsabagh

[email protected]

orcid.org/0000-0001-8704-3887

Department of Machine Learning and Information Retrieval, Faculty of Artificial Intelligence, Kafrelsheikh University, Kafrelsheikh, Egypt

Correspondence

Mahmoud A. Elsabagh, Department of Machine Learning and Information Retrieval, Artificial Intelligence, Kafrelsheikh University, 33511 Kafrelsheikh, Egypt.

Email: [email protected]

Search for more papers by this author

Marwa S. Farhan,

Marwa S. Farhan

Department of Information Systems, Faculty of Computers and Artificial Intelligence, Helwan University, Cairo, Egypt

Department of Information Systems, Faculty of Informatics and Computer Science, British University in Egypt, Cairo, Egypt

Search for more papers by this author

Mona G. Gafar,

Mona G. Gafar

orcid.org/0000-0001-7592-840X

Department of Machine Learning and Information Retrieval, Faculty of Artificial Intelligence, Kafrelsheikh University, Kafrelsheikh, Egypt

Department of Computer Science, College of Science and Humanities in Al-Sulail, Prince Sattam bin Abdulaziz University, Kharj, Saudi Arabia

Search for more papers by this author

First published: 10 August 2021

https://doi.org/10.1111/exsy.12768

Correction added on 25 August 2021, after first online publication: Affiliations 2 and 3 have been corrected in this version.

Share a link

Email
Wechat
Bluesky

Abstract

Software engineering companies strive to improve software quality by predicting software defects-prone modules. Although various data mining methods have been developed, unstable accuracy rates are still critical issues owing to the imbalanced nature and high dimensionality of software defect datasets. To deal with this issue, we propose a spotted hyena, a novel meta-heuristic optimization algorithm for predicting software defects. Support and confidence in classification rules are the basis of a multi-objective fitness function that assists the spotted hyena algorithm in serving as a classifier by finding the fittest classification or standard rules among individuals. Experiments were conducted on four NASA software datasets, JM1, KC2, KC1, and PC3. The spotted hyena classifier provides an accuracy of 85.2, 84, 89.6, and 81.8%, respectively, for these datasets. These accuracy rates are better than those achieved using other popular data mining techniques. We also discuss other classification measures in connection with the experimental results, such as precision, recall, and confusion matrices, in connection with the experimental results. Moreover, the Gaussian mixture model is used to study the uncertainty quantification of the proposed classifier. The study proved the feasible performance of the spotted hyena classifier in four different case studies.

CONFLICT OF INTEREST

The authors declare there is no conflict of interest.

Open Research

DATA AVAILABILITY STATEMENT

Research data are not shared.

REFERENCES

Abdar, M., Acharya, U. R., Sarrafzadegan, N., & Makarenkov, V. (2019). NE-nu-SVC: A new nested ensemble clinical decision support system for effective diagnosis of coronary artery disease. IEEE Access, 7, 167605–167620.
10.1109/ACCESS.2019.2953920
Web of Science® Google Scholar
Abdar, M., Pourpanah, F., Hussain, S., Rezazadegan, D., Liu, L., Ghavamzadeh, M., Fieguth, P., Cao, X., Khosravi, A., Acharya, U. R., Makarenkov, V., & Nahavandia, S. (2021). A review of uncertainty quantification in deep learning: Techniques, applications and challenges. Information Fusion., 76, 243–297.
10.1016/j.inffus.2021.05.008
Web of Science® Google Scholar
Alaia, E. B., Harbaoui, I., Borne, P., & Bouchriha, H. (2018). A comparative study of the PSO and GA for the m-MDPDPTW. International Journal of Computers Communications & Control, 13(1), 8–23.
10.15837/ijccc.2018.1.2970
Web of Science® Google Scholar
Aljamaan, H. I., & Elish, M. O. (2009). An empirical study of bagging and boosting ensembles for identifying faulty classes in object-oriented software. IEEE Symposium on Computational Intelligence and Data Mining, 2009, 187–194.
10.1109/CIDM.2009.4938648
Google Scholar
Bansal, J. C. (2019). Particle swarm optimization. In J. C. Bansal (ed.), Evolutionary and swarm intelligence algorithms (pp. 11–23). Springer.
10.1007/978-3-319-91341-4_2
Google Scholar
Cartis, C., & Scheinberg, K. (2018). Global convergence rate analysis of unconstrained optimization methods based on probabilistic models. Mathematical Programming, 169(2), 337–375.
10.1007/s10107-017-1137-4
Web of Science® Google Scholar
Dam, H. K., Pham, T., Ng, S. W., Tran, T., Grundy, J., Ghose, A., Kim, T., & Kim, C.-J. (2018). A deep tree-based model for software defect prediction. ArXiv, 1802, 00921.
Google Scholar
Das, S., & Saha, P. (2021). Performance of swarm intelligence based chaotic meta-heuristic algorithms in civil structural health monitoring. Measurement, 169, 108533.
10.1016/j.measurement.2020.108533
Web of Science® Google Scholar
Deb, K. (2014). Multi-objective optimization. In K. Deb (ed.), Search methodologies (pp. 403–449). Springer.
10.1007/978-1-4614-6940-7_15
Google Scholar
Dhiman, G., & Kumar, V. (2017). Spotted hyena optimizer: A novel bio-inspired based metaheuristic technique for engineering applications. Advances in Engineering Software, 114, 48–70.
10.1016/j.advengsoft.2017.05.014
Web of Science® Google Scholar
Eid, H. F., Garcia-Hernandez, L., & Abraham, A. (2021). Spiral water cycle algorithm for solving multi-objective optimization and truss optimization problems. Engineering with Computers, 2021, 1–11.
Google Scholar
Elsabagh, M. A., Farhan, M. S., & Gafar, M. G. (2020). Cross-projects software defect prediction using spotted hyena optimizer algorithm. SN Applied Sciences, 2(4), 538. https://doi.org/10.1007/s42452-020-2320-4
10.1007/s42452-020-2320-4
Web of Science® Google Scholar
Gandomi, A. H., Yang, X.-S., Talatahari, S., & Alavi, A. H. (2013). Metaheuristic algorithms in modeling and optimization. Metaheuristic Applications in Structures and Infrastructures, 2013, 1–24.
Google Scholar
Hammouri, A., Hammad, M., Alnabhan, M., & Alsarayrah, F. (2018). Software bug prediction using machine learning approach. International Journal of Advanced Computer Science and Applications, 9(2), 78–83.
10.14569/IJACSA.2018.090212
Web of Science® Google Scholar
Hosseinabadi, A. A. R., Vahidi, J., Saemi, B., Sangaiah, A. K., & Elhoseny, M. (2019). Extended genetic algorithm for solving open-shop scheduling problem. Soft Computing, 23(13), 5099–5116.
10.1007/s00500-018-3177-y
Web of Science® Google Scholar
Jayanthi, R., & Florence, L. (2019). Software defect prediction techniques using metrics based on neural network classifier. Cluster Computing, 22(1), 77–88.
10.1007/s10586-018-1730-1
Web of Science® Google Scholar
Kaur, R., & Sharma, E. S. (2018). Various techniques to detect and predict faults in software system: Survey. International Journal of Future Revolution in Computer Science & Communication Engineering (IJFRSCE), 4(2), 330–336.
Google Scholar
Khan, B., Naseem, R., Shah, M. A., Wakil, K., Khan, A., Uddin, M. I., & Mahmoud, M. (2021). Software defect prediction for healthcare big data: An empirical evaluation of machine learning techniques. Journal of Healthcare Engineering, 2021, 1–16.
Web of Science® Google Scholar
Khan, M. Z. (2020). Hybrid ensemble learning technique for software defect prediction. International Journal of Modern Education and Computer Science, 12(1), 1–10.
10.5815/ijmecs.2020.01.01
Google Scholar
Kimaev, G., Chaffart, D., & Ricardez-Sandoval, L. A. (2020). Multilevel Monte Carlo applied for uncertainty quantification in stochastic multiscale systems. AICHE Journal, 66(8), e16262.
10.1002/aic.16262
CAS Web of Science® Google Scholar
Kompa, B., Snoek, J., & Beam, A. L. (2021). Second opinion needed: Communicating uncertainty in medical machine learning. NPJ Digital Medicine, 4(1), 1–6.
10.1038/s41746-020-00367-3
PubMed Web of Science® Google Scholar
Koru, A. G., & Liu, H. (2005). Building effective defect-prediction models in practice. IEEE Software, 22(6), 23–29.
10.1109/MS.2005.149
Web of Science® Google Scholar
Kumar, V., Chhabra, J. K., & Kumar, D. (2014). Parameter adaptive harmony search algorithm for unimodal and multimodal optimization problems. Journal of Computational Science, 5(2), 144–155.
10.1016/j.jocs.2013.12.001
Web of Science® Google Scholar
Kumar, V., & Kaur, A. (2020). Binary spotted hyena optimizer and its application to feature selection. Journal of Ambient Intelligence and Humanized Computing, 11(7), 2625–2645.
10.1007/s12652-019-01324-z
Web of Science® Google Scholar
Kuncheva, L. I., Skurichina, M., & Duin, R. P. W. (2002). An experimental study on diversity for bagging and boosting with linear classifiers. Information Fusion, 3(4), 245–258.
10.1016/S1566-2535(02)00093-3
Google Scholar
Lai, M. H. C. (2020). Bootstrap confidence intervals for multilevel standardized effect size. Multivariate Behavioral Research, 2020, 1–21.
Google Scholar
MATLAB. (2016). version 7.10.0 (R2016a). Natick, Massachusetts: The MathWorks Inc.
Google Scholar
Mierswa, I, & Klinkenberg, R. (2018). RapidMiner Studio (9.1) [Data science, machine learning, predictive analytics]. Retrieved from https://rapidminer.com/
Google Scholar
Nam, J. (2014). Survey on software defect prediction. Department of Compter Science and Engineerning, The Hong Kong University of Science and Technology, Techcnical Report.
Google Scholar
Okutan, A., & Yıldız, O. T. (2014). Software defect prediction using Bayesian networks. Empirical Software Engineering, 19(1), 154–181.
10.1007/s10664-012-9218-8
Web of Science® Google Scholar
Pławiak Pawełand Abdar, M., Pławiak, J., Makarenkov, V., & Acharya, U. R. (2020). DGHNL: A new deep genetic hierarchical network of learners for prediction of credit scoring. Information Sciences, 516, 401–418.
10.1016/j.ins.2019.12.045
Web of Science® Google Scholar
Pourpanah, F., Lim, C. P., & Saleh, J. M. (2016). A hybrid model of fuzzy ARTMAP and genetic algorithm for data classification and rule extraction. Expert Systems with Applications, 49, 74–85.
10.1016/j.eswa.2015.11.009
Web of Science® Google Scholar
Qodmanan, H. R., Nasiri, M., & Minaei-Bidgoli, B. (2011). Multi objective association rule mining with genetic algorithm without specifying minimum support and minimum confidence. Expert Systems with Applications, 38(1), 288–298.
10.1016/j.eswa.2010.06.060
Web of Science® Google Scholar
Raukas, H. (n.d.). Some approaches for software defect prediction.
Google Scholar
Ren, J., Qin, K., Ma, Y., & Luo, G. (2014). On software defect prediction using machine learning. Journal of Applied Mathematics, 2014, 2014.
10.1155/2014/785435
Google Scholar
Riaz, M., Davvaz, B., Firdous, A., & Fakhar, A. (2019). Novel concepts of soft rough set topology with applications. Journal of Intelligent \& Fuzzy Systems, 36(4), 3579–3590.
10.3233/JIFS-181648
Web of Science® Google Scholar
Sammut, C., & Webb, G. I. (2011). Encyclopedia of machine learning. Springer Science & Business Media.
Google Scholar
Seifert, E. (2014). OriginPro 9.1: Scientific data analysis and graphing software software review. ACS Publications.
Google Scholar
Shafiullah, D. S., Vergara, P. P., Haque, A., Nguyen, P. H., & Pemen, A. J. M. (2020). Gaussian mixture based uncertainty modeling to optimize energy management of heterogeneous building neighborhoods: A case study of a Dutch university medical campus. Energy and Buildings, 224, 110150.
10.1016/j.enbuild.2020.110150
Web of Science® Google Scholar
Shan, C., Chen, B., Hu, C., Xue, J., & Li, N. (2014). Software defect prediction model based on LLE and SVM.
Google Scholar
Singh, P. D., & Chug, A. (2017). Software defect prediction analysis using machine learning algorithms (pp. 775–781). 2017 7th International Conference on Cloud Computing, Data Science & Engineering-Confluence.
Google Scholar
Software Defect Dataset: OpenML (n.d.). Retrieved from https://www.openml.org/search?type=data
Google Scholar
Software Defect Dataset: Promise repository. (n.d.). Retrieved from http://promise.site.uottawa.ca/SERepository/dataetspage.html
Google Scholar
Srivastav, A., Tewari, A., & Dong, B. (2013). Baseline building energy modeling and localized uncertainty quantification using Gaussian mixture models. Energy and Buildings, 65, 438–447. https://doi.org/10.1016/j.enbuild.2013.05.037
10.1016/j.enbuild.2013.05.037
Web of Science® Google Scholar
Wang, S., Rao, R. V., Chen, P., Zhang, Y., Liu, A., & Wei, L. (2017). Abnormal breast detection in mammogram images by feed-forward neural network trained by Jaya algorithm. Fundamenta Informaticae, 151(1–4), 191–211.
10.3233/FI-2017-1487
Web of Science® Google Scholar
Wei, H., Hu, C., Chen, S., Xue, Y., & Zhang, Q. (2019). Establishing a software defect prediction model via effective dimension reduction. Information Sciences, 477, 399–409.
10.1016/j.ins.2018.10.056
Web of Science® Google Scholar
Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2016). Data mining: Practical machine learning tools and techniques. Morgan Kaufmann.
Google Scholar
Xu, Z., Xuan, J., Liu, J., & Cui, X. (2016). Michac: Defect prediction via feature selection based on maximal information coefficient with hierarchical agglomerative clustering (pp. 370–381). 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).
Google Scholar
Zadeh, L. A., & Aliev, R. A. (2018). Fuzzy logic theory and applications: Part I and part II. World Scientific Publishing.
10.1142/10936
Google Scholar
Zhang, Y., Wang, S., Ji, G., & Phillips, P. (2014). Fruit classification using computer vision and feedforward neural network. Journal of Food Engineering, 143, 167–177.
10.1016/j.jfoodeng.2014.07.001
Web of Science® Google Scholar

Volume38, Issue8

December 2021

e12768

Meta-heuristic optimization algorithm for predicting software defects

Abstract

CONFLICT OF INTEREST

Open Research

DATA AVAILABILITY STATEMENT

REFERENCES

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Meta-heuristic optimization algorithm for predicting software defects

Abstract

CONFLICT OF INTEREST

Open Research

DATA AVAILABILITY STATEMENT

REFERENCES

References

Related

Information