Meta-heuristic optimization algorithm for predicting software defects
Corresponding Author
Mahmoud A. Elsabagh
Department of Machine Learning and Information Retrieval, Faculty of Artificial Intelligence, Kafrelsheikh University, Kafrelsheikh, Egypt
Correspondence
Mahmoud A. Elsabagh, Department of Machine Learning and Information Retrieval, Artificial Intelligence, Kafrelsheikh University, 33511 Kafrelsheikh, Egypt.
Email: [email protected]
Search for more papers by this authorMarwa S. Farhan
Department of Information Systems, Faculty of Computers and Artificial Intelligence, Helwan University, Cairo, Egypt
Department of Information Systems, Faculty of Informatics and Computer Science, British University in Egypt, Cairo, Egypt
Search for more papers by this authorMona G. Gafar
Department of Machine Learning and Information Retrieval, Faculty of Artificial Intelligence, Kafrelsheikh University, Kafrelsheikh, Egypt
Department of Computer Science, College of Science and Humanities in Al-Sulail, Prince Sattam bin Abdulaziz University, Kharj, Saudi Arabia
Search for more papers by this authorCorresponding Author
Mahmoud A. Elsabagh
Department of Machine Learning and Information Retrieval, Faculty of Artificial Intelligence, Kafrelsheikh University, Kafrelsheikh, Egypt
Correspondence
Mahmoud A. Elsabagh, Department of Machine Learning and Information Retrieval, Artificial Intelligence, Kafrelsheikh University, 33511 Kafrelsheikh, Egypt.
Email: [email protected]
Search for more papers by this authorMarwa S. Farhan
Department of Information Systems, Faculty of Computers and Artificial Intelligence, Helwan University, Cairo, Egypt
Department of Information Systems, Faculty of Informatics and Computer Science, British University in Egypt, Cairo, Egypt
Search for more papers by this authorMona G. Gafar
Department of Machine Learning and Information Retrieval, Faculty of Artificial Intelligence, Kafrelsheikh University, Kafrelsheikh, Egypt
Department of Computer Science, College of Science and Humanities in Al-Sulail, Prince Sattam bin Abdulaziz University, Kharj, Saudi Arabia
Search for more papers by this authorCorrection added on 25 August 2021, after first online publication: Affiliations 2 and 3 have been corrected in this version.
Abstract
Software engineering companies strive to improve software quality by predicting software defects-prone modules. Although various data mining methods have been developed, unstable accuracy rates are still critical issues owing to the imbalanced nature and high dimensionality of software defect datasets. To deal with this issue, we propose a spotted hyena, a novel meta-heuristic optimization algorithm for predicting software defects. Support and confidence in classification rules are the basis of a multi-objective fitness function that assists the spotted hyena algorithm in serving as a classifier by finding the fittest classification or standard rules among individuals. Experiments were conducted on four NASA software datasets, JM1, KC2, KC1, and PC3. The spotted hyena classifier provides an accuracy of 85.2, 84, 89.6, and 81.8%, respectively, for these datasets. These accuracy rates are better than those achieved using other popular data mining techniques. We also discuss other classification measures in connection with the experimental results, such as precision, recall, and confusion matrices, in connection with the experimental results. Moreover, the Gaussian mixture model is used to study the uncertainty quantification of the proposed classifier. The study proved the feasible performance of the spotted hyena classifier in four different case studies.
CONFLICT OF INTEREST
The authors declare there is no conflict of interest.
Open Research
DATA AVAILABILITY STATEMENT
Research data are not shared.
REFERENCES
- Abdar, M., Acharya, U. R., Sarrafzadegan, N., & Makarenkov, V. (2019). NE-nu-SVC: A new nested ensemble clinical decision support system for effective diagnosis of coronary artery disease. IEEE Access, 7, 167605–167620.
- Abdar, M., Pourpanah, F., Hussain, S., Rezazadegan, D., Liu, L., Ghavamzadeh, M., Fieguth, P., Cao, X., Khosravi, A., Acharya, U. R., Makarenkov, V., & Nahavandia, S. (2021). A review of uncertainty quantification in deep learning: Techniques, applications and challenges. Information Fusion., 76, 243–297.
- Alaia, E. B., Harbaoui, I., Borne, P., & Bouchriha, H. (2018). A comparative study of the PSO and GA for the m-MDPDPTW. International Journal of Computers Communications & Control, 13(1), 8–23.
- Aljamaan, H. I., & Elish, M. O. (2009). An empirical study of bagging and boosting ensembles for identifying faulty classes in object-oriented software. IEEE Symposium on Computational Intelligence and Data Mining, 2009, 187–194.
10.1109/CIDM.2009.4938648 Google Scholar
- Bansal, J. C. (2019). Particle swarm optimization. In J. C. Bansal (ed.), Evolutionary and swarm intelligence algorithms (pp. 11–23). Springer.
10.1007/978-3-319-91341-4_2 Google Scholar
- Cartis, C., & Scheinberg, K. (2018). Global convergence rate analysis of unconstrained optimization methods based on probabilistic models. Mathematical Programming, 169(2), 337–375.
- Dam, H. K., Pham, T., Ng, S. W., Tran, T., Grundy, J., Ghose, A., Kim, T., & Kim, C.-J. (2018). A deep tree-based model for software defect prediction. ArXiv, 1802, 00921.
- Das, S., & Saha, P. (2021). Performance of swarm intelligence based chaotic meta-heuristic algorithms in civil structural health monitoring. Measurement, 169, 108533.
- Deb, K. (2014). Multi-objective optimization. In K. Deb (ed.), Search methodologies (pp. 403–449). Springer.
10.1007/978-1-4614-6940-7_15 Google Scholar
- Dhiman, G., & Kumar, V. (2017). Spotted hyena optimizer: A novel bio-inspired based metaheuristic technique for engineering applications. Advances in Engineering Software, 114, 48–70.
- Eid, H. F., Garcia-Hernandez, L., & Abraham, A. (2021). Spiral water cycle algorithm for solving multi-objective optimization and truss optimization problems. Engineering with Computers, 2021, 1–11.
- Elsabagh, M. A., Farhan, M. S., & Gafar, M. G. (2020). Cross-projects software defect prediction using spotted hyena optimizer algorithm. SN Applied Sciences, 2(4), 538. https://doi.org/10.1007/s42452-020-2320-4
- Gandomi, A. H., Yang, X.-S., Talatahari, S., & Alavi, A. H. (2013). Metaheuristic algorithms in modeling and optimization. Metaheuristic Applications in Structures and Infrastructures, 2013, 1–24.
- Hammouri, A., Hammad, M., Alnabhan, M., & Alsarayrah, F. (2018). Software bug prediction using machine learning approach. International Journal of Advanced Computer Science and Applications, 9(2), 78–83.
- Hosseinabadi, A. A. R., Vahidi, J., Saemi, B., Sangaiah, A. K., & Elhoseny, M. (2019). Extended genetic algorithm for solving open-shop scheduling problem. Soft Computing, 23(13), 5099–5116.
- Jayanthi, R., & Florence, L. (2019). Software defect prediction techniques using metrics based on neural network classifier. Cluster Computing, 22(1), 77–88.
- Kaur, R., & Sharma, E. S. (2018). Various techniques to detect and predict faults in software system: Survey. International Journal of Future Revolution in Computer Science & Communication Engineering (IJFRSCE), 4(2), 330–336.
- Khan, B., Naseem, R., Shah, M. A., Wakil, K., Khan, A., Uddin, M. I., & Mahmoud, M. (2021). Software defect prediction for healthcare big data: An empirical evaluation of machine learning techniques. Journal of Healthcare Engineering, 2021, 1–16.
- Khan, M. Z. (2020). Hybrid ensemble learning technique for software defect prediction. International Journal of Modern Education and Computer Science, 12(1), 1–10.
10.5815/ijmecs.2020.01.01 Google Scholar
- Kimaev, G., Chaffart, D., & Ricardez-Sandoval, L. A. (2020). Multilevel Monte Carlo applied for uncertainty quantification in stochastic multiscale systems. AICHE Journal, 66(8), e16262.
- Kompa, B., Snoek, J., & Beam, A. L. (2021). Second opinion needed: Communicating uncertainty in medical machine learning. NPJ Digital Medicine, 4(1), 1–6.
- Koru, A. G., & Liu, H. (2005). Building effective defect-prediction models in practice. IEEE Software, 22(6), 23–29.
- Kumar, V., Chhabra, J. K., & Kumar, D. (2014). Parameter adaptive harmony search algorithm for unimodal and multimodal optimization problems. Journal of Computational Science, 5(2), 144–155.
- Kumar, V., & Kaur, A. (2020). Binary spotted hyena optimizer and its application to feature selection. Journal of Ambient Intelligence and Humanized Computing, 11(7), 2625–2645.
- Kuncheva, L. I., Skurichina, M., & Duin, R. P. W. (2002). An experimental study on diversity for bagging and boosting with linear classifiers. Information Fusion, 3(4), 245–258.
10.1016/S1566-2535(02)00093-3 Google Scholar
- Lai, M. H. C. (2020). Bootstrap confidence intervals for multilevel standardized effect size. Multivariate Behavioral Research, 2020, 1–21.
- MATLAB. (2016). version 7.10.0 (R2016a). Natick, Massachusetts: The MathWorks Inc.
- Mierswa, I, & Klinkenberg, R. (2018). RapidMiner Studio (9.1) [Data science, machine learning, predictive analytics]. Retrieved from https://rapidminer.com/
- Nam, J. (2014). Survey on software defect prediction. Department of Compter Science and Engineerning, The Hong Kong University of Science and Technology, Techcnical Report.
- Okutan, A., & Yıldız, O. T. (2014). Software defect prediction using Bayesian networks. Empirical Software Engineering, 19(1), 154–181.
- Pławiak Pawełand Abdar, M., Pławiak, J., Makarenkov, V., & Acharya, U. R. (2020). DGHNL: A new deep genetic hierarchical network of learners for prediction of credit scoring. Information Sciences, 516, 401–418.
- Pourpanah, F., Lim, C. P., & Saleh, J. M. (2016). A hybrid model of fuzzy ARTMAP and genetic algorithm for data classification and rule extraction. Expert Systems with Applications, 49, 74–85.
- Qodmanan, H. R., Nasiri, M., & Minaei-Bidgoli, B. (2011). Multi objective association rule mining with genetic algorithm without specifying minimum support and minimum confidence. Expert Systems with Applications, 38(1), 288–298.
- Raukas, H. (n.d.). Some approaches for software defect prediction.
- Ren, J., Qin, K., Ma, Y., & Luo, G. (2014). On software defect prediction using machine learning. Journal of Applied Mathematics, 2014, 2014.
10.1155/2014/785435 Google Scholar
- Riaz, M., Davvaz, B., Firdous, A., & Fakhar, A. (2019). Novel concepts of soft rough set topology with applications. Journal of Intelligent \& Fuzzy Systems, 36(4), 3579–3590.
- Sammut, C., & Webb, G. I. (2011). Encyclopedia of machine learning. Springer Science & Business Media.
- Seifert, E. (2014). OriginPro 9.1: Scientific data analysis and graphing software software review. ACS Publications.
- Shafiullah, D. S., Vergara, P. P., Haque, A., Nguyen, P. H., & Pemen, A. J. M. (2020). Gaussian mixture based uncertainty modeling to optimize energy management of heterogeneous building neighborhoods: A case study of a Dutch university medical campus. Energy and Buildings, 224, 110150.
- Shan, C., Chen, B., Hu, C., Xue, J., & Li, N. (2014). Software defect prediction model based on LLE and SVM.
- Singh, P. D., & Chug, A. (2017). Software defect prediction analysis using machine learning algorithms (pp. 775–781). 2017 7th International Conference on Cloud Computing, Data Science & Engineering-Confluence.
- Software Defect Dataset: OpenML (n.d.). Retrieved from https://www.openml.org/search?type=data
- Software Defect Dataset: Promise repository. (n.d.). Retrieved from http://promise.site.uottawa.ca/SERepository/dataetspage.html
- Srivastav, A., Tewari, A., & Dong, B. (2013). Baseline building energy modeling and localized uncertainty quantification using Gaussian mixture models. Energy and Buildings, 65, 438–447. https://doi.org/10.1016/j.enbuild.2013.05.037
- Wang, S., Rao, R. V., Chen, P., Zhang, Y., Liu, A., & Wei, L. (2017). Abnormal breast detection in mammogram images by feed-forward neural network trained by Jaya algorithm. Fundamenta Informaticae, 151(1–4), 191–211.
- Wei, H., Hu, C., Chen, S., Xue, Y., & Zhang, Q. (2019). Establishing a software defect prediction model via effective dimension reduction. Information Sciences, 477, 399–409.
- Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2016). Data mining: Practical machine learning tools and techniques. Morgan Kaufmann.
- Xu, Z., Xuan, J., Liu, J., & Cui, X. (2016). Michac: Defect prediction via feature selection based on maximal information coefficient with hierarchical agglomerative clustering (pp. 370–381). 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).
- Zadeh, L. A., & Aliev, R. A. (2018). Fuzzy logic theory and applications: Part I and part II. World Scientific Publishing.
10.1142/10936 Google Scholar
- Zhang, Y., Wang, S., Ji, G., & Phillips, P. (2014). Fruit classification using computer vision and feedforward neural network. Journal of Food Engineering, 143, 167–177.