Although open application programming interfaces (APIs) have been improved by advancements in the software industry, diverse types of malicious code have also increased. Thus, many studies have been conducted to characterize the behavior of malicious code based on API data and to determine whether malicious code is included in a specific executable file. Existing methods detect malicious code by analyzing signature data. To detect mutated malicious code in this manner requires a lot of time and has a high false detection rate (see “Detection of malicious code using the FP-growth algorithm and SVM,” a paper presented at The First International Conference on Software and Smart Convergence, 2017). Herein, we propose a method that analyzes and detects malicious code using association rule mining and a support vector machine (SVM). The proposed method reduces the false detection rate by mining the rules of malicious and normal code APIs in the portable executable (PE) file, grouping patterns using the direct hashing and pruning (DHP) algorithm, and classifying malicious and normal files using the SVM. The study shows that sensitivity was 71% and precision was 77% when using a single SVM model. Using the association rules and SVM model, the sensitivity was increased to 77% and the precision to 81%.

REFERENCES

1 National Intelligence Service. Global Open Data, Now. Seoul, South Korea: NIS; 2019; 12: 1-46.
Google Scholar
2Kim H-N, Park J-K, Won Y-H. A study on the malware realtime analysis systems using the finite automata. J Korea Soc Comput Inf. 2013; 18(5): 69-76.
10.9708/jksci.2013.18.5.069
Google Scholar
3Wang Y, Cai W-D, Wei P-C. A deep learning approach for detecting malicious JavaScript code. Secur Commun Netw. 2016; 9(11): 1520-1534.
10.1002/sec.1441
CAS Web of Science® Google Scholar
4Monnappa KA. Learning Malware Analysis: Explore the Concepts, Tools, and Techniques to Analyze and Investigate Windows Malware. Birmingham, UK: Packt Publishing; 2018.
Google Scholar
5Karyotis V, Khouzani MHR. Malware Diffusion Models for Modern Complex Networks: Theory and Applications. Cambridge, MA: Elsevier; 2016.
Google Scholar
6 Cisco. What is the difference: viruses, worms, trojans, and bots? Cisco 2018 Annual Cybersecurity Report. 2017.
Google Scholar
7 Symantec. What is the difference between viruses, worms, and trojans? TECH98539. 2016.
Google Scholar
8Shalaginov A, Banin S, Dehghantanha A, Franke K. Machine learning aided static malware analysis: a survey and tutorial. In: Cyber Threat Intelligence. Cham, Switzerland: Springer; 2018; 70: 7-46. Advances in Information Security.
10.1007/978-3-319-73951-9_2
Google Scholar
9Kim JW. A Study on Machine Learning-Based Ransomware Detection Model Using Hybrid Analysis [master's thesis]. Seoul, South Korea: Konkuk University; 2016.
Google Scholar
10Kim YS. A Study on the Acceleration and Automation of Dynamic Malware Analysis Based on Cloud Computing [master's thesis]. Daejeon, South Korea: Daejeon University; 2016.
Google Scholar
11Roh JH. On Similarity of Malware and Frequency of Suspicious APIs. [master's thesis]. Seoul, South Korea: Yonsei University; 2017.
Google Scholar
12Kwon O-C, Bae S-J, Cho J-I, Moon J-S. Malicious codes re-grouping methods using fuzzy clustering based on native API frequency. J Korea Inst Inf Secur Cryptol. 2008; 18(6A): 115-127.
Google Scholar
13Park JW, Moon ST, Son KW, et al. An automatic malware classification system using string list and apis. J Secur Eng. 2011; 8(5): 611-626.
Google Scholar
14Han KS, Kim IK, Im EK. Malware family classification method using api sequential characteristic. J Secur Eng. 2011; 8(2) 319-335.
Google Scholar
15Agrawal R, Srikant R. Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases; 1994; Santiago, Chile.
Google Scholar
16Bell J. Machine Learning: Hands-On for Developers and Technical Professionals. Indianapolis, IN: John Wiley & Sons; 2014.
10.1002/9781119183464
Google Scholar
17Harrington P. Machine Learning in Action. Greenwich, CT: Manning Publications Co; 2012.
Google Scholar
18Mueller A. Fast Sequential and Parallel Algorithms for Association Rule Mining: A Comparison. Technical Report. College Park, MD: University of Maryland; 1998.
Google Scholar
19Lee H-B, Kim J-H. Performance evaluation of the FP-tree and the DHP algorithms for association rule mining. J KIISE Database. 2008; 35: 199-207.
Google Scholar
20Ju YJ, Shin JH. Detection of malicious code using the FP-growth algorithm and SVM. Paper presented at: The First International Conference on Software and Smart Convergence; June 27-30, 2017; Vladivostok, Russia.
Google Scholar

Citing Literature

Volume32, Issue18

Special Issue on Novel Data Mining Paradigms based on Soft Computing and Machine Learning in the current and upcoming Information Society Revolution (RACS2018). Special Issue on Advanced approaches for information processing in multimedia, decision making and security systems (AdvInfoProc2019)

September 25, 2020

e5483

Detection of malicious code using the direct hashing and pruning and support vector machine

Summary

REFERENCES

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Detection of malicious code using the direct hashing and pruning and support vector machine

Summary

REFERENCES

Citing Literature

References

Related

Information