Volume 32, Issue 18 e5483
SPECIAL ISSUE PAPER

Detection of malicious code using the direct hashing and pruning and support vector machine

YeongJi Ju

YeongJi Ju

Technical Support, TmaxData Corporation, Seoul, South Korea

Search for more papers by this author
MinGu Kim

MinGu Kim

Department of Control and Instrumentation Engineering, Chosun University, Gwangju, South Korea

Search for more papers by this author
JuHyun Shin

Corresponding Author

JuHyun Shin

Department of New Industry Convergence, Chosun University, Gwangju, South Korea

JuHyun Shin, Department of New Industry Convergence, Chosun University, 375 Seosuk-dong, Dong-gu, Gwangju 61452, South Korea.

Email: [email protected]

Search for more papers by this author
First published: 16 August 2019
Citations: 5

Summary

Although open application programming interfaces (APIs) have been improved by advancements in the software industry, diverse types of malicious code have also increased. Thus, many studies have been conducted to characterize the behavior of malicious code based on API data and to determine whether malicious code is included in a specific executable file. Existing methods detect malicious code by analyzing signature data. To detect mutated malicious code in this manner requires a lot of time and has a high false detection rate (see “Detection of malicious code using the FP-growth algorithm and SVM,” a paper presented at The First International Conference on Software and Smart Convergence, 2017). Herein, we propose a method that analyzes and detects malicious code using association rule mining and a support vector machine (SVM). The proposed method reduces the false detection rate by mining the rules of malicious and normal code APIs in the portable executable (PE) file, grouping patterns using the direct hashing and pruning (DHP) algorithm, and classifying malicious and normal files using the SVM. The study shows that sensitivity was 71% and precision was 77% when using a single SVM model. Using the association rules and SVM model, the sensitivity was increased to 77% and the precision to 81%.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.