The problem of intrusion detection has new solutions, thanks to the widespread use of machine learning in the field of network security, but it still has a few issues at this time. Traditional machine learning techniques to intrusion detection rely on expert experience to choose features, and deep learning approaches have a low detection efficiency. In this paper, an intrusion detection model based on feature selection and improved one-dimensional convolutional neural network was proposed. This model first used the extreme gradient boosting decision tree (XGboost) algorithm to sort the preprocessed data, and then it used comparison to weed out 55 features with a higher contribution. Then, the extracted features were fed into the improved one-dimensional convolutional neural network (I1DCNN), and this network training was used to complete the final classification task. The feature selection and improved one-dimensional convolutional neural network (FS-I1DCNN) intrusion detection model not only solved the traditional machine learning method of relying on expert experience to extract features but also improved the detection efficiency of the model, reduced the training time while reducing the dimension, and increased the overall accuracy. In comparison to the I1DCNN model without feature extraction and the conventional one-dimensional convolutional neural network (1DCNN) model, the experimental results demonstrate that the FS-I1DCNN model’s overall accuracy increases by 0.67% and 2.94%, respectively. Its accuracy, precision, recall, and F1-score were significantly better than those of the other intrusion detection models, including SVM and DBN.

1. Introduction

The Internet has become one of the key tools that we cannot live without thanks to the advancement of science and technology, but in recent years, the complexity of the network environment has led to an increasing number of network security incidents around the world [1, 2], and the number of network attacks on various countries is increasing. Therefore, the research on intrusion detection technology has become an indispensable link in the field of network security research. Intrusion detection technology became a dynamic and crucial security protection tool, opening the second line of defense after firewall [3]. It has since developed into a crucial way to defend against network intrusion in the current era of widely used encrypted traffic [4].

Different intrusion detection models based on machine learning and deep learning are being improved more and more as artificial intelligence technology advances quickly [5, 6], but both have disadvantages. SVM, decision trees, and other common algorithms [7–9] are used frequently in traditional machine learning. Their main issue is that they frequently lose sight of the connections between features when extracting features because they rely too heavily on expert experience. Wang et al. [10] proposed a support vector machine based intrusion detection framework. They implement the logarithm marginal density ratios transformation to form new transformed features. In this way, they improve the capability of SVM detection model. In order to reduce the complexity of the model, Kim et al. [11] established a hierarchically integrated anomaly detection model that employs decision trees to build misuse models and data decomposition to create smaller subsets and single-class SVM models from the subsets. Both of the above use SVM to build detection models; however, the training speed drops dramatically when the training data increases substantially [12], so the SVM model is not a very good choice.

Deep learning has been used extensively in the field of intrusion detection recently, but its issue has been poor detection performance. Wang et al. [13] proposed an intrusion detection method based on feature optimization and BP neural network, which improves the intrusion detection rate of a few categories while reducing the dimensionality. However, since BP neural network has a large number of parameters, the convergence speed is relatively slow, leading to low training efficiency. In order to effectively reduce feature dimensionality while maintaining detection performance, Luo and Lu [14] developed a hybrid network attack detection algorithm based on artificial neural networks and genetic algorithms. However, the genetic algorithm itself necessitates the process of encoding and decoding, resulting in a lengthy training period. In the field of intrusion detection, deep neural networks and convolutional neural networks significantly outperform BP neural networks in terms of training efficiency, but the accuracy rate still needs to be increased. Zhang et al. [15] established a deep convolutional neural network classification model based on the improved PCA algorithm, which improves the accuracy of detection while using the PCA algorithm for dimensionality reduction, but the overall accuracy is relatively low. Yang and Wang [16] proposed an improved convolutional neural network intrusion detection method that abstracts low-level traffic into high-level features. The low-level intrusion traffic data is abstractly represented as advanced features by CNN. The stochastic gradient descent algorithm is used to converge the model and the optimization algorithm is used for parameter tuning, and better results are achieved. Moreover, compared with the deep convolutional neural network with too high dimensionality, the one-dimensional convolutional neural network is not only low-dimensional but also a better choice for intrusion detection data with high chronology. Qazi et al. [17] proposed a one-dimensional convolutional neural network-based deep learning system for network intrusion detection and achieved an accuracy rate of 98.96%. However, this system only detects four categories of attacks and is not truly pervasive. Hang et al. [18] proposed an improved method of a one-dimensional convolutional neural network, which used the results of two convolutions as the input of global average pooling and global maximum pooling, and combined the input data to improve the network intrusion detection rate and reduce the parameters and training time of the model.

In summary, to address the problems in intrusion detection, this paper, based on existing research, designs a feature selection and improved one-dimensional convolutional neural network (FS-I1DCNN) intrusion detection model. The key contributions of the paper are as follows:

(1)
We used the methods of oversampling, undersampling, and mean square normalization to process the original dataset
(2)
We adopted XGboost feature selection method to select and filter the processed data, which reduced the training time of the model and sped up the operation efficiency
(3)
We designed an improved one-dimensional convolutional neural network (I1DCNN) intrusion detection model and compared different optimization algorithms. The Adam optimization algorithm is finally used to adjust the model parameters dynamically

The main distribution of this article is as follows. Section 2 introduces the research methods of this article. Furthermore, we provide the experimental results and analysis in Section 3. Finally, we draw research conclusions and prospects.

2. Related Knowledge

2.1. XGboost Feature Selection Algorithm

The XGboost (extreme gradient boosting) algorithm [19] is an evolution of the GDBT (gradient boosting) algorithm and is an efficient system implementation. In this paper, we make use of its tree model to quantify it and choose the features based on their relative importance.

2.1.1. Decision Tree Model and Their Combinations

In the decision tree model construction process, the feature segmentation points are greedily selected using each layer so that they are used as leaf nodes, and then the entire tree is made to gain the most. This means that the more times the feature is segmented, the greater the average gain the feature brings to the whole tree, which indicates that the feature is more important compared to other features. The weight of each leaf node during segmentation can be expressed as w(g_i, h_i); g_i and h_i are displayed in equations (1) and (2), respectively.

()

The training error

is the difference between the training target value y_i and the predicted value

. The gain of each feature as a segmentation point is shown in equation (3), which means that for each segmentation point, the gain can be expressed as the difference between the total weights after segmentation and the total weights of the leaf nodes before segmentation, where the total weight is the sum of the total weights of the left and right subtrees. This would be made in order to minimize the cost of the segmented tree.

()

By continuously iteratively creating new trees, compiling all the trees into a final result, adding a tree during each iteration [20], and building a linear combination of K trees as shown in Equation (4), the XGboost model essentially learns the residuals of the true values and the current predicted values of all trees.

()

f_k(x_i) refers to the weight of the leaf node that the kth tree in the sample is categorized into, and F denotes the function space of all trees.

2.1.2. Importance Metrics

The importance metric is a measure to evaluate the importance of each feature and refers to the importance score of each attribute obtained by the XGboost algorithm [21]. In order to correctly complete the classification task, XGboost algorithm bases the construction of a decision tree on the number of feature splits FScore, the average feature gain value AverageGain, and the average feature coverage AverageCover, as shown in the following equations:

()

As determined by equation (3), X is the set of features assigned to the leaf nodes, Gain is the gain of theXth feature for the segmentation point, and Cover denotes the number of samples at each node.

2.2. Convolutional Neural Networks

Convolutional neural network (CNN) is one of the representative deep learning networks. It has been successfully applied to a variety of artificial intelligence applications [22], including computer vision and natural language processing [23, 24]. Compared to conventional intelligent algorithms, CNN has much more powerful feature extraction capabilities, with its main components being the convolutional layer, pooling layer, fully connected layer, and softmax layer.

2.2.1. Convolutional Layer

The convolutional layer, which serves as the brain of a convolutional neural network, primarily performs convolution and excitation operations. By swiping the convolution kernel window across the input data, the local regions of the input data are convolved with the convolution kernel. The mathematical expression is shown in the following equation:

()

l denotes the lth convolutional layer;

means the lth feature of the output of the ith convolutional layer; ^∗ stands for the convolutional operation;

shows the bias of the ith convolutional kernel; K^l−1 indicates the output of the l − 1th layer; f( ) and

denote the activation function and the weight matrix of the ith convolutional kernel of the lth layer, respectively. Relu, sigmoid, tanh, and other frequently used activation functions are included in CNN. The Relu function has the advantages of speeding up convergence, increasing accuracy, and reducing overfitting when compared to other functions. In this paper, Relu is chosen as the activation function; its formula is provided in the following:

()

2.2.2. Pooling Layer

The primary function of the pooling layer is spatial merging, which is also known as downsampling or subsampling. It basically makes sure that the crucial data is simultaneously reduced in dimensionality. As shown in Equation (10), the maximum pooling layer is used in this study.

stands for the ith feature element of the l + 1th layer after pooling; D_j denotes the jth pooling region; and

means the element of the ith feature map of the lth layer that is included in the scope of this pooling kernel.

()

2.2.3. Fully Connected Layer

The final classification result, whose expression is shown in (11), is obtained by combining the previously extracted features through the fully connected layer of a multilayer perceptron to perform nonlinear activation and output the probability distribution of each classification. P(Y_j) is the probability output of the neuron following the softmax activation function, and m denotes the number of classifications.

()

3. Intrusion Detection Model Based on FS-I1DCNN

The FS-I1DCNN intrusion detection model that is proposed in this paper has three main modules, with the specific framework shown in Figure 1. Data preprocessing module, which first samples, filters, and cleans the original dataset; XGboost feature screening module, which prioritizes data by XGboost model and filters out the features with higher contribution to the model by experimental comparison; I1DCNN traffic detection module, which completes the classification task by an improved one-dimensional convolutional neural network [22], and after experimental validation following comparison tests, the best option is chosen, and the FS-I1DCNN intrusion detection model is finished being built.

Details are in the caption following the image — **Figure 1**
Open in figure viewer PowerPoint

FS-I1DCNN model diagram.

3.1. Data Preprocessing

The FS-I1DCNN intrusion detection model that is proposed in this paper has three main modules, with the specific framework shown in Figure 1. The first module is data preprocessing module, which includes sampling, filtering and cleaning the original dataset. The second module is XGboost feature screening module, which prioritizes data by XGboost model and filters out the features with higher contribution to the model by experimental comparison. The last module is I1DCNN traffic detection module, which completes the classification task by an improved one-dimensional convolutional neural network [22], and the best scheme is selected after experimental verification. The Canadian Institute for Cybersecurity Research’s CIC-IDS-2017 dataset [25] is used in this study. It was collected from 9 a.m. on July 3, 2017, to 5 p.m. on July 7, 2017, and primarily includes 12 categories of attacks and regular benign traffic. Since the dataset has some missing data and is large and has enough experimental data, it should be processed for missing values. The method used is to remove the missing values to stop the missing values from having an effect on the experiment. Furthermore, this experimental dataset contains 78 pertinent features, each of which has a magnitude and order of magnitude that varies. Some of these feature values are of a large order of magnitude, which will affect how well the model performs if trained directly. The feature flow duration, for instance, has a range of feature values between -1 and 119999993, so this feature should be normalized to speed up computing and also remove the impact on the experimental results if the magnitudes are different.

In this study, we employ the demean normalization technique, which preprocesses the data and arranges it uniformly using the StandardScaler module of the sklearn package. The mean-variance normalization formula is displayed in the following:

()

where L represents the feature value of each group, Lmean represents the mean value of each group, and α represents the standard deviation of each group.

Each group’s feature value, mean value, and standard deviation are represented by the letters L, Lmean, and α, respectively.

The final data categories and quantities obtained following the above data preprocessing are shown in Table 1.

Table 1. Traffic types and distribution of CIC-IDS-2017.

Data categories	The amount of processed data
Benign	285324
Bot	7864
DDos	128027
Dos GoldenEye	10293
Dos Hulk	231073
Dos slowhttptest	5499
Dos slowloris	5796
FTP-Patator	7938
Heartbleed	5632
Infiltration	4608
PortScan	158930
SSH-Parator	5897
Web Attack	16620

3.2. XGboost Feature Selection

If a feature is being screened, whether it plays a crucial part in the model will determine whether to keep it or not. To achieve the best classification effect, generate the corresponding feature contribution degree based on the XGboost feature importance index mentioned above, sort them based on the size of the feature contribution degree, and run experiments using various retained features. Algorithm 1 illustrates the XGboost feature selection algorithm suggested in this paper.

Algorithm 1: The XGboost feature selection algorithm.

Input: Intrusion detection dataset S, feature F = {t₁, t₂ ⋯ t_n}, n is feature sum.
Step1: Calculate the importance of each feature, in order from smallest to largest, get , and k ≤ n, and satisfy the condition of .
Step2: Acc = 0, i = k // Acc is accuracy, and i is the number of retained features
Step3: for i to 1, step −1 do
Step4: Input the CNN, and record the accuracy obtained as Acc(i)
Step5: if Acc(i)>Acc then
Step6 Acc⟵Acc(i)
Step7 End if
Step8 End for
Output: the feature filtered dataset S^′

After the above XGboost selection, the dataset S^′ is obtained, which has a smaller dimension than the dataset S before the selection, which reduces the time for the following training and speeds up the operation efficiency.

3.3. Improved One-Dimensional Convolutional Neural Network

Convolution-pooling-full connections make up the bulk of traditional convolutional neural networks. Compared with the fully connected neural network, CNNs have fewer parameters when the same number of hidden units is used [26]. Moreover, CNN is easy to train [27]. And one-dimensional convolutional neural networks are a kind of convolutional neural network that can be effectively identified and applied to the time series problem of sensor data or by fixed-length periodic signal data. So this paper selects a one-dimensional convolutional neural network as the core of the classification model according to the characteristics of the temporal sequence of the intrusion detection dataset. In addition, due to the high dimensionality of the intrusion detection dataset, only one layer of convolutional operation does not fully extract the features in the dataset, so in this paper, we design an AlexNet style improved one-dimensional convolutional neural network [28] (I1DCNN), whose structure is illustrated in Figure 2. It primarily consists of four convolutional layers, two layers with maximum pooling, one layer with flattening, one softmax function output layer, and one full connection layer. To ensure the simultaneous reduction of crucial information, two convolution layers are used to fully extract features, followed by a maximum pooling layer for pooling processing. The previous convolution pooling operation is then carried out again. The output multidimensional data is then transformed into a one-dimensional array using the flattened layer. Use the softmax output function for output as you transition to the full connection layer.

The Adam optimization algorithm is used in this paper to optimize and create an improved one-dimensional convolutional neural network after output by the softmax output function. The Adam optimization algorithm is essentially a momentum method and RMSprop optimization algorithm. To ensure that all parameters are relatively stable, it dynamically modifies the learning rate of each parameter using the first-order moment estimation and second-order moment estimation of gradient [29]. Algorithm 2 illustrates the Adam optimization in this paper.

Algorithm 2: The Adam optimization algorithm.

Input: Initial parameter θ and step size ε.Exponential decay rate of moment estimation ρ₁ and ρ₂. They denote updating the momentum term and RMSprop, respectively. The feature filtered dataset S^′ and minibatch of the training set of u samples {x⁽¹⁾, ⋯x^(u)} in the filtered dataset S^′,whose corresponding target is y⁽ⁱ⁾.
Step1: int s = 0, r = 0 t = 0 // s is a first-order matrix variable, r is a second-order matrix variable, t is a time step
Step2: while failure to meet stop guidelines do
Step3: g⟵1/m∇_θ∑_iL(f(x⁽ⁱ⁾; θ), yⁱ) // Calculate gradient
Step4: t⟵t + 1 // Update training times
Step5: s⟵ρ₁s + (1 − ρ₁)g // Cumulative gradient
Step6: r⟵ρ₂r + (1 − ρ₂)g² // Calculate gradient squared
Step7 // Correct the deviation of the first moment
Step8 // Correct the deviation of the second moment
Step9 // calculate update, element-by-element operation.
Step10 θ⟵θ + Δθ // Update parameter
Step11 End while
Output: The updated parameter θ^′

Utilizing the Adam optimization algorithm, the mean value of the gradient and the mean value of the gradient square are adaptively adjusted to improve the classification performance of the improved one-dimensional convolutional neural network created after feature screening.

4. Experimental Results and Analysis

4.1. Experimental Environment

The Windows Server 2016 operating system, Intel(R) Xeon(R) CPU E5-2650 [email protected] GHz, and the system type 64-bit operating system were the experimental environments used in this experiment. Python3.7.9 is the programming language and version. Pycharm and Anaconda are two third-party programs. Tenserflow-CPU is the primary deep learning framework, and Keras is used to build the network model. Machine learning libraries include sklearn and time.

4.2. Evaluation Indicators

In order to ensure the effectiveness of the experiment, accuracy, precision, recall, F1-score, and ROC curve were adopted as indicators to evaluate the performance of the machine learning model. Among them, true class (TP) refers to the number of positive cases correctly classified; false negative class (FN) refers to the number of positive cases misclassified as negative cases; false positive class (FP) represents the number of negative cases misclassified as positive cases; true negative class (TN) refers to the number of correctly classified negative cases; and the confusion matrix formed by it is shown in Table 2.

Table 2. Confusion matrix.

Label		Positive	Negative
Label		Predict class
True class	Positive	TP	FN
True class	Negative	FP	TN

Accuracy, precision, recall, F1-score, and ROC curve were chosen as the indicators to assess the performance of the machine learning model in order to guarantee the efficacy of the experiment. False positive class (FP) represents the number of negative cases misclassified as positive cases. False negative class (FN) represents the number of positive cases incorrectly classified as negative cases, and true class (TP) refers to the number of positive cases correctly classified; the confusion matrix created by true negative class (TN), which is the quantity of correctly classified negative cases, is displayed in Table 2.

A better classifier has a higher accuracy, which primarily reflects the classifier’s capacity to distinguish between positive and negative. The ability to distinguish between good and bad is improved with increased ability. The formula of accuracy is displayed in the following:

()

Precision is the percentage of samples that contain only positive examples. Its formula is displayed in the following:

()

Recall is a measure of how many instances of a particular category were accurately classified into this category. Its formula is shown in the following:

()

The harmonic average of precision and recall is the F1-score. The best model performance is represented by a value of 1, while the worst model performance is represented by a value of 0. In (16), its formula is displayed.

()

4.3. Comparison Experiment with Different Numbers of Features

The size of the number of features in the screening of the number of features has a significant impact on the outcomes of the experiment. A smaller number of screening may overlook the crucial features, while a larger number of screening may result in feature redundancy, which is of little significance to the experimental results. There are 56 features with varying degrees of contribution to the experiment, as shown in Figure 3‘s ranking of feature contribution scores for various features. Therefore, the preprocessed data in this paper are screened by the importance of features of 56 features by the size of contribution scores for relevant comparison experiments, and the results are shown in Figure 4. When the number of features is 55, it has the highest accuracy rate and has a clear advantage over other feature numbers. As the number of features decreases from 55, the accuracy rate gradually decreases. For this reason, 55 features were chosen as the final number of features for the experiment.

4.4. Model Results and Analysis

The prepreprocessed dataset was fed into the improved 1D convolutional neural network after feature importance selection. The experimental results are shown in Figure 5. When the number of iterations exceeds 40, the experimental results tend to be stable, so the epoch value is set to 40. The results of the confusion matrix obtained from the experiment are shown in Figure 6, which shows that the classification accuracy of each category reaches more than 90%, including 100% for DDos, FTP-Patator, Heartbleed, and PortScan, and 98% for Dos Hulk, Dos slowloris, Infiltration, SSH-Patator, and Web Attack. The three types of malicious traffic also have accuracy of 90%, 94%, and 91%, respectively, for Bot, Dos GoldenEye, and Dos slowhttptest. In addition, the overall accuracy of the model also reaches 99.36%. So, it shows that the FS-I1DCNN intrusion detection model suggested in this paper has good performance in the classification results of each different class of malicious traffic and has good classification effect.

4.5. Comparison Experiment with I1DCNN and 1DCNN

This paper is experimentally compared with 1DCNN and I1DCNN to demonstrate the superiority of the FS-I1DCNN intrusion detection model. The results of the confusion matrix of the two are shown in Figures 7 and 8, with an overall accuracy rate of 96.42% and 98.69%, respectively. In conclusion, the I1DCNN model outperforms the 1DCNN model in terms of classification accuracy for all categories aside from benign and Dos Hulk, which are comparable to the 1DCNN model. It can be seen that the FS-I1DCNN model proposed in this paper has more or less improvement for each category when compared to the I1DCNN model without XGboost feature selection, and for a few categories, such as Bot and Dos slowhttptest, it has about 8% improvement. The I1DCNN intrusion detection model that is suggested in this paper has a notable improvement in accuracy for a select few categories and a better classification effect overall. Additionally, the average training times per epoch FS-I1DCNN compared to I1DCNN is 903 seconds and 988 seconds for these two methods, which is a full reduction of 85 seconds. This data demonstrates how the FS-I1DCNN intrusion detection model enhances classification effects while reducing dimensionality and operating times, further enhancing detection efficiency.

4.6. Comparative Experiment of Different Optimization Algorithms

When selecting the optimization algorithm of the model, different optimization methods have different adaptive fields and advantages, and the experimental results are also different. Considering that Adam can dynamically adjust the learning rate of each parameter by utilizing the gradient, allowing for the realization of self-adaptive learning and the achievement of a better classification effect when using this optimization algorithm. This paper conducts comparative experiments on various optimization algorithms based on the FS-I1DCNN intrusion detection model, and the outcomes are displayed in Table 3 and Figure 9. As can be seen, the Adam optimization algorithm has the highest classification accuracy when learning at the same rate, and it outperforms Adatelta in terms of accuracy rate, recall rate, and F1 score. The Adam optimization algorithm has the best classification effect because it is 1% to 3% higher than the Adagrad, Rmsprop, and Nadam optimization algorithms.

Table 3. Different algorithms correlation values.

Optimization algorithm	Accuracy	Precision	Recall	F1-score
Adatelta	0.9763	0.9701	0.9682	0.9698
Adagrad	0.9775	0.9721	0.9710	0.9733
Rmsprop	0.9823	0.9801	0.9636	0.9691
Nadam	0.9664	0.9558	0.9633	0.9627
Adam	0.9936	0.9887	0.9865	0.9877

4.7. Comparison Experiment with Other Algorithms

Using the same dataset, it was compared cross-sectionally with other algorithmic models, such as random forest, SVM, DBN, LCNN, and ICNN, to determine how effective the FS-I1DCNN intrusion detection model proposed in this paper was. The results are shown in Table 4 and Figure 10. It is evident that the FS-I1DCNN intrusion detection model proposed in this paper has improved overall accuracy, precision, recall, and F1-score by an average of 4.36%, 1.57%, 2.21%, and 1.75% when compared to the other five models. All of the indexes are better than the other detection models, demonstrating the model’s superior classification performance and applicability.

Table 4. Different classification model correlation values.

Classification model	Accuracy	Precision	Recall	F1-score
Rep+random forest[30]	0.9667	/	0.9448	/
PCA+SVM[31]	0.9291	/	0.9632	/
SVM+DBN[32]	0.97	0.98	0.97	0.97
PCA+LCNN[15]	0.9643	0.9589	0.9642	0.9606
ICNN[18]	0.98	0.98	0.98	0.98
FS-I1DCNN	0.9936	0.9887	0.9865	0.9877

5. Conclusion

In order to address the problems with intrusion detection, a model based on FS-I1DCNN is proposed in this paper. After data processing, 55 features with a higher contribution are selected for the CIC-IDS-2017 dataset using the XGboost feature importance ranking method. These features are then fed into an improved one-dimensional convolutional neural network to finish the model’s final classification task. The FS-I1DCNN intrusion detection model not only resolves the issue with the conventional machine learning approach of relying on expert experience to extract features but also enhances the detection efficiency of the model by using the XGboost feature screening method, reduces training time, and increases overall accuracy rate. The final experimental results demonstrate that the FS-I1DCNN intrusion model outperforms other intrusion detection models across all evaluation indices, achieving more than 97% classification accuracy in each category and 99.36% overall accuracy. In the future, the model will be used in actual network attack scenarios in order to further increase its detection effectiveness while maintaining high model classification performance.

Conflicts of Interest

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Acknowledgments

This research was funded by the New Infrastructure and University Informatization Research Project (Grant no. XJJ202205017).

Open Research

Data Availability

The dataset used in this paper is open, which is proposed in reference.

References

1 Hindy H., Atkinson R., Tachtatzis C., Colin J.-N., Bayne E., and Bellekens X., Utilising deep learning techniques for effective zero-day attack detection, Electronics. (2020) 9, no. 10, https://doi.org/10.3390/electronics9101684.
10.3390/electronics9101684
PubMed Web of Science® Google Scholar
2 Benedetto M. S., Anastasija C., and Niels A. N., Training Guidance with KDD Cup 1999 and NSL-KDD Data Sets of ANIDINR: Anomaly- Based Network Intrusion Detection System, Procedia Computer Science. (2020) 175, 560–565, https://doi.org/10.1016/j.procs.2020.07.080.
10.1016/j.procs.2020.07.080
Google Scholar
3 Thakkar A. and Lohiya R., A survey on intrusion detection system: feature selection, model, performance measures, application perspective, challenges, and future research directions, Artificial Intelligence Review. (2022) 55, no. 1, 453–563, https://doi.org/10.1007/s10462-021-10037-9.
10.1007/s10462-021-10037-9
Web of Science® Google Scholar
4 He A. K., Kim D. D., and Asghar M. R., Adversarial machine learning for network intrusion detection systems: a comprehensive survey, IEEE Communications Surveys & Tutorials. (2023) 25, no. 1, 538–566, https://doi.org/10.1109/COMST.2022.3233793.
10.1109/COMST.2022.3233793
Web of Science® Google Scholar
5 Zeeshan A., Adnan S. K., Chean W. S., Johari A., and Farhan A., Network intrusion detection system: a systematic study of machine learning and deep learning approaches, Transactions on Emerging Telecommunications Technologies. (2021) 32, no. 1, article e4150, https://doi.org/10.1002/ett.4150.
10.1002/ett.4150
Web of Science® Google Scholar
6 Zhang H., Zhang X. Y., Zhang Z. Y., and Li W., A review of intrusion detection models based on deep learning, Computer Engineering and Applications. (2021) 58, 17–28.
Google Scholar
7 Miao X., Liu Y., Zhao H., and Li C., Distributed online one-class support vector machine for anomaly detection over networks, IEEE Transactions on Cybernetics. (2019) 49, no. 4, 1475–1488, https://doi.org/10.1109/TCYB.2018.2804940, 2-s2.0-85042862997.
10.1109/TCYB.2018.2804940
PubMed Web of Science® Google Scholar
8 Chew Y. J., Ooi S. Y., Wong K. S., Pang Y. H., and Lee N., Adoption of IP truncation in a privacy-based decision tree pruning design: a case study in network intrusion detection system, Electronics. (2022) 11, no. 5.
10.3390/electronics11050805
Web of Science® Google Scholar
9 Reddy A. V. S., Reddy B. P., Sujihelen L., Mary A. V. A., Jesudoss A., and Jeyanthi P., Intrusion detection system in network using decision tree, 2022 International Conference on Sustainable Computing and Data Communication Systems (ICSCDS), 2022, Erode, India, 1186–1190, https://doi.org/10.1109/ICSCDS53736.2022.9760891.
10.1109/ICSCDS53736.2022.9760891
Google Scholar
10 Wang H. W., Jie G., and Wang S. S., An effective intrusion detection framework based on SVM with feature augmentation, Knowledge-Based Systems. (2017) 136, 130–139, https://doi.org/10.1016/j.knosys.2017.09.014, 2-s2.0-85029208208.
10.1016/j.knosys.2017.09.014
Web of Science® Google Scholar
11 Kim G., Lee S., and Kim S., A novel hybrid intrusion detection method integrating anomaly detection with misuse detection, Expert Systems with Applications. (2014) 41, no. 4, 1690–1700, https://doi.org/10.1016/j.eswa.2013.08.066, 2-s2.0-84888315965.
10.1016/j.eswa.2013.08.066
Web of Science® Google Scholar
12 Tao P., Sun Z., and Sun Z., An improved intrusion detection algorithm based on GA and SVM, IEEE Access. (2018) 6, 13624–13631, https://doi.org/10.1109/ACCESS.2018.2810198, 2-s2.0-85043358121.
10.1109/ACCESS.2018.2810198
Web of Science® Google Scholar
13 Wang W., Dai H., and Dai S. Q., Intrusion detection method based on feature optimization and BP neural network, Computer Engineering and Design. (2021) 42, 2755–2761.
Google Scholar
14 Luo Y. D. and Lu L., Network attack detection based on artificial neural network and genetic algorithm, Computer Engineering and Design. (2021) 42, 2446–2454.
Google Scholar
15 Zhang X. L., Cheng G., and Zhang W. C., Network traffic classification method based on improved deep convolutional neural network, Scientia Sinica Informationis. (2021) 51, 56–74.
10.1360/SSI-2019-0213
Google Scholar
16 Yang H. and Wang F., Wireless network intrusion detection based on improved convolutional neural network, IEEE Access. (2019) 7, 64366–64374, https://doi.org/10.1109/ACCESS.2019.2917299, 2-s2.0-85066733571.
10.1109/ACCESS.2019.2917299
Web of Science® Google Scholar
17 Qazi E. U. H., Almorjan A., and Zia T., A one-dimensional convolutional neural network (1D-CNN) based deep learning system for network intrusion detection, Applied Sciences. (2022) 12, no. 16, https://doi.org/10.3390/app12167986.
10.3390/app12167986
Google Scholar
18 Hang M. X., Chen W., and Zhang R. J., Abnormal traffic detection based on improved one-dimensional convolutional neural network, Journal of Computer Applications. (2021) 41, 433–440.
Google Scholar
19 Chen T. Q. and Guestrin C., Xgboost: a scalable tree boosting system, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, New York, NY, USA, 785–794, https://doi.org/10.1145/2939672.2939785, 2-s2.0-84984950690.
10.1145/2939672.2939785
Google Scholar
20 Alsahaf A., Petkov N., Shenoy V., and Azzopardi G., A framework for feature selection through boosting, Expert Systems with Applications. (2022) 187, article 115895, https://doi.org/10.1016/j.eswa.2021.115895.
10.1016/j.eswa.2021.115895
Web of Science® Google Scholar
21 Alsaleh A. and Binsaeedan W., The influence of salp swarm algorithm-based feature selection on network anomaly intrusion detection, IEEE Access. (2021) 9, 112466–112477, https://doi.org/10.1109/ACCESS.2021.3102095.
10.1109/ACCESS.2021.3102095
Web of Science® Google Scholar
22 Cai W. and Hu D., QRS complex detection using novel deep learning neural networks, IEEE Access. (2020) 8, 97082–97089, https://doi.org/10.1109/ACCESS.2020.2997473.
10.1109/ACCESS.2020.2997473
Google Scholar
23 Zhu Q. and Zu X., Fully convolutional neural network structure and its loss function for image classification, IEEE Access. (2022) 10, 35541–35549, https://doi.org/10.1109/ACCESS.2022.3163849.
10.1109/ACCESS.2022.3163849
Web of Science® Google Scholar
24 Wang Y., Yang Y., Ding W., and Li S., A residual-attention offline handwritten Chinese text recognition based on fully convolutional neural networks, IEEE Access. (2021) 9, 132301–132310, https://doi.org/10.1109/ACCESS.2021.3115606.
10.1109/ACCESS.2021.3115606
Web of Science® Google Scholar
25 Sharafaldin I., Lashkari A. H., and Ghorbani A. A., Toward generating a new intrusion detection dataset and intrusion traffic characterization, ICISS. (2018) 1, 108–116.
Google Scholar
26 Huang J. C., Zeng G. Q., Geng G. G., Weng J., Lu K. D., and Zhang Y., Differential evolution-based convolutional neural networks: an automatic architecture design method for intrusion detection in industrial control systems, Computers & Security. (2023) 132, article 103310, https://doi.org/10.1016/j.cose.2023.103310.
10.1016/j.cose.2023.103310
Web of Science® Google Scholar
27 Lu K. D., Zhou L., and Wu Z. G., Representation-learning-based CNN for intelligent attack localization and recovery of cyber-physical power systems, IEEE Transactions on Neural Networks and Learning Systems. (2023) 1–11, https://doi.org/10.1109/tnnls.2023.3257225.
10.1109/tnnls.2023.3257225
Web of Science® Google Scholar
28 Javaid N., A PLSTM, AlexNet and ESNN based ensemble learning model for detecting electricity theft in smart grids, IEEE Access. (2021) 9, 162935–162950, https://doi.org/10.1109/ACCESS.2021.3134754.
10.1109/ACCESS.2021.3134754
Google Scholar
29 Lin C.-H., Lin Y.-C., and Tang P.-W., ADMM-ADAM: a new inverse imaging framework blending the advantages of convex optimization and deep learning, IEEE Transactions on Geoscience and Remote Sensing. (2022) 60, 1–16, https://doi.org/10.1109/TGRS.2021.3111007.
10.1109/TGRS.2021.3111007
Web of Science® Google Scholar
30 Ahmim A., Maglaras L., Ferrag M. A., Derdour M., and Janicke H., A novel hierarchical intrusion detection system based on decision tree and rules-based models, 2019 15th international conference on distributed computing in sensor systems (DCOSS), 2019, Santorini, Greece, 228–233, https://doi.org/10.1109/DCOSS.2019.00059, 2-s2.0-85071939993.
10.1109/DCOSS.2019.00059
Google Scholar
31 Wang H., Xiao Y., and Long Y., Research of intrusion detection algorithm based on parallel SVM on spark, 2017 7th IEEE International Conference on Electronics Information and Emergency Communication (ICEIEC), 2017, Macau, China, 153–156, https://doi.org/10.1109/ICEIEC.2017.8076533, 2-s2.0-85035759571.
10.1109/ICEIEC.2017.8076533
Google Scholar
32 Zhang H., Li Y., Lv Z., Sangaiah A. K., and Huang T., A real-time and ubiquitous network attack detection based on deep belief network and support vector machine, IEEE/CAA Journal of Automatica Sinica. (2020) 7, no. 3, 790–799, https://doi.org/10.1109/JAS.2020.1003099.
10.1109/JAS.2020.1003099
Google Scholar

Citing Literature

All articles

An Intrusion Detection Model Based on Feature Selection and Improved One-Dimensional Convolutional Neural Network

Abstract

1. Introduction

2. Related Knowledge

2.1. XGboost Feature Selection Algorithm

2.1.1. Decision Tree Model and Their Combinations

2.1.2. Importance Metrics

2.2. Convolutional Neural Networks

2.2.1. Convolutional Layer

2.2.2. Pooling Layer

2.2.3. Fully Connected Layer

3. Intrusion Detection Model Based on FS-I1DCNN

3.1. Data Preprocessing

3.2. XGboost Feature Selection

3.3. Improved One-Dimensional Convolutional Neural Network

4. Experimental Results and Analysis

4.1. Experimental Environment

4.2. Evaluation Indicators

4.3. Comparison Experiment with Different Numbers of Features

4.4. Model Results and Analysis

4.5. Comparison Experiment with I1DCNN and 1DCNN

4.6. Comparative Experiment of Different Optimization Algorithms

4.7. Comparison Experiment with Other Algorithms

5. Conclusion

Conflicts of Interest

Acknowledgments

Open Research

Data Availability

References

Citing Literature

Figures

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley