Volume 2022, Issue 1 6733676
Research Article
Open Access

Rotating Machinery Fault Identification via Adaptive Convolutional Neural Network

Luke Zhang

Luke Zhang

Science and Technology on Reactor System Design Technology Laboratory, Nuclear Power Institute of China, Chengdu 610213, China npic.net.cn

Search for more papers by this author
Jia Liu

Jia Liu

Science and Technology on Reactor System Design Technology Laboratory, Nuclear Power Institute of China, Chengdu 610213, China npic.net.cn

Search for more papers by this author
Shu Su

Shu Su

Science and Technology on Reactor System Design Technology Laboratory, Nuclear Power Institute of China, Chengdu 610213, China npic.net.cn

Search for more papers by this author
Tong Lu

Tong Lu

Science and Technology on Reactor System Design Technology Laboratory, Nuclear Power Institute of China, Chengdu 610213, China npic.net.cn

Search for more papers by this author
Chunrong Xue

Chunrong Xue

Chongqing Research Institute, China Coal Technology Engineering Group, Chongqing 400039, China

Search for more papers by this author
Yinjun Wang

Yinjun Wang

School of Mechanical Engineering, Chongqing Technology and Business University, 400067, China ctbu.edu.cn

Search for more papers by this author
Xiaoxi Ding

Corresponding Author

Xiaoxi Ding

State Key Laboratory of Mechanical Transmission, Chongqing University, Chongqing 400044, China cqu.edu.cn

Search for more papers by this author
Yimin Shao

Yimin Shao

State Key Laboratory of Mechanical Transmission, Chongqing University, Chongqing 400044, China cqu.edu.cn

Search for more papers by this author
First published: 22 July 2022
Academic Editor: Haidong Shao

Abstract

Rotating machinery plays an important role in transportation, petrochemical industry, industrial production, national defence equipment, and other fields. With the development of artificial intelligence, the equipment condition monitoring especially needs an intelligent fault identification method to solve the problem of high false alarm rate under complex working conditions. At present, intelligent recognition models mostly increase the complexity of the network to achieve the purpose of high recognition rate. This method often needs better hardware support and increases the operation time. Therefore, this paper proposes an adaptive convolutional neural network (ACNN) by combining ensemble learning and simple convolutional neural network (CNN). ACNN model consists of input layer, subnetwork unit, fusion unit, and output layer. The input of the model is one-dimensional (1D) vibration signal sample, and the subnetwork unit consists of several simple CNNs, and the fusion unit weights the output of the subnetwork units through the weight matrix. ACNN recognizes the self-adaptive of weight factors through the fusion unit. The adaptive performance and robustness of ACNN for sample recognition under variable working conditions are verified by gear and bearing experiments.

1. Introduction

As a key component of mechanical transmission system, rotating machineries have been widely used in the transmission system of automobiles [1], ships [2], wind turbine [3], machine tools, etc. However, in the actual industrial scene, they are easy to be broken down due to the harsh service environment and variable speed and load [4, 5]. So, it is vulnerable to catastrophic accidents if health state of equipment is not considered in a timely manner. Therefore, the research on intelligent and efficient recognition model is of great significance to ensure the healthy operation of equipment [68].

At present, the common monitoring technology can be divided into three groups: index-based trend forecast methods, spectrum signal-based analysis methods [9], and data-driven deep learning (DL) methods [10, 11]. The former two rely heavily on expert experience and require more labour input. In the past decade years, benefiting from the rapid development of computer systems and intelligent sensing technologies, deep learning methods have been attached to too much attention. As an end-to-end fault diagnosis technology, deep learning aims to build a learning model and mine the inherent complex mapping between feature space and fault types by learning massive labeled data, to predict and judge diagnosis of unknown samples. Existing favourable deep learning methods include deep belief network (DBN), Auto-Encoder (AE) [12], and convolutional neural network (CNN) that present significant advantages in solving varieties of classification problems. Wang proposes a deep interpolation neural network (DICN) [13], which improves the fault recognition rate of neural network under time-varying conditions. Eren et al. [14] used compact 1D-CNN to extract recognition features from the original fault data, and the classification time is less than 1 msec, which is very suitable for the fact monitoring and diagnosis of mechanical equipment. Zhang et al. [15] proposed a deep convolutional neural network with wide first-layer kernels (WDCNN) which used the wide kernels in the first convolutional layer for extracting features and suppressing high-frequency noise. Liu et al. [16] proposed a multiscale kernel-based residual CNN (MK-ResCNN) which overcomes the problem that the gradient of deep network disappears, and used multiscale nuclear energy to extract fault features more accurately. Du et al. [17] proposed a convolution sparse learning model for deconvolution of complex modulation of transmission path, and successfully detecting the transient fault impulses of gearbox vibration signal. Huang et al. [18] used minimax concave penalty function to construct an objective function and constraint the sparsity coefficients. As a result, the repetitive transient’s information is effectively extracted. Li et al. [19] proposed a power spectral entropy based variational mode decomposition method and introduced it into deep neural networks, and achieve a promising fault recognition rate. Li et al. [20] proposed a named WaveletKernelNet framework where the first layer of a standard CNN is replaced with continuous wavelet transform, achieving an interpretable feature map with clear physical meaning. Sun et al. [21] combined sparse auto-encoder SAE with DNN and presented a SAE-based CNN to learn more differentiated features of unlabeled data, and experimentally verified its effectiveness. Guo et al. [22] established a named hierarchical learning rate adaptive deep convolution neural network where the two-dimensional (2D) CNN hierarchical framework with an adaptive learning rate is adopted to recognize bearing fault categories and sizes. Cheng et al. [23] proposed a hybrid time-frequency analysis method, which was successfully used for railway bearing fault identification, which could effectively recover fault information from raw signals contaminated by strong noise and other interferences.

With the research and development of intelligent recognition methods, the scale of the model is becoming larger and larger in order to pursue high recognition rate, which obviously does not correspond to a good direction of fault diagnosis. The large scale of intelligent recognition model needs better hardware support and increases the recognition operation time, which is obviously unfavourable to the industrial application of intelligent recognition methods. Therefore, this paper proposes an adaptive convolutional neural network (ACNN) by combining ensemble learning and simple convolutional neural network (CNN). ACNN model consists of input layer, subnetwork unit, fusion unit, and output layer. The input of the model is one-dimensional (1D) vibration signal sample, and the subnetwork unit consists of several simple CNNs, and the fusion unit weights the output of the subnetwork units by the weight matrix. The weight matrix can adjust the proportion of each subnetwork output, increase the influence of identifying the correct subnetwork unit output, and weaken the influence of identifying the wrong subnetwork output. ACNN realizes the integrated learning of the model by adaptively adjusting the simple CNN as the output of the basic classifier, which can improve the recognition rate of the model without significantly increasing the network parameters. The adaptive performance and robustness of ACNN for sample recognition under variable working conditions are verified by gear and bearing experiments.

The main innovations and contributions of this paper are summarized as follows.
  • (1)

    ACNN is proposed by combining ensemble learning with simple 1D-CNN, which can accurately identify rotating machinery faults under unknown working conditions. It provides a new idea and method for intelligent fault diagnosis

  • (2)

    ACNN replaces the combination strategy in traditional bagging ensemble learning with optimized weight parameters. The combination strategy is optimized by continuously optimizing the weight parameters

  • (3)

    The proposed ACNN is generalizable, and it can also be applied to other machine learning algorithms. Besides, it can be also found that the proposed ACNN not only has a good identification performance for multiple load conditions but also shows a strong ability to unknown information representation for samples under variational working conditions, including speed, load, and oil

The rest of this paper is organized as follows. Section 2 presents theoretical background. In Section 3, the architecture of ACNN is proposed and the training strategy of model is introduced. In Section 4, compared with other network architectures, the fault diagnosis results of the ACNN are discussed on the gearbox dataset and bearing dataset, and the validity of the model is verified. Section 5 concludes this paper.

2. Theoretical Background

The model proposed in this paper is based on one-dimensional (1D) CNN theory and reference learning. The 1D-CNN is essentially a multilayer perceptron, which adopts the method of local receptive fields and shared weights. On the one hand, this method reduces the number of weights and makes the network easy to optimize; on the other hand, it reduces the risk of overfitting. The 1D-CNN is generally composed of input layer, 1D convolution layer, activation function, pooling layer, and full connection layer, as shown in Figure 1.

Details are in the caption following the image
The structure of 1D-CNN.
The calculation formula of 1D convolution is defined as follows.
(1)
where l is the length of 1D input x, k is the length of convolution kernel w, and s is the convolution stride. N∗ is the set of positive integers. ⌊·⌋ and ⌈·⌉ represent rounding down and up, respectively. yi is the i-th element in the output of convolution layer. x is the 1D original vibration signal. w and b are the kernel and bias, respectively. The convolution formula is abbreviated as:
(2)
where ⊗ denotes convolution operation and w and b are the kernel and bias, respectively. To reduce network parameters and retain effective signal characteristics, a max-pooling function is processed after each small convolution layer as follows.
(3)
where is the i-th data of the l layer, yl−1 is the output of the l-1 layer, s is the stride of the pooling, len (·) is the length of the vector, ⌈·⌉ is the up rounding, and the padding type is set to same. The rectified linear unit (ReLU) is used to activation function after each convolution operation. ReLU is defined as:
(4)

The high-dimensional spatial feature map obtained after the input data that is subjected to the convolution operation will be inputted to the pooling layer for subsampled processing. The most commonly used pooling operation is the max-pooling operation. The max-pooling operation will divide the feature map into several nonoverlapping regions according to the relevant parameters and step size of the pooling region, and then extract the maximum value in each region as the representative value of this region, and discard other values of this region. The maximum values of different regions are sequentially combined into a new feature map as the output of the pooling layer.

3. Adaptive Convolutional Neural Network

3.1. Motivation

The ensemble learning refers to a machine learning method that integrates multiple basic classifiers with certain criteria or strategies in order to obtain a strong learner to achieve the target task. For a complex problem, multiple experts have given different opinions and solutions. If you can discuss these different opinions and methods, and get a comprehensive opinion and solution, it is often more comprehensive and better than any one of them. Ensemble learning is based on this idea to complete the learning task. Its concept can be summarized as follows: For a specific target task, use sample data to train to obtain a few base learners with certain training criteria and strategies and then use appropriate fusion criteria or algorithm, which integrates multiple basic classifiers to obtain a strong classifier with excellent performance to complete the target task. Figure 2 shows the general structure of ensemble learning. The traditional bagging is an algorithm that optimizes the output of weak learners by combining strategies. The bagging algorithm not only improves its accuracy and stability but also avoids overfitting by reducing the variance of the results. The voting average method is a common combination strategy. However, the traditional combination strategy cannot be updated. We use the network weight parameters to replace the traditional combination strategy, and continuously adjust the output weight of each weak classifier through the network parameter update iteration. Combining the idea of bagging algorithm with CNN, an ACNN framework is proposed.

Details are in the caption following the image
Schematic diagram of ensemble learning.

3.2. ACNN Architecture

The ACNN model is mainly composed of the following four parts: input layer, subnetwork unit, fusion unit, and output layer. The input layer is used to receive the time-domain signal and input it to the subnetwork unit, which architecture is shown in Figure 3. The subnetwork unit is composed of 1D-CNN fault identification subnetworks. The number of subnetworks is consistent with the number of working conditions (speeds or loads) of samples in the training dataset. Each subnetwork has the same structure and corresponds to different working conditions, respectively. ACNN can accurately extract the fault-sensitive information of rotating machinery under variable working conditions and accurately identify the fault type. The fusion unit stores the weight matrix obtained through supervised training, which is used to assign different weights to the output results of the subnetwork, then performs fusion learning and outputs the final recognition results.

Details are in the caption following the image
Illustration of the proposed MLDTN architecture.

The processing process of the input data by the subnetwork unit is shown in Figure 3. During fault identification, a fault sample to be identified is input into the ACNN; the input layer receives the sample and inputs it into the subnetwork unit. Each subnetwork receives the sample and is activated, and uses the stored fault type information and sample feature distribution information to learn and identify the sample.

Among them, the subnetwork corresponding to the input sample speed condition will output the original fault identification results with high accuracy, and other subnetworks will show inconsistent responses. Therefore, the construction of subnetwork unit realizes the transformation of fault identification problem under multiple working conditions into fault identification problem under single working condition, that is, for each known working condition fault sample to be identified, there is a subnetwork identification module with high identification accuracy corresponding to its working condition. After the subnetwork unit completes the processing of the input data and obtains the original recognition results O1, O2, …, ON, it inputs the original recognition results O1, O2, …, ON to the fusion output layer for the next step. When the multi subnetwork unit completes the processing of input data and obtains the original identification results O1, O2, …, ON, we input the result to the fusion output layer for weighted information fusion learning, as shown fusion unit in Figure 3. The original identification results O1, O2, …, ON are output by different CNN fault identification subnetworks. The identification subnetworks consistent with the working conditions of the input samples will output high-accuracy fault identification results, while other identification subnetworks will output low-accuracy fault identification results. In other words, after the fusion output layer performing weighted fusion learning on the input original identification results O1, O2, …, ON, high-accuracy fault identification results occupy a large proportion in the final identification results, and the results with low accuracy are suppressed, so that the accuracy of the final identification results is guaranteed.

The fusion layer uses the weight matrix W obtained through supervised training to perform fixed weight fusion learning on the original recognition results O1, O2, …, ON; the weight matrix W and output of N-th subnetwork are defined as follows:
(5)
(6)
where ωnl indicates the probability value of the input fault sample at fault type l at speed n. ON is the original recognition result vector. represents the probability value of the input fault sample with fault type L in the condition VN. L is the number of nodes in the subnet, and equal to the number of fault types. The original identification results O1, O2, …, ON of the multi subnetworks are used as intermediate input of the fusion unit. The original identification results O1, O2, …, ON are given different weights by the weight matrix W, and then the output results of the same fault type in different original identification results are accumulated, as shown in the following formula: (take fault type 1 as an example).
(7)
The final identification result O is as follows:
(8)
where OL represents the probability that the input fault sample belongs to fault type L. The fault types corresponding to the maximum probability are output through maximum function, and that is the final fault identification result. The maximum function Max (O) sets the largest element in the final identification result O = [O1, O2, ⋯, OL] to 1 and the remaining other elements to 0. Therefore, the prediction result of the ACNN network model for the input sample is the fault type corresponding to element 1 in the final fault identification result. The maximum function is as follows:
(9)
where
(10)
The weighted fusion output result can be obtained from (5) to (8), and its expression is as follows:
(11)
The weight matrix W of the fusion output layer is obtained through supervised training. According to the error between the final recognition result of the fusion output layer and the real label of the input sample, the classical gradient descent algorithm is used to optimize the network parameters, and the back propagation algorithm is used to transfer the error layer by layer in the training process. The initialization of network parameters adopts the random mode of normal distribution, and the value of the initial value of parameters ranges from (-1,1). When the model parameters reach an optimal value after continuous updating iteration, the model training is completed. Assuming that the real label of a fault sample set is T and the final fault identification result is O, the error between the final identification result output by the fusion output layer and the real label of the input sample can be calculated as follows:
(12)
where OP and OT, respectively, represent the final fault identification result output by the model and the real label of the fault sample set, and p is the number of training samples. In the process of weight matrix W training, the weight matrix W is optimized by minimizing the error ep. Given that the final recognition result is obtained after the process of maximum function, the influence of the maximum function Max (O) on the back propagation process of training error should be considered first in the training process of weight matrix W. According to the classical chain derivation algorithm in the back propagation and the definition of the maximum function Max (O), the partial derivative of Max (O) with respect to its input variable Ol is calculated as follows:
(13)
where Ol is the l-th element in the final recognition result O = [O1, O2, ⋯, OL]. Δ is an infinitesimal quantity. In line with the definition of the maximum function:
(14)
The partial derivative of the maximum function for any input is 1, and the maximum function has no effect on the error back propagation and chain derivation process. Hence, there is no need to consider the influence of the maximum function in the training process of the weight matrix W. According to the output function of the fusion unit and the classical chain derivation in the back propagation algorithm, the partial derivative of the final recognition result O to the weight matrix W is as follows:
(15)
Considering the case of a single training sample, the training error, that is, the partial derivative of the error function ep to the weight matrix W, is as follows:
(16)
where is the output value of the original identification result that belongs to the l-th node of the n-th identification model. Ol is the output value of the l -th node in the final identification result, and Tl is the l-th node value of the real label of the input sample. Based on classical gradient descent algorithm, the optimization formula for the weight matrix W is as follows:
(17)
where η is the learning rate in the training process of weight matrix W.

3.3. ACNN Training

The fault feature mapping information under variable conditions is extracted and saved into different conditions identification subnetworks by the model training process, and the analysis results of the subnetworks are fused through the weighted information fusion learning algorithm to obtain the final fault identification results. The flowchart of ACNN fault identification method is shown in Figure 4. The training and testing steps are as follows:
  • (1)

    Extracting the time-domain signals of fault vibration at different conditions in the actual industrial scene, construct the fault sample set under all conditions

  • (2)

    Divide the fault sample set into sample sets under different conditions, and then train the subnetworks and weight matrix of ACNN model

  • (3)

    Put the fault sample under a certain condition in the same scene into the trained ACNN network model to obtain the fault type corresponding to this sample

Details are in the caption following the image
The training and fault identification of ACNN.

4. Experimental Verification and Analysis

The performance of ACNN is verified on gear dataset and bearing dataset. The gear dataset [24] came from Chongqing University, and the bearing dataset came from Case Western Reserve University (CWRU) [25].

4.1. Case I: Gear Dataset

  • (1)

    Test platform and dataset description

The schematic diagram of the structure of the gear test bench is shown in Figure 5. It consists of five parts: the drive motor, the two-stage spur gear reducer, the speed sensor, the magnetic powder brake, and the control cabinet. The speed of the drive motor and the load of the magnetic powder brake are controlled by the control cabinet, which enable the gearbox to run stably under various speeds and loads. The transmission ratio of the two-stage spur gear reducer is 3.59, the gear ratio of the first transmission stage is 23/39, and the gear ratio of the second transmission stage is 25/53. The motor is a DC motor of YVFF-112M-4, with rated power of 4 kw, rated voltage of 380 V, and maximum speed of 1200 rpm. The magnetic particle brake model is CZ10, rated voltage is 380 V, rated current is 1.2A, and can provide controllable stable torque load for the experimental system within 0 to 500 N.

Details are in the caption following the image
Test platform for acquiring vibration signals.

The structural parameters of the gearbox are shown in Table 1. The fault gear is the intermediate transmission gear with 25 teeth. The gear faults include tooth surface spalling, root crack, tooth surface pitting, and tooth fracture, which are shown in Figure 6. The vibration sensor is set in the vertical direction of three transmission shafts. The training and test data are obtained from the vibration signal of the middle drive shaft position sensor.

Table 1. Structure parameters of experimental gearbox.
Number of high-speed gear teeth Number of low-speed gear teeth Transmission ratio Center distance
23 25 1.696 93 mm
39 53 2.12 117 mm
Details are in the caption following the image
Four gears with different fault conditions.

The gear fault and vibration acquisition settings are shown in Table 2.

Table 2. Gear fault experiment conditions and experiment parameter setting.
Fault type Fault size (mm) Input speed (rpm) Sampling frequency (Hz) Sampling time (s)
Healthy None 700, 750, 800, 850,900, 950, 1000, 1050, 1100 20480 15
Tooth surface spalling 60 × 3 × 0.5
Root crack 60 × 3
Tooth surface pitting 2 mm
Broken tooth 30% tooth width (18 mm)
The vibration was measured at a sampling frequency of 20.48 kHz with an input torque of 200 Nm, and the acquisition time is 15 seconds. In the gear fault simulation experiment, 13 speed conditions are set evenly in the speed range of 500 to 1100 rpm, and the interval between two adjacent speeds is 50 rpm. In this experiment, each sample contains 2048 data points. The 1000 samples were gotten by 2048 steps for sliding sampling under each speed and fault type. In order to verify the performance of the proposed network, the training sets and the test sets use samples with different speeds; 4 different training sets are set up. The settings of the training sets and the test sets are shown in Table 3. Each training set and testing set contain five gearbox states: health, tooth surface spalling, tooth root crack, tooth surface pitting, and broken tooth. The rotation speed of the training set is different, and the rotation speed of the test set is from 500 to 1100 rpm. The number of subnetworks included in ACNN is consistent with the number of working conditions. In this experiment, ACNN contains two subnetworks on training sets A and B. ACNN contains three subnetworks on training sets C and D.
  • (2)

    Classification comparison and analysis

Table 3. Training sets and test sets of models.
Datasets Train sets Test sets Number of training/testing samples
A 600, 900 rpm 500 to 1100 rpm 10000/65000
B 800, 1100 rpm 10000/65000
C 500, 700, 900 rpm 15000/65000
D 550, 750 950 rpm 15000/65000

In order to verify the advantage of ACNN, CNN [14], residual networks (ResNet) [26], wide first-layer kernels (WDCNN) [15], and multiscale kernel-based ResCNN (MK-ResCNN) [16] are used as comparison networks. The comparison models and ACNN are built based on Python 3.7 and Pytorch 1.7.1. The main configurations of the computer are as follows: CPU-i9-9900k, RAM-128GB, GPU-RTX 2080Ti. The five methods (ACNN, DCNN, ResNet, WDCNN, and MK-ResCNN) are trained and tested by the datasets in Table 3; the classification results are shown in Figure 7.

Details are in the caption following the image
Classification performance by different methods on CU gear datasets.

There is no cross sample between the training set and the test set. This verification method is also called fixed dataset verification. It can better verify the recognition ability of the recognition model to unknown working condition samples and improve the robustness. The recognition rates of CNN, ResNet, WDCNN, MK-ResCNN, and ACNN are 91.53%, 94.22%, 93.8%, 94.65%, and 95.12% on dataset A, respectively. The recognition accuracy of ACNN model is higher than that of the other four comparison models. The sample recognition rate of ACNN model is also higher than that of other models on datasets B, C, and D. The average recognition rate of ACNN is 3.69%, 2.67%, 2.33%, and 0.86% higher than that of CNN, ResNet, WDCNN, and MK-ResCNN, respectively. Although the recognition rate of MK-ResCNN model is close that of ACNN model, the network parameter number of MK-ResCNN is more than three times that of ACNN model, and its training time is longer. The experimental results show that ACNN has high recognition rate and strong robustness. The recognition accuracy of datasets C and D samples is higher than that of datasets A and B, because the training set of datasets C and D contains more speed samples, which is also reasonable. The confusion matrix of the identification result on dataset D is shown in Figure 8.

Details are in the caption following the image
Confusion matrix of ACNN model on gear dataset D.

It can be found that the recognition accuracy of healthy samples is the highest by the confusion matrix of the recognition results. There are 75 samples of tooth surface spalling fault incorrectly classified as tooth surface pitting corrosion. The number of tooth surface pitting fault samples incorrectly identified as tooth surface spalling is 60. This shows that the fault characteristics of tooth surface spalling and tooth surface pitting are similar. The number of tooth root crack fault samples incorrectly identified as tooth surface spalling and tooth fracture is 21 and 45, respectively. The parameters and calculation time of the comparison model on dataset A are shown in Table 4. It can be found that the parameters of ACNN network are less than those of the comparison model, and the training and testing time are the least.

Table 4. Training sets and test sets of models.
Models Training time (s) Testing time (s) Model parameter quantity
CNN 0.501 0.304 211672
ResNet18 1.015 0.156 661508
WDCNN 0.241 0.162 99270
MK-ResNet 3.395 1.261 835274
ACNN 0.116 0.128 54076

4.2. CWRU Bearing Data

  • (1)

    Test platform and data description

The experimental data are collected from the accelerometer of the motor-driven mechanical system (Figure 9) at a sampling frequency of 12 kHz. There are four kinds of bearing faults, that are normal, inner ring fault, ball fault, and outer ring fault. The fault dimensions of the three fault kinds are divided into 0.007 inch, 0.014 inch, and 0.021 inch. Therefore, there are 10 kinds of bearing states that need to be classified. The failure frequency of bearing fault types (inner ring fault, outer ring fault, and ball fault) is different, so we use this data to verify the performance of ACNN.

Details are in the caption following the image
Test stand for acquiring vibration signals.
In this experiment, to verify the recognition performance of the model for samples with unknown loads, those datasets are divided into training/testing dataset according to the load of the samples. The input sample is the original vibration signal with the length of 2048, where 400 samples are obtained by 200 steps via sliding sampling on one load and one fault state, so samples with the size of 400 × 10 were obtained under each load. The samples of any two or three loads are selected as the training datasets, and the remaining samples are used as the testing datasets. The information of training datasets and testing datasets is shown in Table 5. It can be found that there is no intersection between the training datasets and the test datasets. The performance of the ACNN model to identify the health state of the samples under unknown load is further verified.
  • (2)

    Classification comparison and analysis

Table 5. Training sets and test sets of models.
Datasets Train datasets Test datasets Number of training/testing samples
A 0, 1 hp 2, 3 hp 8000/8000
B 0, 2 hp 1, 3 hp 8000/8000
C 0, 3 hp 1, 2 hp 8000/8000
D 1, 2 hp 0, 3 hp 8000/8000
E 1, 3 hp 0, 2 hp 8000/8000
F 2, 3 hp 0, 1 hp 8000/8000
G 1,2, 3 hp 0 hp 12000/4000
H 0, 2, 3 hp 1 hp 12000/4000
I 0, 1, 3 hp 2 hp 12000/4000
J 0, 1, 2 hp 3 hp 12000/4000

In order to verify the advantage of ACNN, CNN [14], residual networks (ResNet) [26], wide first-layer kernels (WDCNN) [15], and multiscale kernel-based ResCNN (MK-ResCNN) [16] are used as comparison networks. The five methods (ACNN, DCNN, ResNet, WDCNN, and MK-ResCNN) are trained and tested by the datasets in Table 5; the classification results are shown in Figure 10. The recognition accuracy of ACNN is 95.49%, which is 6%, 5.79%, 7.01%, and 13.24% higher than CNN, ResNet, WDCNN, and MK-ResCNN on bearing dataset A, respectively. The recognition accuracy of ACNN is 93.12%, which is 4.49%, 0.01%, 7.11%, and 10.84% higher than CNN, ResNet, WDCNN, and MK-ResCNN on bearing dataset B, respectively. The average recognition rates of ACNN, CNN, ResNet, WDCNN, and MK-ResCNN models on bearing datasets are 94.06%, 88.33%, 8.79%, 87.50%, and 86.06%, respectively. The average accuracy of ACNN is more than 4.28% higher than that of other comparison models, which proves that ACNN has strong recognition performance and adaptability to samples under variable load conditions. In order to explore the identification details of samples by ACNN model, the confusion matrix of the output results of dataset A is shown, which is shown in Figure 11.

Details are in the caption following the image
Classification performance by different methods on CWRU bearing datasets.
Details are in the caption following the image
Confusion matrix of ACNN model on bearing dataset A. Health, IF7, IF14, IF21, B7, B14, B21, OF7, OF14, and OF21 represent health status, 0.007 inch inner ring fault, 0.014 inch inner ring fault, 0.021 inch inner ring fault, 0.007 inch ball fault, 0.014 inch ball fault, 0.021 inch ball fault, 0.007 inch outer ring fault, 0.014 inch outer ring fault, and 0.021 inch fault, respectively.

In the confusion matrix, it can be found that all health status samples are correctly classified, and the number of 0.014 inch inner ring fault samples incorrectly identified as ball faults is 124. The number of ball fault samples incorrectly identified as outer ring fault is 81. This shows that there are similarities between inner ring fault characteristics and ball fault characteristics. The parameters and calculation time of the comparison model on dataset A are shown in Table 6. It can be found that the parameters of ACNN network are less than those of the comparison model, and the training and testing time are the least.

Table 6. Training sets and test sets of models.
Models Training time (s) Testing time (s) Model parameter quantity
CNN 0.401 0.043 211672
ResNet18 0.812 0.022 661508
WDCNN 0.193 0.023 99270
MK-ResNet 2.716 0.180 835274
ACNN 0.093 0.018 54076

5. Conclusions

This paper proposes an adaptive convolutional neural network by combining ensemble learning and simple convolutional neural network. ACNN model consists of input layer, subnetwork unit, fusion unit, and output layer. The input of the model is one-dimensional (1D) vibration signal sample, and the subnetwork unit consists of several simple CNNs, and the fusion unit weights the output of the subnetwork units by the weight matrix. In gear and bearing experiments, the performance and robustness of ACNN model are verified by comparing with CNN, ResNet, WDCNN, and MK-ResCNN models.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by Chongqing Natural Science Foundation (cstc2019jscx-msxm X0360, cstc2019jcyj-msxmX0346), National Natural Science Foundation of China (under Grant No. 51805051), and the Central University Basic Research Fund (2020CDJGFCD 002).

    Data Availability

    Data available on request.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.