Detectable Inspection: Propeller Nick Detection Model Development Using YOLOv8
Abstract
Purpose: Nicks in the propeller can lead to catastrophic failures if not repaired properly, but proper detection can be difficult. To improve nick inspection tasks, the researchers intend to develop a nick detection model using the object detection algorithm, You Only Look Once.
Design/Methodology/Approach: The researchers select YOLOv8 algorithm. To develop a nick detection model, the researchers collected 4423 nick images in various conditional variable settings. Using preprocessed 1800 nick images, six models are trained with different hyperparameter settings and assessed for performance.
Findings: A well-trained model achieved 0.986 precision, 0.982 recall, and 0.984 F1 score. Additionally, the researchers assessed explainability analysis using Grad-cam to see if this model properly extracts nick features during the training process.
Originality/Value: This research contributes to improving inspectors’ propeller nick inspection maturity using this nick detection model. Therefore, this study can potentially reduce general aviation accidents caused by propeller failure.
1. Introduction
In powered flight, there is arguably not a more critical part in the production of thrust than the propeller. Proper maintenance practices must be followed because the propeller is not only essential for flight, but it is also under a tremendous amount of stress from various sources such as centrifugal force, torque, thrust, and aerodynamic loads [1]. The propeller also operates in dynamic operating conditions. Stresses and operating conditions can lead to defects on the surface of the propeller, such as nicks or cracks, which can lead to the failure of the propeller [1].
Finding and repairing defects while maintaining a smooth, aerodynamic surface are key inspection and maintenance practices. Mechanics and pilots perform visual inspections to ensure the propeller is kept in an airworthy condition. If a defect is found, repairs must be made by a Federal Aviation Administration (FAA)–certified powerplant maintenance technician. Eligibility requirements can be found in 14 CFR Part 65 [2], maintenance practices and scopes in 14 CFR Part 43 [3], and maintenance details in Advisory Circular (AC) 20-37 [1]. The current maintenance practices include both corrective maintenance and preventive maintenance. Corrective maintenance is performed when a defect is detected and restored to an airworthy condition. Preventive maintenance, such as minor preservation operations or the replacement of small standard parts, is generally performed based on certain periodic or time-based schedules. In addition, aircraft operators perform both pre- and postflight inspections around aircraft operations.
Since propeller failure can result in injury, death, or significant equipment failure, it is important for pilots, operators, and technicians to properly identify and repair possible sources of fatigue damage, such as a nick. Machine learning (ML) has been widely adopted in various domains as a decision supporter to overcome decision errors produced by humans. Depending on the ML algorithms, they can generate outputs that provide explanations, predictions, or definitions of certain situations [4]. The principle of ML is to train a computer model on various patterns and characteristics from provided samples. The trained model can recognize objects that look similar to the shape of the sample [5]. In this research, the researchers intend to develop a nick detection model using the object detection algorithm, You Only Look Once (YOLO).
2. Materials and Methods
2.1. YOLO
YOLO is a convolution neural network (CNN)–based ML algorithm; the first generation is developed by [6], and the YOLO outperforms two other well-structured detection algorithms, (a) deformable parts model (DPM) and (b) fast region-based convolutional neural network (R-CNN), in terms of speed [6, 7]. As of January 2023, YOLOv8 was released by Jocher et al. [8]. Jocher et al. [8] establish that YOLOv8 leads in both accuracy and speed compared to its previous YOLO versions. The YOLOv8 also provides five different scale models (nano (n) = smallest, s = small, m = medium, large (l) = large, and x-large (x) = largest). As scale size decreases, training time and inference speed become faster, but prediction accuracy decreases. Conversely, as the scale size increases, training time, inference speed, and prediction accuracy also increase.
- -
Backbone: The YOLOv8 selects cross stage partial (CSP) [12] that extracts the main input features. Spatial pyramid pooling feature (SPPF) and channel fusion (C2f) are used to improve computational processes and increase accuracy.
- -
Neck: The neck acts as a bridge between the backbone and head to make better prediction results. The neck process refines aggregated data from the backbone data. It reprocesses it by focusing on geometric and conceptual information based on two scales: feature pyramid networks (FPNs) [13] and path aggregation networks (PANs) [14].
- -
Head: The head is where the object prediction is processed based on a given new input determined by the extracted features from the backbone and neck. YOLOv8 adopts anchor-free improved approaches that help find the location of new input.
- -
Loss: The loss evaluates errors between the predicted location and actual location and can be evaluated into three losses: distribution focal loss (DFL), classification loss (CLS), and bounding box (BOX) losses.

2.2. ML Implementation Studies
The researchers review ML implementation studies for propellers or blades in the aeronautical domains using borescope inspection tasks to detect defects in gas turbine blades [15–21] and in propellers [22, 23]. Furthermore, the researchers review other YOLO implementation case studies: inspection of steel strip surfaces using YOLOv1 which can detect scar, scratch, inclusion, burr, seam, and iron scale [24]; a YOLOv3 model that can detect aircraft skin defects [25]; Zhang et al. [26] develop turbine engine blade defects’ detection model using YOLOv3; a wind turbine blade defect detection model using YOLOv5 [17, 27, 28].
2.3. Propeller Failure
While there are various defect types on the propeller identified by AC 20-37 [1], this research is focused on the nick, a “sharp, notchlike displacement of metal usually found on leading and trailing edges” of propellers ([1], p. 3). Common propeller failures that begin in the tip are frequently caused by a nick on the propeller surface [1, 29]. Caldwell [30, 31] states that a nick is a tip failure initiator; when the aircraft starts, the propeller stirs objects or material on the ground, and the propeller strikes the object or material and creates a nick. Furthermore, when the propeller fails due to nick, it does not show any structural weaknesses or internal flaws [29, 30]. This means nick damage on the propeller plays a significant role in propeller failure, rather than inherent weaknesses or design flaws. Therefore, finding nicks is a crucial task and should be found before aircraft take-off [29–33].
Once a nick is found, it is also essential to repair the propeller as soon as possible [31]. The general steps to repair a nick are to round off sharp areas using a file. Then, the technician removes tool marks and polishes the propeller with ever increasing grits of emery and crocus cloth before inspecting with a magnifying glass. The detailed propeller repair steps, tools, and consumables are described in the original equipment manufacturers’ (OEM) maintenance procedures, or in the absence of OEM manuals, sometimes technicians may use AC 20-37 as approved instructions.
Improper inspection and maintenance can result in a failure to identify and repair nicks, cracks, and dents that can result in the catastrophic failure of the propeller. When propellers fail, injury and damage to the aircraft are substantial. The inability to identify nicks can be influenced by different factors such as risk sensitivity, individual personality, and human error factors [34, 35], including the lack of experience and knowledge. New technologies, such as ML, can be used to support pilots and technicians to identify nicks that may previously have been overlooked in propellers so they can be properly repaired.
3. Problem Generation
The ability of an inspector to perform a proper inspection can vary based on multiple factors. To improve propeller inspection outcomes, the researchers intend to develop a nick detection model using YOLOv8 algorithm to support inspectors in their propeller inspection tasks. This model can detect nicks that inspectors may miss during inspections.
4. Model Development
To develop a nick detection model, the researchers follow the model development workflow as shown in Figure 2. This workflow includes research design and results. In this model development section, nick images are collected, preprocessed, and the YOLOv8 is selected, and then, six different hyperparameters are selected to train and validate the YOLOv8 models. Next, in the Results section, among the six YOLOv8-M models, the researchers determined a well-performed nick detection model based on the validation results.

4.1. Nick Image Collection
Since obtaining actual nick images is challenging, the researchers create a test coupon from an unairworthy propeller blade used in an educational setting to train the models with images that closely resemble actual nicks. To achieve this, the researchers create six nicks in the blade using a hammer and aluminum chisel; see Figure 3.

- -
View angles (a): The view angle refers to the camera’s vertical position relative to the propeller, captured at three angles: 90°, 45°, and 0°, as shown in Figure 4.
- -
Standpoint range (b): Standpoint refers to the camera’s location on the horizontal plane relative to the propeller. This ranged from 0° to 180°, as shown in Figure 4.

- -
Light: The inspectors typically use a flashlight to perform night inspections. When a flashlight is shined at the surface of the propeller, the nick reflects a different shine if a nick is present. Inspectors typically do not use flashlights for daytime inspections. Therefore, two light conditions are applied to the view angles and standpoint range variables: when there is no light, the nick images are collected, and in a dark environment, the flashlight is given at the nick, with the reflected nick images being collected.
Considering all conditional variables on the six nicks, the researchers collected 4423 images in total. The number of nick images under each condition is provided in Table 1.
Nick | #1 | #2 | #3 | ||||||
---|---|---|---|---|---|---|---|---|---|
Standpoint | 0°–180° | 0°–180° | 0°–180° | ||||||
View angles | 90° | 45° | 0° | 90° | 45° | 0° | 90° | 45° | 0° |
Light | 99 | 127 | 113 | 158 | 88 | 116 | 106 | 135 | 119 |
No light | 112 | 65 | 133 | 112 | 113 | 117 | 126 | 122 | 128 |
Nick | #4 | #5 | #6 | ||||||
Standpoint | 0°–180° | 0°–180° | 0°–180° | ||||||
View angles | 90° | 45° | 0° | 90° | 45° | 0° | 90° | 45° | 0° |
Light | 100 | 109 | 113 | 169 | 134 | 184 | 158 | 115 | 110 |
No light | 121 | 138 | 92 | 134 | 113 | 144 | 137 | 131 | 132 |
4.2. Nick Image Collection Preprocessing
- -
Random subsampling: Various nick images are collected in conditional variables as shown in Table 1. However, due to the limited computational resources, the researchers use Python random function to randomly subsample 50 images for each nick and conditional variables.
- -
Image labeling: Since the YOLO algorithm is supervised learning, it requires manual labeling of nick boundaries in all subsampled images. The authors, who hold FAA-certified Airframe and Powerplant licenses, manually labeled the boundaries on nick images as class “NICK” using V7-Labs [36] software.
- -
Image partition: Using the same Python random function, 40 images were selected out of the 50 randomly selected images from the random subsampled images for training the models. The remaining 10 images are selected for validating models. Therefore, 1440 images are allocated for training, and 360 images are allocated for validation.
4.3. Model Selection
The YOLOv8 performs fast inference speed and high accuracy rates compared to previous YOLO versions [8]. Furthermore, the YOLOv8 model supports image augmentation that helps to better fit the model. Among five YOLO scale models, a YOLOv8-M is selected, because that is well balanced between accuracy and prediction speed [8].
4.4. Hyperparameters
Since different hyperparameters can affect the performance of each model, three hyperparameters are used to train the YOLOv8-M model: learning rate, epoch, and batch, and these three hyperparameters are used in different settings to find an optimized model.
The researchers select a grid search [37] for two hyperparameters: batch and learning rate, which the grid search technique involves inputting different values to explore the performance of various parameter combinations. The epoch size is automatically optimized through the early stopping algorithm, which can prevent overfitting [9, 23] and waste computational resources from redundant training.
The researchers use a stochastic gradient descent (SGD) optimizer and set batches to 16 and 32 and set the learning rate to 1 × 10−2, 1 × 10−3, and 1 × 10−4. The early stopping algorithm sets to patience of 5. If there is no model improvement in the model based on the validation weight results after each epoch within the given patience, the training process stops, and no future epoch is processed [8]. For example, all training images are used to train in the first epoch, and after the validation images are used to validate model performance using metrics discussed in Section 4.5. If there is no model performance improvement in validation results for five consecutive iterations of the epoch, the model retrieves the best-performing state from the last five epochs. If the model shows continuous improvement in validation results, it is trained continuously. The trained models’ hyperparameters are shown in Table 2. Note: The number of epochs for each model is determined based on the validation results after each training epoch. Models (A) and (D) indicate that one epoch is an optimized epoch size; no performance improvements are observed during five consecutive iterations of the epoch. Models (B) and (E) determine 41 and 31 epoch sizes, respectively. Models (C) and (F) determine 27 and 25 epoch sizes, respectively.
Model | Hyperparameters | Validation results | ||||||
---|---|---|---|---|---|---|---|---|
Batch | Learning rate | Epoch | Precision | Recall | mAP @50 |
mAP @50–95 |
F1score | |
(A) | 16 | 1 × 10−2 | 1 | 0.887 | 0.828 | 0.867 | 0.386 | 0.856 |
(B) | 16 | 1 × 10−3 | 41 | 0.986 | 0.982 | 0.986 | 0.595 | 0.984 |
(C) | 16 | 1 × 10−4 | 27 | 0.958 | 0.958 | 0.975 | 0.526 | 0.958 |
(D) | 32 | 1 × 10−2 | 1 | 0.861 | 0.793 | 0.847 | 0.381 | 0.826 |
(E) | 32 | 1 × 10−3 | 31 | 0.983 | 0.983 | 0.983 | 0.579 | 0.983 |
(F) | 32 | 1 × 10−4 | 25 | 0.966 | 0.953 | 0.974 | 0.527 | 0.959 |
- Note: Bold indicates the final model (B)’s hyperparameters and results.
4.5. Metrics
5. Results
The performance of each model is evaluated using validation images, and the results are shown in Table 2. While Models (A) and (D) perform noncompetitive performance across all metrics compared to Models (B), (C), (E), and (F), Model (B) outperforms four evaluation assessments: P, mAP at 50, mAP at 50–95, and F1 score, and Model (E) outperforms only R.
5.1. Determine an Optimized Model
Based on the validation results, the researchers select Model (B) as the final model. The optimized hyperparameters are 16 batches with 1 learning rate and 41 epochs. This is because Model (B) performs at a higher F1 score, which is harmonically well balanced for minimizing FPs through high precision and capturing more TPs through high recall. Therefore, Model (B) is most suitable for propeller inspection tasks. For example, higher precision indicates fewer FPs, meaning the model is less likely to make incorrect inspections that can lead to redundant maintenance. Although Model (B) shows a 0.001 decrease in precision compared to Model (E), it maintains its ability to correctly identify actual TPs while minimizing FNs, such as actual defects are not overlooked or incorrectly classified as nondefective.
Model (B) training summary is shown in Figure 5. Precision–recall curve indicates 0.986 is AP at 50 of “NICK” class (refer to Figure 6a). Figure 6b shows x and y coordinates of normalized images are normally distributed, and the width and height are exponentially distributed. In addition, a label correlogram is provided to show the statistical distribution with image label variables’ pairwise correlations. The x and y coordinates represent the center point of the bounding box; the width and height represent the length of the bounding box relative to image size [6, 43–45]. The results of the validation sample images are shown in Figure 7.




5.2. Explainability Analysis
The researchers use a Grad-cam algorithm on Model (B) to verify how the YOLOv8-M model extracts nick features in layers. Considering the CNN process, the earlier layers extract general features while the later layers extract detailed features. Therefore, the researchers feed 10 training images to Model (B), and the Grad-cam points out that the 18th layer extracts the nick features (refer to Figure 8). From this analysis, Model (B) properly extracts the nick features in the 18th layer.

6. Next Steps
This study focuses on developing a propeller nick detection model to improve maturity in the propeller nick detection inspection tasks. While the model is trained on images of closely resembling nicks using appropriate hyperparameters and its explainability is analyzed through Grad-cam, several aspects remain outside the scope of this research. Future research areas could focus on practical implementation, assessing how well this model can detect actual nicks. Additionally, studies could explore the benefits of ergonomic design of the model prototype, its usability and utility for inspectors, and its practical applicability. Further research could also assess measurable and tangible benefits, such as risk reduction.
7. Conclusion
Detecting nicks is a crucial inspection task because nicks initiate tip failure and do not show prefailure symptoms. To improve inspection outcomes, the researchers developed a propeller nick detection model using the YOLO algorithm. This model can serve as the inspectors’ third eye in the propeller nick inspection tasks. This model can detect nicks that inspectors may overlook during inspections. As a result, this study has the potential to enhance inspection accuracy and reduce general aviation accidents caused by propeller failure.
Conflicts of Interest
The authors declare no conflicts of interest.
Funding
This research received no specific grant from any funding agency.
Open Research
Data Availability Statement
The data that support the findings of this study are available from the corresponding author upon reasonable request.