Tracking Method for Alpine Skiing Based on Hybrid Deep Learning and Evolutionary Chimp Optimization Algorithm
Abstract
Tracking athletes in high-speed outdoor sports like alpine skiing causes substantial difficulties because of ever-changing movements, environmental variability, and the limitations of traditional tracking technologies, such as intrusive sensors and single-view camera setups. This study proposes a hybrid approach for tracking alpine skiing activities by combining YOLO-v8 with an evolutionary version of the chimp optimization algorithm (CHOA-EVOL) for optimizing hyperparameters. The primary goal of this research is to enhance the CHOA to optimally adjust the hyperparameters of YOLO-v8, consequently addressing the drawbacks of outdoor sports tracking technology. This hybrid model integrates data from unmanned aerial vehicles (UAVs) and terrestrial cameras to better understand athletes’ rapid rotating motion. The suggested approach is extensively tested and validated using advanced algorithms with the UAV123 dataset and a recently developed alpine skiing dataset (ASD). The results have shown that our proposed approach can achieve high precision and robustness.
1. Introduction
Tracking athletes in sports like alpine skiing is crucial to analyzing performance and carrying out scientific training. Traditional methods, such as handheld sensors, can be intrusive and have an impact on performance, and outdoor races are notoriously difficult to record in full using cameras [1]. At this stage, the researchers are concentrating their efforts on developing cutting-edge tracking techniques combining deep learning with correlation filters to address that issue [2–4].
The result is an in-depth solution to mapping athletes’ course and speed by capturing how they move at a given moment using data collected with UAVs (uncrewed aerial vehicles) and ground cameras. Outdoor athletes in outdoor sports frequently encounter intricate situations due to their curved flight characteristics, which result in high accelerations and unpredictable flight paths. These situations pose a challenge to existing techniques, which rely on cameras aiming for explicit tracking information [5]. However, N-unKraknet has developed tracking algorithms that meet these challenges, such as building state-of-the-art against dynamic backgrounds under complete pose accuracy while using repetitive motion [6].
Recent improvements in visual tracking have addressed complex tracking scenarios through the introduction of several techniques [7–10]. We have introduced a novel strategy using the hybrid correlation filter method to tackle problems like scale change and partial occlusion problems. To accomplish this, we employ a global filter in conjunction with two mode-dependent local filters, either alone or in combination, to pinpoint the target position precisely.
We use a multiview model to get strong tracking performance with correlation filtering to combine information from all accessible views. Multi-view trackers are essential for global nonlinear air traffic management schemes, and researchers have recently investigated evolutionary algorithms (EAs) as a means to improve them [11, 12]. There are many performance criteria that these schemes must meet while running duration [13]. An innovative approach called a generic applied evolutionary hybrid technique has been created. This strategy combines adaptive multimodel partitioning filters with genetic algorithms to model real-world adaptive systems [14].
Structured correlation filters have been implemented to enhance long-term tracking, particularly in scenarios where objects are partially hidden, by employing a fixed model for initial estimation and a flexible model for the final analysis of the target’s state [10, 15]. The scope of index monitoring has expanded to include a crossbreed optimization method that integrates transformative formulas with quadratic programs. This technique addresses the NP-hard problem of replicating the efficiency of stock exchange indices [16]. High-speed monitoring approach kernelized relationship filters have been recommended, which supplies a rapid and also reliable remedy that exceeds leading trackers [17]. Crossbreed optimization methods like the crossbreed sine–cosine formula with differential advancement have revealed assurance in accomplishing worldwide optimization coupled with item monitoring, outmatching standard metaheuristic formulas [18]. Furthermore, the correlation particle filter has been implemented to enhance the reliability of visual tracking. It is capable of effectively dealing with occlusions, scale fluctuation, and retaining various hypotheses for lost tracks [19].
Current tracking methods frequently encounter challenges, including managing fast motions, obstructions, and environmental conditions unique to outdoor sports such as alpine skiing. Particular strategies might not be challenging in dealing with partial or complete occultation, while others might not have the capacity to adjust to modifications in range or might cause too many computer costs [20]. Additionally, using single-view designs can limit the monitoring’s toughness, as they might not include all the unique qualities needed for accurate monitoring [21]. Metaheuristic formulas, nevertheless reliable, might experience problems when utilized in scenarios with a wide variety of needs or might call for significant handling sources [22]. These shortages emphasize the need for even more advanced and valuable monitoring formulas to deal with these restrictions [23, 24].
An advanced version of the chimp optimization algorithm [25] is needed since it can efficiently explore the complicated search spaces involved in tracking filters. Evolutionary, nature-inspired, and meta-heuristic algorithms are renowned for achieving global optimization, making them crucial for discovering optimal solutions in multiobjective situations [10, 26, 27]. The CHOA method can be fine-tuned to make it more robust and flexible to handle various challenging scenarios in visual tracking [28].
Using CHOA-EVOL to fine-tune the hyperparameters of YOLO-v8 [29, 30] is good as it allows systematically exploring the hyperparameter space to find the best configuration for the tracking model. Letting the model adapt to alpine skiing tracking characteristics such as high-speed motion and diverse environments will improve accuracy and robustness in tracking.
- •
Development of an enhanced evolutionary algorithm
-
We present CHOA-EVOL, an evolutionary version of the chimp optimization algorithm. It overcomes the challenges of low optimization speed and premature convergence. Optimization speed and diversity are usually required to solve optimization problems. CHOA-EVOL performs well for complex hyperparameter tuning tasks as it can utilize the sensitivity to the hyperparameter being tuned by maintaining an iterative approach to ensuring diversity in the space where the solutions are present.
- •
Integration of CHOA-EVOL with YOLO-v8 for hyperparameter optimization
-
We present a unique combination of CHOA-EVOL and YOLO-v8 for the case of the hyperparameter tuning for the conditions arising in an alpine skiing competition. This integration improves performance since the hyperparameter space is effectively and systematically searched.
- •
Creation of a novel alpine skiing dataset (ASD)
-
We present a new dataset oriented toward research on tracking in alpine skiing. The dataset covers environments with occlusions, high motion blur, and environmental changes, which may be helpful for further research in the field.
- •
Comprehensive experimental evaluation
-
The proposed method is tested over two data sets: the well-known, publicly available UAV123 dataset and the newly created alpine skiing dataset. In terms of accuracy, success rates, and resilience, the results obtained are better than those of existing tracking algorithms.
- •
Introduction of a motion distance-based tracking confidence score
-
A motion distance-based tracking confidence score is suggested to improve detection robustness, mainly when the detector’s confidence level is weak. Such a metric guarantees a stable tracking performance even in difficult situations like partial occlusions and rapid motions.
The subsequent sections of the paper are structured as follows: Section 2 pertains to the relevant terms, including CHOA and YOLO-v8; Section 3 presents the proposed model; Section 4 presents experimentation, results, and discussion; and Section 5 presents the conclusion and future work.
2. Notations
This section summarizes the mathematical models and necessary information about CHOA and YOLO-v8 to prepare the models and algorithms.
2.1. Chimp Optimization Algorithm
CHOA draws inspiration from the behavior and the social structure of chimpanzees. A metaheuristic algorithm was created to address intricate optimization problems in different fields. CHOA is designed based on the social behavior of chimpanzee societies. It continuously integrates principles like collaboration, exploration, and exploitation to enhance solutions through iterative processes [22, 29, 30].
Within CHOA, the optimization process emulates the behaviors witnessed in chimpanzee societies, wherein individuals exchange knowledge, cooperate in problem-solving, and adjust to dynamic circumstances. By employing a strategy that involves exploration to uncover novel solutions and exploitation to enhance promising ones, CHOA effectively traverses search spaces to locate optimal or nearly optimal solutions.
Currently, is the most optimal solution discovered so far, is the most suitable site for chimpanzees, and t denotes the total number of iterations. Furthermore, as depicted in Figure 1, a nonlinear coefficient that spans from 2.5 to 0 exists. Two random numbers, r1 and r2, are generated inside the interval [0, 1]. The map vectors representing chaos are displayed in Figure 2. Please be aware that a comprehensive explanation of these coefficients and maps may be found in reference [31].


2.2. YOLO-v8
The YOLO was first introduced in [32], which has significant progress in object detection. It uses deep neural networks to detect and locate objects in images. The main objective is to forecast the bounding boxes and class probabilities for every object in the given image. Over time, YOLO has consistently demonstrated outstanding performance on different datasets, becoming notably well known in real-time object identification applications like video streams and robotics.
The YOLO-v8 architecture in Figure 3 comprises multiple components, each of which plays a key role in the process. The backbone network is to extract features from the input image. YOLO-v8 uses a backbone network based on a cross-stage partial network (CSPNet) architecture. This design aims to improve computing efficiency while keeping high accuracy. Then, the neck, a bridge between the spine and the detecting head, adds a spatial pyramid pooling (SPP) module to YOLO-v8. This module uses multiple pooling sizes to collect features at different scales.

The detection head in YOLO-v8 predicts bounding boxes and class probabilities. This element consists of several convolutional layers, followed by a group of anchor boxes to predict bounding boxes and class probabilities. It is worth mentioning that the loss function of YOLO-v8 combines multiple terms, such as objectness loss, classification loss, and bounding box regression loss. The objectness loss is designed to penalize inaccurate predictions of object presence, which is critical for recognizing the existence of an object at a particular place.
After detecting head prediction, YOLO-v8 uses post-processing algorithms to refine and optimize the detection results. This includes non-maximum suppression to remove overlapping bounding boxes and select the one with the highest score. Anchor boxes help to fine-tune the predicted bounding boxes so the final detection results are accurate. Through its components and steps, YOLO-v8 shows its ability and flexibility in object detection and localization tasks.
3. Proposed Methodology
This section discusses the methodology for developing the new framework, particularly integrating the hyper-tuning capabilities of an enhanced evolutionary algorithm, CHOA-EVOL, into the YOLO-v8 deep learning-based object detection model to complement robust tracking performance. The proposed CHOAEVOL algorithm solves issues like the loss of population diversity or inability to effectively deal in a multidimensional space, which makes it possible to add in fine-tuning of parameters for YOLO-v8 within the context of changing and complex scenarios like those presented in alpine skiing. The designed system architecture utilizes information obtained from different UAVs and ground cameras to capture the fast and curvilinear motions of the athletes. Moreover, the framework also integrates sophisticated detection techniques along with movement distance-based switching trust scores to sustain credibility, especially when in conditions of occlusion and sudden changes in motion. Each of the components of the methodology is elaborately discussed with an emphasis on the novel interventions and mechanisms that make the new system work.
3.1. Evolutionary Chimp Optimization Algorithm
Equations (8)–(11) outline the steps to improve the performance of the solutions that perform the worst. Equation (12) pertains to randomly relocating the least effective solutions inside the inquiry zone. The initial settings of the variables in the abovementioned equations represent the attacker, barrier, pursuer, and driver, respectively. Ub indicates the upper boundary vector of the search space, while Lb denotes the lower constraint vector. The vectors π1 through π6 are also produced arbitrarily and distributed uniformly over the range [0, 1]. This equation allows us to redirect resources from ineffective responses to more effective ones. This technique makes the research process more manageable, and as the algorithm gets closer to the best discoveries, fewer alternatives are missed.
3.2. Mechanism for Tuning
The CHOA-EVOL was developed to separate individuals with lower objective values, in contrast to approaches that only seek to elevate the most eligible candidates. The agents that are least potent in CHOA are the ones who attack, block, pursue, and drive. These agents are restructured using the method that was described before. However, one potential issue with this strategy is that it may converge too quickly. Individuals with lower objectives must be moved to areas of the search field where objective values are high to achieve this convergence. These individuals may perform better in a more diversified search area. For research to progress and avoid early convergence, diversity maintenance must be prioritized, particularly when assessing the least effective alternatives.
The CHOA-EVOL architecture ensures that the best solutions are distributed across the search space by verifying that each iteration’s four best solutions converge. Applying the abovementioned method to the entire search space ensures that solutions will be distributed and retrieved in regions with high goal values. So, reasonable solutions tend toward the ideal solutions, whereas bad ones tend toward regions with greater variety. This calculated maneuver converges on the best answer to the optimization problem by maximizing CHOA’s exploration and exploitation capabilities.
In the first part of the best-fitting solutions, we can find the updated solutions using equations (14)–(17). By randomly selecting an equation from a collection for every solution, we can maintain the similarity of the probability distribution while maintaining a diversity of responses. Since the CHOA-EVOL successfully preserves variety without factor inclusion, an arbitrary re-initialization is unnecessary. Figure 4 displays the schematic of the method that was created.

3.3. Proposed System Architecture
The accuracy of object identification algorithms, particularly YOLO-v8, in precisely tracking alpine skiing activity relies heavily on finely tuned hyperparameters. However, finding the best values for these hyperparameters is difficult because of the large number of dimensions to search through, the unknown relationships between these dimensions, and the costly process of evaluating the fitness at each point. The YOLO-v8 model utilizes around 28 primary hyperparameters to configure different training settings. The hyperparameters, as outlined in Table 1, have a crucial impact on determining the ultimate outcomes. Ensuring the correct initialization of these variables is vital, and it is advisable to use default values that are specifically optimized for YOLO-v8 training from the beginning, especially when there is uncertainty during the initialization process.
Number | Hyperparameter | Complete name |
---|---|---|
1 | Lr0 | Initial learning rate |
2 | Lrf | Final learning rate (OneCycleLR) |
3 | weight_decay | Optimizer weight decay |
4 | Momentum | SGD adam beta1/momentum |
5 | warmup_momentum | Initial momentum |
6 | warmup_epochs | Epochs |
7 | warmup_bias_lr | Initial bias LR |
8 | cls | Cls loss choa-evol |
9 | Box | Box loss choa-evol |
10 | Obj | Obj loss choa-evol |
11 | cls_pw | Cls BCELoss positive weight |
12 | iou_t | IoU training threshold |
13 | obj_pw | Obj BCELoss positive weight |
14 | fl_choa-evol mma | Focal loss choa-evol mma (EfficientDet default choa-evol mma = 1.5) |
15 | anchor_t | Anchor-multiple threshold |
16 | hsv_s | Image HSV-saturation augmentation |
17 | hsv_h | Image HSV-hue augmentation |
18 | hsv_v | Image HSV-value augmentation |
19 | Translate | Image translation |
20 | degrees | Image rotation |
21 | Scale | Image scale (± choa-evol) |
22 | Perspective | Image perspective |
23 | Shear | Image shear |
24 | Fliplr | Probability of image flip left-right |
25 | Flipud | Probability of image flip up-down |
26 | Mixup | Probability of image mixup |
27 | Mosaic | Probability of image mosaic |
28 | copy_paste | Probability of segment copy-paste |
4. Findings From the Experiment
This section showcases the verification of the UAV system and the assessment of the picked image sensor configurations. Afterward, the suggested method is tested and evaluated for robustness and performance using hardware-in-the-loop simulations. The technique was implemented via PyTorch on an NVIDIA GTX-1650 Graphics with 8 GB of memory. The assessment was conducted on two sets of data: UAV123 [34] and a compilation of image sequences featuring athletes participating in alpine skiing [6].
4.1. Hyperparameter Fine-Tuning Results
First of all, after hyperparameter optimization using CHOA-EVOL, these hyperparameters’ results are tabulated in Table 2.
Hyperparameter | Initial value | Optimized value |
---|---|---|
Lr0 | 0.02 | 0.01 |
Momentum | 0.937 | 0.937 |
Lrf | 0.01 | 0.2 |
warmup_epochs | 3.1 | 3.0 |
weight_decay | 0.006 | 0.0005 |
warmup_bias_lr | 0.2 | 0.1 |
warmup_momentum | 0.8 | 0.8 |
Cls | 0.1 | 0.5 |
Box | 0.01 | 0.05 |
cls_pw | 1.0 | 1.0 |
obj_pw | 1.0 | 1.0 |
Obj | 1.0 | 1.0 |
iou_t | 0.20 | 0.21 |
anchor_t | 2.0 | 4.0 |
fl_choa-evol mma | 0.0 | 0.0 |
hsv_s | 0.5 | 0.7 |
hsv_h | 0.02 | 0.015 |
degrees | 0.1 | 0.0 |
hsv_v | 0.5 | 0.4 |
Scale | 0.2 | 0.5 |
Translate | 0.1 | 0.1 |
Perspective | 0.0 | 0.0 |
Shear | 0.1 | 0.0 |
Fliplr | 0.2 | 0.5 |
Flipud | 0.1 | 0.0 |
Mixup | 0.1 | 0.0 |
Mosaic | 1.0 | 1.0 |
copy_paste | 0.0 | 0.0 |
The use of CHOA-EVOL for hyperparameter optimization resulted in significant modifications of the initial values of specific crucial hyperparameters in the YOLO-v8 model. Significantly, the initial learning rate was reduced from 0.02 to 0.01, but the ultimate learning rate for the OneCycleLR scheduler was raised considerably from 0.01 to 0.2. Although there were modifications, the momentum value stayed constant at 0.937, while the weight decay value was reduced from 0.006 to 0.0005.
In addition, the warmup epochs were slightly reduced from 3.1 to 3.0, and the initial bias learning rate during warmup was cut from 0.2 to 0.1. The modifications also entailed raising the box loss and classification loss levels from 0.01 to 0.05 and 0.1 to 0.5, respectively. Additional parameters, such as the IoU training and anchor-multiple threshold, were adjusted to optimize the model’s performance. These adjustments highlight the success of the CHOA-EVOL optimization strategy in refining the hyperparameters of the YOLO-v8 model. This technique can potentially enhance the model’s performance in object detection tasks.
4.2. Comparative Analysis of Multi-Objective CHOA-EVOL
In this section, we will graphically compare the outcomes of the multiobjective CHOA-EVOL method with those acquired by eight widely recognized multiobjective optimization techniques. The comparison is performed on two separate datasets: the Alpine experimental and UAV123 datasets. The comparison algorithms are multiobjective customized moth-flame optimization algorithm (MOCMFOA) [35], multiobjective particle swarm optimization with self-adjusting strategy (M3OPSOSA) [36], evolutionary multiobjective seagull optimization algorithm (EMOSOA) [37], multiobjective manta ray foraging optimizer (MOMRFO) [38], interval Pareto front-based multiobjective robust optimization (IPFMORO) [39], fuzzy multiobjective optimization model (FMOOM) [40], and multiobjective CHOA (MOCHOA) [26]; Figures 5 and 6 illustrate the performance of CHOA-EVOL in comparison to other algorithms for each dataset, offering valuable insights into their relative efficacy.
















Figure 5 compares the multiobjective CHOA-EVOL method with various optimization algorithms on the UAV123 dataset. Each algorithm’s performance is assessed using different measures, which offer a comprehensive perspective on its effectiveness in accomplishing stated objectives.
As shown in Figure 5, CHOA-EVOL outperforms the other algorithms in the minimization of f1, f2, and f3 simultaneously, as evidenced by the fact that they are Pareto optimal solutions that are closer to the front [41]. This figure verifies CHOA-EVOL’s capability of achieving the objectives of robust tracking on the UAV123 dataset within a reasonable time frame.
Figure 6 compares the multiobjective CHOA-EVOL algorithm and selected optimization algorithms using the alpine experimental dataset. Notably, CHOA-EVOL solutions are positioned within the range of the Pareto front, demonstrating their adaptability to the varying and challenging conditions in alpine skiing. The visual depiction showcases the algorithms’ performance in several evaluation criteria, providing a comprehensive understanding of their strengths and adaptability to the dataset’s intricacies.
4.3. Visual Sensor’s Detection Capability
Our algorithm encounters difficulties such as target distortion, gate obstruction, and swift motion in alpine skiing scenarios, where skiers demonstrate rapid and dynamic movements. These problems highlight the need and practicality of using UAVs and image sensors to track moving targets that have undergone several deformations.
Figure 7 illustrates the experimental arrangement, setup, and results of the detection process. Figure 7(a) showcases the visible light sensor’s capacity to detect the target by using a red rectangle to highlight it, displaying its capability to catch the target inside its field of vision. Alpine skiers are fast, turn a lot, and are agile. The target is fast and flexible. The gates create an obstacle to track. We need to consider these when tracking our algorithm. Therefore, the issue of tracking moving targets that have undergone several deformations utilizing UAVs and image sensors is both doable and challenging.





4.4. Evaluation and Key Challenges
We conducted a comparative analysis of our system against seven sophisticated multiobjective algorithms, evaluating precision, and success rates as performance metrics. Furthermore, we consider two additional parameters for performance analysis: average duration of successful tracking in milliseconds, which assesses the average time in a frame taken by the algorithm to track a target successfully, and total operation cost, also in milliseconds, which considers the total time spent on making a decision and processing a single frame. Together, these additional parameters present the tracking system’s performance in terms of accuracy and computational efficiency required in real-time operations. As shown in Table 3, the system’s precision rate was determined by calculating the average Euclidean distance between the center of the bounding box and the ground-truth values that were manually labeled. This measurement showed that our algorithm is robust in tracking fast-moving items, even when faced with problems like occlusion and motion blur. The evaluation of how well our system works focuses on how well the target area matches the object’s actual location. Our system consistently performed better than other standard methods. It shows that it is superior at following objects. It can tackle issues like objects changing size. It also manages to be partially hidden or moving quickly.
Methods | Precision | Success rate | Average duration of successful tracking (ms) | Total operation cost (ms) |
---|---|---|---|---|
MOCMFOA | 0.803 | 0.652 | 150 | 50 |
M3OPSOSA | 0.795 | 0.641 | 140 | 45 |
EMOSOA | 0.786 | 0.633 | 160 | 55 |
MOMRFO | 0.711 | 0.612 | 180 | 60 |
IPFMORO | 0.702 | 0.602 | 185 | 65 |
FMOOM | 0.872 | 0.701 | 130 | 40 |
MOCHOA | 0.896 | 0.733 | 120 | 35 |
MOCHOA-EVOL | 0.911 | 0.744 | 110 | 30 |
This table examines the efficacy of different optimization techniques on the alpine dataset, specifically emphasizing their accuracy and success rates. Precision shows how accurate positive predictions are. The success rate tells us how well methods achieve desired outcomes. These are key measures for assessing these techniques. There are significant differences in precision and success rates among the methods studied. These include MOCMFOA M3OPSOSA, EMOSOA MOMRFO, IPFMORO FMOOM, MOCHOA, and MOCHOA-EVOL.
For the alpine dataset, the MOCHOA-EVOL approach emerged as the best method, boasting a maximum precision of 0.911 and a maximum success rate of 0.744. Its efficiency in both tracking (with an impressively low average duration of successful tracking equal to 110 ms) and in-frame processing (30 ms for operation cost) is awe-inspiring, showcasing its high performance in real-time operations.
The MOCHOA method also appears to yield substantial accuracy in tracking, boasting a precision level of 0.896 and a success rate of 0.733. Its efficiency and average tracking durations of 120 ms are comparable to MOCHOA-EVOL, with only a slight increase in operation cost at 35 ms. This trade-off for a marginally lower success rate appears to be worth it.
Lastly, it is worth noting that specific tracking methods, such as MOMRFO and IPFMORO, appear to show lower precision and success trajectories between 0.7 and 0.6. This tracking strategy also appears to use more extended time frames, increasing its cost to 60 and 65 ms while appearing less efficient between 180 and 185 ms.
It is worth mentioning that MOCHOA and MOCHOA-EVOL stand out as highly effective techniques, demonstrating the highest levels of precision and success rates. The results indicate a connection between precision and success rate, highlighting the significance of choosing a method based on unique needs and considering the trade-offs between accuracy and efficacy. This investigation offers valuable insights into the comparative effectiveness of optimization methods on the alpine dataset, contributing to the progress of optimization approaches in real-world applications.
To compare the proposed tracking model (YOLO-v8 evolved by CHOA-EVOL) fairly, the success rate and precision of the proposed model are compared with five benchmark models, including YOLO-v5 [42], accurate tracking by overlap maximization (ATOM) [43], learning spatially regularized correlation (LSRC) [44], and scale adaptive kernel correlation filter tracker (SCAKCFT) [45], using the benchmark UAV123 dataset. The result is shown in Table 4.
Methods | Precision | Success rate | Average duration of successful tracking (ms) | Total operation cost (ms) |
---|---|---|---|---|
YOLO-v5 | 0.798 | 0.647 | 170 | 55 |
ATOM | 0.725 | 0.633 | 180 | 60 |
LSRC | 0.711 | 0.625 | 190 | 65 |
SCAKCFT | 0.702 | 0.612 | 200 | 70 |
YOLO-v8-CHOA-EVOL | 0.867 | 0.654 | 160 | 50 |
The experiments conducted on various benchmark models on the UAV123 dataset are recorded in Table 4, and as revealed, the model YOLO-v8-CHOA-EVOL has a clear advantage over all other models in that it has the highest precision measured at 0.867 and a tracking success rate of 0.654 while also being fastest in its average duration of successful tracking (160 ms) at an operation cost of 50 ms. That said, this model was also the most effective for real-time operations. The model YOLO-v5 had a precision of 0.798 and a success rate of 0.647, offering a slightly better efficiency with a good compromise on the accuracy; however, once again, the tracking duration and operation cost were increased slightly, meaning it took 170 and 55 ms, respectively. The duration and cost for operation tracking of models Atom and LSRC were measured to be greater than optimal, resulting in lesser precision and success rates, making these less than ideal for real-time application. The SCAKCFT model was the most ineffective in operation and was aimed at real-time applications. It had the longest 200-ms operational cost, the greatest of all models showcased, and the 70-ms tracking duration. That said, the model at the highest operational cost proved to be the most effective regarding accuracy and computational efficiency.
Ultimately, we developed a sophisticated system for gathering information in alpine skiing using several sensors. Furthermore, we presented a robust tracking algorithm that successfully manages typical challenges faced in alpine skiing scenarios. The algorithm is superior in accuracy and has been proven via considerable research. The algorithm shows potential for scientific training applications. Figures 8 and 9 show the accuracy and precision of the alpine dataset and UAV123 dataset, respectively.


With over 700 publicly available and over 3000 privately held datasets, the initial dataset has over 4000 images. The proprietary dataset is from UAV footage documenting the National Alpine Skiing Club of China’s operations. Photos showing prominent occlusion, distortion, and motion blur are chosen from this footage to produce the private dataset (see Figure 10 for an illustration of the two primary target categories of the dataset: gates and athletes).






To enhance the algorithm’s adaptability to a certain degree, we integrate a selection of photos from the VOC2017 dataset [46] with the ASD. The dataset in Table 5 undergoes three enhancement techniques during the experiment: vertical flip, horizontal flip, and combined horizontal and vertical flips. These actions aim to reduce the influence of limited example learning on structure result verification.
Methods | Frame | Accuracy (%) | ||
---|---|---|---|---|
Total | Correct | Error | ||
YOLO-v5 | 3500 | 3076 | 424 | 87.90 |
ATOM | 3500 | 2590 | 910 | 74.33 |
LSRC | 3500 | 2975 | 525 | 85.33 |
SCAKCFT | 3500 | 2940 | 560 | 84.33 |
YOLO-v8-CHOA-EVOL | 3500 | 3150 | 350 | 89.99 |
The benchmark for assessing algorithm performance is determined by the ratio, or accuracy rate, computed by dividing the frame numbers that accurately identify the object based on the total amount of video frames. The collected video data are first broken down into individual images during operations. The next step is to tally up all the frames with targets of various types and the frames that show right and wrong detections. This process is carried out to determine the accuracy of target identification. The ASD utilized in this experiment includes instances where targets were not detected or where categories were predicted inaccurately. Table 5 presents the outcomes of four techniques in detecting targets for athletes.
The information displayed in Table 5 thoroughly examines the rate of success and accuracy attained by benchmark models and the benchmark ASD. The performance of each approach is assessed by considering the total number of frames processed, the number of correct detections, and the number of erroneous detections.
The YOLO-v5 model exhibits a notable level of precision, accurately detecting objects in 3076 out of 3500 frames, yielding a success rate of 87.90%. Nevertheless, despite its very commendable percentage of success, YOLO-v5 still records a significant count of errors, amounting to 424 frames, highlighting possible areas for further precision.
ATOM, LSRC, and SCAKCFT demonstrate different success rates and precision levels. ATOM successfully detects objects in 2590 frames, yielding a success rate of 74.33%. The performance of YOLO-v5 is superior. The success rates of both LSRC and SCAKCFT are over 80%. LSRC is marginally more effective. However, compared to SCAKCFT, LSRC has better precision and fewer recorded faults.
Among the benchmark models, YOLO-v8-CHOA-EVOL stands out as the top performer. This improvement is thanks to its remarkable accuracy and highest success rate of 89.99%. Out of 3500 frames, YOLO-v8-CHOA-EVOL correctly identifies targets in 3150 frames. It gets them wrong in 350 frames. By outperforming other benchmark models, the YOLO-v8 model proves that evolutionary adjustments were effective. This results in a marked enhancement of the precision of tracking.
Among the models evaluated, YOLO-v8-CHOA-EVOL performed the best in accuracy and success rate. Because of this, it may have applications in more complex object-tracking tasks.
4.5. Discussion
The experiment delves deeply into numerous topics. This model verifies UAVs, checks the settings of the image sensor, and uses simulations and hardware-in-the-loop studies. These help to check how well and how long the suggested strategy works. The method underwent thorough testing on the UAV123 dataset and a compilation of athlete image sequences from alpine skiing.
An essential element of the experiment was the finding that hyperparameter fine-tuning was achieved using the CHOA-EVOL optimization. The optimization procedure resulted in substantial modifications to the hyperparameters of the YOLO-v8 model, particularly affecting its learning rate, momentum, weight decay, and other characteristics. The improvements had a crucial role in improving the model’s performance in object detection tasks, demonstrating the effectiveness of the CHOA-EVOL optimization technique in fine-tuning the model’s hyperparameters.
In addition, a comparative analysis was performed to evaluate the multiobjective CHOA-EVOL method against eight well-established multiobjective optimization strategies. This analysis was conducted on both the UAV123 dataset and the alpine experimental dataset, providing valuable information on the comparative effectiveness of various algorithms. Visual depictions of the relative results yielded significant observations regarding their performance across many assessment categories.
The investigation also assessed the visual sensor’s ability to detect, specifically in alpine skiing conditions. The investigation identified several obstacles, such as target distortion, gate obstruction, and quick motion. It emphasized the need to utilize UAVs and image sensors to follow moving objects that experience deformations. The results highlighted the algorithm’s resilience in accurately following swiftly moving objects despite obstacles such as obstruction and blurring caused by motion.
In addition, the YOLO-v8 tracking model developed by CHOA-EVOL was evaluated against five benchmark models using the UAV123 dataset. The results of this study, which compared the efficiency of different algorithms on the UAV123 and alpine experimental datasets, are quite helpful. These judgments stemmed from visual depictions of respective outcomes. Results showed that YOLO-v8-CHOA-EVOL had the best success rate and precision of approaches they compared. This feature bodes well for its future use in complex object-tracking systems.
In summary, the experiments proved that the proposed strategy worked and lasted. This model is an excellent achievement for scientific training in alpine skiing. Recent research enhances tracking algorithms and optimization techniques. As a result, systems for detecting and tracking objects in the real world become more reliable and accurate.
5. Conclusion
This study presented a novel method by optimizing hyperparameters using a combination of YOLO-v8 and CHOA-EVOL. It aids in alpine skiing activity monitoring. The main objective was to improve the effectiveness of the chimp optimization algorithm in improving the hyperparameters of YOLO-v8, thus addressing the current constraints in outdoor sports tracking technology. Our hybrid model combines data from UAVs and ground cameras to provide a thorough knowledge of athletes’ fast and curved movements.
We used state-of-the-art techniques to test and validate the proposed method on the UAV123 dataset and freshly curated ASD. Our system outperforms the competition in terms of precision and durability. This conclusion is particularly true when it comes to tracking alpine skiing events. The results of this investigation show that CHOA-EVOL can make YOLO-v8 more useful for alpine skiing. It also stresses the significance of scientific education and performance evaluation.
Although notable technological advancements exist, the proposed algorithm has some limitations. To begin with, the algorithm’s time complexity, specifically regarding CHOA-EVOL, is not well suited for real-time applications on hardware with limited resources. Furthermore, using UAV and ground camera data necessitates accurate calibration and synchronization, which can seriously restrict scalability in less controlled situations. Also, an additional drawback to this model is that it has only been tested on specific sporting scenarios and does not consider others with different motion characteristics.
Several promising avenues for further research have emerged from this work. First, researchers could explore alternative optimization techniques or develop hybrid models to enhance performance. Furthermore, it is necessary to investigate the scalability and flexibility of the suggested approach in various outdoor activities and conditions. This would provide prospects for broader applications beyond just alpine skiing.
In addition, incorporating supplementary sensor modalities, such as inertial measurement units (IMUs) or radar systems, could enhance the capabilities of the tracking system, offering a more thorough understanding of athletes’ motions and their interactions with the environment. Research efforts should also focus on creating real-time tracking systems. These systems can adjust to ever-changing conditions seen in outdoor sporting events quickly. They must efficiently handle dynamic environments.
Better tracking performance in harsh environments may also be possible with the help of machine learning techniques. Techniques like adversarial training or reinforcement learning could enhance the system’s adaptability. Finally, working with sports scientists and athletes might help improve and confirm the accuracy and usefulness of the tracking system in real-life training and competition scenarios, guaranteeing its practicality and effectiveness in enhancing athletes’ performance and safety.
Ethics Statement
Given that our study primarily utilized freely available video footage of Alpine skiing from sources such as https://github.com/franktpmvu/NeighborTrack, direct interaction with individual players to obtain verbal consent was not applicable. The dataset consisted of publicly accessible recordings of matches where participants’ identities were not individually identifiable, eliminating the need for informed consent.
In accordance with ethical guidelines and regulations, our study protocol was reviewed and approved by Hebei Sports University. Given that the dataset is publicly available and devoid of personally identifiable information, the ethics committee waived the requirement for informed consent.
The Ethics Committee of Hebei Sports University reviewed and approved our research design and consent procedures, ensuring compliance with relevant guidelines and regulations. The committee recognized that publicly available video data did not necessitate informed consent from participants, thereby validating our approach.
Conflicts of Interest
The authors declare no conflicts of interest.
Author Contributions
Xiaohua Wu: writing – original draft preparation, data curation, investigation, resources, visualization.
Yongtao Shi: conceptualization, funding acquisition, supervision, project administration, writing – reviewing and editing.
Mohammad Khishe: methodology, data curation, and software.
All authors read and approved the final manuscript.
Funding
This work was supported by 1. Research on the Optimization of Winter Sports Educational Resources. 2403172 Hebei Province Educational Science “14th Five-Year Plan” Research Topics and 2. Research on the Coupling Development of Winter Sports Tourism to Help Rural Revitalization in Hebei Province in the Post-Winter Olympics Era, C20231015, Hebei Provincial Department of Human Resources Security.
Open Research
Data Availability Statement
The source data can be found at https://github.com/franktpmvu/NeighborTrack.