Active Learning–Enhanced Ensemble Method for Spatiotemporal Correlation Modeling of Neighboring Bridge Behaviors to Girder Overturning
Abstract
Structural health monitoring (SHM) systems are widely deployed in transportation networks, yet traditional methods often focus on individual bridges, overlooking interdependencies between neighboring structures. This study proposes an active learning–enhanced ensemble learning model to predict the tilt behavior of adjacent bridges by leveraging critical response data from multiple bridges. The ensemble model integrates gradient boosting, random forest, and Gaussian process regressors, providing both predictive means and uncertainty quantification. Active learning iteratively selects the most informative samples, improving model efficiency and reducing data requirements. The model accurately predicts vertical displacement and tilt using responses from neighboring bridges, effectively capturing spatiotemporal correlations and dynamic interactions. Active learning achieves comparable accuracy with just 50% of traditional training samples, demonstrating its efficiency. The results reveal structural interdependencies influenced by stiffness and load distribution variations. The successful prediction of tilt behavior underscores the model’s potential for real-time SHM, early overturning warnings, and enhanced bridge safety.
1. Introduction
Bridge structures play a crucial role in transportation infrastructure, ensuring the efficient movement of people and goods, but prolonged use makes them susceptible to damage [1–3]. In recent years, structural health monitoring (SHM) systems have become essential tools for maintaining the structural performance of bridges [4–6]. These systems continuously collect data on various structural responses, which contain huge information for structural performance [7, 8]. Ensuring the structural safety and reliability of grouped bridges based on this vast amount of data remains a paramount concern for engineers and transportation stakeholders [9].
Traditional SHM methods often focus on single-bridge monitoring and overlook dynamic interactions between neighboring structures. Existing models also struggle with generalization, interpretability, and uncertainty quantification, limiting their applicability in safety-critical scenarios. A key gap is the lack of spatiotemporal correlation modeling, where most studies treat each bridge independently [10–12]. Understanding these spatiotemporal correlations is critical for developing robust predictive models that can provide early warnings of structural issues and optimize maintenance strategies.
The increasing prevalence of overloaded traffic exacerbates the risk of girder overturning for ramps in overpasses, particularly for single-column bridges [13, 14]. This situation presents significant safety hazards and potential structural failures [15–17]. To address these challenges, SHM systems are capable of continuously monitoring loads and responses [18]. Typically, they monitor girder lateral inclinations [19], support reactions, and vertical displacements on both sides [20], offering real-time data to issue warnings and swiftly implement mitigation measures. Dan et al. [21] utilized field data from displacement gauges near the bearings to calculate the girder attitude and subsequently introduced a time-varying reliability analysis method for assessing bridge overturning risk. Unlike directly monitoring vertical displacement, the relative displacement of two girder sides can also be indirectly measured using accelerometers, strain gauges, or inclinometers [18, 22]. Chang and Kim [23] placed fiber Bragg grating strain gauges in a seismic bearing and used its stress–strain curve to estimate the support reaction for structural performance assessment. In the bridge network, detecting potential overturning risks in preceding spans or bridges is more beneficial.
The spatiotemporal correlations are critical for predicting responses and assessing the performance of one bridge from neighboring bridges. Cao and Liu [10] proposed a deep learning–based damage localization method for grouped bridges using spatiotemporal correlation of strain data and established a damage localization index. Cao et al. [11] focused on the grouped medium- and small-span bridges to design a probabilistic method to calculate responses and determine damages utilizing acceleration data. Xu et al. [24] identified suspenders tension based on spatiotemporal correlation between girder strain and suspenders tension using a stacked denoising autoencoder and convolutional neural network–based long short-term memory model to construct the correlation model. Lei et al. [25] established the U-Net model to identify regional spatiotemporal correlations from the inspection reports of grouped bridges. Similarly, Liu and Zhang [26] extracted the spatiotemporal correlations of bridges and their associated components from the national bridge inventory database.
Facing with the vast amount of data from various grouped bridges, employing machine learning (ML) technique proves effective in identifying spatiotemporal correlations and predicting responses [27–32]. Ensemble learning (EL) is a powerful method to aggregate several weak ML predictors for detecting hidden features of data and regress targets. In the field of SHM, many EL methods have been applied to make structural performance prediction and assessment [33]. Han et al. [34] stacked five heterogeneous ML models into an EL model to predict natural frequencies from environmental data and effectively detect potential damage in space grid steel structures by statistically analyzing prediction residuals and mitigating environmental influences. Sun et al. [35] presented a data-driven approach using an EL model to predict cable vibration amplitudes based on environmental loads and demonstrated the superior performance of the gradient boosting decision tree (GBDT). Yaghoubzadehfard et al. [36] applied advanced ML techniques, including random forest, GBDT, and XGBoost, to detect bridge damage by analyzing changes in modal parameters, demonstrating the efficacy of these models, particularly random forest, in identifying damage location and severity.
Active learning (AL) is a pattern recognition algorithm that can achieve greater performance, using fewer training labels, with itself selecting the data from which it learns. The goal is to learn a mapping from observations to labels. Most AL techniques are used in structural reliability analysis to enhance the efficiency and accuracy of predicting failure probabilities and assessing structural performance under uncertainty [37, 38]. Dang et al. [39] introduced Bayesian AL to improve precision and efficiency in exacting posterior variance of the failure probability. Zhou et al. [40] designed a new look-ahead learning function for AL to reduce the computational burden achieved by deducing a closed-form expression for the inner integral. Seldom research is used with AL in SHM-based structural assessment [41]. Bull et al. [42, 43] introduced cluster-adaptive AL to address the issue of the absence of diagnostic labels for SHM, showcasing its effectiveness through experiments. Hughes et al. [44] proposed a risk-based AL approach that queries class-label information based on its expected value, demonstrated with a benchmark, showing improved decision-making performance. Zhu et al. [45] proposed an AL for Bayesian deep neural networks to predict the remaining useful life of a battery of limited run-to-failure data by actively selecting samples for labeling.
The contributions of this work can be summarized as follows: (1) The research demonstrates the effectiveness of a surrogate data-driven model in accurately and efficiently predicting the tilt behavior and displacement of adjacent bridges, capturing the spatiotemporal correlations between their responses. (2) The study successfully applies the copula theory to model the dependence structures between different response variables, providing valuable insights into their interdependencies. (3) By employing AL, the model’s performance is significantly improved, particularly in the early stages of training, resulting in a more accurate and efficient prediction model.
The study’s structure is outlined as follows. Section 2 presents the proposed framework for AL-enhanced EL modeling of critical responses among neighboring bridges based on SHM data, involving three main stages. Section 3 provides detailed insight into the EL and AL techniques utilized. Section 4 describes the grouped bridge and its responses, along with copula-based probabilistic modeling to reveal correlations. Section 5 illustrates the model training, AL’s enhancement on predictions, and girder tilt displacement prediction, while Section 6 draws conclusions.
2. General Framework of AL-Enhanced EL for Response Modeling
Figure 1 illustrates a comprehensive framework that integrates AL with EL to model and predict the spatiotemporal responses of neighboring bridges. This framework leverages SHM data to enhance the predictive accuracy and efficiency of the model. Three key components in the framework are SHM data preparation, AL-enhanced EL model training, and prediction responses with uncertainty of neighboring bridges.

SHM systems are deployed on neighboring bridges to collect critical response data. Data preprocessing involves cleaning the data by removing noise and outliers and handling missing values to ensure high-quality inputs for modeling. A random selection of samples from the collected data forms the training and testing datasets. The training dataset contains the initial training set and the remaining selected by AL strategy for increment training the model. The testing dataset is used to evaluate the model’s performance.
The EL model is established with the stack of gradient boosting regressor, random forest regressor, and Gaussian process regressor. It contains boosting and bagging ensemble algorithms to make use of their advantages. The last stacked regressor with Gaussian process regression could offer a probabilistic approach to regression, capturing uncertainties in predictions, and output standard deviations of predictions for aiding the AL strategy to determine the uncertainties. The uncertainty sampling strategy is based on an entropy-based approach, where predictive variance from the regressor serves as the key criterion for selection. At each iteration, samples with the highest predictive variance are prioritized for labeling and retraining, ensuring that the model efficiently refines its predictions with minimal data.
With the fine-tuned model, the responses of a bridge can be accurately predicted based on data from neighboring bridges. The model successfully extracts and determines the spatiotemporal correlations between neighboring bridges. Comparing the predicted values with the actual measurements demonstrates the model’s accuracy. The predicted mean and 95% confidence intervals are plotted alongside the true values, showcasing the model’s capability to capture the actual behavior of the bridges.
Predicting structural behavior using data from nearby bridges offers several advantages, including faster real-time responses, improved data reliability, and better insights into structural interdependencies. By leveraging spatiotemporal correlations, this approach enables early detection of issues, such as girder overturning, before failure occurs. It also helps fill in gaps caused by sensor failures or data loss, ensuring continuous monitoring.
3. Details of EL and AL
3.1. EL Methods
EL is a powerful technique in ML that involves stacking multiple models to achieve better predictive performance than any single model could on its own. The fundamental idea is that by aggregating the predictions of several models, the ensemble can mitigate the weaknesses of individual models and improve overall accuracy, robustness, and generalization.
EL methods, such as boosting and bagging, are powerful techniques used to improve the accuracy and robustness of ML models. The final prediction is made by averaging the predictions for regression. Boosting methods are trained sequentially, each new model focusing on correcting the errors of the previous ones. This is achieved by adjusting the weights of the training samples based on the errors. Bagging methods, known as bootstrap aggregating, are trained independently on different random subsets of the training data, created through bootstrapping (random sampling with replacement).
This study introduces a novel EL model by combining three regressors: gradient boosting regressor, random forest regressor, and Gaussian process regressor. The stacking process involves two stages: the base models and the meta-model. The base models, which include the gradient boosting regressor and random forest regressor, are trained in parallel. Each base regressor operates independently, allowing for simultaneous training. Once the base models are trained, their predictions serve as input features for the meta-model, which, in this study, is the Gaussian process regressor. The meta-model is then trained sequentially on these predictions. Figure 2 shows the flowchart of the proposed EL methods.

While the base model training is parallelized, the meta-model training occurs sequentially, as it relies on the output from the base models. The gradient boosting regressor and random forest regressor form the foundation of the base model. Gradient boosting regressor is a powerful boosting-based algorithm that sequentially improves weak learners to minimize prediction errors. It is well suited for handling structured SHM data with potential noise and missing values, as it iteratively refines predictions by focusing on difficult-to-model patterns. Random forest regressor is a bagging-based algorithm that builds multiple decision trees in parallel and averages the outputs to improve prediction stability. It is highly effective in reducing variance and capturing feature interactions without requiring extensive hyperparameter tuning.
The Gaussian process regressor, used in the final layer as the meta-model, provides a probabilistic approach to capturing uncertainties in predictions. Gaussian process regressor provides a probabilistic approach to regression, offering uncertainty quantification in predictions. It is particularly useful in AL, as it enables the selection of the most informative samples by evaluating prediction confidence.
For an ensemble to be effective, these base models are diverse, meaning they should make different types of errors. This diversity ensures that the strengths of some models can compensate for the weaknesses of others.
3.1.1. Gradient Boosting Regression
Gradient boosting regressor is a powerful ML technique used for regression tasks. It builds an ensemble of weak decision trees in a sequential manner. Each subsequent tree is trained to correct the errors made by the previous trees, thereby improving the overall performance of the ensemble. Gradient boosting uses gradient descent to minimize the loss function. Each new model is trained to predict the negative gradient of the loss function with respect to the current predictions.
3.1.2. Random Forest Regression
Another EL aggregate method of weak learners is bagging. Random forest regressor is an EL method that bags multiple decision trees during training and outputs the average prediction of the individual trees for regression tasks. This method improves predictive accuracy and controls overfitting by leveraging the collective power of multiple trees.
3.1.3. Gaussian Process Regression
Gaussian process regressor is a nonparametric Bayesian regression model for modeling complex relationships between input variables and their corresponding continuous output variables. It is based on the concept of modeling the joint distribution of the data using Gaussian processes, which allows for flexible and probabilistic regression predictions. The radial basis function kernel is selected for Gaussian process regression since it effectively captures smooth, nonlinear relationships in the data, including complex spatiotemporal correlations. Additionally, it improves the Gaussian process’s ability to deliver realistic uncertainty estimates, which is essential for AL.
Table 1 provides a comparison of different meta-models used in SHM. Each model has its own strengths and limitations, which are important to consider when selecting an appropriate method for a given application. Table 1 highlights these aspects, along with the specific limitations of each approach in the context of this study.
Meta-model | Strengths | Limitations in this study |
---|---|---|
Neural networks | Highly flexible, effective for large datasets | Requires extensive tuning, lacks direct uncertainty quantification |
Support vector regression | Good for high-dimensional spaces | Computationally expensive for large-scale datasets |
Linear regression | Simple and interpretable | Cannot capture nonlinear relationships in SHM data |
Gaussian process regression | Provides uncertainty estimates, nonparametric, effective for small-to-medium datasets | Computational cost increases with dataset size |
3.2. Uncertainty-Based AL
Uncertainty-based AL is a strategy in ML where the algorithm actively selects the most informative data points from which it can learn [38]. This approach is particularly useful in scenarios where labeling data is expensive or time-consuming. The central idea is to use the model’s uncertainty in its predictions to identify which data points would most improve the model if labeled and added to the training dataset. Entropy-based uncertainty sampling is selected as the primary criterion because it effectively captures model uncertainty, ensures efficient data selection, and remains computationally feasible.
After selecting the most uncertain samples with the highest entropy, the true labels for these samples are obtained from an oracle. These newly labeled samples are then added to the training dataset, enhancing their diversity and informativeness. Subsequently, the model is retrained with this augmented dataset, which incorporates the new, informative samples. This iterative process continues, with the model repeatedly selecting the most uncertain samples, acquiring their labels, and retraining, until a desired level of model accuracy is achieved. By focusing on the most uncertain samples, the model efficiently improves its performance with minimal labeled data.
4. Measured Bridge for Case Studies
4.1. Grouped Bridges With Deployed SHM Systems
This study selected a group of interconnected overpasses in a city in southern China, comprising four neighboring bridges (Bridge 1, Bridge 2, Bridge 4, and Bridge 5). Due to the importance of these overpasses in the local transportation network, numerous sensors were installed to monitor their operational status. This study focuses on monitoring data from three of these bridges (Bridge 1, Bridge 2, and Bridge 5). The location of these sensors and the relationship of the three bridges are depicted in Figure 3.

Bridge 1 is a prestressed concrete continuous curved box girder bridge, with four spans in total, each consisting of three spans. The span combinations are as follows: (3 × 30) m + (3 × 30) m + (30 + 2 × 35) m + (35 + 43 + 28) m. The box girder has a single-box single-cell cross section with a total width of 9.5 m. The third span is located on a horizontal curve with a radius of 103.75 m.
Bridge 2 is a north–south straight bridge with a total length of 274.8 m. The superstructure is a prestressed concrete continuous box girder with span combinations of (3 × 27) m + (2 × 27 + 23.5) m + (30 + 32 + 26.3 + 28) m. The box girder has a single-box single-cell cross section with a total width of 10.5 m.
Bridge 5 connects Bridge 1 and Bridge 2 with a single-span curved girder segment, with a total length of 56 m. The superstructure consists of 2 × 28 m of prestressed concrete continuous box girders, with a single-box three-cell cross section. Rubber expansion joints are installed at the abutments and segment connections. The transition pier cap beams are designed as hidden cap beams, and the transition pier bearings use rectangular plate rubber bearings.
Due to the single-column pier single-support design of Bridges 1 and 2, along with the relatively narrow bridge deck, it is essential to focus on the anti-overturning stability under eccentric loading conditions. Therefore, displacement gauges are installed on both sides of the girder bottoms of these two bridges to monitor the rotational status and assess the overturning risk. Bridge 5, with a large box-section girder, supports four lanes of traffic. With the city’s development, overloading incidents occur from time to time. Consequently, strain sensors have been installed at the bottom of the girder to monitor the bearing performance.
This study selected eight sensors from the group of bridges for spatiotemporal correlation modeling: two displacement gauges on both sides of the girder bottom of Bridge 1, two displacement gauges on both sides of the girder bottom of Bridge 2, and four strain gauges at the midspan of the girder bottom of Bridge 5. The sensors on the three bridges are all located on neighboring spans, and the load fields, such as traffic flow and temperature, are interconnected. The strain on Bridge 5 and the displacements on Bridges 1 and 2 exhibit spatiotemporal correlations. The proposed model in the study is designed to be flexible and can be extended to various types of monitoring data, beyond just bridge tilt and displacement prediction.
Optimizing sensor placement is essential not only for modeling spatial correlations but also for monitoring multiple structural behaviors. In this study, longitudinal sensor spacing is determined based on the specific monitoring tasks of different sensor types. Strain gauges are positioned to capture the bending performance of the monitored span, while displacement gauges are placed atop piers to track bearing displacements. If the sole focus is on spatial correlation, the average speed of vehicles on the monitored lane is considered to optimize sensor spacing for timely tilt warnings.
4.2. Measured Critical Responses
The critical response data collected from the SHM systems of three bridges are gathered over a period of time. These raw data are preprocessed to enhance the data quality. The locally weighted scatterplot smoothing (LOWESS) technique is used to reduce sensor noise while preserving essential structural response characteristics. Outliers are identified using z-score analysis and subsequently reviewed for potential data errors. Outlier data points of one sensor are removed for all sensors at the same time. No missing data occurred. Following preprocessing, the refined data are consolidated into a structured database for subsequent spatial analysis. This database forms the basis for further modeling of spatiotemporal correlations.
The eight responses are labeled as follows: left displacement gauge of girder bottom of Bridge 1 (B1L), right displacement gauge of girder bottom of Bridge 1 (B1R), left displacement gauge of girder bottom of Bridge 2 (B2L), right displacement gauge of girder bottom of Bridge 2 (B2R), left side strain gauge of girder bottom of Bridge 5 (B5LL), middle left strain gauge of girder bottom of Bridge 5 (B5ML), middle right strain gauge of girder bottom of Bridge 5 (B5MR), and middle right strain gauge of girder bottom of Bridge 5 (B5RR).
As depicted in Figure 4, it is evident that these responses display distinctive characteristics for each bridge. The girder displacement of B1L and B1R fluctuates within a range of 1 cm, whereas the girder displacement of B2L and B2R varies within a narrower range of 0.5 cm. These differences likely stem from variations in structural stiffness and applied loads. The strain measured by the strain gauges on Bridge 5 falls within a range of 30 με for each gauge. These fluctuations underscore the dynamic nature of structural performance and emphasize the significance of accounting for seasonal variations when evaluating bridge behavior and integrity. The EL framework can capture long-term variations in bridge behavior by learning patterns across different time periods. GPR as the meta-model could incorporate uncertainty estimation, allowing the model to adapt to seasonal shifts.








4.3. Copula-Based Probabilistic Modeling
The copula-based probabilistic modeling primarily serves to reveal the dependence structure between critical structural responses of neighboring bridges. The overturning performance of Bridge 1 and Bridge 2 is directly related to the difference in displacement measurements on their respective bottom sides. For Bridge 5, as the locations of the strain sensors are close to the displacement meters of Bridge 1 and Bridge 2, it is assumed that vehicles do not transverse such a short distance between the ramps. Therefore, the difference between the two strain sensors on Bridge 5, corresponding to the respective displacement gauges of Bridge 1 and Bridge 2, is also calculated to represent the difference in vehicle loads on the sides of the ramps, which is directly relevant to overturning.
Furthermore, the four difference responses are labeled as follows: lateral tilt displacement of Bridge 1 (B1diff), lateral tilt displacement of Bridge 2 (B2diff), strain difference of the left side of Bridge 5 (), and strain difference of the right side of Bridge 5 (). Since the plotted responses in Figure 5 are directly derived from the original response differences, they also exhibit similar features to those in Figure 4 for each bridge.




The inherent correlations of structural responses of neighboring bridges could be initially revealed from the probabilistic perspectives. The copula theory provides a powerful tool for determining dependence among a set of variables. In particular, Gaussian copulas are particularly useful for capturing nonlinear relationships among variables with the Gaussian mixture distribution model as marginal distributions. While the individual Gaussian components in the marginal distributions assume a linear relationship, their combination can approximate nonlinear relationships. This allows for flexible parameter adjustments to effectively model data behavior, and the approach can be extended to more sophisticated models.
Figure 6 illustrates the copula-based joint density for critical responses of neighboring bridges. To explore the spatial correlation between neighboring spans, pairs are constructed between B1diff and , as well as between B2diff and . These visualizations offer valuable insights into the interdependencies between different response variables over time. The correlation between B1diff and appears more concentrated than that between B2diff and , indicating that the behavior of responses between B2diff and involves more uncertainties. The overloading on the left side of Bridge 5 may visibly contribute to the significant tilt of Bridge 1. This observation aligns with the larger range of B1diff compared to B2diff in Figure 5. The copula analysis quantifies the joint dependence structure among response variables, offering insights into their probabilistic correlations. As for temporal correlation, the SHM system continuously collects response data over time, allowing the model to learn temporal patterns in displacement and tilt behavior. The meta-model within the ensemble framework could account for time-dependent variations by incorporating sequential training and uncertainty quantification.


5. AL-Enhanced EL Spatiotemporal Modeling
5.1. Modeling
In the model tuning, Bayesian optimization is used to optimize the hyperparameters of the proposed ensemble model. Bayesian optimization not only identifies the optimal settings for key parameters, such as the learning rate and number of trees for the gradient boosting and random forest regressors, as well as the kernel parameters for the Gaussian process regression, but also provides insight into the sensitivity of the model’s performance to these hyperparameters. By systematically exploring the hyperparameter space, Bayesian optimization reveals the impact of parameter variations on prediction accuracy and overall model stability.
The AL-enhanced EL model for predicting the vertical displacement of Bridge 1 and Bridge 2 from measured strains of neighboring Bridge 5 is trained and tested with the split dataset with the ratio of 7 : 3. Figure 7 shows the prediction results with the confidence intervals in the untrained test dataset for four target displacements on Bridge 1 and Bridge 2. Since the test dataset contains thousands of samples, Figure 7 only shows 200 samples for a clear illustration. Any discrepancies between the predicted and actual values are minimal and fall within the confidence intervals, suggesting that the model effectively captures the underlying behavior of bridge performance.




It could be seen that the prediction mean and ground truth match well in time series. The confidence intervals around the predictions provide a measure of the uncertainty, with narrower intervals suggesting higher confidence in the predictions. In Figure 7, the gray shading represents uncertainty or confidence interval. The confidence intervals for B1L, B1R, and B2L are narrower than that for B2R, reflecting more precise predictions for Bridge 1 and vertical displacement of Bridge 2 in the left side. The confidence intervals are slightly wider, reflecting greater uncertainty, but the predictions are still within an acceptable range.
The predictions are then compared to the ground truth data to evaluate the model’s performance using all test data in the scatter plot. Coefficient determination is used as the metric to evaluate performance. The results, depicted in Figure 8, show that the predicted vertical displacements for B1L, B1R, and B2L exhibit a higher degree of correlation compared to B2R. These three predictions have coefficient of determinations (R2) [46] over 95%, largest R2 value is 98.29% for the prediction of B2L. When only using one of the three stacked regressors, the R2 value is lower than the stacked EL regressors.




These predicted vertical displacements closely follow the ground truth data, with minor deviations. This indicates that the vertical displacement of B1L, B1R, and B2L is more sensitive to the loads on Bridge 5, likely due to their structural proximity and interdependencies. The close alignment between predicted and ground truth data for both bridges demonstrates the reliability of the AL-enhanced EL method. The trained EL model also successfully models the spatiotemporal correlation of responses of neighboring bridges.
5.2. AL Enhancement in Model Performance
AL is used to iteratively refine the model by selecting the most uncertain samples for labeling and retraining. Specifically, the entropy-based approach quantifies the uncertainty of each prediction. At each iteration, samples with the highest predictive uncertainty are selected for labeling, ensuring that the most informative data points are incorporated into the training process. The tuning strategy helps balance the exploration of high-uncertainty regions with the exploitation of areas where the model already performs well. This adaptive approach maintains high accuracy and efficiency even as data characteristics evolve over time.
This study trains the model with sample ratios ranging from 0.1 to 0.8 of the entire training datasets. Figure 9(a) shows the effect of the initial training data on model performance, measured by R2. The results indicate that increasing the sampling ratio improves the model’s accuracy, with a higher sample ratio leading to better prediction accuracy.


However, AL does not necessarily enhance the model’s accuracy or efficiency directly. Instead, the accuracy improves with a higher sampling ratio, while a lower sampling ratio helps reduce training time, thus improving efficiency. The model trained without AL, using the entire dataset, achieves an R2 value of around 99% for each response. Figure 9(b) shows that when the AL-enhanced model uses only 40% of the dataset, the accuracy difference between it and the baseline model is less than 2%. This suggests that a smaller sample ratio can still maintain high accuracy while reducing training time.
The training dataset is selected based on an AL strategy, which prioritizes the most informative samples to improve model efficiency and accuracy. This approach mitigates the risk of introducing bias due to arbitrary sample selection. Additionally, the selected dataset comprises real-world SHM measurements collected over an extended period, minimizing selection bias that could arise from short-term or artificially controlled experiments.
5.3. Tilt Predictions
The tilt predictions for Bridge 1 and Bridge 2 are calculated based on the prediction means from the trained model, using the differences between B1L and B1R, as well as between B2L and B2R, as illustrated in Section 4.3. The trained model leverages the AL-enhanced EL approach, which ensures that the most informative samples are utilized to improve prediction accuracy effectively.
Figure 10 shows the comparison of tilt predictions with the ground truth measurements for the two bridges. The predicted tilt values align closely with the ground truth data for both Bridge 1 and Bridge 2. It captures the dynamic behavior of the bridge tilts, reflecting the structural responses accurately over time. The successful prediction of bridge tilt behavior indicates that the model can be used as a predictive tool for monitoring the structural health of bridges. Specifically, the model has potential applications in real-time overturning warnings, providing early alerts based on predicted tilt values.


The use of AL in this study, despite the availability of real-time sensor data, is justified by several key factors: (1) Efficient data utilization: Although the data are collected in real time, not all data points contribute equally to the model’s learning process. AL helps identify and prioritize the most informative samples that can significantly improve the model’s performance. This selective approach ensures that the model focuses on the most relevant data, reducing the time and computational resources needed for training. (2) Adapting to changing conditions: Bridge responses can vary due to factors like environmental changes, traffic patterns, and structural aging. AL enables the model to adapt to these changes by continually selecting the most informative new data, ensuring that the model remains accurate and relevant over time.
6. Conclusions
- 1.
The model demonstrates high accuracy in predicting the vertical displacement and tilt of Bridge 1 and Bridge 2. The close match between the predicted and actual values indicates that the trained model effectively captures the dynamic behavior of the bridges and extracts spatial and temporal correlations among neighboring bridges. This performance is a result of the combined strengths of the EL approach and the AL strategy.
- 2.
By employing AL, this study shows that the model’s performance improves significantly, especially in the early stages of training, making the model both more accurate and efficient. The comparison with the baseline model underscores the effectiveness of AL in enhancing model accuracy with only 40% of samples.
- 3.
The spatial and temporal correlations between the responses of the neighboring bridges are well represented in the model. The strong correlation between B1diff and versus the weaker correlation between B2diff and suggests different structural interdependencies, reflecting varying levels of stiffness and load distribution.
- 4.
The successful prediction of bridge tilt behavior shows that the model can be used for real-time SHM. It has the potential to provide early warnings for possible overturning, allowing for timely interventions and enhancing bridge safety.
The methodology can be expanded to incorporate more complex models and additional data sources, including environmental factors, traffic load variations, and datasets from various geographical locations and bridge configurations. Incorporating temperature, wind speed, and traffic loads will provide a more comprehensive understanding of how these variables impact bridge behavior. Further refinement of AL techniques will continue to improve model adaptability.
Disclosure
The conclusions and opinions in this paper are of the authors, which do not necessarily reflect that of the bridge operator or the inspection company.
Conflicts of Interest
The authors declare no conflicts of interest.
Author Contributions
Ru An: software, data curation, investigation, and project administration. Mengjin Sun: formal analysis, methodology, and writing – original draft. You Dong: writing – review and Editing and supervision. Lu Guo: resources and software. Lei Jia: validation, writing – review and editing, visualization, and funding acquisition. Xiaoming Lei: conceptualization, methodology, software, writing – original draft, and visualization. Ru An and Mengjin Sun, the first two authors, contributed equally to this work.
Funding
The authors would like to acknowledge financial support from the National Key R&D Program of China (2022YFC3801100 and 2022YFC3801102), Shenzhen Science and Technology Program (KJZD20240903102742055), Natural Science Foundation of Shanghai (24ZR1460700), Shanghai Research Institute of Building Sciences Co. Ltd. (KY10000038.20230043), and the Hong Kong Polytechnic University (P0054456).
Acknowledgments
The authors would like to acknowledge financial support from the National Key R&D Program of China (2022YFC3801100 and 2022YFC3801102), Shenzhen Science and Technology Program (KJZD20240903102742055), Natural Science Foundation of Shanghai (24ZR1460700), Shanghai Research Institute of Building Sciences Co. Ltd. (KY10000038.20230043), and the Hong Kong Polytechnic University (P0054456).
Open Research
Data Availability Statement
All data included in this study are available from the corresponding author upon request.