The use of surrogate models in assisting evolutionary algorithms for antenna optimization has achieved significant research outcomes. The construction of surrogate model primarily depends on two aspects; one is the selection of datasets, and the other is the model’s structure and performance. This paper proposes a novel dataset selection method aimed at enhancing the performance of the constructed surrogate model. Additionally, based on Bayesian neural network (BNN) and leveraging the advantages of handling sequence data with long short-term memory (LSTM), a BNN-LSTM surrogate model is introduced. After training, this surrogate model is used as the fitness evaluation function, enabling optimization design based on differential evolution (DE) algorithm. Experimental validations are conducted using the optimizations of a dual-frequency slotted patch antenna and a rectangular cut-corner ultrawideband antenna as examples. Results demonstrate that the proposed surrogate model exhibits high accuracy, providing a guidance for antenna optimization.

1. Introduction

During antenna optimization design, global optimization (GO) algorithms typically require full-wave electromagnetic (EM) simulation software to evaluate the performance of the antenna. Therefore, a considerable amount of calling full-wave EM simulation is needed in the optimization process to obtain the optimal design, making the optimization cost unacceptable. To replace computationally expensive full-wave EM simulation software, machine learning (ML) methods have been widely introduced into the field of antenna optimization design. Trained ML-based surrogate models can replace full-wave EM simulation software to predict antenna performance [1]. The prediction time of surrogate models can be almost negligible compared to full-wave EM simulation time. Therefore, the main computational cost of using this method lies in establishing the dataset and training process, significantly reducing the time spent on the calling full-wave EM simulation software during optimization. Nakmouche et al. [2] used artificial neural network (ANN) as surrogate models for optimizing dual-band antennas with H-slot defect grounding structures. Dong, Li, and Wang [3] employed particle swarm optimization (PSO) to enhance the accuracy of radial basis function neural network (RBFNN) in antenna modeling. They combined the optimized RBFNN with multiobjective evolutionary algorithms for optimization design of multiobjective antenna with multiparameter structures. Gao, Tian, and Chen [4] proposed a semisupervised collaborative training algorithm based on Gaussian processes (GPs) and support vector machine (SVM) for antenna modeling. They validated the proposed algorithm’s effectiveness in replacing time-consuming EM simulation software through optimizations of benchmark functions and Yagi microstrip antennas. Koziel and Ogurtsov [5] used space mapping as the optimization engine and response surface approximation as the surrogate model to optimize the antennas. Additionally, ML methods such as Kriging models [6], polynomial regression [7], and decision tree regression [8] are also used as surrogate models to assist in antenna optimization design.

Long short-term memory (LSTM) network is a type of specialized recurrent neural network (RNN) that exhibits clear advantages in handling time series data. Time series refers to a series of data points or observations arranged in chronological order. In time series analysis, time typically serves as the independent variable, while the observations serve as the dependent variable. The chronological order of such data is crucial for analysis because it reflects potential temporal correlations and trends among data points. LSTM networks have been widely applied in various domains such as stock market price prediction [9], forecasting of oil production [10], and natural language processing [11]. In the field of antenna optimization, LSTM can also leverage its advantages because the dataset used to construct surrogate models can be viewed as time series data, with labels arranged in order of frequency.

Bayesian neural networks (BNNs) differ from traditional deterministic ANNs like multilayer perceptron (MLP) by incorporating Bayesian statistical concepts and methods into the modeling process. They introduce Bayesian inference for prediction, providing not only predicted label values but also estimates of uncertainty associated with the predictions. Liu et al. [12] explored the advantages of BNNs over GP and applied them in combination with differential evolution (DE) algorithms to antenna optimization design, demonstrating the effectiveness of BNNs as surrogate models. This paper combines BNN with LSTM, named Bayesian neural network-long short-term memory (BNN-LSTM) surrogate, to imbue the surrogate model with model uncertainty, thereby integrating the advantages of LSTM in time series prediction with the BNN’s ability to provide statistically grounded estimates of prediction uncertainty.

It is well known that the selection of the dataset is crucial for constructing surrogate models. For a larger parameter optimization range, it is necessary to sample more parameter combinations in the optimization space to ensure the fitting effectiveness of the surrogate model, which increases the cost of dataset construction. Conversely, a smaller parameter optimization range may lead optimization algorithms to trap in local optima when optimizing parameters. Latin hypercube sampling (LHS) [13, 14] is widely used for initial dataset selection. However, although LHS can comprehensively cover samples across the parameter optimization space, achieving precise predictions throughout the entire parameter space still requires a large number of parameter combinations, thereby increasing computational costs. In the field of antenna optimization design, due to spatial constraints, the optimal parameter space is typically not large. Therefore, our goal is to identify this space and ensure the surrogate exhibits good modeling performance.

This paper introduces a novel surrogate model by combining LSTM network and BNN into the field of antenna optimization. It also proposes a new method for generating datasets aimed at reducing the optimization space for antenna parameters. Within this defined space, the surrogate is constructed to enhance the accuracy of antenna modeling. The remainder of this paper is organized as follows. Section 2 provides a brief overview of related works. Section 3 details the proposed methods. Section 4 demonstrates the effectiveness of the method through antenna modeling and optimization cases. Finally, Section 5 concludes the paper and outlines future research directions.

2. Related Works

2.1. LSTM

RNNs are a type of ANNs specifically designed to handle sequential data. Unlike traditional feedforward ANNs, RNNs possess the ability to remember sequential data [15]. LSTM is a variant of RNNs initially designed to address the issue of long-term dependencies inherent in RNNs [16]. Compared to traditional RNNs, LSTM introduces three gates: the input gate, forget gate, and output gate, along with a cell state. These structures enable LSTM to better handle long-term dependencies within sequences. The structure is illustrated in Figure 1, where Circle 1 represents the forget gate, Circle 2 the input gate, Circle 3 the output gate, and Circle 4 the cell state update process.

Details are in the caption following the image — **Figure 1**
Open in figure viewer PowerPoint

Unit structure diagram of LSTM network.

The forget gate determines which information from the previous cell state to discard based on the previous output and the current input. It first integrates the previous output and the current input into a vector and then passes this vector through a Sigmoid neural layer to compress values between 0 and 1. Finally, it element-wise multiplies the result with the previous cell state, thereby selectively forgetting some information from the previous cell state. The formula for the forget gate is as follows:

()

Here, f_t represents the output of the forget gate, σ denotes the Sigmoid function, W_f, b_f are, respectively, the weight matrix and bias term of the forget gate, and [h_t−1, x_t] denotes the concatenation of the previous cell output and the current input.

The input gate controls which data to incorporate into the cell state. It first uses the Tanh function to extract information from the current data, generating a candidate cell state. Then, it applies the Sigmoid function to selectively add the extracted information into the cell state. The formula for the input gate is as follows:

()

Here, i_t represents the output of the input gate, W_i, b_i are the weight matrix and bias of the input gate, respectively, is the candidate cell state, and W_c, b_c are the weight matrix and bias of the candidate cell state, respectively.

The current cell state is obtained by adding the previously forgotten cell state to the currently selectively extracted candidate cell state, which is expressed by the formula:

()

where C_t represents the current cell state.

The output gate computes the current output value. Its operation is similar to the input gate, with the difference being that the current cell state passes through a Tanh layer.

()

Here, o_t represents the output of the output gate, W_o and b_o are, respectively, the weight matrix and bias of the output gate, and h_t is the output of the LSTM unit.

LSTM effectively addresses the gradient and long-term dependency issues in traditional RNNs by introducing three gates and a cell state, thereby providing significant advantages and practical value in sequence modeling and prediction tasks.

2.2. BNNs

BNNs, in essence, can be understood as regularizing ANNs by introducing uncertainty into the weights, akin to integrating predictions over an infinite set of ANNs sampled from a certain weight distribution. Traditional ANNs have fixed weights after training. In contrast, BNNs consider weights to follow a Gaussian distribution with mean μ and variance δ, and each weight follows a distinct Gaussian distribution [17], as illustrated in Figure 2.

During prediction, BNNs sample from each Gaussian distribution to obtain weight values, which is equivalent to traditional ANN. After multiple sampling predictions and averaging, the predicted value of BNNs can be obtained, which means BNNs are also equivalent to an ensemble model.

BNNs seek the posterior distribution of parameters P(w|D). According to Bayesian theory, we have

()

Here, P(w) denotes the prior distribution of parameters, P(D|w) represents the likelihood function, and P(D) is referred to as the evidence. However, ∫P(D|w)P(w) is an intractable multiple integral. Therefore, variational inference is introduced into BNNs to compute the posterior distribution [18]. The main idea of variational inference is to approximate the true posterior distribution P(w|D) using a distribution q(w|θ) controlled by variational parameters θ. This allows us to cleverly transform the problem of computing the posterior distribution of model parameters into an optimization problem of finding optimal variational parameters. The optimal variational parameters are found by minimizing the Kullback–Leibler (KL) divergence between q(w|θ) and P(w|D). The loss function of a BNN is defined as

()

where ELBO means evidence lower bound objective. The first term after the equality sign represents the KL divergence between the variational posterior of the model parameters and the prior, known as the complexity cost. The value of the second term depends on the training data and is referred to as the likelihood cost. Once trained, the BNN can be used for prediction.

2.3. DE Algorithm

DE algorithm is a stochastic heuristic search algorithm based on population differences. It was proposed by Storn and Price [19] in 1997 to solve Chebyshev polynomial problems. Due to its characteristics of simplicity, robustness, and fast convergence [20], it has been widely applied in various fields. DE algorithm performs mutation, crossover, and selection operations based on the differential vector between parent individuals. The basic idea is to start from a randomly generated initial population, creating a new individual by weighted addition of the vector difference between any two individuals in the population and the sum with a third individual according to certain rules. The new individual is then compared with a predetermined individual from the current population. If the fitness value of the new individual is better than that of the compared individual, the new individual replaces the old one in the next generation; otherwise, the old individual is retained. Through continuous iterative computation, DE algorithm preserves good individuals, eliminates poor individuals, and guides the search process towards the optimal solution.

DE algorithm uses two different vectors from the population to perturb an existing vector, performing differential operation to achieve mutation:

()

In the equation, x_r1, x_r2, and x_r3 are three randomly selected distinct individuals from the current g_th generation population, and they should not be the same as the target individual, that is, i ≠ r₁ ≠ r₂ ≠ r₃. v_i(g) represents the i_th mutated individual generated in g_th generation, where F is the mutation factor.

The crossover operation is aimed at increasing the diversity of the population, as follows:

()

Here, u_ij(g) represents the j_th component of the i_th crossover individual generated in g_th generation, CR ∈ [0, 1] is the crossover factor, and j_rand is a random integer from [1, 2, ⋯, D], ensuring that at least one component of the trial individual after crossover is provided by the mutated individual.

DE algorithm uses a greedy method to select offspring from both the target individual and the trial individual based on their fitness values.

()

where x_i(g + 1) represents the offspring selected for the next generation and f(∗) denotes the fitness function of the optimization problem. Through the aforementioned mutation, crossover, and selection operations, the population evolves to the next generation and iterates repeatedly until the algorithm reaches the predefined maximum number of iterations or the population’s optimal solution meets the specified error precision. This paper adopts the DE/current-to-best/1 mutation strategy, meaning in Formula (9), x_r1 represents the best individual of g_th generation.

3. The Proposed Method

3.1. Data Selection

In order to clearly demonstrate the method proposed in the paper, we take modeling the frequency characteristics of a dual-frequency slotted patch antenna shown in Figure 3 as an example. The design requires the antenna to exhibit dual resonances within the frequency range of 1.5–3 GHz. To comprehensively cover the antenna parameter space, LHS is employed to sample 100 sets of parameters from the multidimensional parameter space. HFSS simulation software is used to obtain the frequency characteristic for each set of parameters. Leveraging the idea of fitness evaluation for offspring selection in genetic algorithm (GA), an appropriate fitness function is initially defined based on the antenna design requirements, and fitness values are calculated based on the corresponding frequency characteristic of the antenna. By comparing the fitness values of various antenna parameters and sorting them from smallest to largest (where smaller fitness value indicates better antenna performance in this study), antenna parameters with fitness values in the top 30% are selected as the new dataset. This dataset is analyzed to determine the maximum and minimum values of each antenna parameter, establishing a new parameter range to narrow down the antenna optimization scope and thereby enhance the accuracy of the surrogate model. When the range of antenna modeling parameters has been reduced, the dataset size is insufficient to meet the requirements for surrogate model construction at this stage. Therefore, an additional 130 sets of samples are obtained using LHS within the new modeling parameter range, and HFSS simulations are conducted to obtain the dataset. The newly obtained dataset is combined with the dataset selected from the first round, totally 160 sets of data. Subsequently, the same method as before is used for the second round of selection, resulting in the selection of the top 100 sets of antenna. The primary purpose of the first selection is to reduce the modeling parameter range, while the second selection is aimed at not only further narrowing down the antenna optimization scope but also acquiring sufficient training data for training the surrogate model. Eighty samples are randomly selected from these 100 datasets as the training set for training the antenna surrogate model, while the remaining 20 samples serve as the test set to validate the performance of the antenna surrogate model. The process flowchart of the data selection method is shown in Figure 4.

3.2. The BNN-LSTM Surrogate Model

LSTM networks possess the ability of LSTM, offering significant advantages in predicting time series data. In antenna simulation, the frequency characteristic ∣S11∣ of an antenna can be viewed as a time series, where frequency points represent time steps, and the ∣S11∣ values corresponding to these frequency points can be considered as labels. However, using only frequency points as input does not meet the requirements for constructing a surrogate model because different parameters’ combination also affects the antenna’s performance. Therefore, in this study, both antenna dimensional parameters and frequency points are used together as input features for the LSTM network to model the antenna’s performance. Thus, at each step, the input consists of combinations of antenna parameters and frequency points, while the output is the corresponding ∣S11∣ value at the frequency point. In this study, the LSTM model employs a two-layer structure, in where the first layer has an input dimension equal to the feature dimension with an output set to 64 dimensions, and the second layer also has an input of 64 dimensions and outputs 64 dimensions.

Following LSTM, a single-layer BNN integrates the features computed from the LSTM layer, mapping them into the sample label space. Utilizing the predictive uncertainty provided by the BNN model guides DE algorithm more effectively to find the optimal parameters of the antenna. This forms a hybrid surrogate model of LSTM and BNN, named BNN-LSTM model in this paper. This layer has an input dimension of 64, which corresponds to the output of the final layer of LSTM. The output dimension is 1, representing the predicted ∣S11∣ value of the antenna at the corresponding antenna parameters and frequency point. Figure 5 illustrates the process of the surrogate model predicting the ∣S11∣ parameters of antenna.

3.3. Optimization Using DE Algorithm Based on the Proposed BNN-LSTM Surrogate Model

After training the BNN-LSTM surrogate model with the dataset obtained by the proposed data selection in the first part, the network can predict the antenna performance. The optimization now focuses on the antenna physical parameters within the range covered by the 100 parameter sets obtained in the first part. During the optimization process, the surrogate model is used to predict the antenna’s performance and provide the corresponding prediction uncertainty. The fitness calculation during optimization differs from the first part. Here, the lower confidence bound (LCB) preselection method [21] is used. Formula (12) combines predicted values with predictive uncertainty to balance the exploration and exploitation capabilities of the DE algorithm. In this study, the optimization algorithm employs the DE algorithm with a maximum of 50 evolution generations and a population size of 100 individuals per generation.

()

Here, y_lcb(x) represents the predicted LCB, denotes the predicted mean value, and ω is a constant used to control the confidence level and set to 2 in this study. represents the prediction uncertainty. Figure 6 illustrates the optimal process of antennas by combining DE algorithm with the trained surrogate models, called as the BNN-LSTM-DE (Bayesian neural network-long short-term memory-differential evolution) algorithm in the paper.

4. Experiment and Discussion

4.1. Evaluation Metrics

This paper employs mean square error (MSE), mean absolute error (MAE), and R-square (R²) as metrics for evaluating the model performance. MSE and MAE are commonly used metrics for assessing regression models. MSE measures the average of the squared differences between predicted values and true values, while MAE measures the average of the absolute differences. R² is a statistic used to measure the fit of a regression model, representing the proportion of the variance of the model’s fit to the total variance of the data.

()

Here, m represents the sample size, y_i denotes the true value of the i_th observation, represents the model’s predicted value for the i_th observation, and denotes the average value of all observations.

4.2. Optimization Design of Dual-Frequency Slotted Patch Antenna

In order to discuss the effectiveness of the BNN-LSTM-DE method based on the data selection method proposed in this paper, Example 1 validates the optimization design of a dual-frequency slotted patch antenna, as shown in Figure 3. As we all know, traditional rectangular patch antennas have only one single resonance frequency. In order to increase an additional resonance frequencies, L-shaped slots and rectangular gaps are used to load the rectangular antenna [22]. When constructing a surrogate model, physical parameters of the optimized antenna mainly include the dimensions and positions of the slotted gaps, and their corresponding search space is shown in Table 1, as initial parameter optimization space S₀. The antenna uses Rogers RT/duroid 5880 with a thickness of h = 3.175 mm and a relative dielectric constant of ϵ_r = 2.2 as the substrate, fed by a coaxial probe at feeding positions f_x = 22 mm, f_y = 9 mm.

()

Table 1. Parameter optimization space of the dual-frequency slotted patch antenna.

Optimized parameters	Initial parameter optimization space S₀ (mm)	The first selection		The second selection
Optimized parameters	Initial parameter optimization space S₀ (mm)	The parameter optimization space S₁ (mm)	The reduction percentage (relative to space S₀)	The parameter optimization space S₂ (mm)	The reduction percentage (relative to space S₁)
W	[65, 70]	[65.2, 69.94]	5.2%	[65.2, 69.8]	2.9%
L	[55, 70]	[55, 68.65]	9.0%	[55, 68.6]	0.4%
WR	[1, 8]	[1.32, 7.9]	6.0%	[1.32, 7.8]	1.5%
WU	[1, 8]	[1.17, 6.84]	19.0%	[1.20, 6.8]	1.2%
LR	[1, 10]	[1.06, 9.87]	2.1%	[1.06, 9.8]	0.8%
PR	[0, 5]	[0.08, 4.97]	2.2%	[0.15, 4.9]	2.9%
LH	[40, 54]	[40.54, 53.93]	4.4%	[40.6, 53]	7.4%
D	[1, 5]	[1, 4.87]	3.3%	[1.2, 4.86]	5.4%

The antenna is designed to operate at frequencies of 1.9 and 2.4 GHz. In the data selection process, the fitness function is set as

()

where f₁ = 1.85 GHz, f₂ = 1.95 GHz, f₃ = 2.35 GHz, f₄ = 2.45 GHz, and |S₁₁(f)| represents the return loss magnitude of the antenna at frequency point f.

Data collection is conducted using HFSS simulation software, and the sweep range is 1.5–3 GHz with a step size of 10 MHz. This allows for a comprehensive observation of the antenna’s frequency characteristics and their variations within the sweep range.

In Table 1, the initial antenna parameter optimization space is denoted as S₀, and the space is denoted as S₁ after first selection and S₂ after second selection. From Table 1 we know, after the first selection, the optimization range of parameter space S₁ decreased by an average of approximately 6.4% compared to the initial parameter space S₀. After the second selection, the optimization range of parameter space S₂ decreased by an average of approximately 2.81% compared to parameter space S₁. The largest reduction in optimization range occurs after the first selection, primarily because the first selection is aimed at finding the optimal space, thus retaining fewer parameter sets than the second selection, mainly to gather sufficient effective data for training the model. The data selection process utilizes the selection principle of GA, wherein the fitness function is used to select datasets. After data selection, the remaining data are those with higher fitness value. The increase of population fitness value generally accompanies the reduction of optimization space, thereby enhancing the accuracy of the surrogate model.

Table 2 presents the modeling performance of the BNN-LSTM model trained using different datasets in this study. To ensure equal computational costs across different models, total 230 simulations in HFSS is set to each model. The difference lies in the method of dataset selection. Model M₀ is trained using 230 samples obtained through LHS within the initial optimization range S₀. Model M₁ obtains 100 samples through LHS firstly within the initial optimization range S₀ and then selects the top 30% samples according to fitness value, thus resulting in 30 samples. After the first selection, additional 130 samples are obtained through LHS within the new optimization range S₁, and then, we have 160 samples for training. Model M₂ is trained using the top 100 samples with high fitness value selected from the 160 samples of model M₁. Each model’s training set consisted 80% of its total data, with the remaining 20% reserved for testing. From Table 2, it can be seen that model M₀, trained using the initial dataset, performs the worst. Next is model M₁, trained with one selection and resampling process. Model M₁ shows an improvement of approximately 4.1% in R² and reductions of 17.3% and 13.0% in MSE and MAE, respectively, compared to model M₀. This indicates a significant improvement in model M₁’s performance over model M₀. However, considering the magnitudes of R², MSE, MAE, and other indicators, model M₁ is still inadequate as a trustworthy surrogate model. The best-performing model is M₂, trained with data that underwent two selection processes. Model M₂ exhibits an increase in R² by 9.5% compared to model M₁, with reductions in MSE and MAE by 51.4% and 25.4%, respectively. Model M₂ shows a significant performance improvement over model M₁, demonstrating the effectiveness of data selection method. Considering the significant differences in parameters within the larger optimization range S₀, this inevitably increases the modeling difficulty, hence resulting in the poorer performance metrics of model M₀. Despite the smaller training dataset S₁ compared to model M₀, the parameter training range of model M₁ is reduced, leading to improved performance. The same applies to model M₂.

Table 2. Modeling performance of different optimization ranges and datasets of the dual-frequency slotted patch antenna.

Metrics	Model M₀	M₁ relative to M₀	Model M₁	M₂ relative to M₁	Model M₂
MSE	1.907	17.3%	1.577	51.4%	0.766
R²	0.812	4.1%	0.845	9.5%	0.925
MAE	0.638	13.0%	0.555	25.4%	0.414

In Table 3, the proposed LSTM-BNN surrogate model is compared with several commonly used ML surrogate models including ANN, K-nearest neighbor, and decision tree. All trained on the same dataset as Model M₂. As shown in Table 3, the LSTM-BNN outperforms the other three models in terms of MSE, MAE, and R², due to LSTM’s unique advantages in handling sequential data, while the BNN layer naturally integrates multiple models, enhancing predictive performance. This demonstrates the effectiveness of the proposed surrogate model for antenna modeling.

Table 3. Comparison of modeling performance of different surrogate models of the dual-frequency slotted patch antenna.

Model	MSE	MAE	R²
Artificial neural network	2.744	0.660	0.446
K-nearest neighbor	1.984	0.461	0.805
Decision tree	3.452	0.423	0.660
LSTM-BNN	0.766	0.414	0.925

Figure 7 depicts the optimal fitness function curve and the population average fitness curve when optimizing antenna parameters using the trained surrogate model M₂ combined with DE algorithm. It can be observed that both the average population fitness curve and the optimal fitness function curve show continuous decreases. The decreasing trend of the average population fitness function indicates that the optimization algorithm successfully guides the population towards more optimal solution direction, while the decrease in the optimal population fitness function curve indicates that the optimization algorithm is able to find better solutions during the optimization process. The final flattening of both curves indicates that the optimization algorithm has converged, gradually finding the optimal solution within the optimization space.

The optimized parameters of the dual-frequency slotted patch antenna using the proposed algorithm in this paper are shown in Table 4. Figure 8 illustrates ∣S11∣ simulated by HFSS software based on the optimal parameters. Figure 9 presents the antenna radiation pattern at 1.9 and 2.4 GHz. From Figure 8, it can be observed that the ∣S11∣ values of the antenna at 1.9 and 2.4 GHz are both less than −10 dB, meeting the design requirements, which demonstrates the effectiveness of the proposed BNN-LSTM-DE algorithm for antenna optimal design.

Table 4. Optimal results of the dual-frequency slotted patch antenna.

Parameters	W	L	WR	WU	LR	PR	LH	D
Value (mm)	67.41	57.27	1.44	5.19	7.26	4.88	53.00	4.10

4.3. Optimization Design of Rectangular Cut-Corner Ultrawideband (UWB) Antenna

Example 2 is the rectangular cut-corner UWB antenna [23]. As we all know, traditional rectangular patch antennas have a relatively narrow operating bandwidth. In order to increase it, two symmetrical rectangular slots are cut at both ends of the patch, shown in Figure 10. The parameters to be optimized for the antenna are set as [X1, X2, X3, X4, X5], and the corresponding search space is indicated in the initial parameter optimization space S₀ shown in Table 5. The antenna uses an FR4 dielectric substrate measuring 31 mm in length, 24 mm in width, and 1 mm in thickness, fed by a microstrip line that is 10 mm long and 2 mm wide, with a grounded plane on the back.

Table 5. Parameter optimization space of the rectangular cut-corner UWB antenna.

Optimized parameters	Initial parameter optimization space S₀ (mm)	The first selection		The second selection
Optimized parameters	Initial parameter optimization space S₀ (mm)	The parameter optimization space S₁ (mm)	The reduction percentage (relative to space S₀)	The parameter optimization space S₂ (mm)	The reduction percentage (relative to space S₁)
X1	[12, 18]	[12, 17.93]	1.1%	[12.08, 17.93]	1.3%
X2	[18, 21]	[18.1, 20.98]	4.0%	[18.11, 20.97]	0.7%
X3	[1, 5]	[1.21, 4.98]	5.8%	[1.31, 4.73]	9.3%
X4	[1, 5]	[1.14, 4.86]	7.0%	[1.17, 4.81]	2.2%
X5	[5, 10]	[7.85, 9.98]	57.4%	[7.89, 9.98]	1.9%

The design goal of the antenna is to achieve ∣S11∣ values less than −10 dB in the frequency range of 3.1–10.6 GHz. The fitness function during the optimization process is set as

()

Here, f₁ = 2 GHz, f₂ = 3.1 GHz, f₃ = 10.6 GHz, and f₄ = 12 GHz; |S₁₁(f)| represents the reflection loss magnitude of the antenna at frequency f.

The data is also collected using HFSS simulation software with a frequency range from 2 to 12 GHz and a sweep step size of 100 MHz. In Table 5, the optimization space S₁ is reduced by approximately 15.1% on average compared to the initial optimization space S₀. The optimization space S₂ is reduced by approximately 3.08% on average compared to optimization space S₁. This result further confirms the effectiveness of the data selection method. As shown in Table 6, the model M₀ trained on the initial dataset has the worst performance. The model M₁, trained on the dataset after one selection, performs slightly better. R² of model M₁ improved by approximately 3.3% compared to model M₀, while the MSE and MAE decreased by 5.0% and 15.2%, respectively. This indicates that model M₁ shows significant improvement over model M₀. The best-performing model is M₂, trained on the dataset after two selection processes. This model achieved an 1.3% increase in R² compared to model M₁, with reductions in MSE and MAE of 6.0% and 1.2%, respectively. These results demonstrate the effectiveness of the proposed method in improving model prediction performance with the same consuming computing time.

Table 6. Modeling performance of different optimization ranges and datasets of the rectangular cut-corner UWB antenna.

Metrics	Model M₀	M₁ relative to M₀	Model M₁	M₂ relative to M₁	Model M₂
MSE	2.976	5.0%	2.826	6.0%	2.658
R²	0.880	3.3%	0.909	1.3%	0.921
MAE	1.065	15.2%	0.903	1.2%	0.892

In Table 7, different surrogate models are used to model the rectangular cut-corner UWB antenna, and the datasets are the same as those used for training model M₂. As shown in Table 7, MSE, MAE, and R² of the LSTM-BNN model outperform those of the other three models. This indicates the effectiveness of the proposed surrogate model for antenna modeling.

Table 7. Comparison of modeling performance of different surrogate models of the rectangular cut-corner UWB antenna.

Model	MSE	MAE	R²
Artificial neural network	4.018	1.247	0.863
K-nearest neighbor	15.037	2.515	0.550
Decision tree	17.118	2.390	0.488
LSTM-BNN	2.658	0.892	0.921

Figure 11 shows the trends of the optimal fitness curve and the average fitness curve of the population during the optimization of the rectangular cut-corner UWB antenna. As seen in Figure 11, both curves gradually decrease and stabilize with the increase in evolutionary generations, indicating that the optimization algorithm is converging and ultimately finding the optimal solution within the optimization space.

The parameter dimensions of the rectangular cut-corner UWB antenna optimized based on the proposed algorithm are shown in Table 8. Figure 12 illustrates the ∣S11∣ values obtained from HFSS using these parameters. As seen in Figure 12, the antenna bandwidth fully covers the UWB operating frequency range of 3.1–10.6 GHz. Figure 13 presents the two-dimensional radiation patterns at frequencies of 3, 6, and 9 GHz, where the green curve represents the E-plane and the red curve represents the H-plane. As shown in Figure 13, the antenna radiation pattern in the H-plane is approximately circular, while the E-plane resembles the shape of the digit “8.” Some distortion occurs at higher frequencies, but it still meets the expected requirements. This example further demonstrates the effectiveness of the proposed algorithm for antenna optimization.

Table 8. Optimal results of the rectangular cut-corner UWB antenna.

Parameters	X1	X2	X3	X4	X5
Value (mm)	17.92	20.97	2.98	4.81	9.07

5. Conclusion

This paper integrates LSTM network with BNN as a novel surrogate model for EM optimization. Simultaneously, we propose a new data selection method to enhance the modeling capability of the surrogate model. Following the EM optimization is complemented by DE algorithm based on the trained surrogate model. The paper validates the proposed algorithm by optimizing the dual-frequency slotted patch antenna and the rectangular cut-corner UWB antenna. Experimental results demonstrate that the data selection method significantly enhances the modeling effectiveness. The optimal results also confirm the outstanding performance of the proposed algorithm. Future research will include modeling and optimizing antennas with multiobjection including pattern, gain, and reflection.

Conflicts of Interest

The authors declare no conflicts of interest.

Author Contributions

Jinlong Sun: conceptualization, methodology, software, investigation, formal analysis, writing–original draft; Yubo Tian: funding acquisition, resources, supervision, writing–original draft, writing–review and editing; Zhiwei Zhu: data curation, writing–original draft. All authors agree to be accountable for the content and conclusions of the article.

Funding

This work was supported by the Natural Science Foundation of Guangdong Province of China under Grant No. 2023A1515011272, the Tertiary Education Scientific Research Project of Guangzhou Municipal Education Bureau of China under No. 202234598, the Special Project in Key Fields of Guangdong Universities of China under No. 2022ZDZX1020, and the Engineering Technology Center of Guangdong Province Universities of China under No. 2022GCZX004.

Acknowledgments

Open Research

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

1 Liu B., Akinsolu M. O., Song C., Hua Q., Excell P., Xu Q., Huang Y., and Imran M. A., An efficient method for complex antenna design based on a self adaptive surrogate model-assisted optimization technique, IEEE Transactions on Antennas and Propagation. (2021) 69, no. 4, 2302–2315, https://doi.org/10.1109/TAP.2021.3051034.
10.1109/TAP.2021.3051034
Web of Science® Google Scholar
2 Nakmouche M. F., Allam A. M. M. A., Fawzy D. E., Lin D. B., and Sree M. F. A., Development of H-slotted DGS based dual band antenna using ANN for 5G applications, 2021 15th European Conference on Antennas and Propagation (EuCAP), 2021, Dusseldorf, Germany, https://doi.org/10.23919/EuCAP51087.2021.9411213.
10.23919/EuCAP51087.2021.9411213
Google Scholar
3 Dong J., Li Y., and Wang M., Fast multi-objective antenna optimization based on RBF neural network surrogate model optimized by improved PSO algorithm, Applied Sciences. (2019) 9, no. 13, https://doi.org/10.3390/app9132589, 2-s2.0-85068868377.
10.3390/app9132589
Google Scholar
4 Gao J., Tian Y., and Chen X., Antenna optimization based on co-training algorithm of Gaussian process and support vector machine, IEEE Access. (2020) 8, 211380–211390, https://doi.org/10.1109/ACCESS.2020.3039269.
10.1109/ACCESS.2020.3039269
Google Scholar
5 Koziel S. and Ogurtsov S., Rapid design optimization of antennas using space mapping and response surface approximation models, International Journal of RF and Microwave Computer-Aided Engineering. (2011) 21, no. 6, 611–621, https://doi.org/10.1002/mmce.20555, 2-s2.0-80054077308.
10.1002/mmce.20555
Web of Science® Google Scholar
6 Pietrenko-Dabrowska A. and Koziel S., Antenna modeling using variable-fidelity EM simulations and constrained co-kriging, IEEE Access. (2020) 8, 91048–91056, https://doi.org/10.1109/ACCESS.2020.2993951.
10.1109/ACCESS.2020.2993951
Google Scholar
7 Easum J. A., Nagar J., Werner P. L., and Werner D. H., Efficient multiobjective antenna optimization with tolerance analysis through the use of surrogate models, IEEE Transactions on Antennas and Propagation. (2018) 66, no. 12, 6706–6715, https://doi.org/10.1109/TAP.2018.2870338, 2-s2.0-85053353732.
10.1109/TAP.2018.2870338
Web of Science® Google Scholar
8 Khan M. R., Zekios C. L., Bhardwaj S., and Georgakopoulos S. V., Performance of random forest algorithm in high-dimensional surrogate modeling of antennas, 2021 IEEE International Symposium on Antennas and Propagation and USNC-URSI Radio Science Meeting (APS/URSI), 2021, Singapore, Singapore, 1445–1446, https://doi.org/10.1109/APS/URSI47566.2021.9703847.
10.1109/APS/URSI47566.2021.9703847
Google Scholar
9 Cao J., Li Z., and Li J., Financial time series forecasting model based on CEEMDAN and LSTM, Physica A: Statistical Mechanics and its Applications. (2019) 519, 127–139, https://doi.org/10.1016/j.physa.2018.11.061, 2-s2.0-85059181796.
10.1016/j.physa.2018.11.061
Web of Science® Google Scholar
10 Sagheer A. and Kotb M., Time series forecasting of petroleum production using deep LSTM recurrent networks, Neurocomputing. (2019) 323, 203–213, https://doi.org/10.1016/j.neucom.2018.09.082, 2-s2.0-85054751039.
10.1016/j.neucom.2018.09.082
Web of Science® Google Scholar
11 Nammous M. K. and Saeed K., Natural language processing: speaker, language, and gender identification with LSTM, Advanced Computing and Systems for Security. (2019) 883, 143–156, https://doi.org/10.1007/978-981-13-3702-4_9, 2-s2.0-85061161936.
10.1007/978-981-13-3702-4_9
Google Scholar
12 Liu Y., Liu B., Ur-Rehman M., Imran M. A., Akinsolu M. O., Excell P., and Hua Q., An efficient method for antenna design based on a self-adaptive Bayesian neural network-assisted global optimization technique, IEEE Transactions on Antennas and Propagation. (2022) 70, no. 12, 11375–11388, https://doi.org/10.1109/TAP.2022.3211732.
10.1109/TAP.2022.3211732
Google Scholar
13 Tian J., Tan Y., Zeng J., Sun C., and Jin Y., Multiobjective infill criterion driven Gaussian process-assisted particle swarm optimization of high-dimensional expensive problems, IEEE Transactions on Evolutionary Computation. (2019) 23, no. 3, 459–472, https://doi.org/10.1109/TEVC.2018.2869247, 2-s2.0-85053127236.
10.1109/TEVC.2018.2869247
Web of Science® Google Scholar
14 Wang H., Jin Y., and Doherty J., Committee-based active learning for surrogate-assisted particle swarm optimization of expensive problems, IEEE Transactions on Cybernetics. (2017) 47, no. 9, 2664–2677, https://doi.org/10.1109/TCYB.2017.2710978, 2-s2.0-85023750065, 28650832.
10.1109/TCYB.2017.2710978
PubMed Web of Science® Google Scholar
15 Lipton Z. C., Berkowitz J., and Elkan C., A critical review of recurrent neural networks for sequence learning, 2015, https://arxiv.org/abs/1506.00019.
Google Scholar
16 Hochreiter S. and Schmidhuber J., Long short-term memory, Neural Computation. (1997) 9, no. 8, 1735–1780, https://doi.org/10.1162/neco.1997.9.8.1735, 2-s2.0-0031573117.
10.1162/neco.1997.9.8.1735
CAS PubMed Web of Science® Google Scholar
17 Blundell C., Cornebise J., Kavukcuoglu K., and Wierstra D., Weight uncertainty in neural networks, 2015, https://arxiv.org/abs/1505.05424.
Google Scholar
18 Blei D. M., Kucukelbir A., and McAuliffe J. D., Variational inference: a review for statisticians, Journal of the American Statistical Association. (2017) 112, no. 518, 859–877, https://doi.org/10.1080/01621459.2017.1285773, 2-s2.0-85023644081.
10.1080/01621459.2017.1285773
CAS Web of Science® Google Scholar
19 Storn R. and Price K., Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces, Journal of Global Optimization. (1997) 11, no. 4, 341–359, https://doi.org/10.1023/A:1008202821328, 2-s2.0-0142000477.
10.1023/A:1008202821328
Web of Science® Google Scholar
20 Neri F. and Tirronen V., Recent advances in differential evolution: a survey and experimental analysis, Artificial Intelligence Review. (2010) 33, no. 1–2, 61–106, https://doi.org/10.1007/s10462-009-9137-2, 2-s2.0-75149170039.
10.1007/s10462-009-9137-2
Web of Science® Google Scholar
21 Dennis J. and Torczon V., Managing approximation models in optimization, Multidisciplinary Design Optimization: State-of-the-Art. (1997) 5, 330–347.
Google Scholar
22 Aldhafeeri A. and Rahmat-Samii Y., Brain storm optimization for electromagnetic applications: continuous and discrete, IEEE Transactions on Antennas and Propagation. (2019) 67, no. 4, 2710–2722, https://doi.org/10.1109/TAP.2019.2894318, 2-s2.0-85064277411.
10.1109/TAP.2019.2894318
Web of Science® Google Scholar
23 Fan W., Antenna optimization design based on intelligent optimization algorithm and proxy model, 2022, BUPT.
Google Scholar

Citing Literature

All articles

BNN-LSTM-DE Surrogate Model–Assisted Antenna Optimization Method Based on Data Selection

Abstract

1. Introduction

2. Related Works

2.1. LSTM

2.2. BNNs

2.3. DE Algorithm

3. The Proposed Method

3.1. Data Selection

3.2. The BNN-LSTM Surrogate Model

3.3. Optimization Using DE Algorithm Based on the Proposed BNN-LSTM Surrogate Model

4. Experiment and Discussion

4.1. Evaluation Metrics

4.2. Optimization Design of Dual-Frequency Slotted Patch Antenna

4.3. Optimization Design of Rectangular Cut-Corner Ultrawideband (UWB) Antenna

5. Conclusion

Conflicts of Interest

Author Contributions

Funding

Acknowledgments

Open Research

Data Availability Statement

References

Citing Literature

Figures

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley