Solar energy with hydropower power plants marks a significant leap forward in renewable energy innovation. The combination ensures a consistent power supply by merging the fluctuations of solar energy with the predictable storage provided by hydropower. This research aims to predict high solar irradiance on hydropower plants to maximize active power generation. A novel hybrid decomposed residual ensembling model for deep learning (SBL_TSRA_RW) using models such as autoregressive integrated moving average (ARIMA) and seasonal-trend decomposition using loess (STL) along with prediction and optimization models such as Bidirectional LSTM (Bi-LSTM), and Whale Optimization Algorithm (WOA) methods are used to predict the irradiances. Various forecasting methods, including STL-Bi-LSTM, SBL_TSA_R, SBL_TA_RS, and SBL_TSRA_R models, are assessed to determine their effectiveness in predicting solar radiation. The results show the accuracy of the proposed model, with RMSE and MAE values of 1.85 W/m² and 1.31 W/m², respectively. The proposed SBL_TSRA_RW model results are more accurate than the Bi-LSTM, STL-Bi-LSTM, SBL_TSA_R, SBL_TA_RS, and SBL_TSRA_R models, with RMSE value reductions of 517%, 217%, 151%, 98%, and 1%, respectively.

1. Introduction

Renewable energy is increasingly being used to offset the effects of climate change and global warming. The exploration of incorporating renewable energy sources into existing energy grids is gaining momentum [1]. Future energy supply will most likely rely heavily on solar and wind energy. Solar power is a rapidly developing renewable energy source and is unreliable and intermittent, affecting grid management [2]. Solar energy’s volatility and randomness also pose considerable integration challenges [3]. Adapting hybridization and forecasting ensures a reliable energy supply from intermittent sources.

The hybridization approach will integrate various renewable energy sources, such as combinations of hydro with floating solar, hydro with wind, hydro, solar, and wind together, to provide reliable renewable energy [4]. Hybridization-based research offers a sustainable solution to the world’s growing energy demands by combining the advantages of solar photovoltaic power plants with pumped storage power plants [5]. The efficiency of the combination is significantly enhanced by adapting floating photovoltaic (FPV) technology [6].

Installing floating PV panels on hydropower plants worldwide has the potential to generate over 4.4 GW of electricity with a surface coverage of just 2%, resulting in approximately 6270 TWh of electricity [7]. This type of hybridization is especially important for countries like India, where land for large-scale solar installations is limited [8]. Floating PV panels provide a cost-effective alternative that accelerates project timelines by avoiding complex land acquisition processes and preserving land for agricultural use. Additionally, covering reservoir areas with floating solar panels reduces water evaporation, increases water availability, and enhances hydropower generation. The cooling effect of the water further boosts the efficiency of PV panels installed on reservoirs [9].

The integration of floating PV panels with hydropower increases the flexibility of both energy sources, creating a “virtual battery” that supplies solar power during the day and relies on hydropower during periods of low solar radiation or at night [10]. Additionally, hydropower reservoirs for solar projects provide integrated grid connections and access to existing infrastructure, such as roads. Consequently, countries worldwide are implementing policies to encourage the adoption of hybrid solar systems based on hydropower, aiming to boost renewable energy production and reduce dependence on traditional hydropower alone [11].

Developing a highly dynamic power grid based on accurate solar forecasts offers a cost-effective hybrid technological solution to mitigate solar variability. The forecasting model integration presents significant advantages for power grid management and real-time power forecasting. It presents significant advantages for power grid management and real-time power forecasting [12]. It also aids energy managers in optimizing solar power generation, managing energy storage, and ensuring grid stability through accurate predictions. Surplus energy is stored or directed by high solar radiation period predictions, while backup power sources can be activated during low solar radiation periods. These predictive capabilities allow for real-time adjustments in energy flows, preventing grid overloads and ensuring efficient power distribution. As energy systems evolve, embedding this model within smart grids will enhance dynamic decision-making, improve overall grid stability, and support the integration of renewable energy sources, ultimately contributing to a more sustainable and reliable power supply. Therefore, these forecasts are essential for grid regulation, load-responsive production, power planning, and meeting feed-in obligations. As demand for solar energy integration rises, the need for diverse solar forecasting methods across various temporal and spatial resolutions grows, which enables cost-effective PV energy integration [13].

Accurately forecasting irradiance is complicated by clouds and dust, posing challenges for physical models. As a result, statistical methods such as autoregressive moving averages, support vector machines, and artificial neural networks are commonly used. However, these approaches struggle with problems such as low accuracy, and limited scalability with large data sets, and face difficulties in capturing long-term dependencies. Deep learning-based models, such as neural networks, particularly the multilayer perceptron, long short-term memory, and gate recurrent unit structure, are widely utilized to handle such complex nonlinear difficulties [14]. Fine-tuning the hyperparameters of these deep learning models can enhance prediction accuracy and is commonly achieved using optimization techniques such as genetic algorithms, whale optimization, and particle swarm optimization.

2. Literature Review

Solar irradiance was predicted using various forecasting models, from traditional methods to cutting-edge hybrid ensemble deep learning approaches [15]. Ghislain et al. [16] introduce a statistical spatiotemporal model to enhance the short-term forecasting accuracy of photovoltaic production. Addressing the nonstationarity issue of production series, they propose a stationarity process that significantly reduces forecasting errors compared to raw inputs. Their time series model, leveraging spatiotemporal dependencies among distributed power plants, offers low computational requirements and improves forecasting performance. Since solar irradiance data is nonstationary due to factors like clouds and seasons, traditional time series models struggle to capture nonlinearities, leading to poor prediction accuracy. Additionally, the large volumes of current data make these models less suitable. As a result, researchers have shifted their focus from traditional models to machine and deep learning approaches. Alzahrani et al. [17] introduced a solar irradiance prediction model employing deep recurrent neural networks LSTM, emphasizing preprocessing, supervised training, and postprocessing stages. The study utilized data from Canadian solar farms to demonstrate the effectiveness of LSTM, yielding a mean RMSE of 0.077 for the test dataset, outperforming other methods. Notably, LSTM’s enhanced data handling and feature extraction capabilities make it a promising tool for accurate solar irradiance prediction, overcoming deficiencies in traditional statistical methods like ARMA, and signaling its suitability for practical applications. Hosseini et al. [18] introduce a GRU-based approach for Direct Normal Irradiance forecasting, offering computational efficiency while maintaining accuracy comparable to LSTMs. They assess univariate and multivariate GRU, optimized using historical irradiance, weather, and cloud cover data from the LRSS solar facility in Colorado. Comparative analysis against LSTM models demonstrates the superiority of the proposed multivariate GRU, exhibiting a 34.42% improvement in RMSE and a 41.31% enhancement in MAPE. Additionally, the multivariate GRU showcases efficiency in forecasting multiple time horizons without compromising accuracy, highlighting its potential as a computationally effective solution for irradiance prediction. Liu et al. [19] research centered on utilizing Bi-directional Long Short-Term Memory (Bi-LSTM) neural networks. The study utilizes an 11-year weather dataset from NASA [20], employing preprocessing techniques such as Automatic Time Series Decomposition and Pearson correlation to enhance data quality by removing noisy values and selecting relevant features. A comparison analysis was carried out to compare Bi-LSTM’s performance to that of LSTM and MLP with Bi-LSTM being effective for solar irradiance forecasting. Individual models, however, have limits, driving researchers to explore hybrid models, which combine two or more models to maximize each’s capabilities. Babu et al. [21], and Reikard’s [22] forecasting experiment results highlight the dominance of ARIMA and hybrid model (ARIMA + ANN) across different resolutions and datasets. ARIMAs excel in capturing seasonal cycle transitions due to differencing at the 24-h horizon, while struggling with weather variability. Neural networks generally outperform regressions but lag behind ARIMAs, primarily due to training difficulties. The model choice depends heavily on resolution, with ARIMA better at lower resolutions dominated by seasonal cycles, and regressions or neural networks may excel at higher frequencies capturing short-term patterns. Hence, the combined approach (ARIMA + ANN) dominated both ARIMA and ANN when applied individually. Yinghao et al. [23] developed and applied a real-time re-forecasting ANN-GA optimization approach to predict intra-hour power generation from a 48 MWe PV plant. The re-forecasting approach notably improved forecast skills for time horizons of 5, 10, and 15 min across all baseline models, demonstrating its effectiveness in mitigating errors inherent in various forecasting methodologies. Huaizhi et al. [24] introduced a hybrid approach combining WT and DCNN for deterministic PV power forecasting for data from Belgian PV farms. WT decomposes the signal into frequency series, enhancing its outlines and behaviors, while DCNN extracts nonlinear features from each frequency. Additionally, they propose a probabilistic PV power forecasting model integrating deterministic methods with spine quantile regression to assess probabilistic information in PV power data. Application of these methods to PV data from Belgian PV farms shows improved forecasting accuracy across seasons and prediction horizons compared to traditional models. The model makes it suitable for processing real-time data in power plant operations. Cunha et al. [25] focus on improving the accuracy of the hybrid forecasting model by combining STL with SES and ARMA models. STL handles high-frequency time series data, while SES and ARMA models fit the trend and remainder components respectively. Applied to wind speed forecasts, the hybrid model achieved a significant 101.39% decrease in MAPE compared to ECS forecasts. Using Dense, LSTM, Conv1D, and STL. Njogho et al. [26] evaluate seven ensemble models with hydro data from the Lom Pangar reservoir. The results underline the superiority of the multivariate STL-dense model and highlight the importance of incorporating multiple inputs and models for improved prediction accuracy. Jun et al. [27] address the issue of a large and dynamic dataset of air pollution management through accurate prediction of pollutant concentrations. The proposed ARIMA-Whale Optimization Algorithm (WOA)-LSTM model combines ARIMA for linear data extraction and WOA-LSTM for nonlinear prediction, where the LSTM hyperparameters are optimized using the Whale algorithm. A comparative analysis with other models shows the superior performance of ARIMA-WOA-LSTM in pollutant prediction accuracy, overall model accuracy, and prediction stability. Moreover, the combined model outperforms the individual models in these aspects, underlining the effectiveness of the proposed approach for air pollution management. Hang et al. [28] introduce a novel method for predicting aviation failure events by integrating seasonal-trend decomposition using Loess (STL) with a hybrid model comprising a transformer and autoregressive integrated moving average (ARIMA). STL decomposition isolates trend, seasonal, and remainder components, enhancing understanding of event characteristics. The transformer handles trend prediction, addressing computational efficiency issues, while ARIMA manages seasonal and remainder components, reducing complexity without sacrificing accuracy. Evaluation using ASRS data shows that the STL-Transformer-ARIMA model outperforms single models, demonstrating superior accuracy and robustness in predicting aviation failure events.

This literature sets the hybrid ensemble deep learning models, which integrate the strengths of diverse forecasting techniques to enhance prediction accuracy and robustness. Specifically, the STL model captures the trend, seasonal, and residual components with high interpretability. Statistics-based predictors are easy to use and allow for the quantification of each component’s effect. While they perform well with independent samples, these approaches often introduce significant uncertainty when dealing with complex, nonlinear relationships. Deep learning predictors excel at modeling such complex nonlinearities by learning from data samples and self-optimizing based on newly labeled data, which improves the understanding of solar irradiance trends and enhances forecast confidence. Additionally, a Bi-LSTM-based deep learning model is used to forecast the trend component, while the residual components are predicted using a novel multihead attention mechanism and parallel processing, which significantly improves accuracy for long sequence datasets. Hence, this research follows the remainder and residual decomposition using STL using Loess and ARIMA, prediction using Bi-LSTM, optimization using a WOA, and recomposition for combining the entire results to improve the solar radiance estimations.

The objectives of the proposed work are.

1.
To develop a highly accurate hybrid ensemble intelligent deep learning forecasting model through the decomposed residual ensemble technique.
2.
To predict the high solar irradiance on hydropower houses to achieve high active power by reducing the error.
3.
To design the feasible hybrid model that supports the installation of hybrid hydro solar powerhouses to enhance reliability in hybrid green power generation with FPV advantages.

3. Theories and Methods

The individual models incorporated into the proposed hybrid ensemble model are detailed below: (i) STL, (ii) ARIMA, (iii) Bi-LSTM, and (iv) whale optimization.

3.1. STL

STL is a widely adopted algorithm for time series analysis, valued for its efficacy in decomposing a time series into its constituent components. The STL decomposition technique, introduced by R. B. Cleveland in 1999, is a highly versatile and resilient approach [29].

()

This method partitions the time series into three components: the remainder term, representing irregular variations; the trend term, capturing low-frequency patterns; and the seasonal term, reflecting high-frequency variations.

3.2. ARIMA Model

The model comprises three components: Autoregressive (AR), Integrated (I), and Moving Average (MA) parts. It is essential for the considered time series data to be stationary to apply the ARIMA model [30].

()

The Augmented Dickey-Fuller (ADF) test validates the data, which produced q, s, and d values. The AR (q) and MA (s) values were determined using the partial autocorrelation function (PACF) and the ACF functions. The ‘d’ value is determined by differencing the data [31].

3.3. Bi-LSTM Model

The rapid expansion of deep learning has been accompanied by new technologies and architectures, leading to increased integration in various research domains. LSTM structures, a popular model in deep learning for time series prediction, have been employed in the advanced development of solar forecasting techniques presented in Figure 1. Addressing forecasting challenges, such as long-term trends, seasonal fluctuations, and random noise, LSTMs excel in modeling temporal patterns within data sequences. LSTMs distinguish themselves through memory cells and information transfer mechanisms between units. This unique capability enables them to process operational data sequences, extracting temporal information crucial for minimizing forecast errors.

()

Details are in the caption following the image — **Figure 1**
Open in figure viewer PowerPoint

Schematic diagram for LSTM cell.

The Bi-LSTM network comprises multiple LSTM cells operating bi-directionally as in Equations (3)–(9). The forward layer processes information from the previous and current blocks, while the backward layer, with a reverse direction, utilizes information from future blocks.

3.4. WOA

The WOA draws inspiration from the foraging behavior of humpback whales. The WOA method operates in three steps, as the current best candidate solution is either the desired prey or near the optimal solution [32]. Once the best search agent identifies it, the remaining agents strive to converge toward it [33].

Step 1: Mathematical Modelling
()

()
In this equation, “t” represents the current iteration, while “J” and “K” denote coefficient vectors. “Sp” signifies the position vector of the prey, and “S” indicates the position of a grey wolf. If an improved solution arises after iterations, S^∗ updates. Then, the vectors J and Z are computed utilizing the equations
()

()
The range of fluctuation for variable Z is also reduced by z. In essence, Z is a random number within the interval [−z, z], where the value of z decreases from 2 to 0 over successive iterations.
Step 2: Exploitation Phase
Two strategies to mathematically emulate the humpback whale’s behavior in bubble nets:
1.
Shrinking mechanism
The mathematical representation of the behavior of grey wolves surrounding prey during the hunt is as follows:
()

()
2.
Spiral updating position
Initially, the process involves calculating the distance between the whale at position (S, T) and the prey at position (S^∗, T^∗). Subsequently, a spiral equation is formulated between the whale’s position and that of its prey to replicate the helix-shaped movement observed in humpback whales [34].
()
In the equation, K^′ denotes the distance between the i-th whale and the prey, where c is a constant determining the logarithmic spiral shape, and the variable n represents a random number selected from the range of [−1, 1].
()
The humpback whales exhibit simultaneous movement patterns, swimming in both a diminishing circle and spiral around their prey. Apart from the bubble-net technique, humpback whales also engage in random prey searching. The mathematical representation of this search process is as follows.
Step 3: Search for prey
Exploration of prey can be a similar method by adjusting the A vector. Humpback whales naturally engage in random search behaviors depending on their respective locations. ‘A’ value higher than one or lower than minus one is employed to encourage search agents to move away from a designated reference whale. During the exploration phase, the position of a search agent is updated based on a randomly chosen search agent, rather than the best one discovered thus far during the exploitation phase [35]. By employing this method alongside |A| > 1, which promotes exploration, the WOA algorithm effectively conducts global searches. The mathematical representation is as follows:
()

()
where S_rand represents a randomly selected position vector, corresponding to a random whale chosen from the current population. The conditions that activate the method to search are as follows:
()
The global search capability of WOA and the susceptibility to local optimizations are due to the random nature of the variable “p” and the fact that a global search only occurs when “p” is less than 0.5.

4. Proposed Hybrid Ensemble Forecasting Methodology

The proposed hybrid ensemble approach is explained in detail through a step-by-step process, illustrated in Figure 2, with the workflow depicted in Figure 3(b). The workflow, outlined through subsections, meticulously elaborates each step denoted as S1, S2, … S7 in Figure 3(b), offering a comprehensive understanding of the process.

4.1. Data Collection and Preprocessing

4.1.1. Data Collection

From the Ministry of Energy, India, the probabilistic analysis is performed for 252 high-scale hydropower locations by using 40 years day ahead solar irradiance data from NASA power database with latitude and longitude of selected locations taken from global energy monitor and twenty-five hydropower houses are selected [36].

The results of the probabilistic analysis identify the following twenty-five hydropower plants as the most suitable locations for implementing optimal PV-hydro hybrid systems: Nagarjuna Sagar (AP), Pulichinthala (TG), Kodaya (TN), Idukki (KL), Kalinadi (KT), Rihand (UP), Sewa (JK), Tangnu Romai (HP), Khatima (UD), Jawahar Sagar (RJ), Mukerian (PB), Ukai (GT), Tillari (MH), Omkareshwar (MP), Hasdeo Bango (CG), Panchet (JD), Maithon (WB), Teesta (SM), Chiplima (OS), Ranganadi (AR), Kopili (AS), New Umtru (MG), Loktak (MR), Tuirial (MZ), and Doyang (NG) The identified Hydropower plant locations are shown in Figure 3(a). The average solar irradiance at these hydropower locations is shown in Figure 4.

4.1.2. Data Preprocessing

The single dataset contains only 14,560 samples with sixteen features on each location, hybrid deep learning models like LSTM, GRU and their ensembles are insufficient to handle the dataset. Therefore, twenty-five datasets [37] are mixed to form a large dataset of 364,000 samples divided into 80% train and 20% test datasets. Minor gaps were rectified using mean or median imputation and forward/backward filling. Particular rows or columns with significant missingness were eliminated to maintain data integrity. This comprehensive dataset encompasses twenty-five locations and includes only highly correlated features such as the clearness index, wet bulb temperature, surface pressure, solar irradiance, earth skin temperature, specific humidity, relative humidity, corrected precipitation, temperature, dew point, wind speed, and wind direction at 2 m/s and 10 m/s, among others.

The correlation analysis performed on solar irradiance indicates multiple significant correlations between the characteristics as in Figure 5. The strongest correlation is seen with wet-bulb temperature (0.87), followed by specific humidity at 2 m (0.68), showing that these parameters have a significant positive impact. Air temperature has a moderate correlation of 0.59, while dew point also correlates with 0.67, indicating that temperature, humidity, and dew point variables are important in forecasting solar irradiance. Other characteristics with a positive, but less obvious, include surface temperature at 0.52 and wind speed at 2 m at 0.43. Wind speed at 10 m shows a lesser but positive of 0.40, followed by precipitation and relative humidity at 2 m, with correlations of 0.20 and 0.07, respectively.

Additionally, we have added the latitude and longitude columns into the merged dataset to predict solar irradiance in particular locations. Another reason for merging the datasets is that using the model individually on each dataset will increase the computational time. Also, the model produces different errors for various locations, which leads to inaccuracies in forecasting solar irradiance for each location.

4.2. Data Decomposition Using STL

STL involves several key parameters for effective operation. These include the number of samples used for computation during a single cycling period, the length of the low-pass estimation window, seasonal smoother, and trend smoother. Adjusting these parameters allows for fine-tuning the decomposition process and achieving optimal results in extracting trend (T), seasonal (S), and remainder (R) components from time series data. This technique utilizes robust locally weighted regression to decompose time series data into three components: (1) trend data, (2) seasonal data, and (3) remainder data. Since the additive STL model offers a more effective means of capturing the temporal characteristics of sequences exhibiting substantial periodicity, STL decomposes input data as described in the formula.

()

where I_STL represents the original input samples, and O_T, O_S, O_R represents the output trend, seasonal, and remainder series respectively. To further quantitatively analyze the seasonal component, the ACF was calculated, showing a pronounced peak at lag intervals corresponding to integer multiples of 12, with relative stability observed beyond a lag of 13. The study concluded that the STL model exhibits excellent fit and significant annual periodicity with ns (the length of seasonal smoothing) set to 13, and the residual distributions remain relatively stable. Consequently, ns was set to 13 to achieve optimal performance in the solar irradiance dataset decomposition. Based on this understanding, predictors forecast the decomposed seasonal, trend, and remainder components following the generative process.

4.3. Remainder Data Decomposition or Data Double Decomposition

The three decomposed datasets trend, seasonal, and remainder are downloaded into Google Colab. The remainder dataset is directly passed into the ARIMA time series model through ADF, Akaike information criteria (AIC), and Bayesian information criteria (BIC) tests, where q, s, and d values are brought down through autocorrelation, partial autocorrelation, and differencing functions [38].

()

where P_R represents the observation for the predictor variable at time t, while α_k (for k = 1, 2, …, q), β_t (for t = 1,2, …, s) and γ_u (for u = 1, 2, …, d) denote the autoregressive and moving average coefficients, respectively. Additionally, {ε_c} denotes a white noise process. Further, the predictions are made with the ARIMA model, and the residuals are calculated as the difference between predicted and actual residual values. The final residual values are obtained through data double decomposition or remainder decomposition.

4.4. Data Standardization

Min-max normalization was applied to standardize the dataset and mitigate dimensional discrepancies. This method linearly rescaled the index values to range between 0 and 1. Mathematically, it is as

()

where x is the original value, x_n is the normalized value, and x_max and x_min denote the maximum and minimum values in the dataset, respectively. Variations in value distributions across the dataset were rearranged by employing min-max normalization, ensuring that numerical features remained consistently scaled.

4.5. Predictions Using Bi-LSTM Model

Bidirectional LSTM, distinguished by its ability to consider both past and future contextual information in data analysis, operates by processing input data in ascending order for future prediction and decreasing order for past context evaluation. The residuals from the ARIMA model, trend, and seasonal data from the STL model are passed individually into the Bi-LSTM model. This model runs the individual data through the LSTM cells bidirectionally and predicts the trend, seasonal, and remainder, respectively. Optimizing hyperparameters is pivotal in maximizing the performance of deep learning models, necessitating the fine-tuning of parameters such as batch size (BS), learning rate (LR), neuron (NN) count, and epochs for optimal results. Initially, the Adam optimizer including default settings with a LR of 0.01, 100 NN, a BS of 64, and 100 epochs was employed for the Bi-LSTM model. Subsequently, these parameters were refined using a whale optimization algorithm. Monitoring training and testing errors across epochs revealed a notable decrease in training loss within the initial 25 epochs, followed by a slower decline until convergence around the 70th epoch. Similarly, the test loss exhibited a comparable pattern, reaching convergence around the 60th epoch. Based on these findings, the study determined that the model attains high accuracy when trained for 100 epochs. This iterative process was repeated for various training/test splits and epoch configurations. The same approach was applied to tune hyperparameters for Bi-LSTM models handling trend, seasonal, and decomposed remainder data. The decision to employ this model solely for grouping 25 datasets rather than applying it to each dataset is due to computational time and efficiency concerns. Utilizing the model for individual datasets would entail 72 Bi-LSTM optimizations and 24 ARIMA parameter adjustments across the 25 datasets in addition. Therefore, consolidating the datasets simplifies the optimization process and large data increases the efficiency considerably.

4.6. Re-Composition of Predictions

The trend, seasonal, and remainder predictions from the three WOA—Bi-LSTM models are ensembled to obtain final prediction results. These predictions are further compared with various graphs and tables in the results and discussion section.

5. Metrics

Metrics play a crucial role in evaluating the performance and accuracy of solar radiation prediction models. These metrics provide quantitative measures that help to test the reliability and effectiveness of predictions to support decision-making processes in various applications related to solar energy [39]. Some of the key metrics used in solar irradiance prediction are as follows.

5.1. Mean Absolute Error (MAE)

MAE is the average absolute difference between predicted and actual solar irradiance values. It provides an unbiased assessment of forecast accuracy, assigning equal importance to all discrepancies in the data.

()

MAE is suitable for regression problems and overall forecast evaluation, with smaller values indicating better forecasting performance.

5.2. Root Mean Square Error (RMSE)

The MSE is computed by averaging the squared differences between the actual and predicted solar irradiance values. RMSE is the square root of the average of the squared differences between predicted and actual solar irradiance values.

()

RMSE is highly regarded as a performance evaluation metric due to its capability to identify and mitigate outliers in the data. Moreover, RMSE prioritizes larger errors, making it a preferred choice as the primary error metric. Normalized RMSE (NRMSE) quantifies the overall deviations in larger datasets.

()

6. Results and Discussions

The predictions of trend, seasonal, and residual components derived from the proposed model, highlighting the comparison between actual and predicted values of the STL decomposed components, are presented in Figures 6, 7, 8. The recomposed predictions from the proposed model, featuring scatter plots depicting actual versus predicted values, along with various error plots derived from these recomposed predictions, are shown in Figures 9, 10, 11. The experimental results were obtained using Jupyter Notebook and Google Colab, with the necessary Python libraries installations. We employed a graphical processing unit (GPU) facilitated by the CUDA and CUDNN installation software tools to expedite processing, especially given the extensive use of epochs. This GPU integration significantly accelerated computations compared to standard CPU processing. Notably, Google Colab, with its GPU capabilities, was also leveraged for comparative analyses involving deep learning models.

LSTM-based models displayed superior accuracy than Transformer models in several circumstances, indicating their efficacy in certain settings [40]. The results of the transformer and Bi-LSTM model are close to each other in our case. Bi-LSTM provided better results than the transformer model, especially in mean absolute error for a single layer. Therefore, this research uses the Bi-LSTM model as a benchmark. The third approach, STL-Bi-LSTM, involves decomposing the data using the Loess model’s residual, seasonal, and trend decomposition. The resulting trend, seasonal, and residual components are then individually fed into three Bi-LSTM models. Subsequently, the predictions from these models are combined to produce the ensembled result. The SBL_TSA_R model involves passing only the decomposed residuals into an ARIMA model, and the other trend and seasonal components are fed directly into Bi-LSTM models after STL decomposition. The results from these three models are combined to obtain the ensembled result. In the SBL_TA_RS model, the residuals and seasonal components undergo ARIMA predictions, while the trend component is through a Bi-LSTM model. The results from these three models are combined to obtain the ensembled result. In the SBL_TSRA_R model, the decomposed residuals are passed through an ARIMA-Bi-LSTM model, while the seasonal and trend components are into Bi-LSTM models. The predictions from these models are then combined to predict the ensembled result. Furthermore, the Bi–LSTM model included in the SBL_TSRA_R approach is whale-optimized to achieve final optimal results. It is important to note that the ARIMA model is not for trend data due to the nonstationary nature of trend data, which renders the ADF test ineffective. Therefore, models utilizing ARIMA for trend data may not yield accurate results. The comparison of these different models is in Table 1. The results demonstrate the accuracy of the proposed decomposed residual ensembling bidirectional long short-term memory SBL_TSRA_RW model with low RMSE and MAE values.

Table 1. Comparative results for the models used in the work.

Model	RMSE (W/m²)	MAE (W/m²)	Conclusion
Transformer	11.85	8.65	BS = 64, NN = 100, Epochs = 100, LR = 0.01
Bi—LSTM	11.42	8.20	BS = 64, NN = 100, Epochs = 100, LR = 0.01 (reference model)
SBL_TSA_R	5.87	4.08	BS = 64, NN = 100, Epochs = 100, LR = 0.01 for Bi-LSTM (T and S), ARIMA (2, 0, 2) for R components
STL—Bi—LSTM	4.65	3.17	BS = 64, NN = 100, Epochs = 100, LR = 0.01 (T, R, and S)
SBL_TA_RS	3.67	2.46	BS = 64, NN = 100, Epochs = 100, LR = 0.01 for Bi-LSTM (T), ARIMA (2, 0, 2) for R, S components
SBL_TSRA_R	1.87	1.32	BS = 64, NN = 100, Epochs = 100, LR = 0.01 for Bi-LSTM (T, S, and R), ARIMA (2, 0, 2) for R, S components
SBL_TSRA_RW	1.85	1.31	BS = 64, NN = 67, 90 and 66, Epochs = 100, LR = 0.08, 0.09, and 0.003 for Bi-LSTM (T, S, and R), ARIMA (2, 0, 2) for R components

7. Conclusion

This study investigates various forecasting approaches for solar irradiance prediction, ranging from bidirectional LSTM models to more advanced hybrid techniques, by predicting optimal locations of hydropower houses in India for high solar irradiance to generate the maximum power using the proposed model SBL_TSRA_RW to confirm the reliable power supply. It also highlights the usefulness of applying intelligent time-series forecasting approaches, such as ARIMA, to include deep learning in renewable energy forecasting. By incorporating both trend, seasonal, and residual fluctuations, the model gives a thorough knowledge of solar energy trends in HSHP locations. Further, the outcome of this study is based on the proposed model SBL_TSRA_RW, which identified the optimal locations (KL, AP, and TN) to generate the maximum power. Combining the datasets improved the consistency of error differences across all locations, enhancing the overall accuracy. The proposed SBL_TSRA_RW model root mean square error values are lower than Bi-LSTM, STL-Bi-LSTM, SBL_TSA_R, SBL_TA_RS, and SBL_TSRA_R models, with reductions of 517%, 217%, 151%, 98%, and 1%, respectively. The RMSE and MAE errors of SBL_TSRA_R and SBL_TSRA_RW are closer to each other with 1.87, 1.32, and 1.85, 1.31 W/m², respectively, as shown. It shows a slight improvement through the whale optimization model. This study also reduces computation time and increases the model performance by consolidating a single significant dataset analysis with twenty-five location data rather than processing each dataset individually. The achieved results validate the effectiveness of the proposed decomposed residual ensembling Bi-LSTM (SBL_TSRA_RW) model, paving the way for further advancements in forecasting renewable energy systems. This work helps to initiate the installation of hybrid hydropower plants that improve the reliability of energy supply from generation to consumer, with the benefit of FPV systems. Furthermore, the precise and reliable GHI predictions created by the proposed method assist microgrid management, allowing operators to optimize energy storage, cut costs, and increase dependence on hybrid hydrosolar power. The model’s projections enable real-time changes in energy flows to avoid grid overloads and maintain efficient power distribution. As energy systems improve, integrating this model into smart grids will allow dynamic decision-making, increase overall grid stability, and facilitate the integration of renewable energy sources for a stable power supply. It also aids in attaining the Sustainable Development Goals.

7.1. Future Recommendations

Future research should focus on real-time adaptive forecasting systems that dynamically change in response to incoming data. This technology would enable more responsive energy storage management and ensure efficiency by better matching renewable energy characteristics and equitable cost allocation. In addition, model development tailored to specific climate zones such as tropical, temperate, and arid regions could increase the model’s accuracy and applicability across diverse environments.

Nomenclature

SI: Solar irradiance
T2MWET: Wet-bulb temperature
T2M: Air temperature
T2MDEW: Dew point temperature
TS: Surface temperature
QV2M: Specific humidity
PS: Surface pressure
RH2M: Relative humidity at 2 meters
PRECTOTCORR: Precipitation
WS10M: Wind speeds at 10 meters
WS2M: Wind speeds at 2 meters
WD10M: Wind direction at 10 meters
ADF: Augmented Dickey-Fuller
AIC: Akaike information criteria
BIC: Bayesian information criteria
ARIMA: Autoregressive integrated moving average
T: Trend
S: Seasonal
R: Remainder
STL: Seasonal-trend decomposition using loess
Bi-LSTM: Bidirectional LSTM
WOA: Whale optimization algorithm
RMSE: Root mean square error
nRMSE: Normalized root mean square error
MAE: Mean absolute error
MAPE: Mean absolute percentage error
PV: Photovoltaic
FPV: Floating photovoltaic
PACF: Partial autocorrelation function
ACF: Autocorrelation function
AR: Autoregressive
I: Integrated
MA: Moving average
ANN: Artificial neural network
LSTM: Long short term memory
GRU: Gate recurrent unit
RNN: Recurrent neural network
GA: Genetic algorithm
PSO: Particle swarm optimization
DNI: Direct normal irradiance
ATSD: Automatic time series decomposition
KNN: K—nearest neighbour
WT: Wavelet transform
DCNN: Deep convolutional neural network
QR: Quantile regression
ECS: Environmental control system
SES: Simple exponential smoothing
LR: Learning rate
NN: Neuron number
BS: Batch size
GPU: Graphical processing unit

Conflicts of Interest

The authors declare no conflicts of interest.

Author Contributions

All authors contributed to the study, conception, and model design. All authors commented on the manuscript. All authors read and approved the final manuscript.

Funding

No funding was received for this manuscript.

Open Research

Data Availability Statement

Data that were used to obtain the results are available from the corresponding author upon request.

References

1 Nehrir M. H., Wang C., Strunz K. et al., A Review of Hybrid Renewable/alternative Energy Systems for Electric Power Generation: Configurations, Control, and Applications, IEEE Transactions on Sustainable Energy. (2011) 2, no. 4, 392–403, https://doi.org/10.1109/TSTE.2011.2157540, 2-s2.0-80053204862.
10.1109/TSTE.2011.2157540
Web of Science® Google Scholar
2 Eltayeb W. A., Somlal J., Kumar S., and Rao S. K., Design and Analysis of a Solar-Wind Hybrid Renewable Energy Tree, Results in Engineering. (2023) 17, https://doi.org/10.1016/J.RINENG.2023.100958.
10.1016/j.rineng.2023.100958
PubMed Web of Science® Google Scholar
3 Vyas B. K., Adhwaryu A., and Bhaskar K., Planning and Developing Large Solar Power Plants: A Case Study of 750 MW Rewa Solar Park in India, Cleaner Engineering and Technology. (2022) 6, https://doi.org/10.1016/J.CLET.2022.100396.
10.1016/j.clet.2022.100396
PubMed Web of Science® Google Scholar
4 Solomin E., Sirotkin E., Cuce E., Selvanathan S. P., and Kumarasamy S., Hybrid Floating Solar Plant Designs: A Review, 2021, MDPI AG.
10.3390/en14102751
Google Scholar
5 Iweh C. D. and Akupan E. R., Control and Optimization of a Hybrid Solar PV – Hydro Power System for Off-Grid Applications Using Particle Swarm Optimization (PSO) and Differential Evolution (DE), Energy Reports. (2023) 10, 4253–4270, https://doi.org/10.1016/j.egyr.2023.10.080.
10.1016/j.egyr.2023.10.080
Web of Science® Google Scholar
6 Mishra V. and Jain K., Satellite Based Assessment of Artificial Reservoir Induced Landslides in Data Scarce Environment: A Case Study of Baglihar Reservoir in India, Journal of Applied Geophysics. (2022) 205, https://doi.org/10.1016/J.JAPPGEO.2022.104754.
10.1016/j.jappgeo.2022.104754
Web of Science® Google Scholar
7 Kakoulaki G., Gonzalez Sanchez R., Gracia Amillo A. et al., Benefits of Pairing Floating Solar Photovoltaics with Hydropower Reservoirs in Europe, Renewable and Sustainable Energy Reviews. (2023) 171, https://doi.org/10.1016/J.RSER.2022.112989.
10.1016/j.rser.2022.112989
Web of Science® Google Scholar
8 Jurasz J. and Ciapała B., Solar–hydro Hybrid Power Station as a Way to Smooth Power Output and Increase Water Retention, Solar Energy. (2018) 173, 675–690, https://doi.org/10.1016/j.solener.2018.07.087, 2-s2.0-85051108595.
10.1016/j.solener.2018.07.087
Web of Science® Google Scholar
9 El Hammoumi A., Chtita S., Motahhir S., and El Ghzizal A., Solar PV Energy: From Material to Use, and the Most Commonly Used Techniques to Maximize the Power Output of PV Systems: A Focus on Solar Trackers and Floating Solar Panels, 2022, Elsevier Ltd.
10.1016/j.egyr.2022.09.054
Google Scholar
10 Toufani P., Karakoyun E. C., Nadar E., Fosso O. B., and Kocaman A. S., Optimization of Pumped Hydro Energy Storage Systems under Uncertainty: A Review, 2023, Elsevier Ltd.
10.1016/j.est.2023.109306
Google Scholar
11 Bhimaraju A., Mahesh A., and Nirbheram J. S., Feasibility Study of Solar Photovoltaic/grid-Connected Hybrid Renewable Energy System with Pumped Storage Hydropower System Using Abandoned Open Cast Coal Mine: A Case Study in India, Journal of Energy Storage. (2023) 72, https://doi.org/10.1016/J.EST.2023.108206.
10.1016/j.est.2023.108206
Web of Science® Google Scholar
12 Sebastianelli A., Serva F., Ceschini A., Paletta Q., Panella M., and Le Saux B., Machine Learning Forecast of Surface Solar Irradiance from Meteo Satellite Data, Remote Sensing of Environment. (2024) 315, https://doi.org/10.1016/J.RSE.2024.114431.
10.1016/j.rse.2024.114431
Web of Science® Google Scholar
13 Gupta A., Gupta K., and Saroha S., Solar Irradiation Forecasting Technologies: A Review, Strategic Planning for Energy and the Environment. (2023) 39, no. 3–4, 319–354, https://doi.org/10.13052/spee1048-4236.391413.
10.13052/spee1048-4236.391413
Google Scholar
14 Kumari P. and Toshniwal D., Deep Learning Models for Solar Irradiance Forecasting: A Comprehensive Review, 2021, Elsevier Ltd.
Google Scholar
15 Bansal R. C. and Pandey J. C., Load Forecasting Using Artificial Intelligence Techniques: A Literature Survey, International Journal of Computer Applications in Technology. (2005) 22, no. 2/3, 109–119, https://doi.org/10.1504/IJCAT.2005.006942, 2-s2.0-21244433585.
10.1504/IJCAT.2005.006942
Google Scholar
16 Agoua X. G., Girard R., and Kariniotakis G., Short-Term Spatio-Temporal Forecasting of Photovoltaic Power Production, IEEE Transactions on Sustainable Energy. (2018) 9, no. 2, 538–546, https://doi.org/10.1109/TSTE.2017.2747765, 2-s2.0-85029179011.
10.1109/TSTE.2017.2747765
Web of Science® Google Scholar
17 Alzahrani A., Shamsi P., Dagli C., and Ferdowsi M., Solar Irradiance Forecasting Using Deep Neural Networks, Procedia Computer Science, 2017, Elsevier B.V, 304–313, https://doi.org/10.1016/j.procs.2017.09.045, 2-s2.0-85039995734.
10.1016/j.procs.2017.09.045
Google Scholar
18 Hosseini M., Katragadda S., Wojtkiewicz J., Gottumukkala R., Maida A., and Chambers T. L., Direct Normal Irradiance Forecasting Using Multivariate Gated Recurrent Units, Energies. (2020) 13, no. 15, https://doi.org/10.3390/en13153914.
10.3390/en13153914
Web of Science® Google Scholar
19 Liu Q., Darteh O. F., Bilal M. et al., A Cloud-Based Bi-directional LSTM Approach to Grid-Connected Solar PV Energy Forecasting for Multi-Energy Systems, Sustainable Computing: Informatics and Systems. (2023) 40, https://doi.org/10.1016/J.SUSCOM.2023.100892.
10.1016/j.suscom.2023.100892
Web of Science® Google Scholar
20 POWER, DAVe, https://power.larc.nasa.gov/data-access-viewer/.
Google Scholar
21 Babu C. N. and Reddy B. E., A Moving-Average Filter Based Hybrid ARIMA–ANN Model for Forecasting Time Series Data, Applied Soft Computing. (2014) 23, 27–38, https://doi.org/10.1016/J.ASOC.2014.05.028, 2-s2.0-84903579343.
10.1016/j.asoc.2014.05.028
Web of Science® Google Scholar
22 Reikard G., Predicting Solar Radiation at High Resolutions: A Comparison of Time Series Forecasts, Solar Energy. (2009) 83, no. 3, 342–349, https://doi.org/10.1016/j.solener.2008.08.007, 2-s2.0-59649097842.
10.1016/j.solener.2008.08.007
CAS Web of Science® Google Scholar
23 Chu Y., Pedro H. T. C., and Coimbra C. F. M., Hybrid Intra-hour DNI Forecasts with Sky Image Processing Enhanced by Stochastic Learning, Solar Energy. (2013) 98, no. PC, 592–603, https://doi.org/10.1016/j.solener.2013.10.020, 2-s2.0-84887950005.
10.1016/j.solener.2013.10.020
Web of Science® Google Scholar
24 Wang H., Yi H., Peng J. et al., Deterministic and Probabilistic Forecasting of Photovoltaic Power Based on Deep Convolutional Neural Network, Energy Conversion and Management. (2017) 153, 409–422, https://doi.org/10.1016/j.enconman.2017.10.008, 2-s2.0-85033725701.
10.1016/j.enconman.2017.10.008
Web of Science® Google Scholar
25 Cunha J. L. R. N. and Pereira C. M. N. A., A Hybrid Model Based on STL with Simple Exponential Smoothing and ARMA for Wind Forecast in a Brazilian Nuclear Power Plant Site, Nuclear Engineering and Design. (2024) 421, https://doi.org/10.1016/J.NUCENGDES.2024.113026.
10.1016/j.nucengdes.2024.113026
Web of Science® Google Scholar
26 Tebong N. K., Simo T., Takougang A. N., and Ntanguen P. H., STL-Decomposition Ensemble Deep Learning Models for Daily Reservoir Inflow Forecast for Hydroelectricity Production, Heliyon. (2023) 9, no. 6, https://doi.org/10.1016/j.heliyon.2023.e16456.
10.1016/j.heliyon.2023.e16456
PubMed Web of Science® Google Scholar
27 Luo J. and Gong Y., Air Pollutant Prediction Based on ARIMA-WOA-LSTM Model, Atmospheric Pollution Research. (2023) 14, no. 6, https://doi.org/10.1016/j.apr.2023.101761.
10.1016/j.apr.2023.101761
Web of Science® Google Scholar
28 Zeng H., Zhang H., Guo J., Ren B., Cui L., and Wu J., A Novel Hybrid STL-Transformer-ARIMA Architecture for Aviation Failure Events Prediction, Reliability Engineering & System Safety. (2024) 246, https://doi.org/10.1016/J.RESS.2024.110089.
10.1016/j.ress.2024.110089
Web of Science® Google Scholar
29 Zhang B., Song C., Jiang X., and Li Y., Electricity Price Forecast Based on the STL-TCN-NBEATS Model, Heliyon. (2023) 9, no. 1, https://doi.org/10.1016/j.heliyon.2023.e13029.
10.1016/j.heliyon.2023.e13029
Web of Science® Google Scholar
30 As’ad M., Finding the Best ARIMA Model to Forecast Daily Peak Electricity Demand Finding the Best ARIMA Model to Forecast Daily Peak Electricity Demand Finding the Best ARIMA Model to Forecast Daily Peak Electricity Demand, https://ro.uow.edu.au/asearc/12.
Google Scholar
31 Chodakowska E., Nazarko J., Nazarko Ł., Rabayah H. S., Abendeh R. M., and Alawneh R., ARIMA Models in Solar Radiation Forecasting in Different Geographic Locations, Energies. (2023) 16, no. 13, https://doi.org/10.3390/en16135029.
10.3390/en16135029
PubMed Web of Science® Google Scholar
32 Jafari M., Chaleshtari M. H. B., Khoramishad H., and Altenbach H., Minimization of Thermal Stress in Perforated Composite Plate Using Metaheuristic Algorithms WOA, SCA and GA, Composite Structures. (2023) 304, no. Jan, https://doi.org/10.1016/j.compstruct.2022.116403.
10.1016/j.compstruct.2022.116403
Web of Science® Google Scholar
33 Syama S., Ramprabhakar J., Anand R., and Guerrero J. M., A Hybrid Extreme Learning Machine Model with Lévy Flight Chaotic Whale Optimization Algorithm for Wind Speed Forecasting, Results in Engineering. (2023) 19, https://doi.org/10.1016/j.rineng.2023.101274.
10.1016/j.rineng.2023.101274
Web of Science® Google Scholar
34 Saleh I., Borhan N., Yunus A., and Rahiman W., Comprehensive Technical Review of Recent Bio-Inspired Population-Based Optimization (BPO) Algorithms for Mobile Robot Path Planning, IEEE Access. (2024) 12, 20942–20961, https://doi.org/10.1109/ACCESS.2024.3362638.
10.1109/ACCESS.2024.3362638
Web of Science® Google Scholar
35 Lydia A., Francis S., and Sagayaraj Francis F., A Survey of Optimization Techniques for Deep Learning Networks, International Journal for Research in Engineering Application & Management (IJREAM). (2019) 05, https://doi.org/10.35291/2454-9150.2019.0100.
10.35291/2454-9150.2019.0100
Google Scholar
36 Ramachandra T. V., Jain R., and Krishnadas G., Hotspots of Solar Potential in India, 2011, Elsevier Ltd.
10.1016/j.rser.2011.04.007
Google Scholar
37 Konduru S., C N., and Bansal R. C., Forecasting Solar Irradiance for the Strategic Integration of Hybrid Hydro and Solar Photovoltaic Systems in Rural Indian Regions, International Journal of Modelling and Simulation. (2024) 1–19, https://doi.org/10.1080/02286203.2024.2403014.
10.1080/02286203.2024.2403014
Web of Science® Google Scholar
38 Marhic B. and Masson J.-B., Occupancy Forecasting Using Two ARIMA Strategies Energy Management View Project Smart Heating View Project, https://www.researchgate.net/publication/336553179.
Google Scholar
39 Zhang J., Florita A., Hodge B. M. et al., A Suite of Metrics for Assessing the Performance of Solar Power Forecasting, Solar Energy. (2015) 111, 157–175, https://doi.org/10.1016/j.solener.2014.10.016, 2-s2.0-84910629924.
10.1016/j.solener.2014.10.016
Web of Science® Google Scholar
40 Martin-Cirera A., Nowak M., Norton T., Auer U., and Oczak M., Comparison of Transformers with LSTM for Classification of the Behavioural Time Budget in Horses Based on Video Data, Biosystems Engineering. (2024) 242, 154–168, https://doi.org/10.1016/J.BIOSYSTEMSENG.2024.04.014.
10.1016/j.biosystemseng.2024.04.014
CAS Web of Science® Google Scholar

All articles

Advanced Solar Irradiance Forecasting Using Hybrid Ensemble Deep Learning and Multisite Data Analytics for Optimal Solar-Hydro Hybrid Power Plants

Abstract

1. Introduction

2. Literature Review