A New Strategy for Short-Term Load Forecasting
Abstract
Electricity is a special energy which is hard to store, so the electricity demand forecasting remains an important problem. Accurate short-term load forecasting (STLF) plays a vital role in power systems because it is the essential part of power system planning and operation, and it is also fundamental in many applications. Considering that an individual forecasting model usually cannot work very well for STLF, a hybrid model based on the seasonal ARIMA model and BP neural network is presented in this paper to improve the forecasting accuracy. Firstly the seasonal ARIMA model is adopted to forecast the electric load demand day ahead; then, by using the residual load demand series obtained in this forecasting process as the original series, the follow-up residual series is forecasted by BP neural network; finally, by summing up the forecasted residual series and the forecasted load demand series got by seasonal ARIMA model, the final load demand forecasting series is obtained. Case studies show that the new strategy is quite useful to improve the accuracy of STLF.
1. Introduction
Load forecasting has always been an essential and important topic for power systems, especially the STLF, which is fundamental in many applications such as providing economic generation, system security, and management and planning [1]. Basic operation functions such as unit commitment, economic dispatch, fuel scheduling, and unit maintenance can be performed more efficiently with an accurate forecasting [2]. However, load forecasting is a difficult task as the load at a given hour is dependent not only on the load at the previous hour but also on the load at the same hour on the previous day and on the load at the same hour on the day with the same denomination in the previous week. The STLF is also difficult to handle due to the nonlinear and random-like behaviors of system load, weather conditions, and variations of social and economic environments, and so forth [3]. So how, to improve the forecasting accuracy is still a difficult and critical problem.
During the past years, a wide variety of techniques have been developed for STLF to improve the forecasting accuracy. For example, in [4], a hybrid fuzzy modeling method by employing the orthogonal least squares method to create the fuzzy model and a constrained optimization algorithm to perform the parameter learning for STLF was presented. Another fuzzy modeling technique was also used for STLF in [5]. Yang and Stenzel proposed a new regression tree method for STLF in [6]; both increment and nonincrement trees were built according to the historical data to provide the data space partition and input variable selections then support vector machine was employed to the samples of regression tree nodes for further fine regression; results of different tree nodes were integrated through weighted average method to obtain the comprehensive forecasting result. Based on state space and Kalman filter approach, a novel time-varying weather and load model for solving the STLF problem was proposed in [7], where time-varying state space model was used to model the load demand on hourly basis while Kalman filter was used recursively to estimate the optimal load forecast parameters for each hour of the day. Considering that STLF was always affected by a variety of nonlinear factors, a mapping function was defined for each factor to identify the nonlinearity in [8]. Several other typical approaches for STLF can be found in [9–12].
The seasonal ARIMA model is frequently employed to forecast data with seasonal item. For instance, Choi et al. [13] used a hybrid SARIMA wavelet transform method for sales forecasting. Egrioglu et al. [14] proposed a hybrid approach based on SARIMA and partial high order bivariate fuzzy time series forecasting model and applied the hybrid model to two real seasonal time series. Besides, Chen and Wang [15] developed a hybrid SARIMA and support vector machines in forecasting the production values of the machinery industry in Taiwan. Considering that the load demand series always contain seasonal item, the seasonal ARIMA model is adopted in this paper.
The BP neural network is a kind of typical feed forward network, through the network structure positive transfer method; using the training function reverse revision network weight matrix and threshold, the BP neural network completes samples training model of the structure and then uses the built training model to complete the treatment of the sample to be measured [16]. The BP neural network model is applied to a wide field of forecasting. As Ke et al. [17] used the genetic algorithm-BP neural network to forecast the electricity power industry loan, Li et al. [18] adopted the BP neural network to the prediction of the mechanical properties of porous NiTi shape memory alloy prepared by thermal explosion reaction. In addition, the BP neural network can also be used for evaluation and classification: Li and Chen [19] utilized the BP neural network algorithm to study the sustainable development evaluation of highway construction project. Bao and Ren showed the wetland landscape classification based on the BP neural network in Dalinor Lake area in [20]. For the BP neural network can approximate the underlying function of the curves to any arbitrary degree of accuracy, this model is also employed to constitute the hybrid model of this paper.
Both ARIMA and BP neural network models have achieved successes in their own linear or nonlinear domains. Though a large number of models have been used to load demand forecasting, more techniques for STLF should be sought to further improve the predictive capability. For this purpose, a hybrid model of combining the seasonal ARIMA model and BP neural network is proposed in this paper. Firstly, the seasonal ARIMA model is adopted to forecast the load demand day ahead then BP neural network is used to forecast the residual series. Finally, by summing up the forecasted residual series and the forecasted load demand, the final load demand is obtained.
The remainder of this paper is organized as follows. Section 2 introduces the combined forecasting model theory. In Section 3, seasonal ARIMA model and BP neural network are presented. In Section 4, a case study of forecasting electricity load of South Australia (SA) State of Australia is demonstrated. Section 5 concludes this paper.
2. The Combined Forecasting Model
3. The Hybrid Model
3.1. Review of the Seasonal ARIMA Model
- (i)
Identification of the SARIMA (p, d, q)(P,D,Q)s structure: use autocorrelation function (ACF) and partial autocorrelation function (PACF) to develop the rough function.
- (ii)
Estimation of the unknown parameters.
- (iii)
Use of goodness-of-fit tests on the estimated residuals.
- (iv)
Forecast future outcomes based on the known data.
The steps of this modeling are identification, estimation, availability tests, and forecasting. In the following, we will specifically introduce the four steps. Identification, the appropriate models are determined from all possible models in this stage. The step of identification consists of determining appropriate AR, MA, or ARMA processes and the order of AR, MA, and/or ARMA models. Estimation, in this step parameters are estimated by using ordinary least squares (OLS) and sometimes nonlinear estimation methods. Estimated parameters of AR and MA processes included in ARIMA model should analyze whether they are stationary and invertible or not, respectively. Availability tests, in this stage, it is determined that the estimated ARIMA models are harmonized or not by diagnostic checking. On the other hand, estimated ARIMA model should have to be carried out the assumption that the processes of AR and MA have to be in the unit circle and the assumption of normality. Forecasting, estimated ARIMA models which keep assumptions as expressed above are used in forecasting in this stage.
3.2. Brief Introduction to the Back Propagation (BP) Neural Network
Artificial neural networks (ANN) is a typical kind of intelligent learning algorithm; it is widely used in some practical application, such as pattern classification, function approximation, optimization, forecast, and automation control [24, 25]. In this section, we will introduce the standard multilayer feed-forward neural network (FNN). FNN is a multilayer perception neural network; it is relative to the single perception neural network that can only solve linear separable classification problem. In order to increase the classification ability of the network, the only method is to use the multilayer network. Because the hidden layer neurons are introduced in the multilayer neural network, neural network has better classification and memory ability, so the corresponding learning algorithm becames the focus of research. In 1986, Rumelhart put forward the BP algorithm which solves the learning problems of the multilayer neural network layer implied in the hidden connection weights and gives a complete mathematical deduction. Because BP algorithm overcomes the drawback of the simple perception cannot solve XOR and some other problems, BP algorithm became the main multilayer perception learning algorithm, an important mode of neural network, and widely used.
The BP, one of the most popular techniques in the field of NN, is a kind of supervised learning neural network, the principle behind which involves using the steepest gradient descent method to reach any small approximation. The learning process consists of two parts: forward propagation and back-propagation. When facing the forward propagation, after implicit unit layer processing, information from the input layer to the output layer, the state of each layer neuron affects only the state of next layer neuron. If it is not a desired output in the output layer, then transferred to a back-propagation, the error signal returns along the original neurons connected channel [26]. In the return process, change the neuron connection weights in each layer; this process is iterative, and finally makes the error signal to the permitted range. From that we can see that, in the multilayer feed forward network, there are two signals in circulation: (1) working signal: after the input signal is applied to the working signal, it propagates forward until the actual output signal is produced in the output side, it is the function of inputs and weights. (2) Error signal: the error is the difference between the actual network output and the due output; it propagates back from the output terminal layer by layer [27].
- (i)
Hidden layer stage: the outputs of all neurons in the hidden layer are calculated by the following steps:
()Here netj is the activation value of the jth node, yj is the output of the hidden layer, and fH is called the activation function of a node, usually a sigmoid function as follows:() - (ii)
Output stage: the outputs of all neurons in the output layer are given as follows:
()
Hence, the BP model of (1) in fact performs a nonlinear functional mapping from the past observations (yt−1, yt−2, …, yt−n) to the future value yt that is, yt = f(yt−1, yt−2, …, yt−n, w) + εt, where w is a vector of all parameters and f is a function determined by the network structure and connection weights. Thus, the neural network is equivalent to a nonlinear autoregressive model. Note that expression (7) implies one output node in the output layer which is typically used for one-step-ahead forecasting.
4. Simulation Results
The electric load demand data used for the simulation are sampled from South Australia (SA) State of Australia at half an hour rate, so for one day, 48 load demand data are included. Figure 1 provides the load demand of SA from June 2, 2007 to July 14, 2007.

From Figure 1, it can be found that there exists significant similarity in load demand on the same day of each week; in other words seasonal components exist in load demand on the same day of each week. So, the seasonal ARIMA model will be greatly helpful to forecast the load demand day ahead using the historical load demand on the same day several weeks ago. Using the data on June 2, June 9, and June 16 of 2007, the electric load demand on June 23 is forecasted. Then the same way, that is, using the load demand data on the same day of the three sequential weeks to forecast the load demand on the same day of the adjacent week, is adopted to forecast the load demand on June 30, July 7, and July 14. Before the forecasting, values of the parameters should be estimated, obviously, s = 48, other parameters can be estimated by the ACF and PACF figures; values of parameters in forecasting load demand on June 23, June 30, July 7, and July 14 are listed in Table 1. In addition, as an example, ACF and PACF figures in forecasting load demand on June 23 by seasonal ARIMA model are shown in Figures 2 and 3, respectively.
Parameters | Forecasting load demand on June 23 | Forecasting load demand on June 30 | Forecasting load demand on July 7 | Forecasting load demand on July 14 |
---|---|---|---|---|
p | 1 | 2 | 1 | 1 |
d | 1 | 1 | 1 | 1 |
q | 1 | 1 | 2 | 1 |
P | 0 | 1 | 0 | 1 |
D | 1 | 1 | 1 | 1 |
Q | 1 | 1 | 0 | 1 |


By applying the estimated parameters shown in Table 1 to load demand forecasting, load demand results on June 23, June 30, July 7, and July 14 can be obtained by the seasonal ARIMA models. Forecasted load demand results are shown in Figure 4.






Once the training data and the number of neurons in each layer have been determined, the training process can be conducted. Figure 7 shows the variation of the training error with the epoch number of the BP neural network, where the maximal epoch is 1000.

Then the forecasting can be implemented by the trained network. When for forecasting, the residual load demand data on June 30 and July 7 at time t are used for inputs, with which the same time’s load demand on July 14 can be forecasted. Forecasting results are plotted in Figure 8.

Finally, by summing up this forecasted residual series to the forecasted load demand obtained by seasonal ARIMA model, the final load demand can be got, which is shown in Figure 9.

Figure 10 produces whisker plot with two boxes which have lines at the lower, median, and upper quartile values of the load demand forecasted by the single seasonal ARIMA model and the combined model. It can be observed that each of the boxes includes a notch in the position of the median value.

Models | RMSE | MAPE (%) |
---|---|---|
Individual seasonal ARIMA model | 260.7376 | 15.98 |
Combined model | 97.1366 | 5.13 |


From Table 2 and Figure 11, it can be seen that the value of RMSE varies from 260.7376 in the individual seasonal ARIMA model to 97.1366 in the combined model, while MAPE is reduced from 15.98% to 5.13%. Therefore, the combined model improves the load forecasting accuracy as compared to the individual seasonal ARIMA model.
The performance of the individual seasonal ARIMA model and the combined model in forecasting the load demand is also evaluated by the mean comparison; the comparison result is shown in Figure 12, where group 1, group 2, and group 3 represent the actual load demand, the load demand forecasted by the individual seasonal ARIMA model and the load demand forecasted by the combined model respectively.

As shown, no groups have means significantly different from group 1, that is, there is no significant difference between the means of the actual load demand and the load demand forecasted by the individual seasonal ARIMA model, as well as the means between the actual load demand and the load demand forecasted by the combined model. However, group 3 occupies more common part with the actual load demand variation range than group 2; thus, the combined model performs better than the individual seasonal ARIMA model in load demand forecasting; that is, the combined model improves the load demand forecasting accuracy as compared to the individual seasonal ARIMA model.
5. Conclusions
Different from usual combined forecasting models, a new strategy for STLF of using combined models is presented in this paper. As many sequences has periodic in real life, so these similar to the SARIMA model which can dig out the periodicity contained in the data is often used to predict and model the time series which have periodic. Secondly, this paper proposed by using the error sequence which SARIMA model predicted to predict the residual series of one day in the future, and by adding the residual series to the load value which got by the BP on the same day to improve the accuracy of the model. But the load prediction value residuals which were obtained by the SARIMA model do not have the same tendency or regularity; therefore, the choice of subsequent residual sequence prediction method should be careful. Considering that the neural network has a good effect for fitting of nonlinear function, this paper uses neural network model which can perfectly reflect the nonlinear relation between the input and output element to predict the subsequent residual sequence and did not use the regression or other model which has clear requirements for the form of the data; this further improves the accuracy of the prediction residuals. Furthermore, according to the characteristics of the model, this paper constructed a validity criterion which can measure the effectiveness of the model. At last, by using this combination method to the electricity load demand forecasting of South Australia, it appears that this combination method has a good effect in improving the prediction precision, because it is relative to the error in 15.98% which was predicted by a single SARIMA model, a hybrid model based on SARIMA, and neural network reduces the load predict error to 5.13%, and the validity criterion increases from 0.8402 to 0.9487. Simulation results demonstrate that the new strategy for STLF is effective in getting satisfying improvement of forecasting accuracy.
Acknowledgments
This work was supported by the Natural Science Foundation of P. R. of China (90912003, 61073193), the Key Science and Technology Foundation of Gansu Province (1102FKDA010), Natural Science Foundation of Gansu Province (1107RJZA188), and the Fundamental Research Funds for the Central Universities (lzujbky-2012-47, lzujbky-2012-48).