Volume 14, Issue 8 pp. 953-966
Research Article
Full Access

Time series modelling of power output for large-scale wind fleets

Alexander Sturt

Corresponding Author

Alexander Sturt

Department of Electrical and Electronic Engineering, Imperial College, London, SW7 2AZ UK

Alexander Sturt, Department of Electrical and Electronic Engineering, Imperial College, London, UK.

E-mail: [email protected]

Search for more papers by this author
Goran Strbac

Goran Strbac

Department of Electrical and Electronic Engineering, Imperial College, London, SW7 2AZ UK

Search for more papers by this author
First published: 11 February 2011
Citations: 19

ABSTRACT

Simulations of power systems with high wind penetration need to represent the stochastic output of the wind farms. Many studies use historic wind data directly in the simulation. However, even if historic data are used to drive the realized wind output in scheduling simulations, a model of the wind's statistical properties may be needed to inform the commitment decisions for the dispatchable units. There are very few published studies that fit models to the power output of nation-sized wind fleets rather than the output at a single location. We fitted a time series model to hourly, time-averaged, aggregated wind power data from New Zealand, Denmark and Germany, based on univariate, second-order autoregressive drivers. Our model is designed to reproduce the asymptotic distribution of power output, the diurnal variation and the volatility of power output over timescales up to several hours. For the cases examined here, it was also found to provide a generally good representation of the overall distribution of power output changes and the variation of volatility with power output level, as well as an acceptable representation of the distribution of calm periods. Copyright © 2011 John Wiley & Sons, Ltd.

1 INTRODUCTION

A great deal of research has been devoted to the development of time series models for the wind speed at a fixed location.1-10 These models, which are generally based on autoregressive (AR), autoregressive moving average, Markov chains or spectral methods, have been developed either to assist with short-term wind speed prediction or for use in power system simulations. We are concerned here with the latter application. Such models are hard to extrapolate to a large fleet because a single wind speed can only pertain to a single location, whereas a country-scale power system may contain hundreds of wind farms distributed over hundreds or thousands of kilometres. One solution would be to use a multivariate model such as a vector autoregressive one,11, 12 but these models can run into calibration difficulties because of the very large number of parameters that are required if many sites are to be represented. Klöckl and Papaefthymiou12 described this problem and some techniques that could be used to mitigate it.

A simpler approach is to model the wind power instead of the wind speeds, as Papaefthymiou and Klöckl13 did for a single location using a Markov chain Monte Carlo technique. This technique could be equally well applied to aggregated wind output for a large fleet, for which some long-term and transitional statistical properties have recently been analysed by Louie.14, 15

Other authors (e.g. Holttinen and Pedersen,16 Ummels et al.,17 Meibom et al.,18 Gibescu et al.,19 Wind Logics, Inc. and EnerNex Corporation20 and Van Hulle et al.21) have avoided the problem by using historic wind data in their simulations, instead of synthesized data. Historic data, if available, are generally preferable to synthetic time series because there is less concern about whether the data are sufficiently realistic, and correlations with other variables such as temperature are captured in a natural way. However, there are several advantages to having the capability to generate synthetic time series with statistical properties that have been matched to historic data:
  • Some knowledge of the wind statistics may be needed by the power system simulation tool. For example, if the commitment decisions for the thermal units are made in a stochastic setting or if there are dynamic reserve requirements to cater for a given worst-case percentile drop in wind output, the unit commitment algorithm will need to know the distribution of wind power changes over the following few hours.
  • If the quantity of available data is limited, only short simulations will be possible with the historic data, and there will be little indication of the simulation error. Monte Carlo simulations, based on time series models, can produce results with quantifiable standard errors, and furthermore, these errors can be reduced through variance reduction techniques such as importance sampling. The analyst should bear in mind, however, that a synthetic time series, although long, does not contain any more information than the historic time series to which it has been calibrated and cannot be expected to account for longer-term climatic variations. The alternative of using long historical data sets is becoming increasingly viable, thanks to the efforts of the meteorological community, such as ERA-40.22
  • By categorizing the wind resource in different regions in terms of parameters in a common model, we can build up a database that relates these parameters to regional properties such as overall wind conditions, region size, terrain type and wind farm density. This database can then be used to generate ‘best guess‘ time series for a region for which there are no available data.
  • Real-world data sets invariably suffer from missing or erroneous data. Time series models offer a practical way to fill the gaps or to find candidate erroneous values.
  • The parameters of a model are much more compact than the historic data set to which they were calibrated, can be quickly set up to generate sample data and can be shared between researchers more easily.

We have developed a wind model for our single-bus power system operation simulations, for generating synthetic wind power samples and also for supplying statistical information to the stochastic scheduling algorithm. It is a low-order, univariate AR model that can be fitted to historic, aggregated, time-averaged wind power output data from a geographically diverse fleet of wind farms, without explicitly modelling the wind speeds. The objectives of the model are: to reproduce the same long-term (asymptotic) distribution of power outputs as the historic data, to reproduce the same diurnal variation of the average power output, and to reproduce the same average power output changes over timescales up to around 4 h, which is a typical start-up time assumed for combined cycle gas turbines and therefore represents a crucial lead time for scheduling thermal reserve plant.

The technical development of this paper is divided into three sections. In the next section, we develop the methodology behind our model. Section 3 introduces three data sets to which we have fitted our model, from New Zealand, Denmark and Germany. Results from the calibration are discussed in Section 4. These include an assessment of the fit to some transitional properties to which the model is not explicitly calibrated.

2 METHODOLOGY

Gaussian linear time series models23 are straightforward to fit historic time series data and have well-understood asymptotic and transitional properties. However, they cannot be used directly to generate samples of wind power because real wind processes are non-Gaussian and have diurnal and seasonal components. Our approach is to use a Gaussian linear AR process as the underlying driver but transform it by adding on a periodic diurnal term and then, by using a memoryless non-linear function, transform the process into one with the required asymptotic distribution. Seasonal variations are accounted for by dividing the year into a small number of seasons and fitting separate models to each season.

The underlying driver of the model is an AR(p) process, which is driven by independent N(0,1) variables εk:
urn:x-wiley:10954244:media:we459:we459-math-0001(1)

We scale the parameters so as to normalize the asymptotically stationary distribution of Xk to N(0,1). This allows us to fit the model in two stages. First, we find the transformation function and the diurnal term that reproduce the historic overall power output distribution and diurnal variation of the mean. Second, we find the AR parameters that provide the best fit to the short-term dynamics.

The diurnal variation is introduced by an additive term μk with a period of NT, the number of time steps per day (typically 24 or 48):
urn:x-wiley:10954244:media:we459:we459-math-0002(2)
which means that the cumulative distribution function (CDF) of urn:x-wiley:10954244:media:we459:we459-math-0003 is
urn:x-wiley:10954244:media:we459:we459-math-0004(3)
where N(x) is the standard cumulative Gaussian distribution.
Most wind speed model-fitting studies, such as that of Brown et al.,2 use a power transformation to convert the (approximately Weibull) wind speeds to an approximately Gaussian variable. This class of transformations is not suitable if, as here, the historic input takes the form of wind power rather than wind speed. The beta distribution has been suggested by Louie15 for aggregated wind output. An alternative, implemented here and in a study by Klöckl and Papaefthymiou,12 is to estimate the distribution non-parametrically by direct observation of the historic data. We first estimate the CDF of the historic power output values Pk using Parzen windowing and then choose a transformation function W(·) that is guaranteed to generate values with the same CDF as Pk given that its argument is distributed like urn:x-wiley:10954244:media:we459:we459-math-0005. We can achieve this by writing
urn:x-wiley:10954244:media:we459:we459-math-0020(4)
where C(x) is the CDF of Pk.
We stipulate a diurnal variation such that the expected power output at discrete time-of-day i(= k mod NT) is urn:x-wiley:10954244:media:we459:we459-math-0006 which means that since Xk ∼ N(0,1),
urn:x-wiley:10954244:media:we459:we459-math-0007(5)
where ϕ(x) is the standard Gaussian probability density function.

The choice of the transformation function W(·) and the vector (μi) can therefore give rise to any desired overall distribution of power output as well as the diurnal variation of its expectation. This is a more parsimonious formulation than that in Klöckl and Papaefthymiou,12 which requires separate transformation functions for each time of the day but is able to represent the diurnal variation of the distribution more fully. The determination of W(·) and μi from C(·) and urn:x-wiley:10954244:media:we459:we459-math-0008 requires an iterative method because equations 3-5 represent a set of coupled non-linear equations. Details of a simple solution method are given below.

First, we note that the equations are overdetermined because an arbitrary constant can be added to all the μi without affecting the solution. If urn:x-wiley:10954244:media:we459:we459-math-0009 is inconsistent with C(·), then there will be no solution, whereas if they are consistent, then there are infinitely many solutions, and we can arbitrarily add the following constraint:
urn:x-wiley:10954244:media:we459:we459-math-0010(6)
This constraint allows a solution using the following simple algorithm that converges after a few iterations:
  • Start with an initial guess μi = 0 for all times-of-day i.
  • Calculate the transformation function W(·) from equations 3 and 4 (we represented it with a piecewise linear approximation containing a few hundred segments).
  • For each i, use a binary search on μi to satisfy equation 5 without updating W.
  • Subtract the average value of μi from each element of μi, in order to satisfy equation 6.
  • If the solution has converged, stop; otherwise, go to the second step.
Having thus determined the function W(·) and the vector (μi) to satisfy the required long-term statistics of the power output, our next task is to reproduce the desired transitional statistics. This is achieved by manipulating the parameters of the AR model [equation 1] while maintaining the requirement that its asymptotic distribution is N(0,1). One way to fit the data is to transform the historic power output into values of X using
urn:x-wiley:10954244:media:we459:we459-math-0021(7)
and to obtain the AR parameters (φn) using standard fitting methods such as the Yule–Walker equations,23 which in the case of an AR(1) model reduces to
urn:x-wiley:10954244:media:we459:we459-math-0011(8)
If X follows an AR(1) process, then its asymptotic variance is
urn:x-wiley:10954244:media:we459:we459-math-0012(9)
and since we have designed μi and W(X′) to produce the correct long-term statistics when Xk ∼ N(0,1), it follows that
urn:x-wiley:10954244:media:we459:we459-math-0013(10)
If X follows an AR(2) process
urn:x-wiley:10954244:media:we459:we459-math-0019(11)
then the Yule–Walker equations can be used to derive φ1 and φ2 as the solution to the linear system
urn:x-wiley:10954244:media:we459:we459-math-0014(12)
where ci = E[XkXk − i], and σ then follows from the formula derived by Box and Jenkins23
urn:x-wiley:10954244:media:we459:we459-math-0015(13)
and the requirement that urn:x-wiley:10954244:media:we459:we459-math-0016, as before.

The appropriate order of the AR model can only be determined by careful analysis of the underlying data on a case-by-case basis. If there is no trending in the data (i.e. no significant autocorrelation between successive increments in the X domain), then a first-order model will suffice. However, any smoothing effect from the aggregation of large numbers of wind farms will manifest itself in trending behaviour, so that a second-order (or higher) model may be necessary. (The smoothing effect will be further enhanced if the time series data consist of time-averaged values, as in the case studies documented here.) The Akaike information criterion, autocorrelation function and partial autocorrelation function (PACF) can be used to help the analyst choose appropriate model orders to provide the best fit to the X domain values. We used PACFs for each data set to help determine appropriate model orders and confirmed our choice by observing the goodness of fit of the volatility in the P domain in each case over a range of time horizons.

3 CASE STUDIES

We tested our model by calibrating it to historic wind power data from New Zealand, Denmark and Germany, as described below.
  • New Zealand

    We used the same wind power data that were used to generate the scenarios in a study by Strbac et al.24 These data were generated from 2 years (2005 and 2006) of wind speed time series at 15 sites located throughout New Zealand, the northernmost and southernmost sites being of the order of 2000 km apart. Wind speeds were converted to wind power using representative wind turbine characteristics and compensating for topological and array effects, outages, hysteresis and local losses. The result of this adjustment was that the maximum output at each site was 93.9% of nameplate capacity. Some of the sites already had wind farms in operation, and the wind power data were validated by comparison with real turbine output data at those sites. The original study24 used varying wind farm capacities for the different sites. In the present study, we used the same wind farm capacity for each site, expressing the aggregate output in per unit of the aggregate nameplate capacity. Most of the data were available at a 10 min resolution, but we averaged the values in groups of three to obtain half-hourly time-averaged data. The average capacity factor of the aggregated time series was 39.3%.

  • Denmark

    A record of the hourly time-averaged power output from the hundreds of wind farms in Western and Eastern Denmark is available from the website of Energinet.dk, the Danish transmission system operator (TSO). We used aggregated data from the two regions from 16 September 2003 to 15 September 2009, during which time the total wind capacity of the whole of Denmark remained roughly constant. The megawatt power output values were converted to per-unit values by dividing each value by a nominal capacity of 3080 MW, which was chosen such that the maximum power output for each region was 0.95 p.u., allowing for losses of 0.05 p.u., on very windy days. This led to an average capacity factor of 24.5% over the period.

  • Germany

    The four TSOs in Germany maintain records of quarter-hourly time-averaged wind power output, representing the wind farms in their respective regions, and make these data available on their websites. We derived the hourly time-averaged values for the years 2006–2008 for the regions under the jurisdiction of Vattenfall Europe Transmission and Transpower (formerly E.On Netz), which between them contain the majority of the installed wind capacity in Germany. During this time, there was a substantial expansion of the wind fleet in Germany. We estimated per-unit values by dividing each megawatt value by a nominal capacity for each region that increased in a piecewise linear fashion over time, such that the maximum output in any year was 0.95 p.u. This led to an average capacity factor of 21.7%. The data for 4 November 2006 contained major disruption in the wind feed-in because of the large-scale system disturbance that occurred that evening25; the data for the affected hours were smoothed out.

These three cases provided contrasting challenges for the model: a small number of very dispersed sites with a high capacity factor and half-hourly resolution (New Zealand), a large number of sites in a much smaller region and hourly resolution (Denmark), and a very large installation with hourly resolution (Germany).

All cases show significant seasonal and diurnal periodicity, indicating that a multiseasonal model with diurnal adjustment is necessary (Figure 1). For the Danish and German cases, capacity factors were considerably higher in winter than in summer. In the New Zealand case, there was no significant difference in the average power output, but there was significantly more diurnal variation in summer than in winter. We fitted separate models to four seasons in the Danish and German cases: spring (March–May), summer (June–August), autumn (September–November) and winter (December–February). For the New Zealand case, we used just two seasons: summer (October–March) and winter (April–September).

Details are in the caption following the image
Seasonal and diurnal variations of average power output for New Zealand, Denmark and Germany.

A model should be calibrated and validated using separate sections of the data, if it is to have credible predictive ability. This was not practical for the New Zealand case because of the rather short available data set. For the Danish and German cases, we calibrated the models to the first two-thirds of the data only, so that the last third could be used as an independent validation data set. However, in all cases, we present histograms for each sample year in order to give an idea of the likely degree of interannual variability of their respective wind regimes.

4 MODELLING PRACTICALITIES AND RESULTS

The target power output CDFs C(P) were estimated by Parzen windowing, using a Gaussian kernel with a standard deviation of 0.02 p.u. in each case. The target average outputs (urn:x-wiley:10954244:media:we459:we459-math-0017), representing the expected power output for each time step during the day, were also measured from the data (Figure 1). The values were slightly noisy, so we smoothed them by replacing each element of urn:x-wiley:10954244:media:we459:we459-math-0018 with a moving average of itself, the previous value and the following one. This prevented the model from generating partially predictable jumps at the same time each day. We then used the iterative technique described in Section 2 to find the transformation function W(·) and the vector (μi) that would provide an exact match to the measured long-term distribution and diurnal variation. Figure 2 shows two examples of the transformation function. Its sigmoid shape ensures that the near-Gaussian variable X′ is transformed into one with the correct power output distribution, that is bounded between zero and the maximum allowable value.

Details are in the caption following the image
Transformation function W(X′) fitted to New Zealand and Denmark winter data. The steeper Danish curve is a reflection of the smaller geographical area covered by the Danish wind farms, which means that the aggregated output is more likely to be very high or very low than that in the New Zealand case. The short-term volatility is a function of both the steepness of the curve and the parameters of the underlying AR driver.

Next, we transformed the historic power output data (Pk) to normalized values (Xk) and calculated the PACFs to inform the choice of model order. Figure 3 shows the first 12 elements of the PACFs for the winter data sets of each region. It suggests that, although first-order model is adequate for the New Zealand data, second-order ones are necessary in the case of Denmark and Germany. This was confirmed by fitting AR models of various orders and examining the fit of volatilities of the transformed (power output) process for horizons up to 8 h. Table 1 shows the fitted AR parameters. When fitting AR models with the Yule–Walker equations, we excluded values of P corresponding to flat regions of the W curve (Figure 2), as X is undefined in these regions. All cases show the parameters calculated from the Yule–Walker equations, with the exception of the New Zealand summer data for which the volatility parameter σ was lowered slightly (and φ1 increased) in order to improve the fit to the dynamics of the transformed (power output) process. The fact that some manual adjustment was required demonstrates that the best fit in the X domain does not necessarily lead to the best fit in the P domain because of the non-linear transformation that links the two domains. For example, portions of the time series corresponding to regions of the W curve that are nearly flat (e.g. where the power output is close to zero) may give rise to residuals with different properties from those that arise from other parts of the curve. However, failure to model these differences will not necessarily compromise the performance of the model in the P domain, since large changes in X only cause small changes in P in these regions.

Details are in the caption following the image
Partial autocorrelation functions.
Table 1. Fitted AR parameters.
Case Season φ1 φ2 σ
New Zealand Summer 0.9879 0 0.155
New Zealand Winter 0.9861 0 0.166
Denmark Spring 1.6854 −0.7001 0.094
Denmark Summer 1.6144 −0.6296 0.106
Denmark Autumn 1.6848 −0.6990 0.092
Denmark Winter 1.7517 −0.7640 0.076
Germany Spring 1.6611 −0.6797 0.109
Germany Summer 1.5380 −0.5582 0.133
Germany Autumn 1.6330 −0.6455 0.094
Germany Winter 1.6341 −0.6416 0.073

Two 10 day samples of the time series are shown in Figure 4, for historic and simulated cases in New Zealand and Denmark. The much noisier signal for the New Zealand case, with far fewer sites, can be easily identified. The diurnal variations for the Danish summer case, peaking just after midday, are clearly visible in both historic and simulated cases. The simulated power output samples are optically very similar to the historic ones, but we tested the goodness of fit of the model in a more quantitative fashion, by comparing the dynamics of the power output time series for the historic and simulated cases. It should be noted that, although the model has enough degrees of freedom to allow an exact calibration to the long-term distribution and diurnal variation, there is only one free parameter in an AR(1) model and two in an AR(2) model, to fit all the desired transitional properties. The following sections examine the performance of the model with regard to the power output distribution, the volatility over a range of time horizons, the occurrence of large changes, the relationship of expected power output change with the power output level and the occurrence of periods of low output (calms).

Details are in the caption following the image
Ten-day samples of hourly averaged time series comparing historic data with simulation for (a) New Zealand and (b) Denmark. The smoother appearance of the Danish data is due to the much greater number of sites.

4.1 Power output distribution

The model is explicitly fitted to the long-term (asymptotic) power output distribution of the calibration data set. In this section, we examine its performance with regard to each year of historic data from both calibration and validation data sets.

Figure 5(a)(i),(b)(i),(c)(i) shows power output histograms for each year of the New Zealand, Denmark and Germany data sets, respectively. The diamond and square markers show calibration and validation data, respectively. It can be seen that there is a fair amount of interannual variation although the overall shapes are broadly similar in all years.

Details are in the caption following the image
(i) Power output distributions and (ii) power output volatility curves for historic data (markers) and simulations (lines). Separate values for each year of historic data are shown. Diamond markers show calibration data, and square markers show validation data. The upper (blue) lines and lower (pink) lines show 97.5th and 2.5th percentile values, respectively, from many years of simulation. The black lines show mean values from the simulations.

Some interannual variability is also seen in simulation despite the model's fixed parameters because different samples of the random number drivers are used for different simulation years. The black line shows average values from 1000 annual simulations, whereas the upper (blue) and lower (pink) curves show, for each histogram bucket, the 97.5th and 2.5th percentile values from the simulations, respectively. Hence, if the model correctly models the interannual variability, we expect 95% of historic data to lie within these curves. It can be seen that both calibration and validation data sets are generally within the simulated range, indicating that the interannual variation in simulation is generally of the correct order of magnitude—although the quantity of validation data is insufficient to assess this precisely. Two possible exceptions are the rather calm year 3 (green diamonds) and the rather windy year 4 (purple diamonds) of the Danish calibration data; in both cases, simulation produces such extreme capacity factors for 1 year in every 15 years or so.

4.2 Power output volatility

An important statistic for a wind power process is its volatility over time horizons up to a few hours. Combined cycle gas turbines, which represent the bulk of the dispatchable capacity in many power systems, have start-up times of the order of 4 h, so that the amount of available spinning reserve cannot be adjusted at shorter lead times than this. The degree of wind uncertainty over 4 h or so is therefore a key determinant of operating reserve requirements in many countries. Hence, it is important that the model captures the wind dynamics on these timescales if it is to be applied to systems with thermal reserves.

Figure 5(a)(ii),(b)(ii),(c)(ii) shows how the mean absolute change in power output increases with time horizon up to 24 h, for each year in the historic data sets for New Zealand, Denmark and Germany, respectively. As with the power output distributions, some interannual variability is present in both historic and simulated data, and the range of variability in simulation is represented by the 2.5th and 97.5th percentile curves as well as the mean level. For the New Zealand case, the fit is good for the calibration range (up to 8 h), after which the model diverges, slightly underestimating the magnitude of power output changes seen in the historic data. This can be attributed to the lack of a trending capability in the first-order model. However, if the time horizon increases further, the historic and simulated volatilities converge again (not shown). This is because the wind power values at two sufficiently distant points in time are independent, and the expected difference between the two values is therefore the expected difference between the two independent draws from the underlying distribution. Since the model reproduces the historic long-term distribution exactly, it is guaranteed to reproduce the historic volatility as measured over a sufficiently long time horizon. For the New Zealand model, convergence occurs at timescales of around 48 h.

For the Danish and German cases, the second-order fit, as found by the Yule–Walker equations, gives somewhat high volatilities at all timescales, with the Danish year 3 data (green diamonds) showing slightly lower volatility than the 2.5th percentile level from simulation. Nevertheless, the validation data (square markers) are well within the simulated range.

4.3 Distribution of changes in power output

Since reproducing the expected power output changes over a range of timescales, the model must also reproduce the correct distribution of these changes. Of particular concern is the incidence of sudden large changes, particularly downward ones, because a system that is operated with inadequate reserves is vulnerable to involuntary load shedding if the wind feed-in drops too quickly. We looked at the performance of the model with regard to the distribution of power output changes over 1 and 4 h. Figure 6 shows histograms of absolute power output changes over these horizons for each region, for each historic sample year and for the range of simulated values. Occurrence rates of the full range of power output changes, and their interannual variability, are generally well represented by the model, although there is a slight tendency to under-represent the occurrence of the most extreme changes (occurring up to a few times per year) in the Danish case. The outlier in the Danish 4 h 0.60–0.65 bucket is due to the violent windstorm that struck Denmark on 8 January 2005, causing the majority of the Danish wind fleet to cut out over a short period. The lack of an explicit representation of turbine cut-out prevents the model from reproducing these rare ramping phenomena.

Details are in the caption following the image
Distributions of absolute power output changes over (i) 1 h and (ii) 4 h horizons for historic data (markers) and simulations (lines). Separate values for each year of historic data are shown. Diamond markers show calibration data, and square markers show validation data. The upper (blue) lines and lower (pink) lines show 97.5th and 2.5th percentile values, respectively, from many years of simulation. The black lines show mean values from the simulations. An absence of a marker or a line segment over a particular bucket indicates no occurrences in that bucket.

The log-linear appearance of some of the power output change histograms is indicative of a Laplace distribution, as has been noted by several other authors15, 26 over a range of timescales and for regions of various sizes. The New Zealand data, which represent a 2000 km region, seem to be somewhat sub-Laplace, which should be expected for very large regions because of the central limit theorem, as noted by Louie.14

4.4 Relationship between power output level and power output changes

The previous sections demonstrate that our aggregated wind fleet model reproduces the long-term distribution, the diurnal variation and the distribution of power output changes over timescales up to a few hours, for the three cases examined in this work. In this section and the next, we examine some other transitional statistics to which our model is not explicitly calibrated. If the model represents the physics of the system at some level, then we might hope that these statistics could nevertheless be captured, at least qualitatively.

The volatility of power output for a single wind turbine is a function of the power output level itself. This is partly due to the lower wind speed volatility at low wind speed levels, but more importantly, the power output volatility is affected by the non-linear nature of the wind turbine power curve, which maps wind speed to wind power. At very low wind speeds, the power output is zero, so that the power output volatility is also zero. At high wind speeds between about 12 and 25 m s−1, when the turbine curve has reached saturation, the power output volatility is also theoretically zero. At moderate wind speeds, small changes in wind speed can have a large effect on the power output, so that the volatility is high. At very high wind speeds (typically around 25 m s−1), the turbine cuts out to prevent damage, resulting in near-instantaneous jumps between maximum output and zero and extreme volatility of power output.

A similar behaviour is to be expected when many turbines are aggregated because of the correlation between wind speeds at the different sites, although the volatility will be reduced by aggregation, particularly at very high outputs because not all turbines will cut out at the same time if they are distributed over a wide area.

It may be important to reproduce this behaviour in a power system simulation because the amount of thermal plant online, and hence the maximum amount of available spinning reserve, will decrease as the wind power production increases. We expect our model to reproduce some of the behaviour. In a sense, the diurnally shifted AR variable, X, is a measure of the overall ‘windiness’ of the region, and the transformation function W(·) (Figure 2) is a ‘power curve’ for the aggregated wind fleet. Unlike a real turbine power curve, however, our aggregated power curve does not have an explicit cut-out, so that we might expect the model to underpredict the volatility at very high power outputs.

Figure 7 shows the mean absolute change over a 60 min time horizon for the three cases. In New Zealand [Figure 7(a)], the small number of highly dispersed sites leads to a high overall volatility at high wind levels as a result of turbine cut-out; as there are only 15 sites, cut-out at a single site would lead to an aggregate power output drop of nearly 0.07 p.u. As expected, the model fails to reproduce this and exhibits considerably lower volatility than the historic data at high power levels. However, for the Danish [Figure 7(b)] and German [Figure 7(c)] cases, which represent hundreds of sites, the model provides a good match to the historic properties, with peak volatility occurring at around 0.60 p.u., although the model volatility is generally higher than the historic data in the German case.

Details are in the caption following the image
Variation of 60 min mean absolute change in aggregate power output with power output level. Changes are shown between every second value of the time series for New Zealand (a) and between consecutive values for Denmark (b) and Germany (c).

4.5 Incidence of calm periods

Power systems with a very high wind penetration will have to rely on mechanisms to shift demand from calm (or high load) periods to windy (or low load) periods, either by using bulk storage (such as pumped hydro) or through demand-side measures. Since the amount of shiftable energy is necessarily limited, such systems will be vulnerable to extended calm periods with very low wind feed-in. By fitting our model to the long-term wind power distribution, we can be sure that it will reproduce the overall proportion of time spent in any given power level range, but that does not guarantee that it will reproduce the incidence of continuous periods within that range. The incidence of very long calms will be affected by the dynamics of anticyclones over hours or days, beyond the time horizons to which the model is fitted. This will be largely irrelevant from the point of view of simulating short-term operation of wind-thermal systems, which is the target application for the model. However, it is a useful exercise to examine the fidelity of the model with respect to these long-term phenomena, in order to judge whether its application can be extended to storage-integrated systems where the reproduction of longer-term dynamics will become more relevant.

We tested the model's ability to reproduce the incidence of calm periods by recording, for every occasion when the aggregate power output dropped below a given threshold, the number of time steps it took for the power output to rise back above the threshold. The histograms of these periods were then compared for historic and simulated data. The thresholds used were 0.05 p.u. for Denmark and Germany and 0.10 p.u. for New Zealand (because the proportion of time spent below 0.05 p.u. in New Zealand was very low). Simulations were performed over 1000 year periods for each region and compared with the entire historic data sets (calibration plus validation). The results are shown in Table 2.

Table 2. Mean annual occurance rates of calms: period during which time-averaged power output series is continuously below a low threshold.
(a) New Zealand: threshold = 0.10 p.u. (b) Denmark: threshold = 0.05 p.u. (a) Germany: threshold = 0.05 p.u
Period Occurance/year Period Occurance/year Period Occurance/year
(h) Histogram Simulation (h) Histogram Simulation (h) Histogram Simulation
0.5–1 63.0 56.7 1 14.5 11.3 1 18.0 15.8
1.5–2 24.0 17.0 2 14.5 9.0 2 15.3 11.2
2.5–4 14.5 14.9 3–4 18.5 15.6 3–4 23.3 18.8
4.5–8 16.5 12.0 5–8 25.2 24.3 5–8 18.3 25.4
8.5–12 9.0 5.4 9–12 13.8 17.8 9–12 16.3 15.5
12.5–24 6.0 6.0 13–24 24.7 26.6 13–24 30.0 23.3
24.5–48 0 2.1 25–48 15.5 13.7 25–48 11.3 13.8
48.5–96 0 0.40 49–96 3.5 3.0 49–96 3.0 4.4
97.5–168 0 0.02 97–168 0.17 0.13 97–168 0.67 0.44
169–336 0 0 169–336 0.17 0.001 169–336 0 0.0

In all regions, the model underpredicts the frequency of calms of 2 h or below, ranging from a 15% underprediction for New Zealand to a 30% underprediction for Denmark. However, it should be noted that such short calm periods occur for only a few tens of hours per year and are unlikely to be material in a power system simulation. There is generally good agreement in the distribution of calms of between 2 h and 7 days. In the case of Denmark, the frequency of very long calms is underpredicted by the model. The 6 years of Danish data contained one 243 h (10 day) calm, beginning on 1 May 2008, whereas the longest calm in a 1000 year simulation was only 179 h (7 days). The materiality of this inaccuracy will depend on the nature of the system being analysed. For example, it would result in an underestimate of the loss-of-load probability for a power system that relies on a 7 day storage system to back up the intermittency of the wind fleet.

5 CONCLUSIONS

This paper has shown that a time series of hourly or half-hourly, time-averaged power outputs for large-scale wind fleets can be readily modelled using a low-order AR model, in conjunction with a diurnal adjustment and non-linear transformation function. The long-term distribution of the power output, as well as the diurnal variation of mean power output, can be represented exactly by using a straightforward calibration procedure, and the interannual variability of the distribution in simulated data sets is of the same order of magnitude as for the historic ones analysed in this work.

Another important statistic, when integrating wind with thermal plant that is constrained by start-up times or ramping limits, is the behaviour of power output changes up to a few hours. A first-order model (New Zealand) or a second-order one (Denmark, Germany) can capture the average change over time horizons up to around 8 h. The need for a second-order model implies that a Markov chain representation of an aggregated wind fleet would require two state variables, i.e. the current wind power and the wind power at the previous hour, so that a large transition matrix would be required to cover the entire state space with a fine granularity.

The distribution of changes is close to a Laplace distribution, as has been noted by several other authors working with data sets from around the world. Data sets from very large regions (e.g. New Zealand) have somewhat sub-Laplace tails because of the effects of the central limit theorem. Despite being based on a Gaussian time series, the model provides similarly fat tails in the distribution of changes because of the non-linearity of the transformation function, which makes the short-term volatility a function of the current wind output level. For large-scale fleets with hundreds of sites, the historic data show a similar relationship, with a peak volatility at around 0.6 p.u. and much lower levels at very high or very low outputs. With fewer sites, the lack of an explicit model of turbine cut-out leads to an underestimate of the volatility at high outputs. For the same reason, the model does not reproduce large rapid changes in aggregate output because of wide-scale turbine cut-out during storms that affect a high proportion of the fleet.

The model provides an acceptable representation of the distribution of periods of low output (calms) from 2 h to 7 days in length. This statistic may be important if the wind fleet is to be integrated with medium-term storage or demand-side measures. The frequency of very short calms is underestimated by 15–30% in the cases considered here, and the frequency of very long calms, occurring less than once per year, is also underestimated in the Danish case because synoptic effects are not explicitly modelled.

The supply and demand of many power systems is dependent on temperature as well as wind because of heat-led combined heat and power plants and electrical heating, for example. Since wind and temperature are not independent variables, care should be taken when using an independent wind model in simulations of temperature-dependent systems.

One limitation of the model lies in its univariate nature. Recent wind integration studies, such as TradeWind,21 focus on the use of power transfers between interconnected country-sized European regions to minimize the overall impact of wind intermittency. Since these power flows are constrained by the limited transmission capacity between regions, multiregion studies need to model the wind feed-in separately for each region. In theory, our model could be extended to a multivariate version in order to account for the correlations between regions. Care would be needed to ensure that the long-term joint distribution is adequately represented, as well as the correlations between power output changes in different regions.

ACKNOWLEDGEMENTS

We would like to thank Meridian Energy Ltd for the use of their New Zealand wind data and the Danish TSO Energinet.dk for the help in interpreting the Danish wind data. We would also like to thank Dr Martin Clark for the many helpful comments regarding the mathematical derivations and the two anonymous reviewers for their helpful suggestions that have led to several improvements. This work was supported by the UK's Engineering and Physical Sciences Research Council via the Flexnet Consortium, under Grant EESC-PS1219.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.