Estimation of Monthly Sunshine Duration in Turkey Using Artificial Neural Networks
Abstract
This paper introduces an artificial neural network (ANN) approach for estimating monthly mean daily values of global sunshine duration (SD) for Turkey. Three different ANN models, namely, GRNN, MLP, and RBF, were used in the estimation processes. A climatic variable (cloud cover) and two geographical variables (day length and month) were used as input parameters in order to obtain monthly mean SD as output. The datasets of 34 stations which spread across Turkey were split into two parts. First part covering 21 years (1980–2000) was used for training and second part covering last six years (2001–2006) was used for testing. Statistical indicators have shown that, GRNN and MLP models produced better results than the RBF model and can be used safely for the estimation of monthly mean SD.
1. Introduction
The importance of solar energy is increasing because demands for energy are increasing from day to day and the world’s energy resources are limited. Due to this fact, a large number of studies are carried out and will continue to be so in the future in order to benefit from the solar energy which is priceless and clean. On the other side, the significant variations of solar radiation over long years cause important climatic changes which affect the life in the world from many aspects. For example, a decrease in the amount of solar radiation between 1950s and early 1980s was observed, which is known as global dimming, and this phenomenon was reversed from second half of 1980s known as global brightening [1–6]. It is then crucial to know and analyze the variations of solar radiation in time and space. Global solar radiation measurements are generally made with the actinographs which are often not reliable due to the need of routine calibration of thermal sensitivity of the mechanical components of their sensors. More accurate measurements can be done by constructing networks with calibrated modern pyranometers but this is not the case for many countries because this type of instruments is expensive. It has been proved that solar radiation is highly correlated to the sunshine duration (SD) [7–10]. This obviously means that if one has information about the SD over an area then one also has information about the solar radiation over that area or vice versa. Hence, long-term accurate observations of global SD become very important and necessary for climatologic and some other applications [1, 6, 7, 11–13]. In fact, SD measurements have been achieved at many locations over the world more accurately than the solar radiation with cheaper instruments for long time periods. For instance, it has been measured in Ankara station in Turkey since 1928 [14]. SD measurements have been done in The Netherlands since 1901 [9] and at eight meteorological stations in Taiwan since 1910 or earlier [15]. Records of sunshine hours have been available from four stations of Ireland since the late 19th century [16]. Measurements of duration of sunshine in the United States began in 1891 at 20 U.S. Weather Bureau (USWB) stations [17]. Although SD can be measured almost in all meteorological stations, the networks of meteorological stations are still insufficient and/or limited for some parts of the world due to geographical and sometimes financial problems. This means that some countries have limited or sometimes unreliable SD database and maps. In order to overcome this problem, estimation techniques are improved for the areas where SD measurements are not available or reliable, especially for remote and inaccessible regions. Up to now, many researchers have concentrated themselves on the estimation of global solar radiation so that numerous studies can be found in the literature. In contrast to this, limited numbers of studies have been done for the estimation of SD, its relation with geographic and climatologic variables, and its variation throughout time and space.
Reddy developed an empirical model to calculate sunshine from total cloud amount [18]. The relationship between point cloudiness and sunshine derived cloud cover was investigated using data recorded at 34 stations in India by Raju and Kumar [19]. A cubic regression equation between monthly mean values of fraction of the sky and duration of bright sunshine was proposed by Rangarajan et al. and using this relationship they computed SD from cloud cover data [20]. Harrison and Coombes derived a second order regression equation which defines the statistical relationship between long-term averages of monthly cloud shade and point cloudiness using data of 43 Canadian weather stations [21]. Yin used monthly data of 729 worldwide stations for finding a generic algorithm that captures global variation of bright SD data in relation to temperature, precipitation, and geographic location [22]. El-Metwally proposed a nonlinear relationship based on maximum and minimum air temperatures and cloud cover fraction for estimating relative SD [23]. Matzarakis and Katsoulis estimated mean annual and seasonal duration of bright sunshine from an empirical formula which depends on distance of each station from the nearest cost, height of above sea level for each station location, percentage of land cover around each station, latitude of each station, and longitude of each station [24]. Kandirmaz proposed a simple model for the estimation of daily global SD and constructed spatially continuous map of SD over Turkey using meteorological geostationary satellite data [25]. Robaa established a simple model using observed cloud data and estimated SD at any region in Egypt [26]. Kandirmaz and Kaba showed that the statistical relation between daily satellite-derived cloud cover index and measured SD is quadratic rather than linear [27].
Recently, artificial neural networks (ANNs) have been extensively used for the solution of many problems in many areas, such as optimization, prediction, and modeling in engineering, climate science, pattern classification, and economy. ANNs are effective tools for modeling nonlinear systems [28–31]. Many previous studies have shown that usage of ANN techniques is an alternative and strong key for prediction of global solar radiation as compared to classical regression models. In ANN based solar radiation studies, meteorological, geographical, and climatological variables such as SD, temperature, cloud cover, relative humidity, wind speed, vapor pressure, precipitation, elevation, latitude, longitude, month, and satellite recorded or derived variables were used as input variables for obtaining the solar radiation as an output [32–45]. Although many ANN based studies have been in existence for the estimation of solar radiation in the literature, there have been only two studies for the estimation of SD. Jervase et al. generated contour maps for sunshine hours and sunshine ratios for Oman using a radial basis function (RBF) neural network model [46]. In their study latitude, longitude, altitude, and month of the year were used as input parameters. Mohandes and Rehman estimated the SD over Saudi Arabia using two neural network algorithms in which maximum possible day length, extraterrestrial solar radiation at that particular location, latitude, longitude, altitude, and month number were used as input parameters [47]. On the other side, only a few studies have been devoted to the estimation, determination, and distribution of SD for Turkey. Aksoy analyzed the changes in SD for the Ankara station in Turkey [48]. Kandirmaz [25] and Kandirmaz and Kaba [27] used satellite data for predicting the SD in Turkey. Şahin developed a simple model and estimated SD for some stations in Turkey [49]. Trends of the measured SD of 36 stations were analyzed by Yildirim et al. [50]. The present work is the first study where an ANN approach is proposed for the estimation of SD for Turkey. In order to get the best possible results, we have constructed three different ANN models in which cloudcover, day length, and month have been used as input parameters.
2. Data Sources and Methodology
This research was conducted for Turkey which is located between latitudes 36° and 42°N and longitudes 26° and 45°E and has seven different climatic zones, namely, Black Sea, Marmara, Eastern Anatolian, Southeastern Anatolian, Mediterranean, Aegean, and Central Anatolian [51]. Its average annual solar radiation was determined as 1311 kWh/m²-year (3.6 kWh/m²-day) and the annual average total SD as 2640 hours (7.2 hours/day) for the period from 1966 to 1982 [52]. Ground-measured daily SD and cloud fraction data are collected from 34 selected stations of the Turkish State Meteorological Service (TSMS). These stations cover almost the whole country and hereby reflect all climatic properties of the country. The geographical positions of these stations can be seen in Figure 1.

Previous studies have shown that SD can be interrelated with the cloud cover, air temperature, precipitation, relative humidity, wind speed, and geographical variables or combination of some of variables given above [53–58]. Clouds, consisting of liquid water droplets or ice particles, decrease incoming solar radiation in many ways before reaching earth surface. Physical explanation of the interactions between clouds and solar rays is rather difficult because these interactions depend on the size and shape of droplet or particles and total mass of water and spatial distribution. Other meteorological variables may affect the incoming solar radiation but the main and the greatest effect comes from the clouds [10, 27].
MBE is used to test whether the proposed model tends to overestimate or underestimate the measured value and generally it provides good information for long-term observations. On the other side, RMSE generally provides valuable information for short-term applications and it explains the measure of differences between measured and estimated values. As the values of RMSE, %RMSE, MBE, MAE, and %MAE get closer to zero, performances of the models get better.
2.1. Artificial Neural Network
An ANN consists of biological neuron like operation units (nodes) linked together according to a specific architecture. ANNs have generally one input layer, one output layer, and some hidden layers. Hidden and output layers contain activation functions. Neurons between adjacent layers are interconnected. Connection weights are multiplied by inputs to obtain product terms. Sum of products and biases then applied to a transfer function through the output layer. The result of the output layer contains total effect of all the neurons in the network [34, 60].
2.1.1. Generalized Regression Neural Network
Nadaraya-Watson kernel based general regression neural network (GRNN) is introduced by Specht in 1991 [61]. It is used for many applications ever since. Some of these are function approximation, prediction, and control, medical diagnosis, engineering, speech recognition, and 3D modeling [62–69]. In many studies GRNN performed better function approximation feedforward networks and other statistical neural networks on some datasets [62, 64–66, 68–70]. Although GRNN is proposed for function approximation purposes [62], in some works, it is applied to classification problems with small modifications [65, 66]. Main advantages of GRNN are fast learning, consistency, and optimal regression with large number of samples [67]. GRNN has four layers: input, pattern, summation, and output as shown in Figure 2.

2.1.2. Multilayer Perceptron (MLP)
Unlike single perceptron, multilayer perceptron (MLP) contains input, output, and one or more hidden layers with computation nodes [73]. Figure 3 is an example of architecture of MLP.

2.1.3. Radial Basis Function Network
Radial Basis Function Neural Network (RBF-NN) is three layered feed-forward network type applicable to various regression and classification problems. Structure of the RBF is given in Figure 4.

3. Results and Discussion
In this study three different ANN models, namely, GRNN, MLP, and RBF were employed in order to estimate the monthly mean daily SD for 34 stations in Turkey. The inputs of the networks were monthly mean values of cloud coverage and day length, and the output was monthly mean daily SD. Data belonging to selected 34 stations were subdivided into two separate datasets. First part covering first 21 years (1980–2000) was used in the training process and the second part covering last six years (2001–2006) was used in the testing process. The models were constructed and optimized by varying the number of neurons in hidden layer. Matlab ANN toolbox was used for each modeling.
Monthly mean daily SD values were estimated for each station by the proposed ANNs and their averages were compared with the values recorded at meteorological stations from 2001 to 2006. Figure 5(a) shows the comparison of observed values with the estimated values for each station for the considered period. It has been generally observed that simulated values were very close to those observed at meteorological stations. In order to further clarify these results, MAE and RMSE were calculated for each model and station and the graphical representations were given in Figures 5(b) and 5(c), respectively.



It can be concluded from Figure 5(b) that the GRNN model produced much closer results to the observed values at 14 stations (Rize, Trabzon, Bursa, Erzincan, Erzurum, Sivas, Ankara, Kirikkale, Denizli, Adana, Elazig, Mersin, Izmir, and Antalya). MLP model produced closer results at 11 stations (Kocaeli, Samsun, Kastamonu, Zonguldak, Tokat, Afyon, Istanbul, Yozgat, Iskenderun, Adiyaman, and Van) than at the others. On the other hand, RBF model predicted better results for the remaining nine stations (Bolu, Sakarya, Kutahya, Kahramanmaras, Gaziantep, Kayseri, Konya, Diyarbakir, and Sanliurfa). Lower MAE values (less than 0.2 h) were found for Bolu, Zonguldak, Tokat, Ankara, Konya Izmir, Van, and Antalya stations while higher MAE values (greater than 0.4 h) were found for Bursa, Kahramanmaras, and Sanliurfa stations by all three models. On the other side MLP model produced lower RMSE results for 17 stations (Bolu, Kocaeli, Samsun, Kutahya, Kastamonu, Afyon, Istanbul, Ankara, Iskenderun, Kirikkale, Denizli, Yozgat, Adana, Diyarbakir, Adiyaman, Van, and Antalya) than the GRNN (Rize, Trabzon, Sakarya, Tokat, Bursa, Erzincan, Erzurum, Kahramanmaras, Gaziantep, Sivas, Konya, Sanliurfa, Elazig, Mersin, and Izmir,) and RBF (Zonguldak and Kayseri) models. The RMSE values were found to be less than 1.0 h for all stations for GRNN and MLP models. RBF model produced high RMSE values (greater than 1.0 h) for stations Bolu, Sakarya, Samsun, Kastamonu, Erzincan, Istanbul, Gaziantep, Iskenderun, Kirikkale, and Yozgat. The average of observed monthly mean daily sums of SD was estimated as 6.7723 h, 6.7952 h and 0.6715 h by GRNN, MLP and RBF models with MBE 0.0488 h, 0.0719 h and −0,0544 h, MAPE 9.88%, 9.36% and 10.75%, RMSE 0.7971, 0.6256 and 1.1279, %RMSE 11.75%, 9.30% and 16.78% and linear regression of observed versus predicted values (R) was found to be 0.9530, 0.9563, and 0.8902, respectively.
The range of monthly mean daily SD values for the all stations is around 4.0 h (minimum value is around 4.4 h and maximum value is around 8.4 h) which really indicates that the country has geographically and climatologically different zones. Generally, lower SD values have been observed for the stations especially located inside the Black Sea Region (Rize, Trabzon, Bolu, Samsun, Kastamonu, and Zonguldak) and Marmara Region (Sakarya, Istanbul, Kocaeli, and Bursa), which is highly affected by Black Sea climate, north part of East Anatolia (Erzincan and Erzurum), and Central Anatolia (Kutahya and Tokat). This was expected because Black Sea Coasts have much more cloudy days than the other regions and receive the greatest amount of rainfall. On the other hand, higher SD values observed for the stations that stay inside Southeastern Anatolia (Sanliurfa, Diyarbakir, Adiyaman, and Gaziantep), Mediterranean (Antalya, Mersin, Adana, Iskenderun, and Kahramanmaras), and Aegean (Izmir, Denizli) and Central Anatolia (Ankara, Kirikkale, Konya, Kayseri, and Yozgat) and south part of East Anatolia (Van having latitude as 38,47) regions. It is certain that the cloudiness increases as one goes from lower latitude (south of country) to the higher latitude (north of country).
Results of the current study are comparable with previous studies in which neural network approaches and other methodologies were used for other geographical locations. Jervase et al. used RBF neural network model and found that the estimated values deviate from the measured values in between 0 and 1 hour for the stations inside Oman [46]. Mohandes and Rehman obtained the minimum mean absolute percent errors (MAPE) for Al-Madina station as 2.3% and 2.7% and maximum values for Al-Numas station in Saudi Arabia as 22.9% and 16.7% using PSO and SVM methods [47]. The duration of sunshine varies between 7.4 h and 9.4 h per day and its average daily value is approximately 8.89 h for Saudi Arabia [47] and 9.5 h for Oman [74]. Due to their locations both of these countries have less cloudy days and have much solar energy and SD, in comparison to Turkey. Probably this was the reason why Mohandes and Rehman [47] and Jervase et al. [46] did not use the cloudiness as one of the input parameter in their studies. That is, the effect of cloudiness on SD is more impressive on Turkey and it has been used as an input parameter in the present study.
El-Metwally estimated relative SD for six sites in Egypt. MBE% and RMSE% values varied from −0.2% to −13.3% and 2.3% to 14.5%, respectively [23]. Temporal and spatial distribution of bright sunshine hours over Greece were estimated by Matzarakis and Katsoulis and the correlation coefficient (R) and RMSE were found to be 0.87 and 9.90 h, 0.58 and 6.15 h, 0.89 and 4.69 h, 0.86 and 6.22 h, and 0.84 and 5.33 h for annual sunshine, winter, spring, summer, and autumn, respectively [24]. The relative SD was estimated from the cloud data using three empirical formulae for Egypt in the study of Robaa. It was found that relative percentage error (e), mean percentage error (MPE), MBE, and RMSE were changed from −7.2698% to +3.7908%, −0.6240% to +0.8069%, −0.0053 to +0.0070, and 0.0046 to 0.0160, respectively [26]. A simple model was set up by Stanghellini [75] to evaluate monthly SD for various sites in Italy on the basis of the mean daily cloudiness. It is found that monthly MBE was in the range of −17.3 h to 14.9 h.
4. Conclusions
This paper presents a study on the monthly mean estimation of daily SD using three ANN methods, GRNN, MLP, and RBF, which were applied to 34 stations in Turkey. Month, day length, and cloud coverage data were selected and used as input parameters of the constructed models. Since day length can be calculated using astronomical factors and cloud cover obtained visually, monthly mean SD can be determined very accurately for any region by choosing an appropriate ANN model and without using any measuring instrument if enough historical SD and cloud cover database exist. The statistical indicators have shown that GRNN and MLP models work better than RBF model. Results obtained here seem to be good enough because Turkey has geographically and climatologically diverse zones, meaning that the range of distribution of SD over the country is not homogeneous. Also, it should be noted that the visual determination of cloud coverage is a subjective work which may also cause some error and affects the accuracy of the models used here.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.