Information Analysis of Catchment Hydrologic Patterns across Temporal Scales
Abstract
Catchment hydrologic cycle takes on different patterns across temporal scales. The interim between event-scale hydrologic process and mean annual water-energy correlation pattern requires further examination to justify self-consistent understanding. In this paper, the temporal scale transition revealed by observation and simulation was evaluated in an information theoretical framework named Aleatory Epistemic Uncertainty Estimation. The Aleatory Uncertainty refers to posterior uncertainty of runoff given the input variables’ observations. The Epistemic Uncertainty refers to the posterior uncertainty increase due to the imperfect observation decoding in models. Daily hydrometeorological observations in 24 catchments were aggregated from 10 days to 1 year before implementing the information analysis. Estimations of information contents and flows of hydrologic terms across temporal scales were related with the catchments’ seasonality type. It also showed that information distilled by the monthly and annual water balance models applied here did not correspond to that provided by observations around temporal scale from two months to half a year. This calls for a better understanding of seasonal hydrologic mechanism.
1. Introduction
A major realm of hydrologic community is to figure out the components of hydrologic cycle. Each component should be determined either by observation or by an independent governing equation to guarantee the solvability of the problem. The accuracy of observation and domain of governing equations usually change with scales. The term scale here refers to the characteristic time (or length) of a process, observation, or model [1]. While large scale hydrologic patterns are expected to emerge by integrating detailed event-scale hydrologic control functions along the spatial and temporal paths, such reductionism approach often fails to distill the dominant factors that contribute to catchment’s long range hydrologic behaviours. On the other hand, the holism perspective has been widely adopted to provide coarse explanation of catchment’s mean annual water balance [2, 3]. A cut-through between the reductionism and holism paradigms is required for reaching a self-consistent understanding of hydrologic temporal scale transition [4].
One practical attempt toward this goal is to expand the existing mean annual models to fit for small temporal scale hydrologic simulation. Classical Budyko curve [5] which connects the ratio of catchment’s actual evapotranspiration (E) to precipitation (P) and dryness index (PE/P, where PE denotes potential evapotranspiration) can not exert first-order control of water balance for excluding the impact of soil moisture change within the scales the model focuses on [6–8]. By including the soil moisture storage term (S), the expanded models could be applied for seasonal, even monthly, hydrologic simulation and prediction.
As is declared, the incorporation of new variable increases the system’s freedom degree, which should be compensated by introducing new independent governing equation. In Budyko curve, the water supply P is partitioned into actual evapotranspiration (E) and runoff (Q) with the evapotranspiration demand being PE. Accordingly, the adjusted models make a multistep precipitation partition given the water competition between catchment replenishment demand and evapotranspiration demand. Table 1 listed the analysis of some widely accepted water balance models following this cognitive framework.
Model | Replenishment demand | Evapotranspiration demand | Constitutive functions |
---|---|---|---|
ABCD [32] | b | PE |
|
E = ΔS × (1 − e−PE/b) | |||
DWBM [33] | Smax − S + PE | PE |
|
|
|||
TPWB [17] | SC | PE |
|
|
- 1Symbols without explicit explanation are model parameters.
Due to the same constraints of extreme zero-order and first-order boundary conditions where water supply far surpasses or falls behind demand, the curves above take on similar shapes and achieve similar satisfactory performances in monthly scale simulation. Each model requires 2 (in TPWB) to 4 (in ABCD or DWBM) parameters to adjust curve concavity and position to fit for observations. The statistic characteristics of state variables and parameters differ as the modelling scale changes [9]. It is interesting to make a closer examination of data-revealed hydrologic pattern and model performance during scale transition given the wide temporal scale gap between event-scale hydrologic process and annual-mean water balance.
Variables in large temporal scale hydrologic models are the aggregation of themselves in small scale models. The goal of these models is to find out the control of the aggregated variables on that period’s total water balance, which is determined by the inner-scale temporal distribution of hydrologic events and catchment’s storage capacity [10]. For instance, given the same average water and energy supply, catchments with uniformly distributed rainfall and large storage capacity tend to generate less direct runoff and more evapotranspiration. These determinants are simplified as state variable S in the water balance models. The motivation of this paper was to quantify data-revealed (potential) and model-revealed (achieved) control of hydrometeorological variables’ mean values on catchments average water balance over different temporal scales. The estimations were implemented within an information theoretical framework named Aleatory Epistemic Uncertainty Estimation (AEUE) [11] for its mathematical elegance in assessing such insufficient information control problems.
The rest of the paper is structured as follows: in Section 2, the definitions and properties of AEUE are briefly introduced before clarifying its logical extensions and technical adaptions in this work. Section 3 gives the data description. The results and their interpretations are in Sections 4 and 5. The last section draws conclusion and recommends directions for future work.
2. Methodology
2.1. Aleatory and Epistemic Uncertainty in Hydrologic Simulation

Each term in the equations above is named in correspondence with that in Bayes’ Theorem. Explicitly, H(X∣Y) and h(X∣Y) are called conditional entropy, which represent the posterior uncertainty of X given the knowledge of Y. H(X) and h(X) represent the prior uncertainty. I(X; Y) is called mutual information. It represents the information contribution one variable provides to the other.
Equations (7) and (8) construct the Aleatory Epistemic Uncertainty Estimation (AEUE) framework. The sum of AU and EU is the posterior uncertainty of the hydrologic system given the simulation system.
2.2. Extending AEUE for Temporal Scale Information Analysis
To implement AEUE across temporal scales, daily hydrometeorologic observations were aggregated into different temporal scales. The aggregated data were used for estimation of each term in (7) and (8). To achieve a more explicit information analysis, we adapted the strategy to gradually expand input variable species and lagging steps to detect the decreasing trajectory of AU. The decrease is attributed as the information contribution of the included variables. For example, AU(Q; P, PE) − AU(Q; P) (which simplified to I(Q; P, PE) − I(Q; P) according to (7)) represents the information contribution of including energy supply constraints (PE) in simulation, while AU(Q; P, Pformer, PE, PEformer) − AU(Q; P, PE) (which simplified to I(Q; P, Pformer, PE, PEformer) − I(Q; P, PE)) represents the information contribution of former calculating steps’ hydrologic lagging effects; in other words, it denotes the soil moisture memory significance.
EU of two typical water balance models were estimated. The Two-Parameter Water Balance (TPWB) Model [17] was preferred for its simplicity and satisfactory performance at monthly scale. The model structure was listed in Table 1. The other model was Budyko Model. It is the combination of Budyko curve [18] and mass conservation function. The most significant distinction between the two models is that TPWB adapts iterative structure. The performance of iterative models depends on its state variable’s capacity to distill information of system’s lagging effects and its constitutive functions’ capacity to utilize the distilled information. These two factors were discerned through distinguishing I(Q; P, Pformer, PE, PEformer), I(Q; P, PE, S), and I(Q; Qs), where S represents model’s state variable and Qs represents simulated runoff. The difference of the first two terms tells state variable’s representativeness and difference between the last two terms tells constitutive function’s data processing efficiency.
Given the analysis above, the explicit AEUE framework estimates the terms as listed in Table 2.
Classification | Estimated terms |
---|---|
Observation | h(Q) |
Focused | I(Q; P), I(Q; P, Pformer) |
I(Q; P, PE), I(Q; P, Pformer, PE, PEformer) | |
I(Q; P, Pformer, PE, PEformer, Qformer) | |
Model | TPWB: I(Q; Qs), I(Q; P, PE, S) |
Focused | Budyko: I(Q; Qs) |
2.3. High Dimensional Mutual Information Estimator
An intuitive explanation of (10) is that it estimates mutual information with statistics that depicts the average concentrating density of each window opened around a sample point. Numerical experiments showed that even less than 30 sample size produces satisfying results. For a strict proof, please refer to Kraskov et al. [22].
In practice, the support vector regression was implemented using the LIBSVM package [29]. Radial basic function kernel was adopted for its satisfying performance. The data were first normalized to [−1,1] to balance the impact of different dimensional terms. Results were sensitive to the penalty function parameter c and kernel parameter g, both of which were autocalibrated with particle swarm optimization algorithm [30].
3. Data
24 catchments with daily hydrologic records (including P, PE, and Q) from MOPEX data set [31] were selected to implement cross temporal scale information analysis. Given their temporal water-energy distribution patterns, the selected catchments are classified into 4 groups, explicitly, weak seasonality with synchronous rainfall energy distribution (WS), weak seasonality with asynchronous rainfall energy distribution (WA), strong seasonality with synchronous rainfall energy distribution (SS), and strong seasonality with asynchronous rainfall energy climate (SA). The classification standard was based on the amplitude and phase of the average daily rainfall fitted with sine curve. If the amplitude was less than 0.45, the catchment was taken as weak seasonality. If the phase of rainfall was inverse to that of potential evapotranspiration, it was taken as asynchronous rainfall energy climate type. The general conditions of the catchments were listed in Table 3. The vegetation, soil type, land use, and other specific catchment information are available from the following link: ftp://hydrology.nws.noaa.gov/pub/gcip/mopex/US_Data/.
Climate type | ID | Location | Area (km2) | Pmean (mm) | PEmean (mm) | Qmean (mm) |
---|---|---|---|---|---|---|
WA | 02143000 | 81°W, 36°N | 215 | 1299 | 882 | 553 |
02165000 | 82°W, 34°N | 611 | 1252 | 965 | 539 | |
02329000 | 84°W, 31°N | 2953 | 1321 | 1101 | 330 | |
02375500 | 87°W, 31°N | 9886 | 1452 | 1061 | 549 | |
02478500 | 86°W, 31°N | 6967 | 1440 | 1055 | 489 | |
WS | 05585000 | 91°W, 40°N | 3349 | 922 | 993 | 232 |
06908000 | 93°W, 39°N | 2901 | 1001 | 1066 | 261 | |
07019000 | 91°W, 38°N | 9811 | 1006 | 959 | 303 | |
07177500 | 96°W, 36°N | 2344 | 948 | 1259 | 221 | |
07243500 | 96°W, 36°N | 5227 | 935 | 1303 | 160 | |
SA | 02414500 | 86°W, 33°N | 4338 | 1371 | 976 | 542 |
02472000 | 89°W, 31°N | 1924 | 1442 | 1059 | 509 | |
11025500 | 117°W, 33°N | 290 | 522 | 1407 | 34 | |
11532500 | 124°W, 42°N | 1577 | 2748 | 751 | 2212 | |
12459000 | 121°W, 48°N | 2590 | 1613 | 681 | 1105 | |
13337000 | 116°W, 46°N | 3056 | 1287 | 775 | 872 | |
14359000 | 123°W, 42°N | 5317 | 1052 | 851 | 510 | |
SS | 05418500 | 91°W, 42°N | 4022 | 854 | 1017 | 254 |
05454500 | 91°W, 41°N | 8472 | 839 | 984 | 224 | |
05484500 | 94°W, 41°N | 8912 | 794 | 998 | 117 | |
06810000 | 96°W, 40°N | 7268 | 808 | 1027 | 173 | |
06892000 | 95°W, 39°N | 1052 | 941 | 1110 | 228 | |
06914000 | 95°W, 38°N | 865 | 950 | 1186 | 236 | |
07183000 | 96°W, 37°N | 9889 | 877 | 1250 | 187 |
4. Aleatory and Epistemic Uncertainty across Temporal Scales
4.1. Aleatory Uncertainty
Given this baseline, the estimated Aleatory Uncertainty was shown as follows.
In each subgraph above, the abscissa represents the input steps; for example, number n denotes that the current and (n − 1) lagging steps’ input observations were used to decrease the uncertainty of runoff estimation. The ordinate represents the estimating temporal scale, which varied from 10 days to a year.
The detailed analysis discerning each term’s information contribution for different catchments was discussed in the next session.
4.2. Epistemic Uncertainty
The estimated Epistemic Uncertainty across temporal scales was shown in Figure 3.
For TPWB model, maximum EU appears around temporal scales from 2 months to half a year. This showed that, at seasonal temporal scale, the model can not distill the information provided by the data effectively.
The EU difference between TPWB and Budyko Model was related to the catchment’s seasonality. In 11 out of 14 asynchronous seasonality catchments, EU differs significantly at small temporal scales. The difference diminished as scale expands. In the remaining 3 asynchronous catchments and 14 synchronous catchments, the difference stayed relatively constant across temporal scales.
5. Specific Information Analysis
5.1. Information Contribution of Included Input Terms
The including of new information sources could decrease simulation uncertainty. The specific information contribution of including energy provision PE and observed previous runoff Qp was obtained by subtracting the right column graphs from the left column ones (Figure 4). For instance, AU(Q; P) − AU(Q; P, PE) denotes the information contribution of considering PE in the simulation.
For all the 10 weak seasonality catchments and 5 out of 14 strong seasonality catchments, the information contribution of PE was more significant at temporal scales of less than half a year. It was distributed more uniformly across temporal scales in the 9 left strong seasonality catchments.
The prominent information contribution of previous runoff at small scales in some catchments was attributed to runoff convergence influence.
5.2. Information Contribution of Soil Moisture Memory
Previous hydrologic behaviour exerts influence on current hydrologic response due to the storage capacity of soil moisture. Here this influence is defined as soil moisture memory and was represented by the difference between splines in each subgraph of Figure 2.



The second dissection scheme checks the information contribution of including lagged inputs in mutual information estimation. This is implemented by making differences in mutual information estimated with different input steps; for instance, the nth spline in each graph from Figure 5 equals the difference of the (n + 1)th spline and nth spline in the corresponding graph from Figure 6.


It could be depicted that the first lagging steps’ input variables provide most information contribution across all the temporal scales estimated here. As shown in the first column of Figure 5, these lagging effects were not significant when considering only the water provision. The consideration of energy provision is of key importance in estimating the soil moisture length.
5.3. Dissection of Model’s Information Distilling Capacity
As was declared, the simulation capacity of iterative structure models depends on their capacity to distill lagging effects information and process such information. The information distilling and processing capacity were discerned in the following graphs.
The ordinate of each graph denotes mutual information. I(Q; Ic, S) represents the mutual information between runoff and current input together with current state variable. It denotes model’s capacity to distill lagging effects from previous hydrologic behaviours. The difference between I(Q; Ic, S) and I(Q; Qs) denotes model’s capacity to process the information it distilled.
It could be depicted that, in synchronous climate catchments, the information distilled by TPWB and Budyko Model increases as temporal scale expands, while, in asynchronous climate catchments, the information distilling capacity of TPWB does not change monotonously with temporal scales.
6. Discussion and Conclusion
The aggregation of event-scale hydrologic processes yields to the large temporal scale water-energy correlation pattern. The temporal scale transition was examined in the extended Aleatory Epistemic Uncertainty Estimation framework.
The Aleatory Uncertainty quantified the uncertainty caused by inaccurate and insufficient observation. For a large temporal scale, since the daily observations were aggregated, the large number law guaranteed that the accumulated error tended to 0 when there was no systematic observation bias. Thus, AU was mainly attributed to the insufficience of data. The aggregated variables could exert certain control on the total water balance. The control significance was closely related to the seasonality type as quantified in the previous session.
The Epistemic Uncertainty of a monthly and mean annual water balance model was estimated. The performance of TPWB was evaluated by quantifying its information distilling capacity and data processing efficiency. Results showed that information distilled by the models applied here did not correspond to the information provided by input observations around temporal scale from two months to half a year. This called for a better understanding of seasonal hydrologic mechanism. The information distilling capacity difference of TPWB and Budyko Model was related to the inner-year distribution of water and energy. In asynchronous catchments, the difference converged to 0 at half year scale, which suggested close hydrologic cycle.
The evaluations also revealed some counterintuitive phenomenon that needs to be stressed and explained. The meaning of soil storage capacity from a large temporal scale perspective was not as physically clear as it is in event scale. The state variable S is influenced by the distribution of hydrologic processes and soil properties. The strict definition is required to explain the uncertainty differences in different seasonality catchments.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
The authors are grateful to the financial supports offered by the National Science Foundation of China (51479088, 51179083, and 91225302). For detailed algorithm and estimation results, please refer to the following Github repository: https://github.com/emmore/Code/tree/master/My_Lib/Matlab_Lib. Finally, the authors should express their special thanks to Hoshin V. Gupta from University of Arizona for his insights and kindness.