A test of The Ecological Limits of Hydrologic Alteration (ELOHA) method for determining environmental flows in the Potomac River basin, U.S.A.
Summary
- The Ecological Limits of Hydrologic Alteration (ELOHA) method described in Poff et al. (2010) was applied to streams and small rivers in a large central region of the Potomac River basin in the U.S.A. The area, which is topographically complex, has karst geology, is increasingly urban and has few flow-altering impoundments, allows a test of the flexibility and applicability of the ELOHA method's four steps: build a hydrological foundation, calculate flow alteration, classify streams and develop flow alteration–ecology (FA-E) relationships.
- A hydrological foundation of baseline (undisturbed) and current (existing) hydrographs was simulated for 747 catchments using the Chesapeake Bay Program Hydrologic Simulation Program-FORTRAN (HSPF) model and the Virginia Department of Environmental Quality Online Object Oriented Meta-Model (WOOOMM) routing module. The outlet of each catchment was associated with one, and sometimes two or more, stream macroinvertebrate sampling sites. Pairing each catchment's simulated current flow with its own simulated baseline flow produced estimates of flow alteration that reflect the combination of natural and anthropogenic factors controlling streamflow in individual catchments.
- Flow metrics from the baseline and current simulations were compared with observed values from gauged streams in undisturbed and disturbed catchments. The model may have failed to simulate streamflow well in small urbanised catchments on or near karst geology, but observed data were insufficient to fully evaluate model behaviour in these units. Elsewhere, simulated and observed values of 13 of the 15 tested flow metrics generally agreed well.
- A stream hydrological classification system to account for natural biological variability was not feasible in the study area for two reasons. First, the natural landscape features that most strongly govern undisturbed streamflows (catchment size and karst geology) do not greatly influence undisturbed macroinvertebrate communities. Second, the study area's complex topography ensures that many streams crossed physiographic boundaries or flowed through karst geology before reaching the macroinvertebrate sampling sites.
- Stream macroinvertebrates responded strongly to alteration in the duration and frequency of both high and low flow events, rise rate, flashiness and magnitude of high flow events. They did not respond to the alteration in middle- and low-magnitude flow metrics, fall rate or extreme low flow frequency. Flow alteration–ecological relationships were developed for combinations of six flow metrics and seven macroinvertebrate metrics using quantile regression and conditional probability methods. Of the seven macroinvertebrate metrics, % scrapers, % clingers and the Chessie BIBI were most affected by flow alteration.
- Degraded habitat and water quality conditions modify and, if strong enough, conceal the flow alteration–ecological relationships. Water quality and habitat improvements can potentially ameliorate the impacts of flow alteration. Resource managers need to view each stream system holistically and consider all anthropogenic stressors before the impact of existing or future flow alteration can be determined.
- Overall, the ELOHA approach appears to have worked well in a large river basin with complex topography, karst geology, few flow-altering dams, many urban areas and macroinvertebrates as the ecological response variables.
Introduction
The Ecological Limits of Hydrologic Alteration (ELOHA) method recently proposed by Poff et al. (2010) is a ‘new framework for assessing environmental flow needs for many streams and rivers simultaneously’. It is a flexible 4-step process of analysing and synthesising scientific information about streamflow and the flow-related needs of riverine ecosystems. The framework incorporates earlier recommendations by Arthington et al. (2006) and represents the consensus of a group of international scientists. Ecological limits of hydrologic alteration results can guide the societal process of developing and implementing environmental flow standards to protect biological diversity and ecosystem functioning as well as the goods and services that people derive from them.
This study describes an application of the ELOHA method to streams and small tributary rivers in an area defined as the Middle Potomac. The study is one of several ELOHA applications performed in the U.S.A. (Kendy, Apse & Blann 2012). It was part of a larger Middle Potomac River Watershed Assessment (MPRWA) that also examined future water demands, climate change impacts and large river environmental flow needs in the region [U.S. Army Corps of Engineers (USACE), The Nature Conservancy and Interstate Commission on the Potomac River Basin, 2013]. The study is unusual in that it is one of the first ELOHA studies to use routinely collected monitoring data for stream macroinvertebrates to quantify ecological responses to flow alteration.
The Potomac River basin is atypical of the eastern U.S.A. in that few dams regulate streamflow. Most impoundments are run of river. The region offers an opportunity to examine flow alteration due primarily to anthropogenic factors such as withdrawals, discharges and accelerated run-off caused by development. Several features of the region present analytical challenges. A complex topography of ridges, valleys and plateaus makes stream classification difficult. More than 80% of streams (1st–4th Strahler order) in the study area flow across two physiographic regions by the time their catchments exceed 500 km2. Carbonate rock underlies c 28% of the study area. Dissolution of carbonate layers at or near the surface creates karst landscapes with sinkholes, caves and underground drainage systems and makes streamflow simulations difficult in some catchments. Finally, urban and suburban centres in the region are growing rapidly. Biological responses to development-related flow alteration are likely to be confounded by the additional impacts of degraded water quality and stream habitats.
The study objective was to investigate the flexibility and applicability of the ELOHA method in a large and rapidly changing section of the Potomac River basin. An existing flow model was adapted to create a hydrological foundation from which streamflow alteration could be estimated. Benthic macroinvertebrates were used as the ecological response variables because they are the only stream taxa monitored in a consistent manner across all federal, state and local jurisdictions in the study area.
Methods
Study area
The Middle Potomac study area encompasses 79% of the Potomac River basin, a large (37 995 km2) interstate river system on the U.S. Mid-Atlantic seaboard (Fig. 1). The Potomac River mainstem originates in the Appalachian Mountains and flows 616 km south-eastward before joining Chesapeake Bay on the coastal plain. The river basin encompasses Washington District of Columbia and parts of Pennsylvania, West Virginia, Maryland and Virginia. Based on the national 2010 census, the river basin has a population of approximately 6.11 million people, most of whom are concentrated in the Washington metropolitan area (www.potomacriver.org).

Four physiographic regions underlay the study area. The mountainous Ridges region is characterised by high gradient, cool, trellised streams with many riffles and active down cutting. The Valleys region, interspersed between the mountain ridges, has warmer, lower-gradient streams. Large portions of this region have karst geology, causing a low density of surface streams. The Piedmont region has low-to-moderate gradient streams with falls, islands and rapids. The coastal plain region, representing 3.5% of the Middle Potomac study area, has very low gradient streams on poorly drained, alluvial sediments, and streams are poorly incised and lack a defined channel.
Large portions of the Potomac River basin have returned to forest since the early 1900s, a period when roughly two-thirds of the landscape was logged, farmed intensely or burned. Urban centres have begun to expand rapidly, especially near the Washington metropolitan area, and about 13% of the Middle Potomac study area now supports urban and suburban populations. Growth is projected to convert more forest and farmland into developed and hardened landscapes and to increase the demand for water (Ahmed, Bencala & Schultz, 2010; USACE et al., 2013).
Located in a temperate climate, the Potomac River basin experiences an annual average precipitation of 99 cm distributed almost equally across the year. Precipitation is distributed unevenly across the basin due in part to rain shadows created by parallel mountain ridges. The National Inventory of Dams (http://geo.usace.army.mil/pgis/f?p=397:1:0) lists 481 impoundments in the river basin. Of these, 396 are located on streams in catchments <13 km2 and are typically used for farm, recreational or flood control and storm management purposes. Of the 85 remaining impoundments in catchments >13 km2, only 47 have a maximum storage capacity relative to annual flow volume of more than 10%. Total water withdrawals in the study area and the upstream North Branch Potomac River were 8.522 × 106 m3 day−1 in 2005 (USACE et al., 2013; Appendix B) or about 29% of the long-term mean freshwater flow at the study area's downstream point. Model scenarios estimated total withdrawals in this part of the river basin will increase between 29% and 107% by 2030. Approximately 82.5% of withdrawals presently return to the river system.
The ELOHA method
The four major ELOHA steps described in Poff et al. (2010) are as follows: build a hydrological foundation, compute flow alteration, classify streams and develop flow alteration–ecological response (FA–E) relationships, herein referred to as ELOHA steps 1, 2, 3 and 4.
Build a hydrological foundation
A hydrological foundation consists of information about catchments from which estimates of streamflow alteration can be calculated and verified. For the Middle Potomac study, the hydrological foundation included (i) simulated time series of daily mean flow representing the existing (current) hydrology and an undisturbed (baseline) hydrology in 747 catchments, (ii) daily mean flows for 105 United States Geological Survey (USGS) gauges in the Potomac and neighbouring Susquehanna river basins and (iii) the existing land and water uses in all catchments. Analysis nodes at the pour points of the catchments were stream macroinvertebrate monitoring sites. Flows were simulated for the 21-year period of 1 October 1984–30 September 2005. Flow metrics calculated from the simulated flows were matched with macroinvertebrate data for 2000–2008, a period assumed to reflect the cumulative effects of flow alteration between 1984 and 2005. Differences between the simulated baseline and current hydrologies are due solely to the simulated influences of land-use changes, withdrawals, discharges and impoundments. Observed daily flows from the gauge records were used to calibrate the flow model, establish baseline conditions for the flow model to simulate, and verify the flow metrics derived from the simulated hydrologies.
The flow model for the Middle Potomac study was built using the Chesapeake Bay Program (CBP) Hydrologic Simulation Program-FORTRAN (HSPF) mass balance watershed model (U.S. Environmental Protection Agency, 2010). Land uses for 2000 were obtained from the Chesapeake Bay Watershed Land Cover, version 1.05, developed by the University of Maryland's Mid-Atlantic Regional Earth Science Application Center (RESAC). Dam operations such as pass-by requirements and white water releases were included in the model where that information could be obtained. Two modifications improved the model's usefulness in the study. First, 12 model river segments were resegmented, so all major impoundments had upstream and downstream river segments. Major impoundments (i) have a normal storage capacity >10% of mean annual flow volume or (ii) support hydroelectric facilities. Second, a nonlinear groundwater recession algorithm (Schultz et al., in press) was incorporated to better simulate low flows.
The Virginia Department of Environmental Quality's Online Object Oriented Meta-Model (WOOOMM) routing module was used to reapportion the simulated flows for large river segments in the HSPF model into flows for smaller catchments located above each biological sampling site. The module is an online tool accessible at http://deq1.bse.vt.edu/wooomm/login.php. It uses a channel morphology created by the USGS and inputs specific to each catchment regarding area, dominant physiographic province, local channel slope and length, proportions of current and baseline land uses, dam operations, withdrawals and discharges. In this way, each biological sampling site is paired with simulated flows representative of the land and water uses in its catchment.
Analysis nodes were selected systematically from a large, existing database of stream monitoring data compiled by the CBP. The database contains stream macroinvertebrate, water quality and habitat data collected by 23 federal, state and local monitoring programmes in the Chesapeake Bay basin and can be accessed at www.chesapeakebay.net/data. Sites selected for the Middle Potomac study (i) represent a range of human land and water uses; (ii) are spatially distributed across the study area and represent a range of stream sizes, channel slopes and karst geology; (iii) are located within 61 metres (200 feet) of a USGS National Hydrography Dataset stream (http://nhd.usgs.gov/index.html), indicating relatively accurate location coordinates; and (iv) were sampled between 2000 and 2008. The 747 catchments draining to the sampling sites were outlined with two Geographic Information System (GIS) methods: the ‘NHDPlus’ catchment delineation tool and the Utah State University Multi-Watershed Delineation Tool (available at http://cnr.usu.edu/wmc/htm/predictive-models/usingandbuildingmodels). Associated with each catchment was information derived from multiple sources about size, slope, karst geology, forest cover and the locations and amounts of anthropogenic land and water uses. The delineated catchments tended to be small, reflecting the fact that macroinvertebrate monitoring programmes usually collect samples from streams and wadeable rivers. Catchments ranged in size from 0.7 to 7,899 km2 with a median of 9.7 km2.
Catchments in the Potomac–Susquehanna data set of 105 gauge records ranged from 7 to 70 189 km2 with a median of 728 km2. Associated with the flow records was information derived from multiple sources about the catchment above the gage, including size, slope, karst geology, forest cover, average precipitation and anthropogenic land and water uses. All physiographic regions were represented in the data set. Catchment condition ranged from nearly all forested to highly urban or highly agricultural.
Compute flow alteration
A suite of 24 flow metrics representing different, ecologically important parts of the hydrograph was selected for potential use in developing FA–E relationships (Table 1). The selected metrics were responsive to flow alteration, not redundant with other flow metrics, most efficiently modelled and easily understood and could be related to biological behaviours and condition.
Metric name | Description |
---|---|
1-day maximum | The average of each year's highest daily mean flow divided by catchment area (IHA) |
3-day maximum | The average of each year's highest 3-day moving average of daily mean flow divided by catchment area (IHA) |
Annual mean | The average of all the annual means of daily mean flows divided by catchment area. The average of each year's mean daily flows is calculated, and then, the means of each year are averaged (IHA) |
Median | The median of all the daily mean flows divided by catchment area (ICPRB) |
Q85Seas | Calculate the 15th percentile flow for each month (July–October) during the study period, and then, average all the monthly values and divide by catchment area. Adaptation of the Maryland method (ICPRB) |
August median | The median of the August median flow for each year divided by catchment area (IHA) |
Base flow index | The median of each year's 7-day minimum flow divided by the mean annual flow (IHA) |
1-day minimum | The average of each year's minimum daily mean flow divided by catchment area (IHA) |
3-day minimum | The average of each year's lowest 3-day moving average of daily flow divided by catchment area (IHA) |
7Q10 | The lowest streamflow for seven consecutive days that would be expected to occur once in 10 years. The 7-day moving average is calculated for the study period; the flow volume of the event that recurs every 10 years is the 7Q10 value (DFLOW) |
High pulse duration | The median of the annual average number of consecutive days per year that daily flow is above the 90th percentile of the 1984–2005 period of record (IHA) |
High flow volume index MH21 | The average volume of high flow events (above a threshold equal to the median flow of the entire record) divided by the median daily flow for the entire record. Units are days (HIT) |
High flow duration DH17 | The average duration of flow events with flows above the median flow for the entire period of record (HIT) |
Flood-free season | The length of the longest period common to all water years in the study period where flows are at or below the high pulse threshold (usually less than the 90th percentile) in every year (IHA) |
Low pulse duration | The median of the annual average number of consecutive days per year that daily flow is below the 10th percentile of the 1984–2005 period of record (IHA) |
Extreme low flow duration | Mean of extreme low flow event duration. An extreme low flow event is the occurrence of flow in the lowest 10th percentile of the low flows, which are the lowest 10th percentile of all flows over the study period (IHA) |
High pulse count | The median of the annual average of each year's number of times the daily mean flow is above the 90th percentile of all flows for the study period (IHA) |
High flow frequency | Average of the number of events per year when the daily mean flow exceeds the 90th percentile of all flows in the study period (IHA) |
Number of reversals | The average number of times in a year that daily mean flow switches from rising to falling and vice versa (IHA) |
Low pulse count | The median of the annual average of each year's number of times the daily mean flow is below the 10th percentile of all flows for the study period (IHA) |
Extreme low flow frequency | The frequency of extreme low flow events in a year, where daily flow is in the lowest 10th percentile of all the low flows (or below the 2.5th percentile of all flows in the 1984–2005 period of record) (IHA) |
Flashiness | (Richards–Baker index) Sum of the absolute values of day-to-day changes in the daily mean flow divided by the sum of the daily mean flows (ICPRB) |
Rise rate | The average of all positive differences in daily mean flow during ‘rising periods’, or consecutive days for which change in daily flow is positive, in a year divided by catchment area (IHA) |
Fall rate | The average of all negative differences in daily mean flow during ‘falling periods’, or consecutive days for which change in daily flow is negative, in a year divided by catchment area (IHA) |
A recursive partitioning analysis was used to derive land cover and water use criteria to identify catchments representative of undisturbed flow regimes. Category and Regression Tree Analysis (CART) in S-Plus 6.2 (Insightful Corporation, Seattle, Washington, DC, U.S.A.) was performed on the Potomac–Susquehanna data set to test the responses of eleven of the flow metrics to urban, agriculture and forest land cover, imperviousness, impoundments, total and surface-only withdrawals, consumptive use and karst geology (USACE et al., 2013, Appendix D). Category and regression tree analysis is a nonparametric decision tree technique that sequentially splits, or partitions, the values of a dependent variable into increasing homogenous groups. At each split, CART identifies the independent variable accounting for the greatest amount of variance in the dependent variable and the value of the independent variable where the split occurs (threshold). Two thresholds from the CART results were used as baseline criteria: ≥78% forest cover and ≤0.35% impervious surface. These thresholds correspond to some of the first discernible changes in flow metrics as land and water uses increase. Catchments were considered undisturbed if they met these baseline criteria and did not have any significant withdrawals, discharges or impoundments.
Most catchments delineated for the study did not meet the baseline criteria. To simulate baseline conditions in these catchments, existing land uses were reapportioned to meet the forest and impervious surface criteria, and all withdrawals, discharges and impoundments were removed. Since urban land use correlates with imperviousness, agriculture comprises most of the non-forest land use (<22%) in the resulting baseline scenarios. No distinction was made between agricultural types.
Simulated and observed values of the various flow metrics were rigorously tested and compared to build confidence in the model's ability to represent flow alteration. Baseline values were plotted against three natural features, catchment area, channel slope and % karst, to identify catchments whose natural features might be poorly simulated. Relationships between baseline flow metrics and the natural features should be consistent and explainable. Catchments with outliers in one or more flow metrics, especially outliers that cluster, raised questions about model performance. Current scenario flow metric values from catchments with well-simulated baseline scenarios were then compared with observed flow metrics from Potomac–Susquehanna gauged catchments. These comparisons tested the model abilities to simulate multiple anthropogenic influences on flow. Only catchments in the Ridges, Valleys and Piedmont physiographic regions of the Potomac River basin were analysed. Coastal plain catchments in the Middle Potomac study area were too few to consider. Only simulated data from modelled catchments >26 km2 (n = 242) were included in the comparisons, to better match the larger sizes of the 98 gauged, non-coastal catchments.
Classify streams
Stream classification is recommended by Poff et al. (2010) for ELOHA studies if ecological responses to flow alteration are expected to vary naturally by stream class. Two analyses that avoid the confounding influences of anthropogenic stressors were considered in deciding if stream classification would help minimise natural variability in the FA–E relationships. The first analysis determined which natural landscape features in the Middle Potomac study catchments explained most of the variation in simulated baseline flows. An analysis performed with Recursive Partitioning and Regression Tree (RPART) software tested the baseline scenario values of 22 flow metrics against four independent variables: physiographic region at the analysis node and size, % karst and average channel slope in the catchment. Recursive partitioning and regression tree is a decision tree analysis tool in r software (Venables, Smith and the R Development Core Team, 2013) that performs almost identically to the proprietary Category and Regression Tree analysis tool (Insightful Corporation)
The second analysis, reported in Buchanan et al. (2011), determined which natural landscape features explain most of the variation in macroinvertebrate metrics at 184 ‘reference’ sampling sites in the larger Chesapeake Bay basin. Reference sites have excellent local habitat conditions and no identified water quality problems. All of the following criteria are met: 5.5 < pH < 9, conductivity <500 μS cm−1, DO >5 mg L−1 and Rapid Bioassessment Protocol (RBP) habitat scores of 16 or more of a possible 20 points. The protocols for evaluating stream habitat conditions are described by Barbour et al. (1999). They are presently used by most monitoring programmes in the Middle Potomac study area. Conditions in reference site catchments are roughly comparable to those in baseline scenarios. Level 4 ecoregions in the Mid-Atlantic States are described in Wood, Omernik & Brown (1999), and the spatial coverage files are available through www.epa.gov.
Develop flow alteration–ecological response relationships
A large suite of family-level, stream macroinvertebrate metrics was calculated from a CBP database with standardised programmes. The metric values and their numeric scores are available at www.chesapeakebay.net. The scoring approach takes into consideration differences due to bioregion, and if needed stream order, season and karst geology, and converts a sample's biometric values to scores on a common scale of low to high status (Buchanan et al., 2011). Scoring is based on the distribution of each metric's values in populations found at reference sites in each bioregion. The highest scores are assigned to high values of metrics that respond negatively to degradation (e.g. % EPT, % of macroinvertebrates that are Ephemeroptera, Plecoptera or Trichoptera) and low values of metrics that respond positively to degradation (e.g. % tolerants). Between 70% and 80% of metric values observed in each bioregion's reference population receive high scores, which are considered acceptable and indicative of ‘fair’ or better biological conditions. Numeric scores for the composite Chessie BIBI index were divided into three high-ranking categories (‘fair’, ‘good’ and ‘excellent’) and two low-ranking categories (‘poor’ and ‘very poor’).
Twenty macroinvertebrate metrics were considered in this study (Table 2). Pearson correlation (Microsoft Excel 2007) was used to evaluate the strength of the linear relationships between the 20 macroinvertebrate metrics and alteration in the 24 flow metrics. From this matrix, six flow metrics and seven macroinvertebrate metrics were chosen for development of the FA–E curves. The six selected flow metrics were accurately simulated, responsive to impervious surface area, easily understood, exhibited a broad range of either negative or positive alteration and/or are commonly used by water resource agencies: high flow volume index MH21, high flow duration DH17, high pulse count, low pulse duration, flashiness and the 3-day maximum. The seven macroinvertebrate metrics were diverse measures of community composition and function: % EPT, the Hilsenhoff family-level biotic index, % Chironomidae, a family-level Shannon–Wiener diversity index, % scrapers, % clingers and the multimetric Chessie BIBI. Many of the region's state and local monitoring programmes calculate the first three metrics and use them to evaluate stream health. The Chessie BIBI is used by CBP to report stream health (www.chesapeakebay.net).
Metric name | Type | Description |
---|---|---|
ASPT modified index | T | Average of the family-level tolerance score of each family present in sample |
Beck's index | T | Index is [(3 × #families with tolerance value of 0) + (2 × #families with tolerance value of 1) + (1 × #families with tolerance value of 2)] |
Chessie BIBI | Index | Chesapeake Bay basinwide Benthic index of biological integrity; also called ‘Chessie BIBI’; average of scores of five bioregion-specific, family-level metrics |
Gold index | C | Index is 1 minus the proportional abundances (percents) of gastropods (snails), oligochaetes (segmented worms) and Diptera (true flies) individuals |
Hilsenhoff family biotic index | T | Also called FBI; average of the family-level tolerance score of each individual |
# Ephemeroptera families | R | Number of Ephemeroptera (mayflies) families present |
# Sensitive taxa | T | Number of families with family-level tolerance values less than or equal to 3 |
% Chironomidae | C | Per cent of individuals that are chironomids (non-biting midges) |
% Clingers | H | Per cent of individuals present that are adapted for clinging to hard surfaces |
% Collectors | FG | Per cent of individuals that are collectors (filterers + gatherers) and not predatory |
% Dominant3 | T | Per cent of individuals in the three most common families |
% Ephemeroptera | C | Per cent of individuals that are Ephemeroptera (mayflies) |
% EPT | C | Per cent of individuals that are Ephemeroptera (mayflies), Plecoptera (stoneflies) and Trichoptera (caddisflies) |
% Filterers | FG | Per cent of individuals that are adapted for filtering fine particles from the water column |
% Gatherers | FG | Per cent of individuals that are adapted for gathering a range of food particle sizes |
% Scrapers | FG | Per cent of individuals that are adapted for scraping periphyton (algae, bacteria) from hard surfaces |
% Swimmers | H | Per cent of individuals that are adapted for swimming |
% Tolerants | T | Per cent of individuals with family-level tolerance values greater than or equal to 7 |
Shannon–Wiener index | R | Index is a measure of taxonomic diversity; it is the proportion of each family times the log of its proportion, summed for all families |
Taxa richness | R | The number of family-level taxa in the sample |
- Calculations are derived from raw counts by a Chesapeake Bay Program (CBP) software program. Metric type: C, composition; T, tolerance; R, richness; FG, feeding group; H, habit; index, multimetric index. Family-level pollution tolerance values were originally developed by Hilsenhoff (1988) and refined by CBP.
FA–E curves were constructed using quantile regression in R software (quantreg package by R. Koenker, 27 February 2011, available at www.r-project.org) and conditional probability [custom Microsoft Excel 2007 script by Interstate Commission on the Potomac River (ICPRB)]. The quantile regressions were applied to macroinvertebrate metric values for both negative and positive flow alterations. The regressions tracked the 90th percentile of macroinvertebrate metric values that respond negatively to increasing degradation (e.g. % EPT) and the 10th percentile of the metric values that respond positively to increasing degradation (e.g. % tolerants). The conditional probability method was applied to macroinvertebrate metric scores. The frequencies of acceptable (high) scores were calculated in increments of 20 or more observations as flow alteration diverged from the baseline, or 0% alteration, in the predominant direction of change. A locally weighted regression (LOESS) curve with confidence intervals was then generated from the probability values (see Figure S1 in Supporting Information).
Results
Flow simulations
The HSPF model was calibrated with the CBP autocalibration routine (U.S. Environmental Protection Agency, 2010). Simulated and observed time series of daily flows under existing conditions were evaluated with the Nash–Sutcliffe efficiency (NSE) coefficient (Nash & Sutcliffe, 1970) and the coefficient of determination (R2). (The NSE coefficient assesses deviation from the 1 : 1 relationship in pairs of simulated and observed discharge. Decreasing positive values indicate poorer model efficiency; negative values indicate that the model is not a reliable predictor of discharge.) Forty-three USGS stream gauges in the Potomac River basin provided calibration data. Nash–Sutcliffe efficiency coefficients ranged from 0.33 to 0.82, and R2 coefficients, from 0.39 to 0.82. These results indicate an acceptable range of model error consistent with the ‘weight of evidence’ model evaluation approach (Lumb, McCammon & Kittle, 1994; Donigian, 2002).
Comparisons between simulated and observed flow metrics show that the model represents the influences of natural and anthropogenic factors on Middle Potomac streamflows fairly well. Most of the current and baseline scenarios generated by the model can be used to build a hydrological foundation from which to calculate flow alteration (ELOHA step 1). Results of the comparisons are summarised here and presented in greater detail in the study by USACE et al. (2013).
Outliers in scatter plots of the baseline scenario flow metrics versus catchment area and slope indicated that the model may have difficulty in simulating undisturbed hydrographs for 59 catchments. These catchments were located chiefly in urban areas on or near karst geology in the Valleys bioregion. A final decision as to whether the catchments were accurately simulated could not be made because observed flow data in these areas were insufficient. Including these problematic catchments in the FA–E analysis could potentially introduce artefacts related to flawed simulations, so the 59 catchments were removed from the analysis data set.
With one exception (reversals), flow metrics derived from simulated baselines in the study's 242 larger catchments and from observed flows in the eleven comparably sized, undisturbed Potomac–Susquehanna catchments responded similarly to both catchment size and slope. Baseline values of most of the flow metrics overlaid observed values from the eleven undisturbed catchments. Baseline values of four flow metrics, fall rates, magnitude of the annual 3-day maximum, length of high pulse duration and magnitude of August median, were slightly biased. These biases proved relatively unimportant when current scenario values, which have the same biases, were related to their baseline values to calculate percentage flow alteration.
With one exception (high pulse duration), responses of the current scenario and observed flow metric values to imperviousness mirrored each other and in most cases overlaid each other. Simulated responses to impervious surface as well as to forest cover were strong and consistent with the observed responses. An RPART analysis of 31 disturbed catchments with both simulated and observed flow data further established the model's ability to represent multiple anthropogenic effects on flow. The same land- or water-use variables split the simulated and observed flow metric values at or near the same threshold. Impoundments were identified as a primary driver of change for just one flow metric, fall rate.
Flow alteration
Normalisation of current scenario flow metrics to catchment-specific baseline scenarios (ELOHA step 2) brought into sharper focus the effect of flow alteration on macroinvertebrate metrics. Scatter plots in Fig. 2a,b illustrate how normalisation helps characterise the response of the Chessie BIBI, a stream macroinvertebrate community index, to flow alteration in one flow metric: flashiness. Streams with higher values of the Richards–Baker flashiness index (Baker et al., 2004) have greater day-to-day changes in daily mean flow relative to the overall mean of the daily flows. The scatter plot of Chessie BIBI against flashiness index value in existing conditions (current scenario) does not account for natural, regional differences in flashiness, and a distinct biological response to flashiness was not evident (Fig. 2a). When flashiness was normalised to its catchment-specific baseline and percentage alteration was plotted, a zone with no observed biological values became apparent (Fig. 2b, grey area). The edge of the zone indicates the limit of maximum possible biological status with increasing flow alteration. For many stream macroinvertebrate metrics, this zone appears when flow metrics representing the following characteristics were altered: high flow magnitude, high and low flow event duration, high and low flow event frequency and day-to-day changes in flow. The zone is small or does not appear when middle- and low-magnitude flow metrics are altered (i.e. median, August median, summer Q85Seas, base flow index, 7Q10 and average annual 1- and 3-day minima).

Factors unrelated to flow alteration affect stream macroinvertebrate communities in the Potomac River basin. Eutrophication, toxic chemical pollution, acidity, sediment, temperature and pathogens are often implicated in the list of impaired and threatened Potomac waters submitted to the United States Environmental Protection Agency (USEPA) by the region's jurisdictions (http://www.epa.gov/waters/ir/index.html). The influence of non-flow factors can weaken and confound FA–E relationships. In Fig. 2b, two groups of streams of contrasting quality are identified and given separate least-squares regressions. The regression line for the reference streams (described above) has a high y-axis intercept and is short in length because only a few sites have measureable flow alteration. It is noteworthy that some reference sites have poor (17–<30) and even very poor (<17) Chessie BIBI scores, indicating the presence of stressors other than pH, DO, conductivity and habitat condition. The ‘degraded’ streams in Fig. 2b experience one or more of the following: pH <5.5, pH >9, DO <5 mg L−1, conductivity >500 μS cm−1 or stream habitat scores of 8 or less of a possible 20 points. The least-squares regression line for this group has a lower y-axis intercept and a longer, shallower slope. Individual degraded sites can approach the limit of maximum possible biological status (grey zone edge), which suggests biota can tolerate exceedances of one or two of the degradation criteria if most non-flow factors are not stressful.
In the Middle Potomac study data set, alteration in the flow metrics tended to be either positive or negative. Alteration in flashiness was usually positive (i.e. flashier) except in a few catchments with large discharges and/or withdrawals, which showed negative alteration (Fig. 2b,c). Alteration in frequency and rate-of-change flow metrics also occurred most often in the positive direction (e.g. more frequent; faster). Alteration in the 1 and 3-day annual maxima, annual mean, base flow index, 7Q10 and Q85Seas tended to be positive while the median, August median, 1 and 3-day annual minima showed roughly equal proportions of negative and positive alterations and relatively small amounts of change. Alteration in the duration of high and low flow events occurred most often in the negative direction (i.e. shorter). The intensity of flow alteration also differed by flow metric. For example, in some streams, positive alteration exceeded 600% in rise rate and 280% in high pulse count, but was <83% in 3-day maximum. In other streams, negative alteration in low pulse count and duration exceeded −100%, indicating the streams no longer experienced the low pulses found in their baseline flow time series, but alteration in the mid-range flow metrics of these streams was relatively small (e.g. annual mean, median, August median, flood-free season).
Natural variability
Catchment size and karst geology govern much of the natural hydrological regime in the Middle Potomac study area and can be used to create hydrological classes (ELOHA step 3). In the RPART analysis of well-simulated, undisturbed catchments, catchment area was the factor that initially splits the values of flow metrics representing high magnitude, duration and frequency of high and low flow events and rate of change (primary split). Larger catchments have fewer, longer high flow events with proportionally lower annual maxima. They have slower rise rates, are less flashy and tend to have somewhat fewer and longer low pulses and higher 7Q10 values. Karst appears to be an important factor affecting low flow metrics. Baseline scenarios with relatively high karst percentages have proportionally higher annual minima, base flows, Q85Seas and August medians; fewer extreme low flow events; and slower fall rates. Annual mean, median, duration of extreme low flow, high flow duration DH17 and number of reversals did not split strongly at the first RPART node for any of the landscape variables (<5% reduction in variability). Physiographic region was the primary splitting factor for just one flow metric, high flow volume index MH21, and it accounted for only 11.4% of the variability.
The CART analysis of macroinvertebrate communities from reference streams in and around the Middle Potomac study area indicated that hydrological classes based on catchment size and karst geology do not minimise natural variability in macroinvertebrates. Macroinvertebrate metric values in high-quality streams are more affected by physiographic region. Values of eleven of fifteen macroinvertebrate metrics split primarily and strongly on USEPA level 4 ecoregions, indicating that ecoregion explains much of the natural variation in macroinvertebrates (Buchanan et al., 2011). The four remaining macroinvertebrate metrics split first on elevation or latitude and subsequently on ecoregion. Responses to Strahler order, which corresponds to catchment size in the Middle Potomac catchment, and to hydrogeomorphology, which identifies carbonate rock layers underlying karst geology, were few and secondary. Level 4 ecoregions are areas in which natural features such as soils, vegetation, climate, topography and physiography are relatively homogeneous (Bryce, Omernik & Larsen, 1999; Omernik, 2004). Ecoregions are distinct from each other and reflect elevation and latitude differences. They can be aggregated into larger physiographic regions called bioregions (shown in Fig. 1) and still maintain discriminatory power for stream macroinvertebrates (Astin, 2006; Buchanan et al., 2011). More than 80% of streams in the Middle Potomac study area flow across two bioregions by the time their catchments exceed 500 km2. Nevertheless, a Kruskal–Wallis one-way analysis of variance test of 42 macroinvertebrate metrics from a large sample of reference sites found Ridges, Valleys and Piedmont differences in 38 (90%) of the metrics (P < 0.05).
Flow alteration–ecological response relationships
FA–E relationships (ELOHA step 4) were developed from 1,155 macroinvertebrate samples associated with 656 of the original 747 catchments in the Piedmont, Ridges and Valleys bioregions of the Potomac river system (median size 11 km2). Flows in these catchments were well simulated. Relationships between flow alteration and macroinvertebrates were explored with three statistical methods. The first method, Pearson correlation, tested linear relationships between 20 macroinvertebrate metrics and percentage alteration in 24 flow metrics (Table 3). The largest coefficients (|r| > 0.25) were associated with magnitudes of high flow events, durations and frequencies of both high and low flow events and day-to-day change in daily mean flow. The strength of these relationships is roughly the same for many macroinvertebrate metrics. Correlation coefficients for the middle- and low-magnitude flow metrics approached zero (|r| < 0.23). They include the median, August median, Q85Seas, base flow index, 7Q10 and average annual 1 and 3-day minima.
Pearson correlation coefficients | Response to degradation | 1-day maximum | 3-day maximum | Annual mean | Median | Q85Seas | August median | Base flow index | 1-day minimum | 3-day minimum | 7Q10 | High pulse duration | High flow index MH21 | High flow duration DH17 | Flood-free season | Low pulse duration | Extreme low flow duration | High pulse count | High flow frequency | Number reversals | Low pulse count | Extreme low flow frequency | Flashiness | Rise rate | Fall rate | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Chessie BIBI | Neg | −0.38 | −0.40 | −0.30 | 0.09 | 0.09 | 0.21 | 0.10 | 0.07 | 0.07 | 0.03 | 0.31 | 0.47 | 0.47 | 0.42 | 0.37 | 0.27 | −0.37 | −0.33 | −0.45 | −0.33 | −0.23 | −0.43 | −0.40 | −0.23 | |
% EPT | Neg | −0.36 | −0.36 | −0.32 | 0.04 | 0.05 | 0.15 | 0.08 | 0.04 | 0.04 | 0.02 | 0.27 | 0.40 | 0.42 | 0.40 | 0.33 | 0.35 | −0.34 | −0.25 | −0.40 | −0.26 | −0.19 | −0.38 | −0.37 | −0.17 | |
% Ephemeroptera | Neg | −0.33 | −0.33 | −0.26 | 0.06 | 0.10 | 0.21 | 0.10 | 0.07 | 0.07 | 0.08 | 0.26 | 0.40 | 0.40 | 0.35 | 0.32 | 0.29 | −0.31 | −0.28 | −0.37 | −0.31 | −0.19 | −0.36 | −0.33 | −0.15 | |
# Ephemeroptera families | Neg | −0.42 | −0.40 | −0.31 | 0.11 | 0.12 | 0.23 | 0.11 | 0.07 | 0.07 | 0.05 | 0.34 | 0.43 | 0.42 | 0.40 | 0.34 | 0.25 | −0.39 | −0.32 | −0.42 | −0.33 | −0.19 | −0.43 | −0.41 | −0.16 | |
# Sensitive taxa | Neg | −0.38 | −0.36 | −0.30 | 0.04 | −0.05 | 0.05 | −0.02 | −0.06 | −0.07 | −0.10 | 0.27 | 0.39 | 0.40 | 0.37 | 0.34 | 0.21 | −0.37 | −0.30 | −0.41 | −0.23 | −0.11 | −0.39 | −0.38 | −0.22 | |
Hilsenhoff family index | Pos | 0.36 | 0.34 | 0.31 | −0.02 | −0.01 | −0.10 | −0.05 | −0.01 | −0.01 | 0.01 | −0.26 | −0.32 | −0.36 | −0.37 | −0.28 | −0.26 | 0.33 | 0.22 | 0.35 | 0.21 | 0.14 | 0.35 | 0.35 | 0.17 | |
ASPT modified index | Pos | 0.41 | 0.39 | 0.33 | −0.09 | −0.01 | −0.12 | −0.03 | 0.02 | 0.02 | 0.04 | −0.29 | −0.39 | −0.42 | −0.41 | −0.33 | −0.23 | 0.39 | 0.28 | 0.41 | 0.27 | 0.18 | 0.42 | 0.41 | 0.20 | |
% Tolerant | Pos | 0.33 | 0.32 | 0.29 | 0.01 | −0.07 | −0.14 | −0.11 | −0.09 | −0.09 | −0.06 | −0.23 | −0.32 | −0.34 | −0.36 | −0.27 | −0.28 | 0.28 | 0.21 | 0.32 | 0.20 | 0.12 | 0.31 | 0.31 | 0.21 | |
% Dominant3 | Pos | 0.31 | 0.31 | 0.24 | −0.05 | −0.07 | −0.14 | −0.10 | −0.08 | −0.08 | −0.03 | −0.24 | −0.34 | −0.34 | −0.30 | −0.28 | −0.21 | 0.29 | 0.26 | 0.32 | 0.22 | 0.10 | 0.32 | 0.30 | 0.18 | |
% Chironomidae | Pos | 0.38 | 0.40 | 0.37 | 0.01 | −0.13 | −0.22 | −0.17 | −0.15 | −0.14 | −0.11 | −0.29 | −0.41 | −0.43 | −0.42 | −0.33 | −0.37 | 0.35 | 0.24 | 0.42 | 0.32 | 0.25 | 0.40 | 0.39 | 0.20 | |
Beck's index | Neg | −0.38 | −0.37 | −0.29 | 0.08 | −0.03 | 0.08 | −0.01 | −0.05 | −0.06 | −0.10 | 0.27 | 0.42 | 0.42 | 0.37 | 0.34 | 0.25 | −0.37 | −0.30 | −0.42 | −0.26 | −0.15 | −0.40 | −0.39 | −0.21 | |
Gold index | Neg | −0.36 | −0.36 | −0.32 | 0.07 | 0.13 | 0.22 | 0.16 | 0.13 | 0.13 | 0.08 | 0.28 | 0.38 | 0.40 | 0.39 | 0.30 | 0.34 | −0.32 | −0.20 | −0.38 | −0.29 | −0.25 | −0.38 | −0.37 | −0.14 | |
% Gatherers | Pos | 0.28 | 0.31 | 0.32 | 0.09 | 0.00 | −0.07 | −0.08 | −0.06 | −0.05 | −0.01 | −0.18 | −0.30 | −0.33 | −0.33 | −0.27 | −0.30 | 0.26 | 0.18 | 0.34 | 0.19 | 0.16 | 0.30 | 0.30 | 0.22 | |
% Collectors | Pos | 0.23 | 0.20 | 0.21 | 0.03 | 0.07 | 0.03 | 0.03 | 0.05 | 0.05 | 0.10 | −0.16 | −0.14 | −0.16 | −0.20 | −0.11 | −0.09 | 0.20 | 0.10 | 0.19 | 0.07 | 0.07 | 0.20 | 0.21 | 0.07 | |
% Scrapers | Neg | −0.28 | −0.32 | −0.24 | 0.05 | 0.12 | 0.21 | 0.13 | 0.11 | 0.11 | 0.12 | 0.23 | 0.37 | 0.35 | 0.33 | 0.24 | 0.20 | −0.27 | −0.24 | −0.32 | −0.32 | −0.24 | −0.33 | −0.30 | −0.17 | |
% Filterers | Neg | −0.09 | −0.16 | −0.16 | −0.09 | 0.07 | 0.11 | 0.12 | 0.12 | 0.11 | 0.10 | 0.06 | 0.21 | 0.23 | 0.19 | 0.21 | 0.27 | −0.11 | −0.11 | −0.21 | −0.16 | −0.13 | −0.15 | −0.14 | −0.20 | |
% Swimmers | Neg | −0.28 | −0.30 | −0.26 | 0.00 | 0.05 | 0.13 | 0.07 | 0.05 | 0.04 | 0.03 | 0.23 | 0.33 | 0.35 | 0.29 | 0.32 | 0.28 | −0.26 | −0.22 | −0.35 | −0.25 | −0.18 | −0.31 | −0.29 | −0.18 | |
% Clingers | Neg | −0.23 | −0.22 | −0.22 | −0.04 | 0.05 | 0.08 | 0.08 | 0.07 | 0.07 | 0.07 | 0.15 | 0.18 | 0.20 | 0.26 | 0.18 | 0.14 | −0.19 | −0.12 | −0.18 | −0.11 | −0.05 | −0.20 | −0.21 | −0.16 | |
Shannon–Wiener index | Neg | −0.33 | −0.33 | −0.26 | 0.04 | 0.05 | 0.12 | 0.07 | 0.05 | 0.05 | 0.00 | 0.26 | 0.33 | 0.35 | 0.31 | 0.28 | 0.22 | −0.31 | −0.26 | −0.33 | −0.21 | −0.08 | −0.33 | −0.32 | −0.20 | |
Taxa richness | Neg | −0.29 | −0.27 | −0.22 | 0.00 | −0.02 | 0.03 | 0.01 | −0.02 | −0.03 | −0.07 | 0.21 | 0.26 | 0.27 | 0.25 | 0.24 | 0.13 | −0.27 | −0.23 | −0.28 | −0.15 | −0.03 | −0.27 | −0.28 | −0.19 |
The tendency of flow alteration to occur predominantly in one direction in the study area often leads to discernible biological responses in one direction, but not in the other (e.g. Fig. 2c). The FA–E relationships of six flow metrics and seven macroinvertebrate metrics were developed for the dominant direction of alteration in each flow metric.
The quantile regression method for delineating FA–E relationships overcomes much of the confounding effects of poor water quality and stream habitat on the macroinvertebrate metric values. It does not account for underlying biological differences relating to bioregion. The strength of each quantile regression FA–E relationship is shown in Table 4 for the predominant direction of flow alteration. Responsiveness of the seven macroinvertebrate metrics to flow alteration varied. The Chessie BIBI and % scrapers quantile regressions with all six flow metrics were significant at P < 0.05, whereas only three of the six % clingers quantile regressions were significant at P < 0.05.
Biometric | Flow metric (dominant direction of alteration) | |||||
---|---|---|---|---|---|---|
3-day maximum (+) | High flow volume index MH21 (−) | High flow duration DH17 (−) | Low pulse duration (−) | High pulse count (+) | Flashiness (+) | |
Chessie BIBI | *** | *** | *** | * | *** | *** |
Shannon–Wiener | ** | *** | ** | ns | * | ** |
Hilsenhoff FBI | ** | ns | ns | * | *** | ** |
% EPT | ns | * | *** | *** | ns | ** |
% Chironomidae | ns | ** | *** | ns | ns | ** |
% Scrapers | *** | *** | *** | ** | ** | *** |
% Clingers | ns | * | ** | ns | ns | * |
- ***P < 0.000, **P ≤ 0.01, *P ≤ 0.05; ns, P > 0.05.
The conditional probability method for delineating FA–E relationships accounts for natural, bioregional differences in macroinvertebrate metrics by scoring their values in a common fashion against bioregion-specific reference values. Metric scores are thus comparable across bioregions. The method does not overcome the confounding effects of poor water quality and stream habitat. When the method is applied to the Fig. 2c scatter plot of Chessie BIBI versus flashiness, Fig. 2d is the result. The considerable scatter in the probability points is evidence that non-flow factors are influencing biological communities. Almost all of the LOESS curves showed a strong tendency to decrease as alteration intensified in each of the six flow metrics (see Figure S2a-ap in Supporting Information). In general, the probability of a healthy biological community at any location decreased, and the probability of an unhealthy biological community increased as flow alteration increased. Of the seven biological metrics, % scrapers, % clingers and the Chessie BIBI demonstrated the most abrupt declines with flow metric alteration; % EPT and the Hilsenhoff family biotic index demonstrated the least abrupt declines.
Discussion
Assembling detailed land- and water-use information for the Middle Potomac study area and simulating the hydrological foundation of baseline and current scenarios for the study's 747 catchments (ELOHA step 1) were difficult and time-consuming, but proved to be a crucial step. The hydrological foundation overcame to varying degrees the analytical challenges of a complex topography and karst geology. Extensive examination of the relationships between natural features and flow metrics from the simulated baseline scenario and comparisons between current and observed flow metrics for comparably sized catchments built confidence in the CBP HSPF watershed model and VADEQ WOOOMM routing module. Uncertainty about streamflow simulations in 59 karst-affected catchments led to their removal from the final analysis data set. However, 111 other catchments with high levels of karst (>44%) appeared to be well simulated and were kept in the analysis. The 32 coastal plain catchments were removed due to their low number and a lack of observed data to check their simulated flow metrics. In the 656 remaining catchments, baseline and current values of eight of the fifteen tested flow metrics and their relationships with natural and urban landscape features were accurately represented. The model had small, but surmountable difficulties representing the values of five other flow metrics while accurately representing their relationships to anthropogenic factors. Overall, flow simulations in the 656 catchments were sufficient to allow a choice of flow metrics with which to develop flow alteration–ecology (FA–E) relationships.
The Middle Potomac hydrological foundation accomplished three important purposes outlined by Poff et al. (2010). It facilitated the use of biological monitoring data, in this case macroinvertebrates, from ungauged 1st–4th order streams; it provided a consistent benchmark (baseline condition) from which flow alteration could be quantified; it characterised streamflow relationships with the natural landscape and with human uses of land and water.
Pairing each catchment's current flows with its own baseline flows produced estimates of flow alteration (ELOHA step 2) that reflected the particular combination of natural and anthropogenic factors controlling streamflow in the catchment. The intensity and extent of flow alteration caused by human activities on the land (urbanisation, agriculture) and human manipulations of streamflow (withdrawals, discharges and impoundments) could be examined and accounted for with more confidence. This approach, recommended by Poff et al. (2010), also helped distinguish the effects of flow alteration on macroinvertebrates from the impacts of other stressors, namely poor stream water quality and habitat. Figure 2b illustrates how stream water quality and local habitat conditions can alter an FA–E relationship. With no change in the level of flow alteration, the biological status of a stream site can fall when local stream water quality and habitat are degraded, or it can rise if restoration efforts improve conditions. Biological status can rise to a maximum limit dictated by the site's percentage flow alteration.
Poff et al. (2010) recommend classifying rivers and streams according to the hydrological characteristics of undisturbed catchments (ELOHA step 3). They reason that biological populations in the same hydrological class, and by extension their FA–E relationships, will be similar. Within a class, or river type, communities should show less variability due to natural factors, and extrapolations to unmodelled or unmonitored rivers of the same type can be made with more confidence. The results of the Middle Potomac study suggest that at least in this region, the natural factors governing flow metrics are not the same as those governing macroinvertebrate populations. Catchment size and karst geology, the two natural factors that most strongly influence undisturbed streamflows, did not greatly influence macroinvertebrate communities at undisturbed (reference) stream sites. Bioregion proved to be the strongest factor influencing macroinvertebrate metrics. This correspondence between stream macroinvertebrates and the physiographic features of bioregions and their underlying level 4 ecoregions was shown previously (e.g. Kennan, 1999; Feminella, 2000; Hawkins et al., 2000). A classification system for 1st–4th order streams that reflects the influence of all three factors, catchment size, karst geology and bioregion, would be untenable in study area's complex topography of ridges, valleys and plateaus. Below their headwaters, Potomac streams have a high probability of crossing into another bioregion or flowing in and out of karst geology before they reach the macroinvertebrate sampling sites.
FA–E relationships (ELOHA step 4) were developed for stream macroinvertebrates with both the quantile regression and conditional probability methods. The quantile regression method overcomes the impacts of non-flow stressors, but does not account for natural, bioregional differences in the macroinvertebrate populations. Quantile regression results could be stronger if the method is applied to larger data sets coming from a single bioregion. For general flow management purposes, quantile regression results for the entire Middle Potomac region with no consideration for bioregion are probably adequate. Differentiating bioregional effects could become important if protective biological criteria are being developed for high-quality streams.
The conditional probability method accounts for the bioregional differences in the region's macroinvertebrate populations. However, the probability of acceptable biological status at each flow alteration level is affected by the proportion of samples degraded by non-flow factors. The conditional probability method will be most successful in streams with similar water quality and habitat conditions at all levels of flow alteration.
Macroinvertebrate responses to alteration in a given flow metric vary in strength. Of the seven macroinvertebrate metrics used to construct FA–E curves, % scrapers and the Chessie BIBI demonstrated the overall strongest declines with alteration in the duration, frequency and change flow metrics. The heightened sensitivity of % scrapers can be explained by that group's reliance on periphyton food, which is washed downstream by scouring high flows. Scrapers have demonstrated greater sensitivity to changes in both flow duration and magnitude in other studies (e.g. Kennan, Riva-Murray & Beaulieu, 2010). The Chessie BIBI index, which is bioregion specific, contains several macroinvertebrate metrics that are comparatively sensitive to flow alteration, including % scrapers in Piedmont and Ridges and the Shannon–Wiener Index in all three bioregions. The weakest FA–E relationships were seen in % clingers (quantile regression method) and % EPT (conditional probability method). The weak % EPT response possibly reflects a replacement of flow-specialist taxa by more tolerant, flow-generalist taxa. For example, the net-spinning caddis fly Macrostemum sp. that constructs firm refugia cemented to bottom substrate is more tolerant of peak flows, while free-living EPT taxa clinging to epibenthic surfaces are susceptible to dislodgement (Holomuzki & Biggs, 2000). Given the variability in macroinvertebrate responses to flow alteration, the choice of biological metric(s) will affect the criteria a jurisdiction uses to establish environmental flow standards.
The confounding influences of poor water quality, degraded stream habitat and other stressful non-flow factors on the FA–E relationships cannot be overstated. Figure 3 provides a generalised illustration of how this can affect the societal choice of an impairment threshold. Hypothetical quantile regressions for excellent quality streams (solid) and severely degraded streams (dashed) are shown. Point (a) indicates where increasing flow alteration lowers biological status below an acceptable threshold in an excellent quality stream. Point (b) indicates where the threshold is crossed in a stream degraded by poor water quality and habitat conditions. Point (b) occurs at a much lower percentage flow alteration. To accurately forecast when an unacceptable change in a biological community will occur relative to flow alteration, projections must account for all present and future environmental factors expected to impact the biological community.

Unambiguous FA–E relationships could be demonstrated in streams with no water quality or local habitat problems if those streams experience a range of flow alteration. This analysis was not carried out in the Middle Potomac study because reference-quality streams with large amounts of flow alteration were not identified in the study's 656 well-simulated catchments. Habitat conditions important to macroinvertebrates eventually deteriorate as flow alteration increases beyond a stream's intrinsic limits. Quantile regressions on the scores of several habitat metrics in the analysis data set suggest the highest habitat scores (90th percentile) begin to drop when flow alteration exceeds roughly +100% or falls below −50% in streams with no known water quality problem (unpublished analysis). Within these approximate limits, an ELOHA analysis of high-quality streams would theoretically result in a narrow band of data points parallel to and below the edge of the grey zone in Fig. 2b (e.g. Cade, Terrell & Schroeder, 1999; Cade & Noon, 2003). Regression lines through these data would define FA–E relationships unequivocally. Experimental manipulations of flow alteration in otherwise high-quality streams could achieve the same results.
In conclusion, the ELOHA approach appears to work well for streams located in a large river basin with complex topography, karst geology, few flow-altering dams, many urban areas and macroinvertebrates as the ecological response variables. The study results clearly show degradation occurring in macroinvertebrate communities as flow diverges from baseline conditions. The results could be the basis for developing and implementing specific flow alteration limits for streams in the Potomac River basin. Simulated current and baseline scenarios in each hydrological unit would be needed to quantify and manage flow alteration since stream classification proved untenable in this region. Resource managers would need to view each stream system holistically and consider all anthropogenic stressors before the relative impact of existing or future flow alteration on aquatic communities could be determined. Improvements in water quality and habitat conditions potentially could be used to ameliorate some of the biological degradation that is related to existing flow alteration. Finally, consensus on what constitutes an unacceptable level of stream biological degradation is needed before basinwide limits to flow alteration can be established. The basin's jurisdictions, including Pennsylvania, Maryland, Virginia, West Virginia and the District of Columbia, presently have their own, often differing, biological indices and criteria for determining stream impairment.
Acknowledgments
This effort was part of a larger Middle Potomac River Watershed Assessment project supported by the Army Corps of Engineers and The Nature Conservancy. Colleagues acknowledged in the project report (USACE et al., 2013) were involved in different stages of the project. We owe the following individuals special thanks for their efforts supporting the part of the project described here: Robert Burgholzer (VADEQ) for his assistance in generating flow time series for ungauged streams; Andrew Roach and Claire O'Neill of the Army Corps of Engineers, Baltimore, and Tara Moberg, Michele DePhilip, Julie Zimmerman, Kathleen Boomer, Eloise Kendy, Colin Apse, Doug Sampson and especially Stephanie Flack of The Nature Conservancy for their guidance and suggestions; Jan Ducnuigeen and Andrea Nagel (ICPRB) for their technical assistance; and two anonymous reviewers for their very constructive edits and recommendations. We thank the monitoring programmes of federal, state, interstate and local agencies and academic institutions in Maryland, Pennsylvania, Virginia and West Virginia who collected, processed and made their stream data available; the Chesapeake Bay Program who provided macroinvertebrate metrics and their HSPF model; and the United States Geological Survey and its partners who collected surface flow data at gauging stations in the Potomac and Susquehanna river basins.