Volume 20, Issue 4 pp. 924-951
Critical Review
Open Access

Recommended updates to the USEPA Framework for Metals Risk Assessment: Aquatic ecosystems

William J. Adams

Corresponding Author

William J. Adams

Red Cap Consulting, Lake Point, Utah, USA

Address correspondence to [email protected]

Search for more papers by this author
Emily R. Garman

Emily R. Garman

NiPERA Inc., Durham, North Carolina, USA

Search for more papers by this author
First published: 14 August 2023
Citations: 4

Abstract

In 2007, the USEPA issued its “Framework for Metals Risk Assessment.” The framework provides technical guidance to risk assessors and regulators when performing human health and environmental risk assessments of metals. This article focuses on advances in the science including assessing bioavailability in aquatic ecosystems, short- and long-term fate of metals in aquatic ecosystems, and advances in risk assessment of metals in sediments. Notable advances have occurred in the development of bioavailability models for assessing toxicity as a function of water chemistry in freshwater ecosystems. The biotic ligand model (BLM), the multiple linear regression model, and multimetal BLM now exist for most of the common mono- and divalent metals. Species sensitivity distributions for many metals exist, making it possible for many jurisdictions to develop or update their water quality criteria or guidelines. The understanding of the fate of metals in the environment has undergone significant scrutiny over the past 20 years. Transport and toxicity models have evolved including the Unit World Model allowing for estimation of concentrations of metals in various compartments as a function of loading and time. There has been significant focus on the transformation of metals in sediments into forms that are less bioavailable and on understanding conditions that result in resolubilization or redistribution of metals in and from sediments. Methods for spiking sediments have advanced such that the resulting chemistry in the laboratory mimics that in natural systems. Sediment bioavailability models are emerging including models that allow for prediction of toxicity in sediments for copper and nickel. Biodynamic models have been developed for several organisms and many metals. The models allow for estimates of transport of metals from sediments to organisms via their diet as well as their water exposure. All these advances expand the tool set available to risk assessors. Integr Environ Assess Manag 2024;20:924–951. © 2023 The Authors. Integrated Environmental Assessment and Management published by Wiley Periodicals LLC on behalf of Society of Environmental Toxicology & Chemistry (SETAC).

INTRODUCTION

In 2007, the USEPA issued its “Framework for Metals Risk Assessment” (Fairbrother et al., 2007; USEPA, 2007a). The document is intended to provide technical guidance to risk assessors and regulators when performing human health and environmental risk assessments of metals. The guidance pertains to a broad suite of assessments performed by the Environmental Protection Agency (USEPA) including site assessments under Superfund and chemical reviews under the Toxic Substances Control Act (TSCA) for new and existing substances. The USEPA Metals Framework document has not been updated since 2007, and there have been several advances in the science related to aquatic assessments that are now available for risk assessors in all fields of assessment. Metals possess properties that are uniquely different from organic compounds. Many metals are essential for the survival and growth of animals and/or plants, and are mainly regulated by active transport systems; moreover, their accumulation by biota as measured by bioaccumulation or bioconcentration factors is inversely related to exposure concentration (McGeer et al., 2003; Menzie et al., 2009). Since 2007, important advances in the science of risk assessment for metals have occurred, and the USEPA document no longer presents the latest scientific findings and modeling approaches in some key areas relevant to metals.

The goal of this article is to review the recent scientific findings related to metals risk assessment and suggest areas where the USEPA Metals Framework could be updated to reflect the latest science. We focus on four areas that are primarily associated with freshwater ecosystems:
  • (1)

    bioavailability of metals in aquatic systems;

  • (2)

    biodynamic models for assessing bioaccumulation;

  • (3)

    fate of metals in aquatic ecosystems; and

  • (4)

    evaluation of metal toxicity in sediments.

METHODS

The approach used to develop the findings of this article was to review the existing information in the USEPA Metals Framework Document and to supplement this by conducting literature searches from 2000 to present, evaluating review articles and interviews with key authors about their publications, and by evaluating aquatic toxicity models through hands-on use.
  • (1)

    Bioavailability of metals in aquatic systems—Our literature review focused on available bioavailability models for predicting toxicity of metals to aquatic organisms in freshwater ecosystems. The review includes extensive assessment of modifying factors of toxicity as a function of water chemistry and types of models currently available.

  • (2)

    Biodynamic models for assessing bioaccumulation—These models have been used to study the relative contribution of metals in water and food to different types of organisms and describe the kinetics of metal bioaccumulation processes over time.

  • (3)

    Fate of metals in aquatic ecosystems—Here, we focused on these advances in understanding and modeling fate processes for metals. Advances in transformation and/or dissolution, loss from the water column, application of the Unit World Model (UWM) to estimate metal fate, and fate following resuspension are discussed.

  • (4)

    Evaluation of metal toxicity in sediments—Significant advances in the science of assessing metal toxicity in sediments have occurred with new models, sediment toxicity testing methods with spiked sediments, and interpretation of results.

BIOAVAILABILITY OF METALS IN AQUATIC SYSTEMS

Metal toxicity to aquatic life is well documented and it is known that toxicity varies as a function of water chemistry, as much as 50–100-fold due to differences in testing laboratory conditions related to pH, hardness, dissolved organic carbon (DOC), and other ions in solution. Metal bioavailability is defined as an index of the rate and extent to which the metal reaches the site of toxic action (Adams et al., 2020). The science of metal bioavailability has advanced to the point where several types of bioavailability models are available with varying levels of complexity and sophistication. This has led to renewed emphasis on nuances between protectiveness versus predictiveness and ease of use (Adams et al., 2020).

In the last two decades, regulatory use of bioavailability models in the determination of water quality standards and guidelines has increasingly grown in acceptance and application (Australian and New Zealand Governments and Australian State and Territory Governments, 2018; European Union, 2013; USEPA, 2007b).
  • To date, the USEPA (2007b) has developed its water quality criteria for copper using the biotic ligand model (BLM) (Paquin et al., 2002); Canada has, likewise, used the BLM for its water quality guideline for copper (Environment and Climate Change Canada [ECCC], 2021). The USEPA used acute data with an acute to chronic ratio, whereas Canada used only chronic data and a BLM version based on chronic data.

  • Recently, the USEPA has used a multiple linear regression (MLR) model for calculating water quality criteria for aluminum (USEPA, 2018).

  • Canada has adopted water quality guidelines for lead and zinc (ECCC, 2020) using an MLR approach.

Empirically based MLR models have mechanistic foundations similar to BLMs but utilize simple equations to predict metal toxicity on a species-specific basis (Brix et al., 2017; Brix, DeForest, et al., 2020). This is the result of renewed interest in the development of easy-to-use empirical models to supplement the BLMs to predict metal bioavailability and derive protective values for aquatic life. The predictive capacity of empirical bioavailability models is based on toxicity-modifying factors (TMFs) such as pH, DOC, and hardness.

Key considerations in the design and selection of a model include ease of use, model fit to the range of TMFs for the ecosystem of interest, goodness-of-fit to existing toxicity data sets, and robust agreement of the model to measured toxicity values. When developing a new toxicity model, it should be able to detect nonlinear responses to TMFs, should demonstrate inclusion of the key TMFs, and be applicable to a broad array of species (Garman et al., 2020).

Models specifically used for regulatory purposes should be selected based on their ability to meet performance criteria as specified through a model-validation process (Brix, 2019, Brix, DeForest, et al., 2020; Garman et al., 2020; Van Genderen et al., 2020; Schlekat et al., 2020). Model validation generally involves an assessment of a model's appropriateness, relevance, and accuracy (Garman et al., 2020). Models should be validated to a level to fit a specific purpose. Understanding the advantages and limitations by considering the tradeoffs between ease of use versus complexity allows for selection of the most appropriate model.

Toxicity-modifying factors

Toxicity of metals depends on many factors including the intrinsic properties of the individual metal and its chemical speciation, and the duration, magnitude, and route of exposure. The toxic responses to metals by aquatic organisms are poorly predicted using total metal concentrations. Noted exceptions to this are aluminum and iron, where laboratory toxicity studies for these two metals indicate that toxicity correlates with total metal and not dissolved metal because the formation of the metal hydroxides and their precipitates limits the amount of metal in the dissolved form. However, the metal hydroxide complexes that form in solution and eventually precipitate are bioavailable, and their toxicity is affected by TMFs, thus allowing their toxicity to be modeled. For most other metals, while dissolved metal concentration is a better predictor of toxicity than total metal concentration, bioavailable metal concentration is the most accurate predictor of toxicity. Metal bioavailability is a function of many TMFs that affect the speciation, bioavailability, and toxicity of metals. These factors include pH, water hardness (primarily Ca and Mg ions), alkalinity, temperature, sodium, chloride, fluoride, suspended solids, and DOC. However, the TMFs that have been shown to have the strongest influence on toxicity in terms of empirical bioavailability models are pH, hardness, and DOC (Adams et al., 2020).

Toxicity-modifying factors can affect the bioavailability of metals in different ways, subsequently affecting the physiological responses of aquatic organisms (Meyer et al., 2007). One way is by metal ions competing for binding sites on organism respiratory membranes with other key ligands (e.g., competition from H+ and hardness cations, particularly Ca2+ and Mg2+) and binding to DOC. Different metals respond differently to the effects of various TMFs. This is due to metal-specific intrinsic properties such as speciation and is dependent on the type and strength of bonds (ionic or covalent) formed with the binding sites.

One of the factors for which the influence on toxicity can vary greatly from metal to metal is pH. The mechanisms by which pH can affect metal bioavailability include changes in speciation and solubility and by H+ ion competitive interactions between the metal and biotic ligands. When pH declines below 7, many metals become more soluble and dissociate, releasing free metal ions to a greater extent. For example, aluminum (Al) at pH 5 occurs almost exclusively as Al+++. As pH increases above pH 7, many metals form complexes with hydroxides and carbonates. These forms are frequently less soluble and therefore less toxic. However, increasing the pH can increase the toxicity of metals to algae due to reduced proton competition. In marine waters, which typically have a pH of ~8, metals such as Al, iron (Fe), and lead (Pb) form hydroxide and carbonate, respectively, and rapidly precipitate, resulting in very low concentrations of the metals in marine waters. Key TMFs by metal are summarized in Table 1 adapted and expanded from the US Environmental Protection Agency (2022).

Table 1. Metal toxicity-modifying factors and their importance
Metal Importance of modifying factor References Important modifying factors
Hardness pH DOC
Aluminum Hardness moderately modifies Al toxicity; pH is critical; toxicity is greatest at low pH (5) and elevated pH (>8.5) pH; DOC consistently reduces Al toxicity. Solubility minimum around pH 6.0. DeForest et al. (2018), DeForest, Brix, et al. (2020) X X X X X
Cadmium Hardness regressions predict acute and chronic toxicity in natural waters; pH effects are weak; and the threshold for a DOC effect appears to be >5 mg/L.

Clifford and McGeer (2010), Mebane et al. (2008), Niyogi et al. (2008), USEPA (2016)

X X X X X
Cobalt Hardness is important and pH moderately affects gill uptake, with uptake increasing with increasing pH up to 8.7. DOC has a small effect. Borgmann et al. (2005), Diamond et al. (1990), Stubblefield et al. (2020) X X X X
Copper freshwater DOC has a strong binding affinity to Cu and significantly reduces Cu toxicity. pH shift from acid to alkaline reduces the toxicity of Cu (larger LC50)the opposite is also true. Hardness is a small factor in natural waters on a chronic basis but important on an acute basis. Cusimano et al. (1986), Erickson et al. (1996), Markich et al. (2005), Welsh et al. (2000) X X X X X X
Copper marine Increasing DOC and salinity tend to reduce Cu toxicity in marine and estuarine waters. In marine waters, pH is stable (~8.0) and Ca and other ions are constant; hence, DOC is the driving factor. Arnold (2009), Grosell et al. (2007) X X X
Iron DOC is the most important variable for fish, Daphnia, and algae. A DOC increase in low-pH waters (5.9–6.4) results in a 10-fold reduction in algal toxicity. Ceriodaphnia dubia was influenced by DOC, but not by pH. Algae were not affected by hardness. Algae are affected by phosphorus (P) and iron precipitation of P. Arbildua et al. (2016), Brix et al. (2023), Cardwell et al. (2023) X X X X X
Lead DOC has a strong effect on the bioavailability and toxicity of Pb; pH has a moderate effect. Hardness becomes more important when DOC is low (Mebane et al., 2012). DeForest et al. (2017) X X X X X
Nickel Ni toxicity tends to moderately decrease as hardness and DOC increase; pH has an inconsistent influence on toxicity (Croteau et al., 2021; Santore et al., 2021). Croteau et al. (2021), Santore et al. (2021) X X
Silver Ni binds strongly to particles, sediments, and reduced sulfides in solution. DOC significantly reduces toxicity, but pH and hardness influences may be inconsistent. Ni BLM is an acute BLM. Naddy et al. (2018) X
Zinc Hardness has a strong influence on Zn toxicity, decreasing toxicity with increasing hardness; fish toxicity increases with increasing pH, but the relationship may be inconsistent in other taxa. DOC reduces Zn toxicity, but some studies suggest a nonlinear relationship, with a threshold of 10 mg/L DOC required to substantially reduce toxicity. Bringolf et al. (2006), Canadian Council of Ministers of the Environment (CCME) (2018), Clifford and McGeer (2009), De Schamphelaere and Janssen (2004), Hyne et al. (2005), Ivey et al. (2019), Mebane et al. (2012) X X X X X X
  • Abbreviations: BLM, biotic ligand model; DOC, dissolved organic carbon.
  • Source: Adapted from the USEPA (2022).

Toxicity-modifying factors other than DOC, pH, and hardness have been evaluated and incorporated, to some extent, in BLM models. Temperature is a factor that has been evaluated for aluminum, and the kinetics underlying aluminum bioavailability has a strong dependency on temperature (Santore et al., 2018). Existing data for other metals such as nickel, copper, and zinc do not show the same magnitude of correlations between temperature and chronic toxicity (Pereira et al., 2017). However, temperature effects on toxicity have not been thoroughly evaluated. To some extent, this is the result of standardized tests that prescribe a given temperature for a test species. Aluminum also appears to be influenced to some degree by fluoride (Santore et al., 2018). Particulate organic carbon (POC) in suspended solids is also an important TMF due to the sorption of metals to POC. When considered, POC is frequently treated like DOC, or the binding strength to solids is estimated by the soil to particle distribution coefficient (Kd). Sodium is another important TMF due to competition at metal binding sites on the respiratory membrane (i.e., gill) of the organism and because of the need for freshwater aquatic organisms to conserve sodium. While not the focus of this article, anions such as phosphate, nitrate, and sulfate are important TMFs for the oxyanions (arsenate, selenate, molybdate, vanadate, uranate).

Bioavailability models

A variety of methods to account for the relative bioavailability of metals in aquatic systems have been developed including hardness adjustment, water–effect ratio, Free Ion Activity Model, BLM (Paquin et al., 2002), and MLRs (Adams et al., 2020). At the time of the issuance of the USEPA Metals Framework in 2007, the acute copper and silver BLM models were available. The USEPA adopted the BLM for its water quality criteria document for copper (USEPA, 2007b). The approach was based on using an acute copper BLM with an extensive amount of acute data and a calculation of chronic effects using an acute to chronic ratio for copper. This is the only US water quality criteria document in which the BLM approach is applied. Subsequently, Canada adopted a chronic BLM for its national copper guideline (Canadian Council of Ministers of the Environment [CCME], 2021), and the European Union (EU) has adopted a bioavailability-based nickel Annual Average Environmental Quality Standard (EQS) under the Water Framework Directive (European Commission [EC], 2011). Finally, the updated USEPA aluminum water quality criteria document has adopted a chronic MLR approach in the derivation of the criteria (USEPA, 2018) based on DeForest et al. (2018) and Deforest, Brix, et al. (2021).

Biotic ligand models are based on the principle that toxicity occurs by accumulation of the metal-bound exchange surface of the organism. MacRae (1994) first demonstrated this approach by measuring copper metal accumulation on fish gills in acute studies as a function of water chemistry. The BLM approach successfully combines the influences of speciation (e.g., free metal ion, DOC complexation) and cationic competition (e.g., K+, Na+, Ca2+, Mg2+) to predict metal accumulation and toxicity to aquatic organisms (Adams et al., 2020).

Most BLMs are models with a single set of binding constants and other parameters that are applied in a consistent manner to the acute and/or chronic toxicity data (Brix, DeForest, et al., 2020). The model, in theory, is only adjusted for differences in metal sensitivity among species for a given endpoint and exposure duration. Consequently, adjustments are made to the sensitivity parameter (i.e., LA50-lethal accumulation for 50%), rather than two independent models, to deal with both the acute and chronic toxicity.

A common misconception is that there is one BLM for each metal that can be used for all species to account for either bioavailability differences in water chemistry in which the organisms were tested or to develop a threshold for effects in each water body. However, experience has demonstrated that not all organisms respond the same way to water chemistry changes, especially pH. For example, algae often respond differently than fish and invertebrates—with toxicity decreasing (lower EC values) at higher pH values rather than at lower pH levels. This has necessitated the need to develop BLMs specific to a given group(s) of organisms (Mebane et al., 2020). To normalize a given data set to a standard set of water chemistry conditions, separate BLM calculations may be needed for different groups of organisms. A list of BLMs by metal and the organism groups to which they apply is provided in Table 2 (expanded from the USEPA, 2022). To simplify things, the various software packages typically have multiple models in one computer package, allowing for calculations for algae, invertebrates, fish, and occasionally plants and amphibians.

Table 2. MLRs for predicting metal toxicity to aquatic organisms
Metal MLR model Equation
Aluminum The Akaike information criterion (AIC) and the Bayesian information criterion (BIC) were used to determine which combination of terms resulted in the most parsimonious models for predicting EC20s for each species. Pseudokirchneriella subcapitata and Pimephales promelas EC20s were based on the AIC and BIC; Ceriodaphnia dubia was based on AIC. All three models included interaction terms. The ​ USEPA (2018) approach did not use the interaction terms. No acute model was developed.

P. subcapitata: Al ln(EC20) = −61:952 + 1:678 ln[DOC] + 17:019 × pH + 4:007 × ln[H] − 1:020 × pH2 − 0:204 × (ln[DOC] × pH) − 0:556 × (ln[H] × pH).

C. dubia: Al ln(EC20) = -41:026 + 0.525 × ln[DOC] + 11:282 × pH + 2:201 × ln[H] − 0:663 × pH2 − 0:264 × (ln [H] × pH).

P. promelas: Al ln(EC20) = −14:029 + 0:503 × ln[DOC] + 3:131 × pH + 3.443 ×  ln[H] − 0.494 × (ln[H] × pH)

DeForest et al. (2018).

Copper Pooled model. Following USEPA guidelines for WQC development, species data were then combined to develop a linear model with pooled slopes for each independent parameter (i.e., DOC, pH, and hardness) and species-specific intercepts using analysis of covariance.

Acute: Exp(0.700 × ln(DOC) + 0.579 × ln(Hard) + 0.778 × pH − 6.738).

Chronic: Exp(0.855 × ln(DOC) + 0.221 × ln(Hard) + 0.216 × pH − 1.402).

Brix et al. (2021).

Lead A pooled chronic model was developed using C. dubia and L. stagnalis to represent invertebrates and a species-specific P. promelas model to represent fish. A separate chronic MLR model was also developed for an alga (Raphidocelis subcapitata).

Acute: CMC (μg Pb/L) = exp(0.776 × ln(DOC) + 0.437 × ln(Hard) + 0.445 × pH − 1.150); DeForest, Tear, et al. (2020).

Chronic: C. dubia = exp(0.865 × ln(DOC) + 0.131 × ln(Hard) + 0.206 × pH + 0.921).

Chronic: L. stagnalis = exp(0.865 × ln(DOC) + 0.131 × ln(Hard) + 0.206 × pH – 1.109).

Chronic: P. promelas = exp(0.433 × ln(DOC) + 1.428 × ln(Hard) + 0.146); DeForest, Tear, et al. (2020).

A pooled MLR lead model with data for five different species, with DOC and a hardness variable R2 value of 0.72, is currently used in Canada (ECCC, 2020) FWQG = exp(0.514[ln(DOC)] + 0.214[ln(Hardness)] + 0.4152).
Nickel

Acute model = pooled stepwise regression model with only DOC and hardness.

Chronic model: Pooled all group stepwise regression that produced the same model results whether interactions were considered and whether AIC and BIC were used. The model selected included only hardness and DOC.

Acute: ECxg/L) = e0.454 × ln (hardness) + 0.084 × ln(DOC) + intercept.

Chronic: ECxg/L) = e0.471 × ln (hardness) + 0.147 × ln(DOC) + intercept

Croteau et al. (2021).

Pooled Daphnia model used to standardize toxicity values for all species with acute data to common water chemistry. The Oncorhynchus mykiss model (hardness and pH) was used to standardize chronic toxicity values for all species to common water chemistry.

Acute: EC50 = exp[ln(EC50meas) – 0.240(ln[DOCmeas] − ln[DOCtarget]) − 0.833(ln[hardmeas] − ln[hardtarget])].

Chronic: EC10 = exp[ln(EC10meas) − 0.995(ln[hardnessmeas] − ln[hardnesstarget]) + 0.847(pHmeas − pHtarget)] CCME (2018).

Zinc Pooled model. Following USEPA guidelines for WQC development, the species data were combined to develop a linear model with pooled slopes for each independent parameter (i.e., DOC, pH, and hardness) and species-specific intercepts using analysis of covariance.

CWQG = exp(0.995[ln(hardness)] − 0.847[pH] + 4.932).

De Schamphelaere and Janssen (2004).

  • Abbreviations: BLM, biotic ligand model; DOC, dissolved organic carbon; MLR, multiple linear regression; WQC, water quality criteria.
  • Source: Expanded from the USEPA (2022).

Recently, “user-friendly” bioavailability tools and/or models have been developed. These models generally only require DOC, pH, and hardness as input parameters compared with BLMs, require less training, and allow for rapid assessment of large data sets (Peters et al., 2011). These models have been used for rapid assessments and to determine compliance with EU EQS.

Examples of user-friendly models include Bio-met, PNEC-pro, and MBAT:
  • (1)

    Bio-met bioavailability tool, Ver 2.3—this is a “look-up table”—developed by Arche Consulting, WCA Environment (2014).

  • (2)

    PNECPro, Ver 6—this an “algorithm”-based tool in MS Excel format developed by Deltares in the Netherlands.

  • (3)

    M-BAT, Ver 31—an “algorithm”-based tool in MS Excel developed by WCA Environment (Water Framework Directive—UK Technical Advisory Group, 2014).

One barrier to the use of BLMs is the technical complexity of the approach, in addition to the transparency of the algorithms, and the large number of water chemistry parameters required (USEPA, 2022). An alternative approach, the MLR model approach, has recently been developed for many cationic metals. These models have begun to replace BLMs due to their ease of use and access in an excel format (Peters et al., 2011; USEPA, 2018). The approach is similar to existing toxicity adjustments for hardness that pool data across species (USEPA, 1985). Multiple linear regression models fit toxicity responses to several variables, usually pH, DOC, and hardness, but other variables can be modeled. Multiple linear regression models frequently pool the toxicity data to provide a single model for multiple species; however, they may be developed for a single species. For example, Brix et al. (2017) indicated that for copper, they followed USEPA (1985) guidelines for calculating a hardness adjustment equation, where the species data were combined to develop a linear model with pooled slopes for each independent parameter (i.e., DOC, pH, and hardness) and species-specific intercepts using analysis of covariance.

Early use of MLRs for describing the effects of water chemistry on metal toxicity occurred in the 1980s (Erickson et al., 1987a1987b). More recently, MLRs were developed for lead (Esbaugh et al., 20112012) and copper (Brix et al., 2017)—the latter using a single equation based on the pooling of data across species. The Brix et al. (2017) paper helped establish the fundamental principles used in the development of MLRs for subsequent models as they published the first multispecies MLR for copper in a framework suitable for use in setting environmental quality standards. Subsequently, MLR models for aluminum, lead, nickel, and zinc have been developed (Croteau et al., 2021; DeForest, Tear, et al., 2020; Santore et al., 2018; Peters et al., 2011; CCME, 2020; DeForest et al., 2023, respectively).

Ambient water quality criteria for metals, and most chemicals, are developed from species sensitivity distributions (SSDs), with the 5th percentile of the SSD typically used to define the criterion. Multiple linear regression models that predict metals toxicity as a function of key TMFs (namely, DOC, hardness, and pH) can be used to normalize the SSD and the resulting criterion to site-specific conditions. Multiple linear regression-based criteria may be derived from a single pooled model that is applied to all species in the SSD or the criteria may be derived from two or more MLR models (e.g., an invertebrate MLR model may be applied to invertebrates and a fish MLR model may be applied to fish). If a pooled MLR model is selected for criteria development, the MLR-based criteria can be expressed as a single equation (analogous to the USEPA's hardness-based criteria). If two or more MLR models are selected for criteria development, the TMF slope coefficients are applied to their respective species in the SSD. For the latter case, the MLR-based criterion cannot be reduced to a single MLR equation. Rather, the MLR models are used to update the SSD, and the associated 5th percentile, for the TMF conditions of interest. The use of two or more MLR models to develop criteria is conceptually more complex, but simple Excel-based tools can be developed to automate the process.

A mixture of pooled MLR criteria equations and species-specific MLR equations is presented in Table 2. A single pooled equation is presented when it exists for a given metal (see copper, lead [acute only], nickel, and zinc equations). For aluminum and lead (chronic only), species-specific equations are provided for normalizing the invertebrate and fish toxicity values to a common TMF condition. In these cases, the TMF slope coefficients in the equations would be used to adjust the toxicity values for other species in the SSD. For the chronic lead model, note that a pooled Ceriodaphnia dubia and Lymnaea stagnalis MLR model was recommended for invertebrates. Multiple linear regression models for both species are shown in Table 2, but note that the TMF slope coefficients are the same and only the intercept differs. The intercepts reflect differences in the sensitivity of these two species to lead; however, to be clear, only the TMF slope coefficients from the pooled model are necessary for adjusting the SSD to the TMF conditions of interest (it is not necessary to calculate the intercept for each species in the SSD).

As noted, once a single pooled MLR equation or multiple MLR equations are used to adjust the SSD to the TMF conditions of interest, the 5th percentile of the SSD is typically used to derive the criterion. In the United States, the statistical approach of Erickson and Stephan (1988) is used to calculate the 5th percentile. The regression analysis is typically driven by the four most sensitive genera in the sensitivity distribution. Depending on the sample size, the 5th percentile may be interpolated among the four most sensitive genera or extrapolated below the most sensitive genera. The approach uses a triangular distribution that represents a censored statistical approach that improves estimation of the lower tail with the assumption that the shape of the whole distribution is uncertain. User-friendly Excel programs are used to calculate the 5th percentile or site-specific criterion with only the need for site-specific DOC, pH, and hardness data.

The main disadvantage of MLRs is that they do not explicitly model metals speciation or the binding affinity of the metal for the biotic ligand receptor within the model, but instead are based on empirical observations. Biotic ligand models can help to inform the mechanistic underpinnings of the MLRs.

Model validation

The use of bioavailability models for regulatory purposes should be contingent on meeting performance criteria as specified through a model-validation process (Garman et al., 2020). Garman et al. (2020) identified three types of validation: (1) auto-validation, (2) independent validation, and (3) cross-species validation. Auto-validation is an evaluation of how well the model describes the toxicity data set from which the model was calibrated. Independent validation is an evaluation of how well the model developed from one toxicity data set predicts toxicity in another data set. Cross-species validation evaluates the ability of a model developed for one or more species to predict toxicity for a species that is not in the model.

Brix, DeForest, et al. (2020) and Garman et al. (2020) provide a general approach for model validation. From this general approach, that of Brix et al. (2021), who compared the copper BLM with the copper MLR model, the following four performance evaluations are recommended: (1) Develop a 1:1 plot of observed versus predicted effect concentrations (ECx). Include a diagonal trend line denoting a 1:1 agreement, with corresponding lines indicating a factor of ±2 agreement, and provide the model (adjusted) R2 (Figure 1). (2) Evaluate model residuals of observed versus predicted values to identify systematic biases in the model predictions (e.g., underprediction of toxicity at lower observed ECx values and overprediction of toxicity at higher ECx values). Residuals should be evaluated for all TMFs considered in the model (DOC, pH, hardness) to identify the parameter(s) responsible for any observed patterns in residuals. (3) Develop cumulative probability distributions (CPDs) of observed and/or predicted ECx values and factor-of-agreement plots to expand on information in the 1:1 plot. (4) Develop a comparative CPD using the geometric mean of all toxicity values for a species without bioavailability adjustment. These plots provide a visual comparison of the two data sets characterizing the percentage of data that are over- or underpredicted by the models as well as a simple way to evaluate the percentage of toxicity predictions that are within a given factor of agreement.

Details are in the caption following the image
Comparative 1:1 plots of the pooled acute Cu multiple linear regression model and the biotic ligand model using auto-validation data sets for Daphnia magna. Solid line = line of perfect agreement between observed and predicted median effect concentrations. Dashed lines indicate a factor of ±2. BLM, biotic ligand model; LC50, median lethal concentration; MLR, multiple linear regression; RFx,2.0, percentage of data within a factor of 2 using the slope rating formula. Source: Adapted from Brix et al. (2021)

Comparative performance of BLM and MLR models

The USEPA signed a Cooperative Research and Development Agreement (CRADA) with eight metals associations (Aluminum Association, Aluminum REACH Consortium, Cobalt Institute, International Copper Association, Copper Development Association, International Lead Association, International Zinc Association, NiPERA Inc.) in December 2017. The Agency's goal was to engage external technical experts from the metals associations to develop a modeling approach to predict the bioavailability and toxicity of metals in aquatic environments under a range of common water chemistry conditions. The results of the CRADA efforts are summarized here and by the USEPA (2022).

A key part of the initial phase of the CRADA was to make a head-to-head comparison of the BLM with the MLR model for two metals with chronic data (i.e., copper and aluminum), with subsequent comparisons of BLMs with MLR models for lead and nickel. Copper and aluminum were selected because there were existing bioavailability models for both metals and chronic data. Copper represents a metal with a large toxicity data set and aluminum with a much smaller toxicity data set. Brix, Tear, et al. (2020) and Brix et al. (2021) compared the BLM and MLR models applied to the aluminum and copper data sets. The goal of the comparison was to evaluate the bioavailability model performance and to determine how the BLM and MLR model results compare.

As a general conclusion, both models worked well for both metals. They observed that the chronic Al MLR model performs substantially better than the Al BLM across a range of metrics. However, there were substantial differences between models in the resulting water quality criteria as a function of water chemistry. It was to be expected that the Al MLR would fit the data set somewhat better than the BLM as the MLR model was built using the same data set used to establish the USEPA water quality criteria (USEPA, 2018). For copper, Brix et al. (2021) found that the acute Cu MLR and BLM performances were quite comparable, although there were differences in performance on a species-specific basis and in the resulting water quality criteria as a function of water chemistry. The chronic Cu MLR performed distinctly better than the chronic BLM. Observed differences in performance were due to the smaller effects of hardness and pH on chronic Cu toxicity compared with acute Cu toxicity. These differences were captured in the chronic MLR model but not the chronic BLM, which adjusts only for differences in organism sensitivity. A question that usually arises is as follows: when there are different water quality criteria predicted by two models, which is more accurate? One way to assess this is to examine the fit of the model to the toxicity data at the lower end of the SSD. As new models are developed, Brix et al. (2021) recommend development of both modeling approaches because they provide useful comparative insights into the strengths, limitations, and predictive capabilities of each model.

Multimetal BLMs (mBLM)

The recognition that metals in aquatic ecosystems usually occur as metal mixtures has led to the development of bioavailability models that are designed to predict the toxicity of the mixture as a function of water chemistry. The same concepts built into the BLM lend themselves to the development of models that predict toxicity of multiple metals acting upon receptor sites. In general, based on interactions between the dissolved metal and various ligands in solution, BLMs calculate metal concentration at a biotic ligand (e.g., fish gill) and predict toxicity. In the past, approaches to assessing metal mixtures have focused on evaluating whether the metals are additive, less than additive, or more than additive (Meyer et al., 2015; Vijver et al., 2011). To assist with the development of models for metal mixtures, a modeling evaluation project was initiated (Van Genderen et al., 2015). Four separate research groups developed multimetal models and utilized the model to predict toxicity for several data sets (Farley & Meyer, 2015a).

The four models developed for the project were used to develop a single streamlined multimetal model (Farley & Meyer, 2015b), demonstrating that predictions of toxicity due to metal mixtures can be predicted. Santore and Ryan (2015) developed a multimetal, multiple binding site version of the BLM (i.e., mBLM). The model can predict toxicity using dissolved metal as well as mBLM normalized bioavailable metal. Individual BLMs for cadmium, copper, lead, and zinc were combined to develop the mBLM to predict the toxicities of mixtures of these four metals. The mBLM is an advancement from single-metal BLMs in that it considers the influence of the interactions of metals of cadmium, copper, lead, and zinc as well as aluminum and iron with various aqueous ligands.

For given data sets, the toxicity predictions indicate that mixtures are slightly more than additive, whereas on an mBLM normalized basis, the mixtures are additive or slightly less than additive. The difference is because the mBLM accounts for the interaction of the metals with the ligands in solution and adjusts for bioavailability.

While multimetal models now exist, they have primarily been used in risk assessments or to assist in research projects as opposed to regulatory applications such as water quality standards. The mBLM has recently been used to assess toxicity of metal mixtures in sediment porewaters (Santore et al., 2022). The authors used this model to help determine the suitability of metal porewater measurements obtained by dialysis (peepers) versus centrifugation.

Use of bioavailability models in risk assessment and other applications

The BLM and MLR models are most frequently associated with the development of water quality criteria or guidelines; however, there are other potential uses including risk assessments, effluent permitting, evaluation of impaired water bodies, and assessment of stormwaters. Risk assessment, either retrospective or prospective, frequently requires the assessment of metal concentrations in water bodies associated with contaminated sites, acid mine drainage, or pesticide runoff. For example, arsenic (cotton) and copper (grapes) have been used as pesticides with unintentional runoff to local waterbodies. Metals are associated with multiple superfund sites, producing surface and groundwater contributions to nearby steams. These models are useful in assisting with risk determinations.

Under the US Clean Water Act (2002), section 303d, each of the states is required to identify water bodies that do not meet the established water quality standards for the respective state. A mass loading study is then required, and steps are to be taken to bring the water body into compliance. The 303d determinations are frequently based on total metal or dissolved metal in-stream concentrations. Bioavailability models assist in the determination of the extent, if any, of the exceedance of the water quality standard and in the mass loading allocation. These models have also found use in assessing metals in stormwater and receiving waters with the potential to impact fish populations. For example, this has been demonstrated in assessing the potential for Cu to impair olfactory sensory detection (Meyer & Adams, 2010; Meyer & DeForest, 2018) in surface waters. Another example is the use of the BLM to identify concentrations of Cu in Panther Creek, Idaho, below the Blackbird mine that would allow for the recovery of fish populations in the Panther Creek after installation of a lime treatment system to control acid mine drainage (Mebane et al., 2015).

BIODYNAMIC MODELS FOR ASSESSING BIOACCUMULATION

Biodynamic (bioaccumulation) modeling was only briefly covered in the USEPA Metals Framework in 2007. The approach has evolved, many more metals and organisms have been tested, methods have been developed to identify cellular locations of metals, and efforts have been made to use the approach to assess toxicity in addition to bioaccumulation. Biodynamic modeling describes the uptake, distribution, and elimination of a substance over time (Ashauer & Escher, 2010). This is in contrast to equilibrium-based models, such as the BLM, which assume that reaction rates or conditions remain constant over time, and that assume the exposure route is primarily water. Biodynamic modeling provides a means of assessing metal accumulation via various pathways including both water and diet. The biodynamic approach also provides a means to assess the metal variability in tissue among species (Luoma & Rainbow, 2005). This approach utilizes data on assimilation efficiencies of metals from the diet as well as species-specific accumulation and elimination kinetics. Biodynamic modeling allows for identification of the key route(s) of exposure and for comparison among species (Stewart et al., 2004).

In general, the biodynamic modeling approach has not been used to predict toxicity. However, there has been an effort to link the biodynamic modeling approach to toxicity by Buchwalter et al. (2007) and Khan et al. (2015). The models have also been used to compare the bioaccumulation of several metals in freshwater and marine invertebrates using a one-compartment approach and two uptake routes, that is, water and diet (Luoma & Rainbow (2005). Their results demonstrated the applicability of the approach and explained the large differences in metal tissue concentrations in terms of differences in uptake and elimination rates and the relative importance of water and diet as sources of metal. The dynamic modeling approach has also been used to model metal transfer from one trophic level to another and to predict metal accumulation in food webs (Van Campenhout et al., 2009; Rainbow et al., 2009).

While kinetics of uptake and bioaccumulation are key features in metal risk assessment, whole-body or specific organ concentrations are not necessarily predictive of metal toxicity. For instance, the critical body residue concept works for some species but not for others and is most applicable to mercury and selenium (Adams et al., 2011). The general lack of correlation of tissue accumulation with toxicity is explained by the fact that once inside the organism, a metal enters a biological system in which metals bind with different biological ligands and can be stored in inert pools (Gao et al., 2015). Additionally, the lack of observed toxicity in biodynamic studies is due to short-term studies (e.g., on the order of hours, not days) and are designed to assess the kinetics of uptake (and depuration) via dietary and water exposure; they are not designed to solicit toxic responses. Techniques have been developed to identify metals in biologically active or metabolically available pools and inactive or detoxified pools using tissue and subcellular fractionation techniques (Liao et al., 2011; Wallace & Lopez, 1997). This information on accumulation of metals in sensitive and less sensitive pools can be incorporated into biodynamic models to allow for the overall disposition of the metal in an organism. Metal toxicity occurs when the binding or accumulation of a metal to one or more key biotic ligands reaches a critical threshold. Selenium is the classic example because fish and amphibians preferentially deposit selenium in their ovaries, resulting in deformed embryos when the concentrations are sufficiently high (Chapman et al., 2010).

As mentioned, there are limited examples of toxic effects reported with the development of biodynamic models. One example where the experimental design allowed for toxicity as well as bioaccumulation assessment was conducted with copper sorbed to hydrous iron oxide (Cain et al., 2016). The experimental design used copper particle concentrations that spanned nearly a factor of 1000. Results showed that copper influx and ingestion rates decreased as copper concentrations increased above 104 nmol/g. This was accompanied by lysosomal destabilization and lysosomal membrane damage.

Other examples of biodynamic modeling of metals include the bioaccumulation of arsenic and zinc by polychaetes (Casado-Martinez et al., 2009; Rainbow & Luoma, 2011, respectively); bioaccumulation of lead by a freshwater amphipod (Urien et al., 2015); uptake of cadmium by a freshwater bivalve (Pyganodon grandis) (Cooper et al., 2010); biodynamic understanding of mercury accumulation (Wang, 2012); and bioaccumulation of arsenic and silver by caddisfly larvae (Awrahman et al., 2015).

Kinetic approaches can also be used when performing measurements of metal concentrations and potential bioavailability. For instance, diffusive gradients in thin films (DGT) and Chelex columns can be used to assess metals in water and sediments when the rate-determining step is diffusion of the metal from the complex at the membrane, rather than uptake into the organism, as may be the case with chronic exposure to low metals concentrations (Gillmore et al., 2020).

Overall, biodynamic modeling can be a useful approach to understand and predict the uptake rates and accumulation of metals in aquatic organisms and inform the exposure assessment. Compared with aqueous bioavailability models (BLM/MLR), it requires more detailed information for parameterization. The coupling of metal uptake and accumulation to toxicity requires further research to fully exploit the power of the dynamic modeling approach and potentially adapt it to the development of toxicity and water quality criteria.

FATE OF METALS IN AQUATIC ECOSYSTEMS

The USEPA Framework for Metals Risk Assessment discussed in detail factors associated with metal speciation, sorption, complexation, precipitation, and the importance of pH and redox. In the years since the issuance of the USEPA Framework, there have been advances in the science related to the fate of metals in aquatic environments, particularly with regard to measuring metal dissolution and/or transformation and modeling the fate, transport, and resuspension of metals. Models have emerged that allow for integrating the processes that determine metal loss from the water column and the ultimate fate in sediments. Here, we focus on these advances in understanding and modeling fate processes for metals. In the context of risk assessment, understanding the fate of metals in aquatic ecosystems allows for improved exposure assessments in the water and sediment compartments and can be used to inform the linkages to be selected in the conceptual model.

Transformation–dissolution

For proper exposure assessment, an understanding of the solubility of metal substances is needed. A common misunderstanding in assessing the risk of metals is to assume that they are infinitely soluble and/or 100% bioavailable. Toxicity tests usually use soluble metal salts to determine the toxicity of the metal. However, metals released into the environment frequently are rarely in that form. This is especially true for contaminated sites near mining, smelting, and refining operations. The transformation–dissolution protocol (TDPs) was developed to provide the means to help determine whether a metal of interest will go into solution and the rate and extent to which that will occur. In aquatic environments, massive metals and sparingly soluble metal compounds dissolve at different rates depending on the intrinsic properties of the metal and the environmental water chemistry. The rate and extent of dissolution are critical to understanding the rate of metal ions' release in solution. Data generated from the TDP tests (concentration of metal in solution) can be compared with ecotoxicity reference values to determine the short- and long-term potential aquatic hazard of the metal-bearing substance for both hazard classification and risk assessment purposes.

Efforts began in the late 1990s as part of the Organization for Economic Co-Operation and Development (OECD, 2001) to develop a globally harmonized system of classification and labeling and to develop a reliable method for measuring metal dissolution. This led to the development of a TDP (Skeaff & King, 1997). The protocol was adopted by OECD and included in Annex 10 of the United Nations Globally Harmonized System of Classification and Labeling of Chemicals (United Nations [UN], 2005). The method is now used worldwide for assessing metal and metal alloys transformation. With the passage of the EU regulations dealing with the Registration, Evaluation, Authorization and Restrictions of Chemicals (REACH) in 2006, there was a need for risk assessment to be performed for substances produced and/or imported over 1 ton. Hence, most metals have been subjected to a TDP evaluation. Thus, there are data in the REACH metal dossiers on common metal substances. The TDP approach requires that a standard means of assessing the extent of dissolution be developed and related to the mass of the material placed in solution. The testing approach utilizes an aqueous medium simulating natural surface water and testing at pH of 6, 7, and 8 for periods of 1–28 days, with the testing solution being constantly mixed at 100 rpm. It is now recognized that the best approach to interpreting the dissolution data is to base the calculation on the surface area of the metal being tested rather than on the mass of material placed in solution (Skeaff et al., 2000). Surface area is an intrinsic property of a metal and has a much greater influence on the resulting dissolution than the mass of the metal being tested.

The outcome of the TDP testing is frequently used to determine the environmental hazard classification of a metal substance according to the UNGHS, where the concentration in solution after 7 or 28 days is used to determine whether the solution will be acute or chronically toxic to fish, invertebrates, or algae. In Figure 2, it can be seen that the nickel alloy being tested does not dissolve sufficiently to cause acute toxicity to aquatic organisms. In this example, one can also calculate the critical surface area needed to release sufficient nickel to cause acute toxicity. This allows for a calculation of the surface area and/or particle size required to result in sufficient ions in solution to cause toxicity.

Details are in the caption following the image
Net Ni(aq) as a function of the total Inconel alloy surface area loading at pH 6 after seven days. Source: Adapted from Skeaff et al. (2000)

Another example of measuring dissolution as a function of surface area using a copper wire was demonstrated by Rodriguez et al. (2007) (Supporting Information: Figure S-1). It was observed that abrasion needed to be avoided in the test to avoid elevated concentrations of copper in solution; therefore, plastic wheels were used that suspended the copper wire. The surface area of the wire was measured and used to calculate the extent of dissolution (Figure 3), resulting in a linear relationship. In contrast, if one were to plot dissolution as a function of loading, the relationship would be curvilinear, indicating that loading by itself cannot account for dissolution. Thus, it is possible to maintain the loading (mass) constant and achieve different amounts of metal in solution if the surface areas are different. This is critical to understanding the dissolution of sparingly soluble metals.

Details are in the caption following the image
Linear relationship between dissolved released copper at pH 6.0 for seven days and the surface loading. Source: Adapted from Rodriguez et al. (2007)

Loss from the water column

The principal advancements in the science since USEPA published its Framework for Metal Risk Assessment in 2007 have been in areas of understanding of rate processes in aquatic ecosystems affecting metal bioavailability, speciation changes, sorption kinetics, and the ability to integrate these processes in a predictive model that includes both the water column and the sediments. In this section, a brief review of recent developments regarding metal loss from the water column is provided.

Metal loss from the water column is a natural process that reflects transport to sediments via sorption to particulate and/or biological matter. Metals are continually being lost from the water column by transport to the sediments in all aquatic ecosystems (Adams, 2020). This process has been depicted by calculating the loss of metals in the ocean (Figure 4). Metals are continually carried to the ocean, and one might expect the concentrations to increase over time. However, it has been shown that for metals other than sodium, they are continuously being transported to the sediments and the resulting concentrations are much lower than would be predicted if they were infinitely soluble—by a factor of 10 000 or more (Di Toro, Allen, et al., 2001).

Details are in the caption following the image
Predicted versus actual metal concentrations in the ocean. Source: Adapted from Di Toro, Allen, et al. (2001)

The rate of transport of metals to the ocean floor is slow in relation to the depth of the ocean, on the order of a few meters per day. However, in shallow marine ecosystems such as estuaries or in freshwater ecosystems, a transport rate of 1–2 m/day can be significant in terms of a removal mechanism and reduction in exposure for water column organisms. Detailed studies with several metals (e.g., Cu, Ni, Zn, Pb, Ag) indicate that they are very particle reactive, with strong affinities for suspended solids, Fe oxides, sulfides, organic ligands, and algae (Danner et al., 2015; Moffett & DuPont, 2007; Rossi & Jamet, 2008; Stumm & Morgan, 1996).

Multiple processes are interacting to reduce exposure in both the water column and the sediment compartments. These include sorption to particles as well as iron and manganese oxyhydroxides and subsequent settling, binding with acid-volatile sulfides (AVS), changes in speciation to less reactive forms, covalent bonding with other elements, burial, resuspension, and re-oxidation. Each of these processes is rate limited by the intrinsic properties of the specific metal, and ecosystem-specific water and sediment chemistry. These factors, and the fact that it is known that free metal ions (toxic form of common metals) are very reactive (seconds to minutes) with various inorganic and organic ligands, have led to the development of the UWM. This model integrates the above-mentioned fate processes and allows for estimation of exposure concentrations in various environmental compartments in aquatic ecosystems.

The UWM concept (Adams & Chapman, 2006) was developed to model these rate processes and allow for estimates of exposure both in the water column and in the sediments (Supporting Information: Figure S-2). This is somewhat analogous to the development of the BLM, in that both models allow for assessment of competing binding phases (MFs) and ultimate exposure of the metal to various aquatic organisms. The Tableau Input Coupled Kinetics Equilibrium Transport (Ticket) UWM (Farley et al., 2011) “is based on a computation that allows for simultaneous consideration of dissolved and particulate phase transport; metal complexation to organic matter and inorganic ligands; precipitation of metal hydroxides, carbonates, and sulfides and competitive interactions of metals and major cations with biotic ligands. It includes a simplified description of biogeochemical cycling of organic carbon and sulfur; and dissolution kinetics for metal powders, massives, and other solid forms.” The model has been applied in several lake scenarios and a generalized lake in the Sudbury area of the Canadian Shield to demonstrate the overall cycling of metals in lakes and the nonlinear effects of chemical speciation on metal responses. The model can be used in conjunction with the BLM (overlying water and porewater) and be used to calculate a loading rate (critical load) to a water body that will exceed a threshold for effects.

The UWM applied to the Canadian Shield Lake calculated critical loads for Cu, Ni, Pb, and Zn ranging from 2.5 to 39.0 g metal/m2-year. These critical loads were a factor of 10 or more higher than those for organic compounds (e.g., lindane, DDT, and PAHs), thus indicating a lower degree of concern for the metals as opposed to the selected organic compounds. Farley et al. (2009) have also used the UWM to calculate response to loading in Lake Coeur d'Alene as a verification of the model (Supporting Information: Figure S-3). The results of the estimated surface water concentrations are in good agreement with the measured values (Figure 5).

Details are in the caption following the image
Unit World Model results compared with measured in-lake metal concentrations for cadmium (Cd), lead (Pb), and zinc (Zn)

The UWM has been applied in risk assessment scenarios for surface waters in Europe and considered as a possible addition to or replacement for the EUSES model currently used for REACH risk assessments (EC, 2004). In support of the use of the UWM for European risk assessments, a project was completed to define 16 of the parameters used in the UWM (e.g., pH, DOC, Ca, Mn, etc.) and their relevant ranges representing their natural occurrence in EU surface waters along with median, average, 10th, and 90th percentiles (Vercaigne et al., 2013).

The two models are available at the following web addresses:

Rate of loss

The rate at which metals are lost from the water column in aquatic ecosystems has been the focus of several recent publications by Burton et al. (2019), Huntsman et al. (2019), Rader et al. (2019), and Adams (2020). Classification of substances for protection of the aquatic environment under the UNGHS or EU Classification, Labeling and Packaging regulations is based on aquatic chronic toxicity and can be modified by biodegradability. Substances that are rapidly degraded (rapid loss) present a low risk to the aquatic environment. Rapid loss for organic compounds is measured using a standard ready biodegradability test. For metals, it has been argued that rapid loss from the water column, sediment burial, transformation into insoluble forms, followed by mineralization is equivalent, that is, exposure is reduced or eliminated. The debate has centered on the rate of loss from the water column and the extent to which the metals are permanently lost to sediments. The relevance to risk assessment is that loss from the water column reduces the exposure concentration to organisms living in this compartment.

To evaluate the potential for a metal to undergo rapid loss from the water column, we consider the following: (1) the intrinsic properties of the metal that determine its fate in aquatic systems; (2) available field and/or laboratory studies that assess the fate of dissolved metal ions in aquatic systems; and (3) application of a laboratory test protocol to assess metal removal and/or remobilization under standardized conditions (i.e., ETRP) to gain insight into the physicochemical processes that influence metal fate.

Intrinsic properties and metal fate

Intrinsic properties of a chemical substance are those that influence its behavior and that are typically independent of concentration. Classic examples are number of protons and neutrons, valence states, chemical speciation, and sorption to ligands. The intrinsic properties of metals such as copper, lead, silver, and zinc cause them to react in the environment in ways that influence their bioavailability and fate. Copper, mercury, and silver, for example, interact strongly with various functional groups present in dissolved and POC as well as iron and manganese oxyhydroxides and sulfide. Since metal ions are electron donors, they can bind to electron receptors within various ligands. Metal atoms are Lewis acids since they donate electrons to the central metal (coordination with the metal ion as opposed to the anion—the Lewis base). This is an intrinsic property of the metal (Burton et al., 2019).

Changes in speciation can increase or reduce bioavailability (Di Toro, Allen, et al., 2001). Key changes in speciation occur through metal complexation, sorption, and precipitation reactions, Rader et al. (2019). These changes require the formation or breaking of bonds. This contrasts with organic molecules, which sorb to particles or DOC. Sorption occurs by means of weak van der Waals forces (Schwarzenbach et al., 1993). Hence, the organic molecule remains unchanged.

Metal precipitation can be an important factor in controlling environmental exposures. The chemistry of aluminum, iron manganese, and tin represent examples of metals that rapidly transform into metal hydroxides in freshwater ecosystems (in a pH range of 6–8.5) and rapidly precipitate from the water column and are transported to sediments, where they react with other ligands. These precipitation reactions can occur in minutes to hours depending upon pH, temperature, and other factors. In marine systems, it is well known that concentrations of lead in the water column as well as aluminum and iron are controlled by precipitation reactions.

Metals are frequently categorized as hard or soft metals, indicating their reaction with oxygen and nitrogen (hard metals, such as aluminum, cobalt, chromium, iron, manganese, strontium) or with sulfur (soft metals, i.e., copper, cadmium, mercury, silver). The soft metals are those that readily react with sulfur and are associated with AVSs found in sediments. “Soft” (e.g., Cu, Cd) and “borderline” metal ions (e.g., Zn, Ni, Co) exchange oxygen-containing ligands for sulfur, where they are sequestered via various mechanisms including adsorption, inclusion, metal-exchange, and co-precipitation (Rickard, 2012). Over time, these metals become incorporated into pyritic minerals, which are resistant to further transformation (Huerta-Diaz et al., 19931998; Morse & Luther, 1999). “Hard” metal ions remain bonded to oxygen-containing ligands and remain adsorbed to mineral surfaces, particulate organic matter, or precipitate as amorphous oxides, hydroxides, and carbonates. Over time, these metals age into even more insoluble forms or become incorporated into mineral crystalline structures and are frequently associated with insoluble iron and manganese oxyhydroxides along with aluminum and iron silicates.

Understanding the chemistry of metal ions in solution requires a detailed analysis of chemical speciation of each metal and affinity for the ligands present. Knowledge of speciation is necessary to understand the chemical form present with respect to valence state(s) and activity of the metal and the potential of the free metal ion to interact with various ligands in solution or suspension. Likewise, stability constants are fundamental to understanding and predicting the behavior of metal ions in the environment. The equilibrium constant involving the formation of a metal complex from the aquo-metal ion and a simple ligand is a standard measure of the effectiveness of the ligand in coordinating (combining) with the metal ions. Stability constants are frequently expressed as logarithms and express the strength of the metal ligand.

Carbonaro et al. (2007), Carbonaro et al. (2011), and Atalay et al. (2013) have developed linear free-energy relationships to describe bonding of metal ions to ligands containing oxygen donor atoms, the same groups that are responsible for metal partitioning in the water column. A single metal-specific parameter, αO (the Irving–Rossotti slope), indicates the extent to which a given metal bonds preferentially to negatively charged oxygen donor atoms relative to the proton (Irving & Rossotti, 1956). The magnitude of metal loss from the water column is strongly correlated to the magnitude of the Irving–Rossotti slope. This is because metals bind with select components of particulate matter in lakes (e.g., humic acids, fulvic acids, and metal oxides) through negatively charged oxygen donor atoms (Dzombak & Morel, 1990; Tipping & Hurley, 2002). This relationship was assessed experimentally using an adaptation of the TDP with several metals in a 4-day experiment (Huntsman et al., 2019). This is discussed further in the section on resuspension. The data show that there is a linear relationship between the Irving–Rossotti slope (αO) and the extent of loss from the water column (Figure 6). Silver appears to be an outlier and may be explained by its strong affinity for ligands other than oxygen such as sulfur including reduced sulfur groups (Bell & Kramer, 1999; Smith et al., 2002). The relationship in Figure 6 indicates that metal removal is related to the relative preference of a metal for oxygen donor atoms (i.e., the Irving–Rossotti slope).

Details are in the caption following the image
Relationship between removal of metal observed at Day 4 of the Natural Resource Canada Canmet mining experiments and the Irving–Rossotti slope (αO). The data point for Ag is excluded from the regression

Field and laboratory data on rate of loss

Assessment of exposure is fundamental to assessing the risk of toxicity to exposed organisms. Reactions of metals with their environment result in both loss of the metal from the water column and resolubilization under some circumstances. Understanding the dominant vector and metal fate processes controlling exposure is central to exposure assessment. Recent reviews of papers reporting loss of metals from aquatic ecosystems (Burton et al., 2019; Huntsman et al., 2019; Mebane et al., 2015; Rader et al., 2019) all indicate that there is rapid and substantial loss of metals from the water column following releases.

Numerous studies have demonstrated metal removal in surface waters for most of the common metals. Burton et al. (2019) and Rader et al. (2019) reviewed 20+ publications related to metal loss from the water column. Here, we briefly summarize the literature pertaining to rate and loss of metals in various ecosystems as well as estimates using the UWM with supporting field data for select examples. The experiments performed by Diamond et al. (1990) using mesocosms placed in a lake assessed the behavior of several metals in the same mesocosms. They observed the following half-times (i.e., the time at which the concentration has decreased by 50%, average of the two mesocosms, in days): cobalt (4.35), tin (9.45), iron (10.8), mercury (14.45), zinc (17.6), arsenic (19.2), and cesium (22.65). The rate of loss was correlated with sorption to particles and settling (Supporting Information: Figure S-4).

A review of removal rates for copper in several lakes and mesocosms (Rader et al., 2019) indicates that the half-life removal times for copper are typically on the order of 2–15 days (Table 3). In one experiment in which the addition of copper was continuous, the half-life was calculated as 76.2 days. This experiment is interesting in that, with the continuous addition of copper (i.e., flow-through system), approximately 50% of the copper is lost from the water column at steady state (Di Toro, Kavvadas, et al., 2001; Rader et al., 2019; Supporting Information: Figure S-5).

Table 3. Experimental measures of removal rates of copper from natural lakes and ponds
Ecosystem Measurement % Removal (days) 50% removal time (days) 70% removal time (days) Author
Cazenovia Lake Total copper 34%–68% (23) 2.0–9.6 14–> 23 Effler et al. (1980)
Lake Matthews Total copper 62% (9.4) 15.6 27.2 Haughey et al. (2000)
Lake Courtille Dissolved copper 80% (22) 9.0 15.6 van Hullebusch et al. (2002)
Saint Germain les Belles Dissolved copper 75% (9) 4.1 7.1 van Hullebusch, Chatenet, Deluchat, Chazal, Froissard, Botineau, et al. (2003); van Hullebusch, Chatenet, Deluchat, Chazal, Froissard, Lens, et al. (2003)
Catfish ponds Total copper 0.48 0.84 Liu et al. (2006)
Microcosms–Fraunhofer Dissolved copper 1.4–3.7 2.4–6.3 Schäfers (2001)
Novosibirskoye Mesocosms Total copper 86%–95% (19) 4.5–8.7 7.8–15.1 Smolyakov, Ryzhikh, Bortnikova, et al. (2010); Smolyakov, Ryzhikh, and Romanov (2010)
Lake Baldegg Limno-corals Dissolved copper 76.2 130 Gächter (1979)
  • Source: Adapted from Rader et al. (2019) and Burton et al. (2019).

The UWM has been used to demonstrate metal removal with remobilization potential in a lake ecosystem simulation for copper (Cu), zinc (Zn), lead (Pb), nickel (Ni), cobalt (Co), cadmium (Cd), and aluminum (Al) (Mutch Associates, 2012). The model calculations were based on use of the same model parameters (e.g., DOC, pH Ca, Mg, etc.) as the EUSES model and with initial metal concentrations of 29, 76.3, 120, and 136 µg/L. Starting concentrations were based on acute or chronic toxicity thresholds (Mutch, 2013). The data are presented as the fraction of the original concentration [C(t)/CT(0)] remaining as a function of time plotted on the y-axis. A fraction remaining of 0.3 corresponds to a 30% remaining or 70% removal. Time to 70% removal following a single spike to the model ecosystem for Cu, Ni, Pb, and Zn ranges from 2 to 5.5 days in (Supporting Information: Figure S-6). In contrast, the data for barium (not shown) indicate almost no loss from the water column over 28 days unless the starting concentration exceeded the saturation point and the metal precipitated.

While single-metal releases into the environment are rare, the data shown here indicate that the ability now exists to model metal exposure concentrations and predict exposure outcomes under a variety of natural conditions by changing the input parameters to the model to simulate metal fate at a site of interest. This approach provides a step change in the science related to exposure assessment for metals.

Assessment of metals following sediment resuspension (remobilization)

The potential for metal remobilization from sediments is a key issue to be considered in risk assessments. This should be considered both in terms of the short term and longer time periods. In the short term, several factors may influence exposure including the fact that some small fraction of particulate phase-bound metals may not be permanently sequestered in the bottom sediments (Kalnejais et al., 2007). The burrowing activity of organisms, hydrologically related events (e.g., increased flow rate, storm events), or occasional human activities (e.g., dredging, seabed mining) are all processes that could cause sediments resuspension. During these processes, previously redox-stratified sediments will mix with oxygenated overlying water, thereby altering metal sediment–water partitioning and speciation (Simpson & Batley, 2003). Additionally, seasonally induced changes in redox/zones can have a pronounced influence on the chemical phase distribution. For example, in winter, oxygen penetration in the sediment is deemed to affect deeper layers, while lake turnover in the spring and fall will influence the oxygen content at the sediment surface layer. However, following these events, there is a redistribution of metals to various binding phases and a reset of steady-state conditions. The amount of metal mass that is perturbed is always a small fraction of the total. Therefore, a short-term increase in exposure occurs that may be significant depending upon the mass of metal in the sediments and extent of the perturbation. However, toxicity does not appear to be the norm following short-term resuspension events as the metals are rapidly sorbed to oxyhydroxide surfaces.

Huntsman et al. (2019) developed a method for determining the loss of metals from the water column using a modification and/or extension to the TDP known as the Environmental Transformation and Removal Protocol (ETRP). This method measures metal removal from the water column via interactions with dissolved and suspended particles to nonavailable forms and potential for metals to reenter the water column following a resuspension event. The ETRP quantifies the rate of environmental transformation of dissolved metal species into nonavailable forms and the transport of metals into sediments. The ETRP follows as an extension to the TDP, which calls for the addition of three loadings of a metal-bearing substance into an aquatic medium, with agitation, followed by the evaluation of metal concentrations in solution over a 28-day period. To evaluate removal potential, 10 g of sediment is placed in 1-L Schott–Duran flasks with an aquatic solution spiked with dissolved metal, followed by a period of rapid mixing, and then settling. The rapid mixing is meant to mimic a resuspension event, although this vigorous mixing in the laboratory would likely produce a worst-case scenario. The rate of loss is calculated by comparing metal concentrations at the beginning and end of the test. In Huntsman et al. (2019), three different sediment substrates were tested at pH values of 6, 7, and 8 to evaluate the effect of pH and sediment characteristics on metal loss for several metals including silver, copper, lead, zinc, cobalt, nickel, and strontium. The results show that silver, copper, lead, zinc, and, to a lesser extent, nickel and cobalt were rapidly and substantially reduced in the water column. Conversely, strontium concentrations declined at a slow rate.

Several studies have investigated the chemical composition and changes that occur when sedimentary metal phases are suspended in oxygenated water (Fetters et al., 2016; Kalnejais et al., 200720102015). Sulfides are thermodynamically unstable in oxidizing environments, leading to a potential initial release of metals to the dissolved phase (De Jonge et al., 2012; Kalnejais et al., 2007; Simpson et al., 2012). However, the released trace metals are in turn strongly scavenged by iron and manganese oxide phases so that precipitation of these oxides close to the sediment–water interface can lead to an enriched layer of trace metals close to the sediment–water interface (Sutherland et al., 2007). Costello et al. (20152016) indicated that copper and nickel released during CuS and NiS oxidation were not lost from the sediment but were instead retained by other solid-phase ligands, likely organic matter and Fe and Mn oxides. The relative importance of these opposing processes will ultimately determine the net effect on trace metal bioavailability.

In summary, Fetters et al. (2016) showed that short-term resuspension of sediments resulted in limited metal mobilization that was sediment and metal specific. Xie et al. (20152019) showed that zinc was moderately released only after two days of vigorous resuspension. Burton et al. (2015) concluded that while increases in toxicity were observed in some laboratory studies, most resuspension events, laboratory or field, are nontoxic due to the short duration of the exposures.

The binding of metals to sediments over the long term is typically not reversible, because metals are sequestered to stable forms or buried at depth, which means that they will not be released into the porewater of the active biological zone (Cappuyns & Swennen, 2006). In this regard, adsorption time is a factor to be considered in the distribution and partitioning of metals. The prolonged aging of metals in sediment or soils has been demonstrated to be a major factor in determining their availability: the exchangeable and carbonate fractions decrease, while the refractory fractions (organic and mineral phases) increase (Guo et al., 2011; Jones et al., 2008; Peng et al., 2009; Zhong et al., 2012). For example, aging (fixation) can redistribute the adsorbed metals to the interior of sorption sites of organic and mineral substrates and as such increase retention of metals (Cappuyns & Swennen, 2006). Fixation of metals takes place by the slow diffusion of metals into Fe hydroxides and hydrous oxides of Al and Mn (Trivedi & Axe, 2000), clay minerals (Ma & Uren, 1998), and by diffusion or coprecipitation in carbonates (Nakhone & Young, 1993).

At a steady state, the input of a metal via sedimentation and the output via resuspension will reach an equilibrium. As metals are being transported to deeper sediment layers, they will be less prone to resuspension. The burial process and the increasing irreversible binding of the metals into the sediment matrix control metal exposure in the long term. The conditions in which organisms are most vulnerable to metal exposure are as follows:
  • (1)

    recent release (spill) resulting in a significant increase in concentration in the water and sediment;

  • (2)

    continuous release (effluent) that exceeds the assimilative capacity of the ecosystem, such as has occurred at mining sites due to elevated concentrations in the effluent or acid mine drainage; and

  • (3)

    substantial change in the pH of the ecosystem such as might occur following an acid spill or changes following prolonged acid rain deposition.

In examples 1 and 2 above, the key issue is that the mass loading has exceeded the ability of the ecosystem to respond on a short-term basis. Once management of the release occurs, the ecosystem will reach a steady state and most of the metal will be sequestered and become nonavailable to organisms. Regarding #3 above, Schindler et al. (1980) evaluated the effects of acidification on mobilization of metals from sediments in a 10 m (dia) enclosure in a freshwater lake (#223) in Ontario. The pH was reduced from 6.7–6.8 to 5.7 and 5.1 (one enclosure each). Aluminum, iron, manganese, and zinc were released from the sediments, with subsequent redistribution back to the sediments. Radiotracers showed that the acidification slowed the loss of manganese and zinc, while losses of barium, cesium, selenium, and vanadium to sediments were more rapid under acid conditions. While acidification results in an initial release of metals, over a longer period, the metals are transported to sediments through partitioning to suspended solids.

An example of system recovery was reported by Mebane et al. (2015) for Panther Creek, Idaho, following massive releases of copper, arsenic, and cobalt due to mining operations in the 1930s–1960s era. Efforts to restore water quality began in 1995, and by 2002, copper levels had been reduced by about 90%. Full recovery of salmonid populations occurred within about 10 years after the onset of restoration efforts and about four years after the USEPA chronic copper criteria had mostly been met. Shorthead Sculpin (Cottus confusus) numbers recovered within four years after their first arrival at a site. Benthic macroinvertebrate biomass increased, reaching about 70%–90% of the reference. The key to success was the implementation of a treatment plant to remove metals from acid mine drainage and hence, mass load reduction. This occurred despite the significant concentrations of metals remaining in the sediments.

EVALUATION OF METAL TOXICITY IN SEDIMENTS

Sediment assessment tools have advanced since the issuance of the Metals Framework in 2007. There has been considerable effort to improve sediment spiking procedures, identify appropriate aging periods, and develop models to assess or predict metals toxicity in sediments. These improvements are reviewed along with background information leading to these developments.

Numerous studies have demonstrated that dry weight concentrations of total recoverable metals in sediments cannot be used to predict toxicity across sediments of differing characteristics. Considering that most sediment risk assessments begin with such measurements, this is disconcerting and has led to various approaches for assessing metals in sediments. One of the early methods for assessing sediments was the Apparent Effects Threshold (AET) method (Barrick et al., 1988). Under the AET approach, empirical field and laboratory data are used to identify concentrations of chemicals above which there were always effects. The AET method, while providing an approach for many organics and metals, was limited because it could not provide direct causality for observed effects to a specific chemical since the toxicity tests were based on field samples that included many chemicals. This led to the development of weight of evidence approaches evaluating metal chemistry, toxicity, and field evidence using benthic community studies referred to as the “Triad Approach” (Chapman, 1996). Weight of evidence approaches were later improved with a better definition of lines of evidence and decision criteria (Hope & Clarkson, 2014; Hull & Swanson, 2006). Several approaches were developed from laboratory toxicity studies performed with field-collected sediments with the goal of developing sediment quality guidelines (SQGs) as indicators of benthic effects in the field and for remedial action decisions for sediments. These SQGs are usually defined as concentrations of single contaminants in sediments (mg/kg dry weight) below which toxicity is rarely observed and above which toxicity is frequently observed (Long & MacDonald, 1998; Long et al., 1995; MacDonald et al., 19962000). These methods, like the AET method, lack definitive evidence of causality due to co-occurring contaminants, but have been widely used. Additionally, logistic regression modeling has been used to match sediment toxicity and chemistry data (Field et al., 19992002), and field-based SSDs have been proposed as a tool for sediment risk assessment (Kwok et al., 2008; Roman et al., 2005).

While these approaches have merit, there has been considerable effort to develop an approach that defines a metal concentration that is either nontoxic or toxic (safe–not safe). In the mid-1990s, studies on the presence of AVS in anoxic sediments demonstrated an amazing ability to control the bioavailability of common divalent metals. Numerous studies (>100) demonstrated a lack of toxicity when AVS exceeded the sum of the simultaneously extracted metals (SEMs) (Ankley, 1996; Ankley et al., 1996; Di Toro, Allen, et al., 2001). This approach was improved upon using sediment interstitial water measurements compared with toxicity results from water-only exposure toxicity tests with benthic organisms (Berry et al., 1996). This was based on a long-standing premise that toxicity of contaminants to benthic organisms is primarily due to desorption into interstitial water (Adams et al., 1984). Di Toro et al. (2005) developed a sediment version of the aqueous BLM in which sediment porewater toxicity could be predicted for cadmium, copper, nickel, lead, and zinc using POC as a replacement for DOC. The model requires pH measurement of the porewater and uses the SEM procedure for metal measurements that are normalized to the sediment OC and uses the BLM to assess possible toxicity of the sediment porewater.

This addition to the AVS-SEM model allowed for estimates of when sediment would be toxic as opposed to identifying nontoxic sediments. This was the state of the science at the time the USEPA's Metals Framework document was issued (USEPA, 2007a). These concepts have been widely used and remain valid; however, these methods are (1) not predictive and (2) do not address sediments that are oxic or where AVS is low and <SEM. Since the issuance of the USEPA Framework for Metals Risk Assessment (USEPA, 2007a), approaches have been developed for spiking sediments and aging sediments for consideration of phases other than AVS including oxyhydroxides and suspended solids, for development of species sensitivity (SSD) curves for benthic invertebrates allowing for predictive models, for sediment toxicity assessment, and for application of the mBLM in assessing sediment porewater metal concentrations.

Sediment toxicity models: Copper

The toxicity of copper to benthic organisms has been evaluated in 106 chronic toxicity tests for six different organisms, that is, amphipods Hyalella azteca (25 individual no-observed effet concentration [NOEC] values) and Gammarus pulex (six individual NOEC values), oligochaetes Tubifex tubifex (39 individual NOEC values) and Lumbriculus variegatus (three individual NOEC values), insect Chironomus riparius (27 individual NOEC values), and insect Hexagenia (six NOEC values) (Cu-VRAR, 2008). A total number of 11 types of sediments were tested, grouped into artificial sediments with a range of OC and AVS concentrations; natural sediments with low AVS concentrations; or natural sediments with high AVS concentrations.

Copper was spiked into the sediment–water system as CuCl2·2H20, placed into test vessels, and were stabilized and/or equilibrated for 8–11 days before test initiation. According to Simpson et al. (2004), equilibration of Cu-spiked sediments occurs within 10–15 days. The water renewal regime was two to three times a week and porewater and overlying water were sampled to assess dissolved copper. The observed NOEC values ranged between 18.3 and >3158 mg/kg dry weight (min–max value) without correcting for bioavailability. The distribution of AVS and OC in selected toxicity data varied between 0.05 and 58.6 mmol/kg dry weight and 0.5% and 24.8%, respectively (Supporting Information: Figure S-7).

A high degree of variability was observed in the effect levels, which is attributed to the variation observed in the AVS-OC sediment characteristics presented above. To assess the data, an MLR model was used. The resulting model had an adjusted R2 of 0.81, indicating that both OC and AVS had a significant positive effect on the NOEC. Organic carbon has the largest effect on the log10(NOEC) value, which is apparent from its slope (i.e., 0.112). The AVS slope (0.044) indicated a less pronounced effect, which is not surprising since AVS is more suited as a model to predict the absence of toxicity and not the onset of toxicity. The interaction term between OC and AVS is negative, indicating that the positive effect on the log10(NOEC) of each was lower if the other was also high. The model parameters and output of the model (observed versus predicted values are shown in Table 4 (Supporting Information: Figure S-8). Most of the toxicity thresholds for all species showed a clear relationship with OC and/or AVS/OC and the model predicts the relationship quite well.

Table 4. Results of linear regression analysis (endpoint: growth/biomass) for the pooled copper sediment; data set for Tubifex tubifex, Hyalella azteca, Lumbriculus variegatus, Chironomus riparius, and Gammarus pulex (n = 37)
Estimate Std. error t Value p Value
Intercept 1.543 0.075 20.565 <2e−16***
OC 0.112 0.015 7.632 8.74e−09***
AVS 0.044 0.010 4.615 5.71e−05***
OC:AVS −0.003 0.000 −6.301 3.99e−07***
  • Abbreviations: AVS, acid-volatile sulfides; OC, organic carbon.
  • ***p < 0.05.
  • Source: Adapted from Cu-VRAR (2008).

A separate analysis was undertaken to investigate only the relationship between NOEC values and OC, based on sediments in tests that contained no AVS, but that contained a gradient in OC. This analysis confirmed the positive relationship between the toxicity values and the OC content of the sediment (Supporting Information: Figure S-9) for all species (n = 14). Most sediments in this set of tests had a narrow OC range (i.e., 1%–3%). For each species, only one sediment was tested at a significantly higher OC concentration (i.e., 9.8%). This approach was undertaken to accommodate assessments in which the sediment AVS is low and/or where regulatory bodies recommend the use of a worst case (most conservative) scenario.

The above analysis clearly shows that both AVS and OC have a significant effect on the observed copper toxicity levels for all test species in freshwater sediments. Taking these bioavailability factors into account reduces intraspecies variability to a large extent, resulting in more robust sediment quality assessment values and the ability to predict toxicity as a function of sediment chemistry.

In addition to the above models for copper, Simpson et al. (2011) developed a copper model to predict toxicity to marine sediment organisms based on binding to OC in the <65 μm particles size category. They concluded that adequate protection for all benthic organisms is expected for an OC-normalized copper concentration of 3.5 mg Cu/g OC in the <63 µm sediment fraction. For short-term exposures, the equivalent acute guideline is 11 mg Cu/g OC.

Sediment toxicity models: Nickel

Bioavailability-based models have been published for nickel in sediments (Schlekat et al., 2015; Vangheluwe et al., 2013). The approaches applied are based upon laboratory toxicity studies with a large number of species. This allows for the creation of SSDs and corrections in the species responses observed for different sediments based on the development of equilibrium partitioning-based bioavailability models.

A sediment testing program was established that utilized extensive laboratory testing, followed by field verification of laboratory findings regarding nickel toxicity to benthic organisms. The laboratory program was designed to address three objectives: (1) evaluate various sediment spiking methods to ensure that the laboratory sediments reflect real-world chemistry conditions; (2) generate a reliable and expanded benthic data set using 10 benthic species in sediments with low and high nickel binding capacity; and (3) examine sediment bioavailability relationships for OC, AVS, Fe, Mn, and cation exchange capacity (CEC) (Schlekat et al., 2015; Vangheluwe et al., 2013). Additionally, six nickel-spiked sediments were deployed in the field to examine benthic colonization. The test results were used to develop SSDs and bioavailability models for sediments with differing chemistries.

Vangheluwe et al. (2013) utilized eight field-collected sediments for nickel sediment toxicity tests with a range of chemical parameters. Ten benthic species were tested using spiked sediments following the spiking and equilibration procedure described by Brumbaugh et al. (2013) (see the next section). The data were also used to develop bioavailability models based on EC20 values. The data were used to develop chronic SSSs for the different sediment chemistries (Supporting Information: Figure S-10). The natural sediment (Spring River MO) with the lowest AVS and total organic carbon (TOC) was used to derive a reasonable worst-case PNEC for nickel.

The bioavailability models developed by Vangheluwe et al. (2013) used stepwise MLR analysis. The models were based on AVS, TOC, pH, Fe, and Mn (mg/kg dry wt), CEC (Meq/100 g), sand (%), silt (%), and clay (%) (Table 5). The approach allowed for the development of toxicity thresholds (EC20 and HC5 values, that is, hazard concentration at the 5th percentile species) using AVS, total recoverable Fe, TOC, and CEC.

Table 5. Nickel sediment bioavailability models developed by species and sediment parameter
Species Model R2 Intercept (SE) Slope (SE)
AVS based
Hyalella azteca Log EC20 total Ni (mg/kg dry wt) = 2.65 + 0.492 log AVS (µmol/g dry wt.) 0.74 2.65 (0.11) 0.492 (0.11)
Gammarus pseudolimnaeus Log EC20 total Ni (mg/kg dry wt) = 2.8 + 0.358 log AVS (µmol/g dry wt.) 0.62 2.8 (0.13) 0.358 (0.13)
Hexagenia sp. Log EC20 total Ni (mg/kg dry wt) = 2.35 + 0.175 log AVS (µmol/g dry wt.) 0.59 (p = 0.07) 2.35 (0.06) 0.175 (0.07)
TOC based
H. azteca Log EC20 total Ni (mg/kg dry wt) = 2.81 + 0.513 log OC (%) 0.59 2.81 (0.11) 0.513 (0.17)
G. pseudolimnaeus Log EC20 total Ni (mg/kg dry wt) = 2.81 + 0.557 log OC (%) 0.79 2.81 (0.09) 0.557 (0.13)
Hexagenia sp. Log EC20 total Ni (mg/kg dry wt) = 2.40 + 0.164 log OC (%) 0.29 (p = 0.26) 2.40 (0.07) 0.164 (0.13)
Fe based
H. azteca Log EC20 total Ni (mg/kg dry wt) = − 0.54 + 0.854 log Fe (mg/kg dry wt.) 0.62 −0.54 (1.15) 0.854 (0.27)
G. pseudolimnaeus Log EC20 total Ni (mg/kg dry wt) = 0.31 + 0.666 log Fe (mg/kg dry wt.) 0.68 0.31 (0.87) 0.666 (0.20)
Hexagenia sp. Log EC20 total Ni (mg/kg dry wt) =  0.75 + 0.418 log Fe (mg/kg dry wt.) 0.79 0.75 (0.45) 0.418 (0.11)
CEC based
H. azteca Log EC20 total Ni (mg/kg dry wt) = 2.11 + 0.783 log CEC (meq/100 g) 0.59 2.11(0.32) 0.783 (0.26)
G. pseudolimnaeus Log EC20 total Ni (mg/kg dry wt) = 2.28 + 0.679 log CEC (meq/100 g) 0.68 2.28 (0.26) 0.679 (0.26)
Hexagenia sp. Log EC20 total Ni (mg/kg dry wt) = 2.20 + 0.0.244 log CEC (meq/100 g) 0.36 (p = 0.21) 2.2 (0.20) 0.244 (0.16)
  • Abbreviations: AVS, acid-volatile sulfides; CEC, cation exchange capacity; OC, organic carbon; TOC, total organic carbon.
  • a Nonsignificant.
  • Source: Adapted from Vangheluwe et al. (2013).

The effects data set for nickel with 10 benthic species is the largest sediment data set for any metal and is representative of several feeding strategies and exposure routes. However, there is a lack of guidance on what is an adequate sediment database of benthic invertebrates for the development of an SSD and PNEC. The USEPA (1985) recommends eight families for water column species. This is currently not possible due to a lack of suitable test methods. Vangheluwe et al. (2013) recommended that different exposure conditions and feeding behavior of sediment organisms be the guiding principle related to organism selection.

Based on the experience of the aforementioned nickel testing program, the authors proposed an overarching approach to assessing risk associated with metals in sediment based on (1) laboratory-based sediment toxicity testing; (2) development of bioavailability models; (3) determination of effects thresholds; and (4) field validation (Supporting Information: Figure S-11).

In summary, the nickel and copper sediment assessment programs discussed have advanced the science of sediment risk assessments for freshwater organisms by developing regression models that can be used to predict toxicity as a function of AVS and OC (copper) and AVS and Fe (nickel). The models have been applied to SSDs for benthic organisms and to field-collected sediments. It has been demonstrated that variability in toxicity responses can be reduced by applying bioavailability corrections to the sediment toxicity data. This is analogous to the approach that USEPA uses for water quality criteria for copper and aluminum (USEPA, 2007b2018). While this represents an advancement in predicting metals toxicity in sediments, the use of MLR models has not advanced to the same level for sediments as for the aqueous environment. For example, the models for copper and iron have not yet integrated AVS, OC, Fe, and Mn into a single regression model. However, it is fair to point out that the UWM accomplishes that integration and allows for calculation of a concentration in sediment or the water column that is toxic or nontoxic. It is not, however, a single equation that can be performed using a spreadsheet and thus has not been widely used.

Spiking and aging

Borgmann et al. (2001), Vandegehuchte et al. (2007), and Costello et al. (2011) have shown that sediment samples spiked with metals for laboratory toxicity tests produce significantly different results (greater toxicity) when the tests are conducted shortly after spiking versus aging the samples for several weeks. It is now widely recognized that toxicity tests with unaged sediments provide overly conservative results due to increased metal concentration in the porewater. Costello et al. (2016) showed that when sediments high in nickel (Raisin River, MI) were delivered to the laboratory, they were initially toxic to Hyalella azteca due to oxidation of AVS caused by oxygen ingress to the sediments during sediment transport, but over 10 days, they became nontoxic as the porewater nickel concentrations declined. The goal in laboratory testing is to achieve metal concentrations in sediments with distribution between dissolved and solid phases that resemble those measured in field-collected sediments (Brumbaugh et al., 2013; Costello et al., 2016; Simpson & Batley, 2003; Simpson et al., 2004). Thus, there is a need to use sediments in toxicity tests that reflect the bioavailability of those in natural waters. The literature is replete with studies in which sediments are collected, delivered to the laboratory, mixed and spiked (or not), and evaluated for toxicity by placing organisms in the test chambers without proper characterization of the sediment and without determining the redox and AVS of the test system relative to the natural conditions.

Relative to natural sediment that has an oxidized surface layer, homogenized and equilibrated sediment placed into test chambers presumably has greater concentrations of reduced ligands (e.g., AVS) and decreased oxidized ligand concentrations (e.g., Fe oxides) near the sediment–water interface (Costello et al., 2016). Reduced sulfur (measured as AVS) is unstable in the presence of O2, and in homogenized sediment, AVS is rapidly oxidized within the surface 1 cm (Costello et al., 2015; De Jonge et al., 2012; Simpson et al., 2012). A decline in AVS concentration in surface sediment during aging will change metal bioavailability. Oxidation of reduced sulfur and/or metal sulfides results in a nonequilibrium condition for metal binding and a change to sorption to other ligands, with a potential release of metal into the dissolved phase. There can be a loss of oxidized ligands (e.g., Fe and Mn oxides) during homogenization of sediment, especially if the pH has been lowered due to addition of the metal.

For Ni, there is evidence that toxicity and speciation are strongly correlated to Fe and Mn oxide concentrations (Costello et al., 2011). Simpson and Batley (2003) reported that Zn release rates increased rather than decreased from sediments, upon exposure to more oxygenated waters, indicating that oxygen penetration into the sediments did not increase sufficiently to cause oxidation of pore-water Fe(II) to Fe(III) with formation of the hydroxide and sorption of released Zn. The increased rate of Zn release from the sediments was attributed to the slow oxidation of Zn sulfide phases present in the surface sediments (0–2 cm). The competing rates between Zn sulfide dissolution and Ferric hydroxide sorption of Zn control the concentration of zinc in the porewater. Thermodynamics predict that sulfide oxidation should occur in preference to Fe(II) oxidation.

Simpson et al. (2004) demonstrated that sediments spiked with nickel required a relatively long time for equilibration (i.e., 70 days) as compared with 15, 40, and 45 days for Cu, Zn, and Cd, respectively. Brumbaugh et al. (2013) confirmed the need for an extended aging period (10 weeks) for nickel, and Hutchins et al. (2007) demonstrated this for copper. Zhong et al. (2012) demonstrated the need for a two-month aging period by extracting 65Cu-spiked sediments and extracting them with digestive fluid mimicking deposit feeders. An additional refinement in sediment spiking procedures is to neutralize the sediments after spiking to adjust downward pH shifts that occur using NaOH (Simpson et al., 2004). The pH shift results from metal hydrolysis and the fact that the stock solutions used for spiking are typically acidic to keep the metal in solution. Hydrolysis lowers pH, resulting in competition of solid-phase binding sites between H+ and Me+, allowing for the release of increased concentrations of metals into the porewater. A similar approach has been developed for spiking soils and has been the basis for developing laboratory to field correction factors for several metals (Smolders et al., 2009). Sediment aging after pH adjustment allows for a redistribution of metal between binding phases, incorporation into the particle matrix, and a change in metal speciation to less labile forms of the metal. It also allows for reestablishment of preexisting redox conditions, formation of AVS, and achievement of a steady state of metal concentrations between the porewater and the sediment. The latter is critical to enable comparison with effects occurring in natural sediments at a given site.

Brumbaugh et al. (2013) recommend the following spiking procedure: Use of an indirect (two-step) spiking approach as proposed by Hutchins et al. (2007), which utilizes adjustment of pH via the addition of NaOH at a 2:1 ratio to the spiked metal during preparation of “super-spikes.” This is followed by a four-week equilibration period in the spiking jars, after which the spiked sediment is mixed and equilibrated with incremental portions of unspiked sediment to achieve the desired test concentrations. The procedures are performed under nitrogen to prevent oxygen ingress. This approach was effective in producing consistent pH values and other chemical characteristics across spiking levels. When spiking Ni into sediment having high AVS and high organic matter, a total equilibration period of at least 10 weeks is recommended because of the slow reaction kinetics of Ni with AVS.

Additional studies that summarize methods for spiking sediments with metals have been published by Simpson et al. (2004), Zhong et al. (2012), Besser et al. (20112013), Simpson and Batley (2016), and Hartford et al. (2022).

Sediment porewater measurement and assessment (mBLM)

There has been considerable discussion regarding how to reliably collect sediment porewater (interstitial water) for metals beginning back in the 1980s. Four approaches that have frequently been used include dialysis, centrifugation, pressurized squeezing, and vacuum extraction. Dialysis (collection by a passive sample across a membrane, often referred to as peepers) is often thought of as the gold standard. This method is the least susceptible of the methods to cross-contamination with suspended solids and eliminates oxygen ingression into the sediment from sample handling. It frequently provides the lowest metal concentrations. Samples collected with peepers require a minimum of a week of equilibration and provide a very small amount of porewater (i.e., <5 mL), which limits the ability to conduct toxicity tests, and only some analytical methods and equipment can analyze metals in volumes this small. Centrifugation has the advantage that the samples can be processed within minutes after collection, large volumes of water can be produced, and many samples can be processed rapidly. The disadvantage is that there can be a change in speciation and loss of metal when the samples are exposed to oxygen.

Judd et al. (2021) and Santore et al. (2022) reviewed several large data sets in which toxicity studies were conducted, and porewater metal concentrations were measured using peepers and centrifugation. The results indicate that metal concentrations are generally higher in centrifugation-collected porewater, the results are more variable than metals concentrations measured by peepers, and that toxicity correlated best with the peeper metal concentrations (Supporting Information: Figure S-12).

Santore et al. (2022) (companion paper to Judd et al. (2021) used the multimetal BLM to predict toxicity of porewater metals to Hyalella azteca in 184 samples of paired survival with excess SEM (i.e., SEM-AVS). The aim of the paper was to use toxicity results to support a conclusion as to the use of centrifugation versus peepers for sediment assessment. The prediction was significantly more accurate using peeper porewater measurements than centrifugation using H. azeteca (Supporting Information: Figure S-13). Their analysis showed that when SEM exceeds the AVS, toxicity is generally expected and becomes more prevalent as the excess SEM increases. Their data also demonstrate a reduction in toxicity when the porewater is normalized to organic carbon content. The mBLM assessment using peeper data provided the most accurate prediction of toxicity for the highest number of samples. Santore et al. (2022) concluded that peepers provide a better measure of toxicity potential, but that centrifugation is acceptable for collecting samples for DOC, pH, and soluble metal ions such as calcium and sodium, and the two methods could be used in tandem.

Interpreting sediment toxicity tests

A workshop sponsored by the European Chemicals Agency (ECHA) identified several recent scientific advances as well as areas where guidance in assessing sediment toxicity data is lacking (European Chemicals Agency, 2014). One such area relates to how to interpret the results of laboratory sediment toxicity tests in terms of determining whether the toxicity is due to the sediment, the porewater, or overlying water in the test chamber. This problem exists for tests with spiked sediments and field-collected sediments when the tests are conducted using a static design versus tests conducted where the water exchange in the test chambers is insufficient to maintain the contaminant in the overlying water below the aqueous toxic effect concentration. In this circumstance, there is a high probability that the porewater contaminant concentration is the same as the overlying water concentration. The problem is further compounded by the fact that, unless the sediments were properly aged before introducing the organisms, the porewater concentration will be higher than what is observed in the field. This leads to the conclusion that the laboratory tests are biased on the high side and are often not representative of field conditions.

The solution to this problem is the following: (1) Age the sediment appropriately such that the porewater reflects the field conditions. This usually takes a period several weeks, or it requires core samples to be collected and kept under nitrogen until they are tested. (2) Perform the test using a flow-through test system (or multiple water replacements) so that the concentration of the contaminant(s) is maintained below the concentration causing toxicity due to aqueous exposure (no sediment). The approach used by Adams et al. (1984) demonstrated this for kepone by conducting static and flowthrough tests along with water-only exposure tests. They demonstrated that the toxic effects could be expected to occur only if the chemical concentration is high enough in the sediments such that the equilibrium interstitial water concentration reached by desorption is equal to or higher than the concentration demonstrated to cause effects in a water-only exposure test. This allowed for calculation of the kepone sediment concentration that produced a porewater concentration that was toxic. Adams et al. (1984) demonstrated that kepone sorbed to food (i.e., no kepone in water or sediment) did not cause toxicity at the highest level tested. If the concern is that the chemical may exert toxicity via ingestion, it is critical to maintain the overlying water below the aqueous effect level and to measure the porewater concentrations during the experiment. Without these additional measures, it cannot be determined as to which is the route of exposure and what sediment metal concentration causes effects. Lack of proper aging for the test sediments and improper spiking procedures also lead to results that are not comparable to field conditions (see the section on sediment spiking and aging).

CONCLUSION

The science associated with assessing metals in the environment has advanced over the past 15 years since the USEPA Metals Framework was published in 2007. Notable advances have been made in the development of bioavailability models for assessing toxicity as a function of water chemistry in freshwater ecosystems. Biotic ligand model and MLR models now exist for most of the common mono- and divalent metals. Additionally, with the aid of the EU REACH regulations, aquatic toxicity databases were developed or significantly expanded for most of the common metals. This has allowed for the development of SSDs for these metals and made it possible for many jurisdictions to develop or update their water quality criteria or guidelines.

The understanding of the fate of metals in the environment has undergone significant scrutiny over the past 20 years or more. It is now clear that the ultimate fate of metals in aquatic ecosystems is the sediment compartment. The rate and extent to which metals are transported to sediments have been extensively studied in both the laboratory and in natural systems. Transport and toxicity models have evolved, and models such as the UWM allow for estimating concentrations of metals in various compartments as a function of loading and time. Additionally, there has been significant focus on the transformation of metals in sediments into forms that are less bioavailable and to understanding conditions that result in re-solubilization or redistribution of metals in and from sediments.

Assessment of metal bioavailability of metals in sediments has been extensive. Methods for spiking sediments have been developed such that the resulting chemistry in the laboratory mimics that in natural systems. Sediment bioavailability models are emerging. The complexity of the sediment chemistry and the lack of toxicity test methods for sediment have limited the scope of the sediment bioavailability models. Nevertheless, models that allow for prediction of when sediments are nontoxic have advanced, and models that allow for prediction of sediment toxicity are emerging. Models for copper and nickel are two examples. These models take advantage of MLR approaches analogous to those used for the aqueous environment, making use of chemistry parameters including AVS, OC, Fe, Mn, and CEC. This is an area that still needs further development.

Biodynamic models have been developed for several organisms and many metals. The key parameters such as uptake and depuration rate constants as well as bioavailability of the metal in the food have been developed for several species. This allows for estimates of transport of metals from sediments to organisms via their diet in addition to water exposure. Sediments that are judged to be nontoxic using toxicity tests or infauna studies may still allow for contaminants to be transported through the food chain. This concern is particularly true for substances that are highly bioaccumulative (typically organics and metallo-organics such as methyl mercury and selenium in some cases).

As the science continues to develop, the tool set available to environmental risk assessors has expanded significantly over the past 15 years. Emphasis on usability and fit for purpose continues to drive the development of models and approaches for use in both the scientific and regulatory arenas.

AUTHOR CONTRIBUTION

William J. Adams: Conceptualization; data curation; formal analysis; funding acquisition; investigation; project administration; writing—original draft. Emily R. Garman: Data curation; investigation; methodology; writing—review and editing.

ACKNOWLEDGMENT

William J. Adams acknowledges financial support to develop this work from the North American Metals Council (NAMC). Emily R. Garman received no compensation for her efforts in developing the manuscript.

    CONFLICT OF INTEREST

    Both authors acknowledge that they work for the metals industry, but indicate that the views expressed here are those of the authors and not the organizations for which they work.

    DATA AVAILABILITY STATEMENT

    The supplemental data and all data in the article are available from the corresponding author William J. Adams at [email protected].

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.