Linking field-based metabolomics and chemical analyses to prioritize contaminants of emerging concern in the Great Lakes basin
Abstract
The ability to focus on the most biologically relevant contaminants affecting aquatic ecosystems can be challenging because toxicity-assessment programs have not kept pace with the growing number of contaminants requiring testing. Because it has proven effective at assessing the biological impacts of potentially toxic contaminants, profiling of endogenous metabolites (metabolomics) may help screen out contaminants with a lower likelihood of eliciting biological impacts, thereby prioritizing the most biologically important contaminants. The authors present results from a study that utilized cage-deployed fathead minnows (Pimephales promelas) at 18 sites across the Great Lakes basin. They measured water temperature and contaminant concentrations in water samples (132 contaminants targeted, 86 detected) and used 1H-nuclear magnetic resonance spectroscopy to measure endogenous metabolites in polar extracts of livers. They used partial least-squares regression to compare relative abundances of endogenous metabolites with contaminant concentrations and temperature. The results indicated that profiles of endogenous polar metabolites covaried with at most 49 contaminants. The authors identified up to 52% of detected contaminants as not significantly covarying with changes in endogenous metabolites, suggesting they likely were not eliciting measurable impacts at these sites. This represents a first step in screening for the biological relevance of detected contaminants by shortening lists of contaminants potentially affecting these sites. Such information may allow risk assessors to prioritize contaminants and focus toxicity testing on the most biologically relevant contaminants. Environ Toxicol Chem 2016;35:2493–2502. Published 2016 Wiley Periodicals Inc. on behalf of SETAC. This article is a US Government work and, as such, is in the public domain in the United States of America.
INTRODUCTION
Ecological risk assessors are challenged by the growing number of emerging contaminants (e.g., pharmaceuticals, personal care products, flame retardants, synthetic musks, and perfluorinated compounds) that are released into aquatic ecosystems 1-3. Although technological advances have increased the ability to detect these compounds at low concentrations and demonstrated their ubiquity in the environment 4, advances in techniques for assessing biological impacts and toxicity of contaminants have not kept a comparable pace. From a hazard-assessment standpoint, it is impractical to conduct in vivo toxicity testing of such a large number of chemicals using single-contaminant exposures because resources for such testing are limited. Results from these single-contaminant exposures can also have limited applicability for predicting responses in aquatic ecosystems where the vast majority of environmental exposures involve complex mixtures of contaminants 5. For instance, predicting these overall impacts in the environment can be difficult because many contaminants comprising these mixtures can possess limited to no information on their biological impacts 2, 6 or can exhibit interactive effects when found in combination 5, 7. This can leave a large list of potentially toxic contaminants to measure but very little biological knowledge to inform risk assessments 6.
Because of the untenable nature of identifying all contaminants at increasingly lower concentrations and then assessing each of these individually with in vivo toxicity tests, alternate approaches are needed to provide relevant biological information in a reasonable time frame and at a manageable resource level. Effects-based monitoring methods were developed to help meet these needs because they can detect the presence of biologically active contaminants and mixtures of these contaminants in the environment 8, 9. One advantage of these approaches is that they can help identify biological impacts of contaminants and provide information about their modes of action without prior knowledge of the toxicity of the specific contaminants eliciting biological responses 8. Methods that assess and identify those components of contaminant mixtures responsible for biological effects are not a new concept (e.g., toxicity identification and evaluation methods) 10. However, the emergence of “omic” techniques (e.g., transcriptomics, proteomics, and metabolomics) represents a considerable methodological advancement in measuring those potential impacts in a high-throughput manner that can assess impacts across a range of biological endpoints and without bias for any specific mode of action 9, 11, 12. Furthermore, these tools can measure endpoints at lower levels of biological organization (e.g., molecular), enabling detection of effects before adverse impacts manifest at higher levels of response (e.g., population growth rates) 9. Thus, effects-based monitoring tools, including “omic” techniques, show great promise for environmental surveillance where there is a need for early-warning detection prior to higher-order impacts and where uncertainty exists with regard to which stressors or biological responses to measure 8, 13, 14.
Although effects-based methods such as “omic” techniques have provided useful information for in situ contaminant hazard assessments, such methods are not, in and of themselves, a panacea for reducing chemical contaminant risk or informing remediation efforts in threatened ecosystems. It is advantageous that these techniques can provide relevant biological information regarding contaminant exposure and attendant mode of action without prior knowledge of contaminant occurrences 8, 9. Nevertheless, it would be beneficial if these tools also provided a means to evaluate which specific contaminants within a complex mixture contributed to observed biological impacts 8. For example, changes in fish metabolite profiles were used to detect overall exposure to pulp mill effluent in a previous field study, but without data on the chemical composition of this complex mixture, those changes could not be linked to specific contaminants comprising the effluent 15 or inform decisions regarding additional toxicity testing and remediation. However, this general goal of linking exposure responses to specific contaminants comprising complex mixtures may be achieved by approaches that couple endogenous metabolite analysis with targeted analysis of surface-water contaminants. In particular, statistical analyses, such as partial least-squares regression, that can assess covariance in contaminant measurements and metabolomic profiles may be able to distinguish among contaminants that exhibit greater biological activity and screen out those that may not be eliciting a significant biological response. This screening approach could help distill contaminant lists down to those with the greatest likelihood of causing adverse outcomes, allowing risk assessors to focus limited resources on further testing of those contaminants that are most likely eliciting a biological response.
To demonstrate the ability of partial least-squares regression to achieve the overall goal of prioritizing contaminants for toxicity testing, we applied this approach for coupling conventional contaminant monitoring with high-throughput untargeted metabolomics. We utilized data from a large field-based study that was conducted at several sites across the Great Lakes in the United States. Specifically, we cage-deployed fathead minnows (Pimephales promelas) at 18 sampling sites across the Great Lakes basin and used 1H nuclear magnetic resonance (NMR) spectroscopy to measure relative changes in the profiles of endogenous polar metabolites in liver tissue. By coupling those measurements with contaminant analyses of water samples that were collected in parallel (132 contaminants targeted), we evaluated to what extent endogenous metabolite profiles of fish differed across sampling sites. We then utilized a partial least-squares regression analysis to assess whether those differences were related to contaminant concentrations in surface water at those sites and to distinguish among those contaminants that were unlikely to be eliciting a biological impact versus those that were likely to be significantly impacting fish deployed at these sites. As described in the Materials and Methods section, this helped prioritize contaminants based on their potential biological activity by ranking them according to the strength of their covariance with endogenous metabolite changes, thereby identifying candidates for more targeted toxicity evaluations and screening out those that could be given lower priority (i.e., weaker correlation with biological impacts as measured by metabolomics). Such prioritization could constitute a critical first step toward chemical screening, by reducing the long list of candidate contaminants while focusing attention and resources for toxicity assessments, monitoring, and remediation on contaminants most likely to be biologically active.
MATERIALS AND METHODS
Field deployment and water-collection methods
Two cages with 12 fathead minnows per cage (6 males and 6 females) were deployed at multiple locations in the Great Lakes basin (18 sites spanning 5 watersheds: St. Louis River estuary [MN, USA], Detroit River [MI, USA], Maumee River [OH, USA], Milwaukee River estuary [WI, USA], and Fox River [WI, USA]; Supplemental Data, Table S1) 16. These sites are characterized by a range of biological impairments associated with point and nonpoint contamination 17 and designated by Environment Canada and the US Environmental Protection Agency (USEPA) as Great Lakes areas of concern 18. The main objective of the present study was to assess the utility of metabolomics for screening detected contaminants for probable biological activity rather than surveying the spatial distribution of such contaminants. Therefore, we focused on locations likely to provide an intersite gradient of contaminant concentrations. This increased the probability of detecting changes in concentrations of targeted contaminants and assessing whether they covaried with changes in endogenous metabolite profiles. Although some sites were not collocated with wastewater-treatment plant outflows, deployments were typically located near such outflows because they can be important sources of a wide variety of personal care products, pharmaceuticals, and emerging industrial contaminants, including endocrine-active contaminants 4. We focused on a range of emerging contaminants because some possess properties that increase their likelihood of impacting aquatic ecosystems. For instance, they can include compounds (e.g., pharmaceuticals) that have been engineered to elicit biological effects at very low concentrations 2, may be poorly removed by effluent treatment 19, or have physicochemical properties (e.g., high polarity) that increase their likelihood of being present in surface waters 2.
The field-deployment methods applied in the present study have been used previously to successfully measure relative changes in endogenous metabolites in response to exposure to complex mixtures of aquatic contaminants 15. More detailed descriptions of fish deployment protocols are reported elsewhere 16 (see Supplemental Data). Briefly, fish were deployed for 4 d across 5 watersheds from May to August 2011. At the termination of each exposure, field-deployed fish were transported to a nearby laboratory in water collected from the respective field sites and then euthanized in a buffered solution of tricaine methanesulfonate (MS-222). After weighing the fish, livers were removed, flash-frozen, and stored at –80 °C until they were processed for metabolite profiling. Because in situ water temperature has been found to alter relative abundances of endogenous metabolites 20 in other species, water temperatures (range, 10.4–25.0 °C) were recorded every 30 min during the deployments (HOBO; Onset Computer) and included in the analyses.
Although additional tissues (e.g., gonads) were collected as part of the present study, we focused the present analysis on liver tissue. Because of the essential role of the liver in many biological processes, including biotransformation of exogenous contaminants, it represents a system-wide integrator of biological responses and would likely reflect perturbations of a range of metabolic pathways and tissues 21, 22. Therefore, comparing relative changes in endogenous metabolites in livers and contaminant concentrations in water samples could provide valuable insight into system-wide biological responses to contaminant exposure.
In parallel with the deployment of fish that were used to measure changes in endogenous metabolite profiles, we collected a depth-integrated water sample (1 L) at the beginning and again at the end of the 4-d exposure to assess contaminant concentrations for each exposure period. The water sample was collected at a comparable water depth to the caged fish and transported to the laboratory on ice. Samples were shipped overnight to the US Geological Survey's National Water Quality Laboratory (Denver, CO), where they were analyzed for 132 unique organic compounds that included pesticides, industrial and domestic contaminants, and a suite of pharmaceuticals and personal care products. The list of measured contaminants, specific collection and analytical methods, quality assurance procedures, and detailed analyses of contaminant composition are described elsewhere 23. The Supplemental Data also describe how contaminant concentrations were estimated and averaged when measurements were less than laboratory reporting levels.
Processing of metabolomic samples
To profile changes in relative abundances of endogenous metabolites, liver samples were extracted in a 96-well plate format using a dual-phase extraction process 15 (Supplemental Data). Polar extracts were vacuum-dried and reconstituted in 200 μL of 0.1 M sodium phosphate-buffered deuterium oxide (pH 7.4) that contained 50 μM sodium 3-(trimethylsilyl)propionate-2,2,3,3-d4. Samples were analyzed with 1H-NMR spectroscopy based on an automated push-through direct-injection technique 24 using the acquisition parameters reported in the Supplemental Data.
NMR data analysis
Acquired spectra were zero-filled, line-broadened at 0.3 Hz, and Fourier-transformed (ACD/Spectrus Processor; Advanced Chemistry Development). Spectra were phase-corrected and baseline-corrected, referenced to sodium 3-(trimethylsilyl)propionate-2,2,3,3-d4, and binned at a width of 0.005 ppm. Several regions of binned data were excluded to eliminate a residual water peak (4.70–5.10 ppm), a residual methanol peak (3.35–3.37 ppm), and a large resonance from betaine exhibiting high leverage and variability (3.26–3.28 ppm). Remaining bins were then normalized to unit total integrated intensity. To compare metabolite profiles across multiple sampling locations, principal component analysis was applied to male and female fish separately (SIMCA-13.0; Umetrics). Cross-validation was used to determine the number of principal component analysis components, and cumulative Q2 (Q2cum) was used to assess model overfitting. Outliers were identified with Hotelling's T2 test, which calculates the distance of each observation's score value from the origin in the principal component analysis plot. Observations outside the 95% confidence interval (Hotelling's T2) were considered outliers 25. A one-way analysis of variance (ANOVA) with a post hoc Tukey's test (R, Ver 2.15.1) assessed whether average principal component analysis score values differed among sites; separate analyses were run for first and second component scores. When necessary, data were log10-transformed to meet statistical assumptions. The suitability and interpretation of these statistical analyses have been discussed elsewhere 25.
Partial least-squares regression comparing metabolite changes and contaminant composition
To compare changes in endogenous metabolite profiles and contaminant composition and to screen out those contaminants exhibiting concentrations that did not significantly covary with metabolite changes, we applied partial least-squares regression (SIMCA-13.0). This statistical framework is appropriate for comparing multiple x and multiple y variables when the number of variables exceeds the number of replicates or the variables exhibit collinearity 25 and has successfully identified environmental stressors that have a significant biological impact on endogenous metabolites 12, 26. Although these methods cannot assign causal relationships, they can identify contaminants and endogenous metabolites exhibiting significant covariance. Because we focused on sites that would likely provide a gradient of contaminant detections and concentrations, this helped discern those contaminants that did not covary with endogenous metabolite shifts.
The number of NMR spectral bins (2230 bins) exceeded the number of measured contaminants (132 contaminants analyzed); therefore, average contaminant concentrations and binned spectral data were defined as the y and x variable matrix, respectively. Temperature measurements were averaged over the deployment period for each site and were included as a y variable. Spectral data were Pareto-scaled and mean-centered, and contaminant concentrations and temperature were scaled to unit variance and mean-centered. To find the best-fitting model, we used a “backward elimination” fitting routine that is described in more detail in the Supplemental Data. In general, application of the partial least-squares regression allowed us to find the best-fitting model that maximized the covariance between changes in endogenous metabolite profiles and contaminant measurements. We started by building a model that included temperature and all 86 contaminants detected in at least 1 of the water samples (hereafter referred to as the “global” model). The optimal number of model components was determined by cross-validation, and the overall model fit was assessed using Q2cum. Using this global model, we then used analysis of variance testing of cross-validated predictive residuals (CV-ANOVA) and Q2Y to assess how well the model explained concentrations of individual contaminants comprising the model 27. Contaminants with CV-ANOVA values <0.05 are those that are better predicted by the model than by random chance, with the strength of that relationship indicated by Q2Y. Thus, we used Q2cum to assess the predictive power of the overall model and CV-ANOVA and Q2Y to quantify how well individual contaminants were predicted by the partial least-squares regression model.
We also used CV-ANOVA in the backward elimination steps to identify contaminants that were not significantly predicted by the model and could be removed from subsequent models. For that assessment, we used a threshold of CV-ANOVA >0.10 to be conservative about which contaminants to exclude. Using an iterative process, we excluded those y variables that had the highest CV-ANOVA values (i.e., exhibiting the lowest predictive power), then rebuilt and revalidated the partial least-squares regression. For subsequent models, we again used Q2cum, Q2Y, and CV-ANOVA to identify poorly predicted variables and repeated this process until all y variables remaining in the final model possessed a CV-ANOVA <0.10. This process was conducted separately for males and females. To further confirm that informative contaminants were not spuriously excluded, at the end of the model selection routine we conducted a final step whereby all excluded contaminants were individually reintroduced into the final model. An excluded contaminant was reincluded in the final model if its CV-ANOVA was <0.10 and its inclusion increased the model's predictive power (Q2cum).
The contaminants contained in the final model were those that were significantly covarying with endogenous metabolite changes at these sampling sites. We then used Q2Y to assess the relative strength of that relationship. On the other hand, the list of excluded contaminants indicated those that were not significantly covarying with endogenous metabolite changes and were less likely to be having a significant biological impact at these sites, as indicated by this approach. Based on variable importance on the projection values for each metabolite bin in the final partial least-squares model 25, we ascertained which endogenous metabolites explained a substantial proportion of the variability in contaminant concentrations (i.e., variable importance on the projection ≥1.0). We also generated loading plots for the first and second components in each partial least-squares model to assess the direction of the endogenous metabolite and contaminant relationship. The appropriateness of these model fit metrics has been described elsewhere 25, 27. For those spectral bins exhibiting high variable importance on the projection values, we identified their associated endogenous metabolite peaks using Chenomx NMR Suite 7.6 (Chenomx) and previously published values 28-30.
RESULTS
Comparing metabolite profiles across sites
Prior to processing, liver samples from 1 of the 18 sites (Point Hennepin) were lost, leaving only 17 sites in all analyses. In addition, we used Hotelling's T2 tests (95% confidence interval) to identify several individual fish (9 males and 7 females) that were statistical outliers, which were also removed from all subsequent analyses. After removal of these outliers and loss of individual samples in the field, the statistical analyses included 191 males and 191 females. Outlier removal and loss of individual samples were not systematic and were distributed across multiple sampling sites.
Based on principal component analysis and subsequent ANOVA comparisons, we observed distinct differences in endogenous metabolite profiles of male and female fathead minnows deployed across 17 sampling sites. For instance, the principal component analysis model comparing endogenous metabolite profiles in male fish exhibited significant differences (p < 0.001) among sites along principal component 1 (PC1) and principal component 2 (PC2) (PC1, F16,174 = 4.69; PC2, F16,174 = 8.19; Supplemental Data, Figure S1A and Table S2). Females exhibited considerably more overlap than males, but sites were still significantly different (p < 0.001) from each other along PC1 and PC2 (PC1, F16,174 = 6.11; PC2, F16,174 = 14.10; Supplemental Data, Figure S1B and Table S2).
Relating changes in metabolite profiles with chemical composition
Based on the iterative model selection routine, we reduced the number of contaminants contained within the partial least-squares regression while improving overall model fit, as measured by Q2cum (Table 1). Of the 132 contaminants that were analyzed, 46 were not detected at any site (Supplemental Data, Table S3) and were not considered further. As a result, the global model for males included water temperature and 86 contaminants. The model selection routine, based on CV-ANOVA values, removed 45 of 86 contaminants over 8 iterations (Table 2 and Table 3). Although excluded contaminants were individually reintroduced into the male model as a final validation step, none improved overall model fit or were significantly related to metabolite changes. This iterative process ultimately resulted in water temperature and 41 contaminants being included in the final model (Tables 1 and 2). The final model for males represented a 52% reduction in the number of potential biologically relevant contaminants and exhibited greater predictive power compared to the global model (Table 1).
No. of variables | No. of contaminants removed | No. of model components | Q2cum | |
---|---|---|---|---|
Male candidate models | ||||
Global male model | 87 | N/A | 12 | 0.226 |
Final male model | 42 | 45 | 11 | 0.340 |
Female candidate models | ||||
Global female model | 87 | N/A | 18 | 0.290 |
Female model without 1 additional chemical | 49 | 38 | 18 | 0.427 |
Final female model with 1 additional chemical | 50 | 37 | 18 | 0.428 |
- a Separate models were built for males and females. The overall predictive power of the model is indicated by the cumulative Q2 (Q2cum). The global model included temperature and all contaminants that were detected for at least 1 sampling site during the fathead minnow deployments, so no chemicals were excluded from that model (indicated by N/A).
Males | Females | |||||
---|---|---|---|---|---|---|
Name | Units | Range | Q2Y | CV-ANOVA | Q2Y | CV-ANOVA |
Chemicals in both models | ||||||
β-Sitosterol | μg/L | ND–1.063 | 0.544 | 3.72E-18 | 0.687 | 8.21E-25 |
DEET | μg/L | 0.019–0.902 | 0.610 | 8.00E-23 | 0.681 | 3.63E-20 |
Benzophenone | μg/L | ND–0.737 | 0.550 | 3.84E-18 | 0.619 | 4.70E-15 |
Camphor | μg/L | ND–0.033 | 0.516 | 6.50E-17 | 0.594 | 8.24E-16 |
Cotinine | μg/L | ND–0.148 | 0.500 | 5.70E-16 | 0.579 | 5.04E-12 |
3,4-Dichlorophenyl isocyanate | μg/L | ND–0.090 | 0.524 | 2.83E-06 | 0.537 | 8.12E-05 |
β-Stigmastanol | μg/L | ND–0.181 | 0.236 | 1.53E-03 | 0.536 | 9.96E-11 |
Diethyl phthalate | μg/L | ND–1.250 | 0.485 | 6.74E-14 | 0.513 | 3.88E-09 |
Metolachlor | μg/L | ND–0.516 | 0.263 | 4.45E-04 | 0.493 | 4.75E-07 |
1-Methylnaphthalene | μg/L | ND–0.020 | 0.414 | 1.29E-09 | 0.489 | 7.57E-10 |
Dichlorvos | μg/L | ND–0.050 | 0.278 | 1.71E-03 | 0.479 | 7.29E-06 |
Bisphenol A | μg/L | ND–0.296 | 0.412 | 1.24E-10 | 0.451 | 2.93E-06 |
Caffeine | μg/L | ND–0.769 | 0.273 | 1.65E-03 | 0.446 | 4.41E-05 |
Isophorone | μg/L | ND–0.051 | 0.362 | 6.57E-07 | 0.443 | 1.91E-07 |
Metaxalone | μg/L | ND–0.010 | 0.252 | 3.85E-04 | 0.432 | 8.08E-08 |
Venlafaxine | μg/L | ND–0.049 | 0.327 | 5.26E-07 | 0.423 | 2.63E-06 |
Atrazine | μg/L | ND–0.552 | 0.207 | 8.47E-03 | 0.422 | 4.74E-05 |
Tris(2-chloroethyl) phosphate | μg/L | ND–0.061 | 0.403 | 1.14E-10 | 0.417 | 4.24E-05 |
Triclosan | μg/L | ND–0.202 | 0.265 | 3.61E-04 | 0.415 | 2.01E-03 |
Piperonyl butoxide | μg/L | ND–0.028 | 0.229 | 4.10E-03 | 0.411 | 1.18E-06 |
Phenol | μg/L | ND–0.045 | 0.345 | 1.41E-06 | 0.406 | 5.62E-05 |
2-Methylnaphthalene | μg/L | ND–0.027 | 0.341 | 2.91E-06 | 0.392 | 5.25E-06 |
Tris(2-butoxyethyl) phosphate | μg/L | ND–1.710 | 0.303 | 4.04E-06 | 0.375 | 1.10E-03 |
2,6-Dimethylnaphthalene | μg/L | ND–0.012 | 0.390 | 1.73E-07 | 0.370 | 1.32E-04 |
Dihydrotestosterone | ng/L | ND–2.138 | 0.360 | 1.05E-07 | 0.369 | 1.30E-05 |
Indole | μg/L | ND–0.003 | 0.360 | 1.05E-07 | 0.369 | 1.30E-05 |
Pentobarbital | μg/L | ND–0.009 | 0.360 | 1.05E-07 | 0.369 | 1.30E-05 |
17β-Estradiol | ng/L | ND–1.139 | 0.352 | 1.86E-07 | 0.364 | 1.82E-05 |
Chloroxylenol | μg/L | ND–0.047 | 0.242 | 7.36E-04 | 0.350 | 4.85E-03 |
Tribromomethane | μg/L | ND–0.096 | 0.252 | 4.82E-04 | 0.318 | 3.27E-03 |
Prometon | μg/L | ND–0.034 | 0.384 | 9.44E-09 | 0.310 | 1.52E-03 |
4-Cumylphenol | μg/L | ND–0.009 | 0.229 | 4.98E-03 | 0.277 | 1.70E-02 |
Chemicals unique to males | ||||||
Estrone | ng/L | ND–6.210 | 0.311 | 2.44E-05 | ||
Pentachlorophenol | μg/L | ND–0.172 | 0.272 | 1.52E-03 | ||
Carbazole | μg/L | ND–0.174 | 0.251 | 2.82E-04 | ||
Triethyl citrate | μg/L | ND–0.255 | 0.241 | 4.52E-04 | ||
3β-Coprostanol | ng/L | ND–3306.2 | 0.238 | 1.15E-02 | ||
Propofol | μg/L | ND–0.007 | 0.226 | 1.77E-03 | ||
4-tert-Octylphenol | μg/L | ND–0.061 | 0.219 | 7.70E-03 | ||
9,10-Anthraquinone | μg/L | ND–0.277 | 0.204 | 1.67E-02 | ||
p-Cresol | μg/L | ND–0.075 | 0.195 | 9.14E-02 | ||
Chemicals unique to females | ||||||
Phenobarbital | μg/L | ND–0.024 | 0.472 | 3.23E-09 | ||
cis-Androsterone | ng/L | ND–5.868 | 0.458 | 3.52E-05 | ||
4-tert-Octylphenol diethoxylate | μg/L | ND–0.073 | 0.444 | 2.57E-08 | ||
Celecoxib | μg/L | ND–0.083 | 0.434 | 5.83E-06 | ||
Ibuprofen | μg/L | ND–0.567 | 0.416 | 1.55E-05 | ||
Acetyl hexamethyl tetrahydronaphthalene | μg/L | ND–0.031 | 0.387 | 4.62E-05 | ||
Cholesterol | ng/L | 884.0–5798.1 | 0.385 | 7.43E-05 | ||
Estriol | ng/L | ND–1.123 | 0.382 | 5.32E-03 | ||
4-tert-Octylphenol monoethoxylate | μg/L | ND–0.054 | 0.355 | 1.43E-02 | ||
Naphthalene | μg/L | ND–0.066 | 0.352 | 1.77E-02 | ||
Tetrachloroethene | μg/L | ND–0.071 | 0.345 | 1.02E-02 | ||
Carbaryl | μg/L | ND–0.028 | 0.325 | 9.61E-02 | ||
Tributyl phosphate | μg/L | ND–0.100 | 0.291 | 3.28E-02 | ||
Fluconazole | μg/L | ND–0.039 | 0.284 | 1.03E-02 | ||
4-Nonylphenol diethoxylate | μg/L | ND–0.998 | 0.284 | 1.18E-02 | ||
Hexahydrohexamethyl cyclopentabenzopyran | μg/L | ND–0.215 | 0.281 | 1.95E-02 | ||
Oxycodone | μg/L | ND–0.176 | 0.266 | 1.96E-02 |
- a The strength of the relationship between contaminants and endogenous metabolites is indicated by Q2Y and analysis of variance testing of cross-validated predictive residuals. The list is sorted based on Q2Y calculations from models for females. The best-fitting models also include temperature (see Results).
- CV-ANOVA = analysis of variance testing of cross-validated predictive residuals; DEET = N,N-diethyl-m-toluamide; ND = not detected.
Excluded from both models | Excluded from males | Excluded from females |
---|---|---|
1,4-Dichlorobenzene | 4-Nonylphenol diethoxylate | 3β-Coprostanol |
3-Methyl-1H-indole | 4-tert-Octylphenol diethoxylate | 4-tert-Octylphenol |
4-Androstene-3,17-dione | 4-tert-Octylphenol monoethoxylate | 9,10-Anthraquinone |
4-Nonylphenol | Acetyl hexamethyl tetrahydro naphthalene | Carbazole |
4-Nonylphenol monoethoxylate | Carbaryl | Estrone |
5-Methyl-1H-benzotriazole | Celecoxib | p-Cresol |
Anthracene | Cholesterol | Pentachlorophenol |
Benzo[a]pyrene | cis-Androsterone | Propofol |
Bromacil | Estriol | Triethyl citrate |
Carbamazepine | Fluconazole | |
Citalopram | Hexahydrohexamethyl cyclopentabenzopyran | |
Codeine | Ibuprofen | |
Diltiazem | Naphthalene | |
Diphenhydramine | Oxycodone | |
Fluoranthene | Phenobarbital | |
Hydrocodone | Tetrachloroethene | |
Iminostilbene | Tributyl phosphate | |
Lidocaine | ||
Menthol | ||
Methocarbamol | ||
Oxcarbazepine | ||
Phenanthrene | ||
Phenytoin | ||
Pyrene | ||
Tramadol | ||
Triphenyl phosphate | ||
Tris(dichloroisopropyl) phosphate | ||
Verapamil |
- a Separate modeling routines were run for male and female fathead minnows. Excluded contaminants had individual analysis of variance testing of cross-validated predictive residuals values >0.10, indicating that they were not significantly covarying with changes in metabolites. See Supplemental Data (Table S3) for additional contaminants that were not detected at any of the sampling sites.
The partial least-squares regression focused on endogenous metabolite changes in female fathead minnows similarly reduced the number of potential biologically relevant contaminants. As with males, the global model initially included temperature and 86 contaminants, and the model-selection routine removed 38 of those contaminants over 9 iterations (Table 1). However, when each excluded contaminant was reintroduced into the model, 4-cumylphenol was significantly related to endogenous metabolite changes (CV-ANOVA <0.10) and marginally improved overall model fit. Because 4-cumylphenol was reincorporated into the final model for females, the best-fitting model contained temperature and 49 contaminants (Table 2). The final model eliminated 37 contaminants, a 43% reduction in the list of potentially biologically relevant contaminants, and improved the overall predictive power compared to the global model (Tables 1 and 3).
Contaminants that were significantly related to metabolite changes
The best-fitting partial least-squares regression models identified a variety of contaminants as being significantly related to endogenous metabolite changes in livers. There was considerable overlap in the identity of contaminants that were significantly related to male and female endogenous metabolite changes (32 overlapping contaminants), but some contaminants were unique to either male or female models (9 and 17, respectively; Table 2). By comparing the Q2Y of various contaminants that comprised these partial least-squares regressions, we assessed the relative strength of the relationship between metabolite changes and individual contaminants (Table 2). In the final models, Q2Y for the contaminants ranged from 0.195 to 0.610 for males and from 0.266 to 0.687 for females. In the partial least-squares regression relating male fathead minnows and contaminant composition of water samples, the 5 variables with the strongest relationship to endogenous metabolite changes were temperature (Q2Y = 0.612), N,N-diethyl-m-toluamide (DEET; 0.610), benzophenone (0.550), β-sitosterol (0.544), and 3,4-dichlorophenyl isocyanate (0.524; Table 2). In the best-fitting female model, the 5 variables with the strongest relationship to endogenous metabolite shifts were temperature (Q2Y = 0.758), β-sitosterol (0.687), DEET (0.681), benzophenone (0.619), and camphor (0.594; Table 2).
Endogenous metabolites that were significantly related to contaminant changes
Based on the partial least-squares regression, we identified several endogenous metabolites in liver tissue exhibiting significant covariance with contaminant concentrations at these sampling sites (Supplemental Data, Figures S2–S4). For instance, lactate, alanine, betaine, glucose, and taurine in male livers were positively related to the first component of the best-fitting partial least-squares regression; but creatine, phosphocholine, glycogen, leucine, aminoisobutyrate, and taurocholic acid were negatively related to the first component (Supplemental Data, Figure S3). Most of these and other metabolites (e.g., dimethylglycine) were also significantly related to the second component of the male partial least-squares model. In the best-fitting model relating endogenous metabolites from livers of females and contaminant concentrations, glycerophosphocholine, adenosine mono-/di-/tri-phosphate (AXP), creatine, phosphocholine, glutamate, and taurocholic acid were positively related to the first component of the partial least-squares regression (Supplemental Data, Figure S4). Glycogen, glucose, leucine, and nicotinamide adenine dinucleotide were negatively related to the first component (Supplemental Data, Figure S4). For the second component, we found that betaine, taurine, lactate, choline, and phosphocholine were positively related; but AXP, glucose, and taurocholic acid were negatively related (Supplemental Data, Figure S4).
DISCUSSION
By comparing relative differences in endogenous metabolites in fathead minnow livers with differences in concentrations of surface-water contaminants and temperature, we were able to screen and prioritize detected contaminants based on these biological impacts. Specifically, we identified contaminants that did not significantly covary with endogenous metabolite changes in livers and thus were not likely to significantly influence biological responses in fathead minnows deployed at our study sites. The analyses also highlighted the potential biological relevance of many contaminants detected at our study sites. Despite these sites receiving complex contaminant mixtures, this approach resulted in a 43% to 52% reduction in the number of potential biologically relevant contaminants that were originally detected at these sites and may represent a first step in screening contaminants for contaminant-monitoring and toxicity-testing programs. In contrast, a decision-making process focused primarily on chemical analysis and contaminant detections in surface water would have resulted in a substantially larger number of contaminants being identified as potentially important and requiring additional testing. Limited resources for toxicity testing can frequently preclude testing of all contaminants detected with these traditional chemical analyses. However, the dual-based approach described in the present study helped screen out detected contaminants that had a lower probability of eliciting a biological response and could be given a lower priority for toxicity testing, while identifying higher-priority contaminants that may elicit biological impacts. The latter could be subsequently targeted for more in-depth evaluation, individually or as part of a mixture. Thus, the employment of an effects-based tool in this approach generated valuable biological information that could be leveraged by risk assessors during this decision-making process and provided a biological basis for which contaminants to screen out. Although the biologically relevant contaminants identified in the present study are most applicable to our study sites and the particular suite of measured contaminants, the general screening approach we have employed has relevance and broad applicability in other aquatic ecosystems receiving complex mixtures.
Our approach was focused on contaminant relationships, but the analyses also identified water temperature as significantly covarying with endogenous metabolite changes. Although temperature has been found to affect relative abundances of endogenous metabolites in some species 20, comparable temperature effects have not been reported for fathead minnows in other studies. For example, a similar field-based study that measured effects of pulp mill effluent on caged fathead minnows found that endogenous metabolite differences were significantly related to effluent exposure and could not be explained by changes in water temperature (e.g., metabolite differences were greater when temperatures were more similar) 15. Furthermore, a controlled 4-d laboratory study with fathead minnows that focused solely on impacts of temperature changes comparable to the present study (10 °C, 15 °C, 20 °C, and 25 °C) did not find significant temperature effects on polar endogenous metabolites from livers in either sex (as measured by principal component analysis; J. Davis, unpublished data). This suggests that over the time period and temperature range studied, endogenous metabolite profiles of fathead minnows can exhibit considerable thermal tolerances. Thus, we expect that the strength of the relationship between temperature and endogenous metabolite changes observed in the present study may be the result of temperature covarying with other biologically impactful components of wastewater effluent, especially because sites receiving such effluent can typically exhibit elevated temperatures relative to surrounding water temperatures.
In a similar fashion, the partial least-squares regressions highlighted some contaminants (caffeine, cotinine, and DEET) as significantly covarying with endogenous metabolite changes despite little previous evidence of their biological relevance 31-33. Some of these contaminants exhibited relatively strong relationships. These particular contaminants may be co-occurring with other, perhaps unmeasured, contaminants that are eliciting biological responses. Indeed, these contaminants have been employed as markers for the presence of wastewater-treatment effluent 31, 32, 34. Therefore, to assess whether these contaminants and temperature had undue influence on the models and on contaminant classification, we excluded them and repeated the model-fitting procedure (Supplemental Data). This truncated data set exhibited a high degree of overlap with the best-fitting model because it identified 95% and 91% of biologically relevant contaminants from the best-fitting models as being relevant for males and females, respectively. This, in combination with previous results, suggests that our contaminant prioritization was relatively robust against the influence of these particular variables and that the strength of these variables in the best-fitting model may be related to their simply covarying with other aspects of wastewater effluent.
Generating hypotheses for biologically active contaminants
The multivariate approach is most robust for screening out contaminants that do not appear to elicit biological responses at our sites (i.e., not significantly covarying with endogenous metabolite changes); however, analysis of contaminants that did covary also offers useful insights. In particular, our results suggest how this approach may help generate testable hypotheses of whether similarly responding contaminants may possess similar modes of action and lead to interactive effects when combined in mixtures. For instance, certain contaminants with similar modes of action had similar influences on the best-fitting model, as indicated by clustering in the partial least-squares loading plots. For example, we found that estrone, benzophenone, and bisphenol-A, which are contaminants that have been shown to exhibit estrogenic properties 35-37, were clustered together and negatively associated with the first component of the partial least-squares regression for males (Supplemental Data, Figure S3). Thus, these contaminants with similar modes of action were found to exhibit comparable influence on the model and on endogenous metabolites. This illustrates how this approach may leverage similar information to generate hypotheses of how other contaminants with unknown modes of action may interact or, more generally, identify contaminant combinations that could be utilized in mixture studies.
Comparisons of other contaminants suggest limitations and the need for further study to explore the utility of this approach for hypothesis generation. For example, the loadings plot for females showed that cis-androsterone and dichlorvos were in close proximity and negatively related to the second component (Supplemental Data, Figure S4), despite different modes of action. Dichlorvos possesses greater potency in terms of acetylcholinesterase inhibition and glucocorticoid receptor agonism 38, but it can be antiandrogenic in mammals 39. Conversely, cis-androsterone is a known androgen. Although other contaminants can have contrasting androgenic properties for mammals versus fish (e.g., the steroid spironolactone) 40, it is also possible that their similar effects on the model are related to their concentrations covarying across sites, rather than them eliciting similar metabolomic responses. Such covariance can limit the ability of the partial least-squares regression to distinguish among covarying contaminants when comparing them to endogenous metabolite changes and may help explain some of these incongruences. Additional studies are necessary to elucidate apparent discrepancies, evaluate hypotheses generated in this fashion, and, in general, assess how well this overall approach may be applied for hypothesis generation.
Response of individual endogenous metabolites
The present study demonstrates the potential utility of field-based metabolomics as a discovery tool for assessing specific biological impacts of complex mixtures. For example, a variety of biologically active compounds were detected in surface water, and we were able to relate changes in those contaminant concentrations to several physiological changes related to exposure. A full accounting of endogenous metabolite responses observed at these sites is beyond the scope of the present study, but metabolite changes noted in previous studies as reflective of possible adverse impacts on fathead minnows were similarly identified as changing in response to contaminant exposures at the Great Lakes areas of concern examined herein 15, 41. For instance, we identified significant changes in endogenous metabolites important for creatine, taurine, and glucose metabolism. Increased abundances of creatine and taurine have also been linked to liver toxicity 42.
Assumptions of the approach
We identified contaminants of concern by linking measurements of endogenous metabolites and contaminant concentrations, but the analysis is based on several assumptions. Because we applied partial least-squares regression, which maximized covariance between data sets, the list of biologically relevant contaminants included in the best-fitting model is limited to the particular contaminants measured and is unable to inform whether other unmeasured contaminants may be significantly affecting fish deployed at these sites. Nonetheless, this would likely have little impact on the list of contaminants that were excluded during the model-fitting routine because they were measured and did not elicit changes in endogenous metabolite profiles. Care also must be taken in generalizing specific results to other regions because contaminants identified as not having a biological impact at our sites may reach biologically relevant levels elsewhere. In addition, contaminant analyses were based on depth-integrated grab water samples, which may not be fully representative of contaminants exhibiting temporally fluctuating concentrations and may have affected our ability to accurately classify such contaminants. Because of the statistical nature of these comparisons, this approach also cannot assign true causal relationships between endogenous metabolite and contaminant changes. However, if a measured contaminant was eliciting biological impacts at these sites, those changes would have covaried with changes in contaminant concentrations and exhibited a high probability of being elucidated by this procedure, even as other variables were similarly affecting endogenous metabolite profiles. Therefore, we expect that this would also have little to no bearing on the approach's ability to screen out contaminants that are not significantly impacting endogenous metabolites and could be deprioritized for toxicity testing, indicating that such results exhibit greater strength and rigor. The generation of such an exclusion list could still inform toxicity assessment programs because it would reduce the number of contaminants that may need to be considered for toxicity testing at a given study site.
Because we focused on endogenous metabolites in livers, assessments of the biological importance of contaminants were restricted to those contaminants affecting liver processes. Although other tissues or species may respond differently, metabolomic-based analyses of livers may provide considerable insight into a variety of contaminants and their associated impacts on biological processes. Because it is an important site for biotransformation of exogenous contaminants 21, 22, the liver can be highly susceptible to contaminant exposure; and many contaminants are hepatotoxins 43. It also performs essential roles in many biological processes (e.g., immune response, energy metabolism, and hormone production); thus, perturbations affecting other organs can also alter liver structure and function 44. Therefore, a range of biological impacts associated with contaminant exposures would have a high probability of manifesting in the liver and being detected by these analyses. Application of similar approaches to other tissues (e.g., gonad) or biological measurements (e.g., transcriptomics) could further highlight additional biologically active contaminants and provide a more complete assessment of potential toxicity.
CONCLUSION
The present study and similar biological effects-based approaches applied in other regions may begin to help guide management decisions and risk-management programs 8, 9, 14. For instance, the 43% to 52% reduction in the number of potential biologically relevant contaminants at these sites suggests that such information could be an important first step in eventually increasing the efficiency of monitoring and toxicity-assessment programs by diverting resources away from those contaminants unlikely to be eliciting significant biological impacts at a particular site. The potential for this approach to help meet such a goal has also been demonstrated by a recent study in a mid-sized river in the western United States. That study excluded approximately 46% of detected surface-water contaminants as not being biologically active at sites affected by wastewater-treatment plant effluent (T. Collette, unpublished data), indicating that these screening methods can be effective in other ecosystem types (e.g., river networks). As additional studies are developed based on more sites and tissues (e.g., gonads), the robustness of the relationships detailed in the present study and the overall promise of this approach for contaminant screening can be more fully evaluated, especially if those studies help reinforce evidence of the biological activity detailed in the present study. More generally, results from the present study demonstrate how the approach could be utilized for hypothesis generation and to highlight contaminants that could be prioritized for additional in-depth analyses to identify toxic modes of action, more targeted biological responses using traditional biomarker measurements (e.g., vitellogenin induction), or population-level responses based on adverse outcome pathways 9, 13, 14. The combination of such information could help risk-assessment and toxicity-assessment programs keep pace with an ever-increasing number of possible contaminants, identify unknown sources of contamination, and allow for more rapid assessment of ecosystem recovery when known sources of contamination decrease.
Supplemental Data
The Supplemental Data are available on the Wiley Online Library at DOI: 10.1002/etc.3409.
Acknowledgment
We thank J. Banda, S. Choy, E. Durhan, D. Gefell, C. LaLone, S. Langer, E. Makynen, M. Menheer, J. Moore, M. Pearson, M. Severson, and K. Stevens for technical assistance and T. Smith for research support. M. Berntsson and L. Eriksson (Umetrics) and S. Wenger provided guidance on statistical analyses. J.M. Davis was supported by the Great Lakes National Program Office and an appointment to the Postdoctoral Research Program at the National Exposure Research Laboratory, administered by Oak Ridge Institute for Science and Education through interagency agreement between the US Department of Energy and the USEPA. J.E. Cavallin was supported by an Oak Ridge Institute for Science and Education fellowship.
Disclaimer
The views expressed in the present study are those of the authors and do not necessarily represent the views or policies of the USEPA. The findings and conclusions in the present study are those of the authors and do not necessarily represent the views of the US Fish and Wildlife Service. The present study has been peer-reviewed and approved for publication consistent with US Geological Survey's Fundamental Science Practices (http://pubs.usgs.gov/circ/1367/). Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the US government.
Data availability
For all data requests, please contact the corresponding authors ([email protected] or [email protected]).