Full Access

A Simulation Study of Quantitative Risk Assessment for Bivariate Continuous Outcomes

Correction(s) for this article

Corresponding Author

Zi-Fan Yu

Statistics Collaborative, Inc., Washington, DC, USA.

*Address correspondence to Zi-Fan Yu, Statistics Collaborative, Inc., 1650 Massachusetts Ave. NW, Washington, DC 20036, USA; [email protected].Search for more papers by this author

Paul J. Catzlano,

Paul J. Catzlano

Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA.

Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA.

Search for more papers by this author

Zi-Fan Yu,

Corresponding Author

Zi-Fan Yu

Statistics Collaborative, Inc., Washington, DC, USA.

*Address correspondence to Zi-Fan Yu, Statistics Collaborative, Inc., 1650 Massachusetts Ave. NW, Washington, DC 20036, USA; [email protected].Search for more papers by this author

Paul J. Catzlano,

Paul J. Catzlano

Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA.

Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA.

Search for more papers by this author

First published: 20 September 2008

https://doi.org/10.1111/j.1539-6924.2008.01082.x

Citations: 1

Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA.

Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA.

Share a link

Email
Wechat
Bluesky

Abstract

The neurotoxic effects of chemical agents are often investigated in controlled studies on rodents, with binary and continuous multiple endpoints routinely collected. One goal is to conduct quantitative risk assessment to determine safe dose levels. Yu and Catalano (2005) describe a method for quantitative risk assessment for bivariate continuous outcomes by extending a univariate method of percentile regression. The model is likelihood based and allows for separate dose-response models for each outcome while accounting for the bivariate correlation. The approach to benchmark dose (BMD) estimation is analogous to that for quantal data without having to specify arbitrary cutoff values. In this article, we evaluate the behavior of the BMD relative to background rates, sample size, level of bivariate correlation, dose-response trend, and distributional assumptions. Using simulations, we explore the effects of these factors on the resulting BMD and BMDL distributions. In addition, we illustrate our method with data from a neurotoxicity study of parathion exposure in rats.

1. INTRODUCTION

In toxicology studies, both binary and continuous endpoints are routinely collected. Analyzing such endpoints is an important part of risk assessment in order to find a safe dose level of exposure. Because exposures tend to affect multiple outcomes simultaneously, risk assessment procedures should account for all of the adverse events at once. While methods for quantitative risk assessment of multiple binary and even mixed outcomes have been explored, those for multiple continuous outcomes are not as widely developed.

Yu and Catalano (2005) propose a method for modeling and quantitative risk assessment via benchmark dose (BMD) estimation for bivariate continuous outcomes in dose-response studies. The purpose here is to study the properties of that approach via simulations. Motivation stems from neurotoxicology studies that assess the effects of a chemical agent on the structure and function of the nervous system. In addition, topical literature advocates risk assessment for joint as well as individual outcomes, as there may be greater overall sensitivity or ability to detect generalized effects by looking at more than one outcome (Ryan, 1992). Such studies routinely collect multiple endpoints, and risk assessment for the correlated continuous endpoints is of interest.

To estimate dose-response parameters and bivariate correlation, assume the responses (or transformations of them) are distributed bivariate normal and use likelihood-based methods for estimation. Quantitative risk assessment is accomplished by choosing two cutoffs, one for each outcome, such that the area as defined by the cutoffs defines the area of adversity. This area of adversity equals a prespecified background rate. Because the cutoffs are functions of parameters from the dose-response model, they are treated as variable rather than fixed, as in previous methods for risk assessment methods for a single continuous outcome. This background percentage then determines a BMD. One may find a lower bound on the BMD (BMDL) using a variety of methods.

In formulating this method, the authors identified several issues deserving further attention. These issues are the basis for the simulations in this article. Analogous to the issues of cutoffs to use for defining adversity are the related issues of appropriate background percentiles for risk assessment and effects of varying background rates on resulting BMD distributions. We also sought to examine the following effects on the resulting BMD distributions: sample size, level of bivariate correlation, misspecification of dose-response trend (i.e., linear versus quadratic trend), and misspecification of distribution in the parametric normal model.

Another area of interest is the method of BMDL calculation. Several advantages to the BMDL based on a likelihood-ratio method may exist over a BMDL based on the delta method (Chen & Kodell, 1989). Delta-method confidence limits may produce negative values (Crump & Howe, 1985), while the likelihood-ratio method tends to maintain better coverage and restricts the lower bound within the range of doses. For each simulation scenario, we calculated both types of BMDLs to evaluate their behaviors.

Our data come from an evaluation of the effects of parathion in rats from a study sponsored by the World Health Organization and the International Programme on Chemical Safety (Moser, 1997). One type of study in laboratory animals is the Functional Observable Battery (FOB), a series of tests where roughly 30 measures of sensory, motor, and autonomic function are collected on each animal. Each test may be grouped into a domain of neurological function (autonomic, convulsive, neuromuscular, sensorimotor, central nervous system (CNS) excitability, and CNS activity domains) (Moser, 1997). Dosing may be acute or chronic, with 8–10 rats per dose group and five dose groups. Two continuous responses of particular interest in the neuromuscular domain are the forelimb and hindlimb grip strengths. We use these outcomes in our example.

Section 2 reviews the method for modeling and risk assessment for two continuous bivariate outcomes. Section 3 provides a data example of the method for both the bivariate and the univariate settings on a neurotoxicity study of parathion in rats. Section 4 describes the procedures and parameters selected for the simulation study, with results reported in Section 5. A discussion follows in Section 6.

2. METHODOLOGY FOR CONTINUOUS BIVARIATE OUTCOMES

For N independent individuals, let F and H denote continuous forelimb grip strength and hindlimb grip strength, respectively, of the bivariate response.

2.1. Dose-Response Modeling

We assume F and H (or some monotonic transformation of them) are distributed bivariate normal with means (μ_F, μ_H), variances (σ²_F, σ²_H), and bivariate correlation parameter ρ:

Dose-response models may be specified for all parameters. The means μ_F and μ_H may be modeled as functions of dose and other covariates. For dose-response and adverse event data in particular, responses at different doses are likely to have changes in scale and, for bivariate data, changes in correlation. Thus, it may be appropriate to model the variances σ²_F and σ²_H and bivariate correlation ρ as functions of dose, choices that a likelihood model easily accommodates. We choose a log link for positivity on the variance estimation and a Fisher's Z transformation for the correlation parameter to restrict the parameters between −1 and 1. Thus, we may express the dose-response parameters as:

where

are all N×p_a, N×p_b, N×p_e, N×p_f, N×p_g matrices of covariates corresponding to parameter vectors {a, b, e, f, g}, respectively. In controlled dose-response studies, inline image

often will be simply {[1 d], …, [1 d]}, where d is the N× 1 vector of doses or some transformation of the dose metric.

The method of maximum likelihood allows straightforward estimation of parameters. The score equations are obtained by taking the derivatives of the log likelihood of the above bivariate distribution. The final regression parameter estimates are easily obtained using a quasi-Newton algorithm. We used the nlminb function in Splus for this purpose.

2.2. BMD Calculations

While quantitative risk assessment for bivariate data could proceed by finding BMDs for each univariate outcome separately using the above model, a joint BMD may be preferred. Topical literature has emphasized the importance of considering multiple outcomes when assessing risk (Ryan, 1992). A BMD calculated from joint risk may encompass greater overall sensitivity than computing BMDs from each individual outcome alone. One method of joint risk assessment for two outcomes, as described in Regan and Catalano (1999a, 1999b), is to consider separate fixed cutpoints for adversity for each of the two responses, and to define the probability of an adverse outcome using those cutpoints according to the methods just described.

We combine elements of their approach and the percentile regression method of Yu (2002) for the bivariate continuous responses of hindlimb and forelimb grip strengths. To establish a definition of adversity, we propose letting c_F and c_H be cutoffs for forelimb grip strength and hindlimb grip strength, respectively, where c_F and c_H are variable (estimated) rather than fixed. That is, these cutoffs are not determined by the investigator, but by the correlation in unexposed subjects and the selected value of u, as one will see later in this section. More specifically, we find the values c_F and c_H that define the lower u× 100% (for example, the lower 5%) of the modeled bivariate distribution in the control group, such that P(F < c_F or H < c_H) =u (Fig. 1), with the added constraint of:

(i.e., equal marginal probabilities). This constraint provides equal weighting of F and H in the risk assessment. Without the constraint, many combinations of (c_F, c_H) satisfy u=P(F < c_F or H < c_H); a high value for c_F and a low value for c_H, or vice-versa, may produce the same u and be arbitrarily (or differentially) weighted.

Details are in the caption following the image — **Figure 1**
Open in figure viewer PowerPoint

An illustration of bivariate dose response and risk assessment. Shaded areas indicate the area of adversity for each dose as defined by the cutoffs c_F and c_H, where P(F < c_F or H < c_H) equals a prespecified background rate, u.

The determination of c was best kept as a direct analogy to the univariate case. In general, the probability of the union of two events is greater than that of each marginal event; the corresponding adjustments to the cutoffs to adjust the overall probability of adversity result in the definition of c. The notion of a percentile in a bivariate distribution usually refers to contours, which contain a certain percentage of the probability under the bivariate normal surface. However, the lower probabilities defined as such are not consistent with the definitions of adversity in the univariate setting; adverse forelimb strength and adverse hindlimb strength in our example (for example “low” forelimb strength and “low” hindlimb strength) are lower tail values in the respective univariate distributions. For a bivariate distribution, the same or similar values are located on the extreme right and bottom sides of the distribution, as indicated by the shaded regions in Fig. 1.

Given our definition of c, the probability of an adverse event at a given dose d_i is therefore

(1)

and is expressed conveniently as inline image

. For the control group at d= 0, the probability of an adverse outcome reduces to 2Φ(c) −Φ₂(c). Note that c is dependent only upon ρ(0), the correlation in unexposed subjects, and the selected value of u. In our example, P(d) is the probability of either low forelimb strength or low hindlimb strength at dose d.

The risk function, r(d), may be written as r(d) =P(d) −P(0) for additional risk, or inline image for extra risk. The is the dose satisfying , where P(0) =u.

Fig. 1 illustrates a bivariate dose response. Each oval represents a contour of a three-dimensional bivariate normal surface at the respective mean response and correlation ρ. The mean responses decrease with dose while the correlation increases with dose, as shown by the “narrowing” of each contour with increasing dose. Shaded areas generally represent the region of adversity defined by the cutoffs c_F and c_H; note that the region of adversity includes points beyond the contour's boundary, as adversity is defined by the cutoff values (dotted lines). For a 5% background risk and a 5% additional risk, for example, the BMD is the dose where this area of adversity equals 0.10.

We present the case for two positively correlated outcomes where adversity is defined by decreasing values. When adversity is defined by increasing levels—for example, lead levels in blood—the region(s) of adversity and calculation(s) for risk would differ accordingly.

2.3. Lower Limit Calculations

A 95% lower bound on the BMD based on asymptotic normality using the delta method is inline image , where and β={a, b, e, f, g} is the vector of dose-response parameters. Because the BMD_q is obtained numerically, may be more conveniently obtained as . Since contains the estimated parameters used to calculate the cutoffs c_F and c_H, the BMDL calculation accounts for the variability of the background percentage. This feature is an advantage over the alternative of using a fixed cutoff or estimating the cutpoint but not taking the estimate into account in the BMDL, as in previous methods.

Crump and Howe (1985) point out that confidence limits based on the delta method may possess certain drawbacks such as negative values or failure to maintain nominal coverage. An alternative, more attractive approach is to construct a confidence interval based on the asymptotic distribution of the likelihood-ratio statistic (Chen & Kodell, 1989). Using this method, a BMDL is determined by finding alternative estimates of the parameters inline image that satisfy r(d) =q, minimizing the BMD, and are subject to the constraint of . The method is implemented by modifying a Nelder-Mead simplex algorithm to incorporate the constraint. Although more computationally intensive, the advantages of the likelihood-ratio method make its use preferable to the delta method. The confidence limits are invariant under parameter transformations and maintain better nominal coverage. Its limitation is that it requires a likelihood. While one could use GEE methods, for example, for estimation if a likelihood were not present or did not have justification, calculations for a corresponding, non-delta method BMDL are not straightforward.

3. EXAMPLE

3.1. Parathion Data

Our example comes from a study evaluating the effects of parathion in rats sponsored by the World Health Organization and the International Programme on Chemical Safety (Moser, Tilson, & MacPhail, 1997). Parathion, a neurotoxic agent used as an insecticide, inhibits cholinesterase, an enzyme that is necessary for proper functioning of the nervous system in humans. For this example, we focus on the neuromuscular outcomes of forelimb and hindlimb grip strength.

As for typical neurotoxicity studies examining acute effects, the measurements for this study were taken at baseline, time of expected peak effect, one day, and one week. We are interested primarily in the responses at the time of peak effect. The data are pooled from seven different labs conducting the same experimental protocol. All laboratories had a single control group and four dosed groups, where each lab separately determined the maximum tolerated dose (MTD) from which it chose the three other lower doses. Thus, the dosed groups differed between labs. To control for intraanimal and interlab variability, we consider percent change in grip strength from baseline by subtracting the response at baseline from the response at time of peak effect, dividing by the baseline measurement, and multiplying by 100. Table I displays a summary of grip strength changes from baseline by dose group.

Table I. Grip Strengths from a Neurotoxicity Study of Parathion in Rats.

Dose (mg/kg)	Mice (N)	Forelimb		Hindlimb
Dose (mg/kg)	Mice (N)	Mean	SD	Mean	SD
0	74	11.7	22.8	1.0	21.3
0.38	8	12.3	19.5	24.0	19.2
0.56	17	4.2	22.8	−4.8	14.3
0.60	10	1.8	5.6	−0.7	4.7
0.75	8	11.3	24.8	−7.3	18.2
0.84	18	9.2	33.0	0.1	27.5
0.85	10	9.0	15.8	0.1	25.1
1.10	18	1.9	6.1	−0.3	5.7
1.13	10	5.4	11.6	2.9	16.3
1.27	10	19.5	19.5	1.3	15.7
1.50	8	14.0	28.0	4.0	16.4
1.69	28	9.0	27.6	−0.1	24.2
2.25	9	4.4	9.4	−7.0	10.8
2.30	18	2.1	4.6	1.6	9.3
2.53	10	4.4	16.3	−1.3	13.5
3.00	8	−2.3	16.0	−26.5	18.8
3.38	28	2.9	22.6	−3.9	22.3
4.50	26	−6.4	26.3	−15.3	18.0
5.07	10	0.9	17.8	7.7	22.2
6.75	22	−27.1	34.4	−15.8	23.0
10.13	6	−25.6	41.6	−33.5	14.2

Units for grip strengths are in kilograms-to-release (percent changes from baseline).

3.2. Dose-Response Modeling

The means for both types of grip strength changes generally decrease with dose; that is, as dose increases, grip strengths at peak times become weaker than those at baseline (Table I). To demonstrate the flexibility of the model and to account for possible heteroscedasticity, we allow the variances to vary linearly with dose. We choose to model the following dose-response trends:

where d_i is the dose administered to animal i. The parameter estimates, listed in Table II, are consistent with the raw data seen in Table I. Fitted means for changes in forelimb grip percent change decrease with dose (12% and −29% in controls and highest dose), and those for hindlimb grip percent change also decrease (3% and −26% in controls and highest dose). The fitted standard deviations for changes in forelimb grip percent change increase slightly with dose (20% to 34% in controls to highest dose), while those for changes in hindlimb grip change do not significantly vary with dose (20% to 21% in control and highest dose groups). The bivariate correlation ρ is positive, as expected, and stays relatively constant (0.10 to 0.20 in controls and highest dose groups).

Table II. Parameter Estimates for the Parathion Data, Percent Changes from Baseline for Forelimb and Hindlimb Grip Strength Outcomes

Model Parameter	Estimate	Standard Error	Z
Forelimb grip strength mean (μ_F)
Intercept	12.3	1.59	7.7
Dose	−0.041	0.64	−6.4
Hindlimb grip strength mean (μ_H)
Intercept	2.73	1.46	1.9
Dose	−2.75	0.50	−5.5
Forelimb grip strength SD (ln (σ_F))
Intercept	6.02	0.099	60.9
Dose	0.101	0.031	3.2
Hindlimb grip strength SD (ln (σ_H))
Intercept	5.96	0.104	57.6
Dose	0.014	0.035	0.4
Bivariate correlation
Intercept	0.194	0.140	1.4
Dose	0.021	0.045	0.5

3.3. Quantitative Risk Assessment

We used our methods to compute bivariate BMDs for 5% and 10% additional risks at different background rates of 1%, 5%, and 10%. BMDLs were computed using the delta method and the likelihood-ratio approach (Table III).

Table III. BMDs and BMDLs for Parathion Data, Percent Changes in Grip Strength Outcomes, with Dose-Varying Variances and Bivariate Correlations

	Background Rate	BMD	BMDL (dm)	BMDL (lr)	BMD	BMDL (dm)	BMDL (lr)	BMD	BMDL (dm)	BMDL (lr)
	Background Rate	Univariate Forelimb			Univariate Hindlimb			Bivariate
5%	1%	2.68	2.10	2.12	5.04	3.11	3.46	2.94	2.38	2.37
additional	5%	1.34	1.05	1.06	2.41	1.56	1.70	1.46	1.18	1.18
risk	10%	0.95	0.75	0.76	1.65	1.11	1.19	1.01	0.82	0.82
10%	1%	4.04	3.17	3.20	7.30	4.75	5.14	4.31	3.50	3.49
additional	5%	2.35	1.84	1.87	4.09	2.74	2.93	2.48	2.02	2.02
risk	10%	1.77	1.39	1.41	2.99	2.05	2.18	1.84	1.49	1.50

lr = likelihood ratio method.
Units are in mg/kg.
BMDLs are 95% lower bounds.

To define the joint probability of an adverse effect (large negative change in forelimb or hindlimb grip strength), we specify a background rate from which to derive cutoffs. For example, an overall background rate of 5% defines the cutoffs for percent changes in forelimb and hindlimb grip strength as −27 and −36, respectively, where the area below these cutoffs in the bivariate distribution equals 5%. Those values in the area defined by percent changes in forelimb grip strength that are less than −27 or percent changes in hindlimb grip strength less than −36 are adverse. The corresponding value of inline image is −1.95, with a marginal probability of 0.026 for each outcome; the estimated correlation in controls is approximately 0.1. The cutoffs for each outcome vary if we choose different background rates. For example, at 1%, the cutoffs are (−40, −48) with and marginal probabilities of 0.0051; at 10%, the cutoffs are (−21, −29) with inline image and marginal probabilities of 0.052.

While the region of adversity is 100u% of the modeled bivariate surface, the actual percentage of points in the area of adversity is dependent on the fit of the model. A rough goodness-of-fit test may be implemented by finding the percentage of points lying in the modeled region of adversity. For our example, using a nominal 5% background rate, 6.8% of the actual points lie in the area of adversity as defined by our model. For a 10% background rate, 12.2% of the points lie in the adverse area; for a 1% background rate, none of the points lie in the adverse area. The simulations and discussion further address issues of appropriate background rates.

Continuing with our illustration using a 5% additional risk (r(d) = 0.05), we may estimate a BMD₀₅. We specified the risk of an adverse effect in the control group to be 5%, which we denote as P_u(0) = 0.05. The dose satisfying the equation r(d) =P_v(d) −P_u(0) is P_v(d) = 0.10. As reported in Column 9 of Table III, this dose is the inline image . The corresponding BMDL ₀₅ values based on the delta method (dm) and likelihood-ratio (lr) methods are equal, 1.18 mg/kg.

We wish to compare the bivariate results to the univariate cases by computing separate BMDs and 95% BMDLs for changes in forelimb and hindlimb strengths. To do so, we fit linear mean percentile regression dose-response models for both outcomes, allowing the variances for each outcome to vary linearly by dose to account for possible heterogeneity.

As for the bivariate case, we computed univariate BMDs for 5% and 10% additional risks, at background rates of 1%, 5%, and 10% for both outcomes. Fig. 2 depicts an additional risk BMD₀₅ with background percentile of 5%. For changes in forelimb grip strength, the inline image is 1.34 mg/kg with BMDL ₀₅(dm) = 1.05 mg/kg and BMDL ₀₅(lr) = 1.06. The results for changes in hindlimb grip strength are with BMDL ₀₅(dm) = 1.56 mg/kg and BMDL ₀₅(lr) = 1.70 mg/kg (Table III).

The bivariate inline image fall between the two univariate but are closest to the more conservative univariate (in this case, the forelimb grip strength change ). This result is consistent with the explanation in Yu and Catalano (2005) for a bivariate BMD calculated with a jointly defined background rate as described.

4. SIMULATION STUDY PROCEDURES

Our simulation focuses on the properties of the bivariate method. Each simulation scenario uses 1,000 BMDs and corresponding BMDLs computed from n_i= 10, 20, or 50 bivariate samples at each dose group (total N= 40, 80, or 200), with a single control group and three dosed groups set arbitrarily at d= 10, 20, 30. At each dose, we generated n_i bivariate normal observations about means corresponding to a dose-response model with specified intercept, slope, and variances. We modeled the bivariate correlation ρ as constant at values of ρ= 0 or ρ= 0.35 and as varying by dose at equally spaced values from 0.2 to 0.65. The background percent also varied; we used 1%, 5%, and 10% to reflect background rates typically used in bioassays where BMDs are of interest to compute. True BMDs were calculated under all possible simulation scenarios. For a linear assumption, we used a dose-response model with intercept (2, 2), slope (−0.05, −0.05), and constant variance (1, 1).

To examine the same properties for a quadratic trend, we generated bivariate normal observations as described above about means corresponding to a quadratic dose-response model with intercept (2, 2), dose coefficients (−0.078, −0.078), and quadratic coefficients (0.0013, 0.0013). To examine model misspecification, we modeled the quadratic data under (1) a linear assumption with the original four groups and (2) using only the first three dose groups to see how the curvature affects the BMD. Because of our choice of sampling distributions for both linear and quadratic trends (Fig. 3), the first three groups of the quadratic trend have means that are similar to the linear trend. The last dose group is considerably different and thus defines the curvature of quadratic dose-response trend. By eliminating the last dose group and fitting a linear trend to the remaining three groups (a common practice among dose-response modelers when the highest dose appears different from the rest of the study), we may examine the degree to which the curvature affects the BMD.

To examine misspecification of the underlying distribution, we generated data from a bivariate lognormal distribution and analyzed them under the normality assumption. The purpose was to keep similar distributions and bivariate correlations at each dose but generate a skewed rather than a symmetric distribution about each mean. We chose this distribution because dose-response data often appear lognormal.

We used n_i= 10 and 20 per dose group to reflect the sample sizes occurring in typical noncancer animal studies and used n_i= 50 to examine properties resulting from a larger sample size. The choice of dose-response parameters for both linear and quadratic trends reflect good study design where the true BMD lies near the first dose. For decreasing dose-response trends, the BMD is smaller with increasing background rates; therefore, we designed the simulation study such that the BMD lies near the first dose at the 1% background rate. The quadratic trend chosen is similar to the linear trend; the main difference is the response in the last dose group, which provides the curvature in the quadratic trend (Fig. 3).

We calculate two types of 95% lower confidence bounds on the BMD: one is based on the delta method, denoted BMDL(dm); the other, on the likelihood-ratio statistic, denoted BMDL(lr). We evaluate the behavior of the BMDLs for each scenario by comparing them to the empirical 5% of the inline image distribution and by computing coverage probabilities. We define the coverage probability as the proportion of BMDLs lying at or below the true BMD, which is calculated from the true underlying parameters used for sampling.

5. SIMULATION RESULTS

5.1. Linear Normal Results

Table IV presents summary statistics for estimated BMD and BMDL distributions of generated normal linear data and linear fit for true ρ= 0. Corresponding results for ρ= 0.35 and dose-varying ρ appear in Tables V and VI, respectively.

Table IV. Summary Statistics for Generated Normal Linear Data, Linear Fit, and True ρ= 0

Background Rate	N per Dose	BMD				Empirical 5%	BMDL(dm)				BMDL(lr)
Background Rate	N per Dose	True	Mean	Median	SD	Empirical 5%	Mean	Median	SD	C.P.^†	Mean	Median	SD	C.P.^†
1%	10^‡	14.02	13.74	13.15	3.28	9.5	7.91	7.92	1.00	1	10.05	9.76	1.83	0.972
	20		13.79	13.57	2.18	10.76	9.75	9.68	1.09	1	11.00	10.85	1.42	0.968
	50		13.93	13.79	1.42	11.89	11.37	11.28	0.98	0.958	12.02	11.93	1.07	0.958
5%	10	6.45	6.44	6.13	1.60	4.44	3.81	3.81	0.47	1	4.71	4.56	0.86	0.967
	20		6.40	6.28	1.02	4.97	4.61	4.58	0.51	1	5.11	5.05	0.66	0.962
	50		6.67	6.48	0.66	5.48	5.30	5.26	0.45	0.984	5.55	5.51	0.50	0.953
10%	10	4.28	4.30	4.09	1.08	2.95	2.66	2.66	0.33	1	3.14	3.05	0.57	0.964
	20		4.26	4.18	0.68	3.30	3.15	3.14	0.34	0.998	3.40	3.36	0.44	0.961
	50		4.27	4.22	0.44	3.64	3.58	3.55	0.30	0.979	3.69	3.65	0.33	0.951

^†Coverage probability, defined as the proportion of BMDLs lying at or below the true BMD.
^‡Scenario where simulated BMDs exceeding the maximum dose of 30 were set as equal to the maximum dose.

Table V. Summary Statistics for Generated Normal Linear Data, Linear Fit, and True ρ= 0.35

Background Rate	N per Dose	BMD				Empirical 5%	BMDL(dm)				BMDL(lr)
Background Rate	N per Dose	True	Mean	Median	SD	Empirical 5%	Mean	Median	SD	C.P.^†	Mean	Median	SD	C.P.^†
1%	10^‡	14.32	14.35	13.52	4.02	9.37	7.47	7.64	1.41	1	10.08	9.77	2.11	0.969
	20^‡		14.23	13.84	2.67	10.67	9.62	9.58	1.08	1	11.01	10.83	1.60	0.965
	50		14.29	14.08	1.70	11.96	11.40	11.29	1.08	0.984	12.08	11.95	1.22	0.951
5%	10^‡	6.67	6.81	6.36	0.66	4.38	3.62	3.73	1.01	1	4.74	4.59	0.98	0.963
	20		6.67	6.48	1.27	4.97	4.60	4.58	0.51	1	5.16	5.07	0.76	0.961
	50		6.67	6.58	0.80	5.58	5.39	5.34	0.51	0.983	5.64	5.57	0.57	0.951
10%	10	4.47	4.58	4.27	1.46	2.95	2.51	2.59	0.72	1	3.18	3.07	0.66	0.962
	20		4.48	4.35	0.85	3.33	3.15	3.14	0.35	1	3.46	3.40	0.51	0.961
	50		4.47	4.41	0.54	3.74	3.65	3.62	0.34	0.98	3.78	3.74	0.39	0.951

^†Coverage probability, defined as the proportion of BMDLs lying at or below the true BMD.
^‡Scenario where simulated BMDs exceeding the maximum dose of 30 were set as equal to the maximum dose.

Table VI. Summary Statistics for Generated Normal Linear Data, Linear Fit, and True ρ Varying from 0.2 to 0.65

Background Rate	N per Dose	BMD				Empirical 5%	BMDL(dm)				BMDL(lr)
Background Rate	N per Dose	True	Mean	Median	SD	Empirical 5%	Mean	Median	SD	C.P.^†	Mean	Median	SD	C.P.^†
1%	10^‡	14.58	14.84	13.83	4.45	9.37	7.15	7.43	1.69	1	10.23	9.78	2.37	0.960
	20^‡		14.61	14.13	2.95	10.71	9.55	9.53	1.07	1	11.13	10.93	1.71	0.966
	50		14.62	14.34	1.88	12.08	11.48	11.37	1.15	0.985	12.24	12.08	1.32	0.949
5%	10	6.78	7.04	6.47	2.48	4.37	3.39	3.56	1.22	1	4.78	4.58	1.05	0.958
	20		6.83	6.59	1.41	5.00	4.54	4.52	0.51	1	5.19	5.09	0.81	0.961
	50		6.82	6.69	0.89	5.63	5.40	5.34	0.54	0.982	5.70	5.62	0.62	0.947
10%	10	4.56	4.76	4.37	1.72	2.91	2.32	4.54	0.99	1	3.21	3.07	0.72	0.955
	20		4.61	4.44	0.97	3.37	3.10	3.10	0.35	1	3.49	3.41	0.55	0.956
	50		4.59	4.51	0.61	3.79	3.67	3.63	0.37	0.979	3.83	3.78	0.42	0.946

^†Coverage probability, defined as the proportion of BMDLs lying at or below the true BMD.
^‡Scenario where simulated BMDs exceeding the maximum dose of 30 were set as equal to the maximum dose.

In general, the inline image distributions are affected in similar ways by increasing sample size and background rate: they become tighter over all independent and correlated simulation scenarios, the mean and median of the distributions grow closer to the true BMD, and the true BMD itself becomes smaller and closer to zero. For example, at ρ= 0.35, at background rates of 1%, 5%, and 10%, the true BMDs are 14.32, 6.67, and 4.47, respectively.

These general trends are similar for ρ= 0 and for dose-varying ρ. In particular, inline image distributions at the 1% background rate tend to be wider than those at the 5% and 10% background rates. The high variability is apparent when viewing density plots of the distributions (Fig. 4), where a vertical line indicates the true BMD for a specific scenario. An interesting feature on plots 4d and 4g, where the background rate is 1% and n_i= 10, is a small “spike” at the maximum dose of 30. The reason for the spike, which indicates greater spread, is that our algorithm for computing the BMD limits the inline image to the given range of doses in the study. If the estimated BMD numerically exceeds the last dose, the algorithm sets it equal to the maximum dose; a consequence is that the means of the simulated distributions are less meaningful. This “bumping up” against the maximum dose disappears with increasing sample size and background percentile because both factors decrease the variability of the distribution.

As expected, increasing the bivariate correlation results in slightly larger BMDs. A heuristic argument is that less correlated outcomes are “more independent” and result in lower, more conservative BMDs to protect against both outcomes.

Fig. 5 displays boxplots of the estimated BMD, BMDL(dm), and BMDL(lr) distributions for two scenarios. A horizontal line represents the value of the true BMD. Both graphs are for dose-varying ρ and 10% background rate, but the sample sizes per group differ. The results are typical for the different scenarios in the simulation. The BMDL(dm) boxplot for n_i= 10 in Fig. 5a displays the known tendency of BMDLs computed using the delta method to produce negative values, while the BMDL(lr) distribution remains within the range of doses. As expected, the problem is less severe as the sample size increases above 10 animals per dose (Fig. 5b). However, it indicates that in general the delta method may not be as reliable as the likelihood-based method for finding a BMDL. In general, BMDLs based on the delta method are more variable and also have poor coverage.

Coverage probabilities indicate that the BMDL(dm) is extremely conservative, especially at the smaller background rates of 1% and 5%. Across all scenarios, the coverage probabilities for BMDL(dm) are wider than those for BMDL(lr). With increasing sample size, the BMDL(dm) grows closer to a 95% coverage probability, but the BMDL(lr) universally has coverage closer to the nominal 95%. In addition, the empirical 5% of the inline image distribution is closer to the median of the BMDL distribution with increasing sample size and background rate (Tables IV–VI). The differences for BMDL(dm) are generally higher while corresponding differences of BMDL(lr) are fairly consistent.

The differences, especially over the three background rates, further suggest that the delta method is less adequate than the likelihood-based method for calculating the BMDL. The information also indicates that both higher sample sizes and background rates greater than 1% are more reliable for calculating the BMDL and that the BMDL(lr) is preferable over the BMDL(dm). Such results are especially interesting given that the data follow a normal distribution.

5.2. Quadratic Normal Results

Since dose-response trends often possess some degree of curvature, we examine the properties of the BMD for a quadratic dose-response trend. We generated quadratic data and looked at two main cases: one in which the model fit is quadratic and the other in which the model fit is misspecified as linear.

To examine the properties of the BMD for a true quadratic trend and quadratic fit, we generated bivariate normal observations about means corresponding to a quadratic dose-response model with intercept (2, 2), dose coefficients (−0.078, −0.078), and quadratic coefficients (0.0013, 0.0013). As stated above, this quadratic trend was chosen to be similar to the linear trend. The exception is in the last dose group, where the response chosen provides an obvious departure from linearity (Fig. 3). Tables VII–IX summarize statistics for estimated BMD and BMDL distributions of generated normal quadratic data and quadratic fit.

Table VII. Summary Statistics for Generated Normal Quadratic Data, Quadratic Fit, and True ρ= 0

Background Rate	N per Dose	BMD				Empirical 5%	BMDL(dm)				BMDL(lr)
Background Rate	N per Dose	True	Mean	Median	SD	Empirical 5%	Mean	Median	SD	C.P.^†	Mean	Median	SD	C.P.^†
1%	10^‡	11	10.98	9.33	5.55	5.12	1.23	2.48	14.25	0.977	6.04	5.49	2.59	0.966
	20^‡		10.93	9.93	3.92	6.32	4.31	4.20	5.70	0.983	6.94	6.51	1.90	0.966
	50		10.97	10.67	2.47	7.65	6.82	6.65	1.15	0.993	8.10	7.96	1.48	0.956
5%	10^‡	4.46	4.94	4.00	3.08	2.26	0.47	1.17	4.88	0.997	2.64	2.41	0.92	0.952
	20		4.66	4.17	1.81	2.74	1.89	1.91	0.35	0.999	2.98	2.80	0.77	0.951
	50		4.53	4.37	1.00	3.24	2.89	2.88	0.35	0.999	3.39	3.32	0.58	0.945
10%	10^‡	2.88	3.34	2.62	2.25	1.49	0.19	0.80	1.88	1	1.73	1.57	0.61	0.945
	20		3.07	2.72	1.26	1.79	1.20	1.28	0.34	1	1.94	1.82	0.50	0.947
	50		2.94	2.83	0.66	2.11	1.89	1.89	0.21	1	2.19	2.14	0.37	0.944

^†Coverage probability, defined as the proportion of BMDLs lying at or below the true BMD.
^‡Scenario where simulated BMDs exceeding the maximum dose of 30 were set as equal to the maximum dose.

Table VIII. Summary Statistics for Generated Normal Quadratic Data, Quadratic Fit, and True ρ= 0.35

Background Rate	N per Dose	BMD				Empirical 5%	BMDL(dm)				BMDL(lr)
Background Rate	N per Dose	True	Mean	Median	SD	Empirical 5%	Mean	Median	SD	C.P.^†	Mean	Median	SD	C.P.^†
1%	10^‡	11.31	11.91	9.73	6.51	4.99	−0.40	2.10	24.11	0.961	6.43	5.59	3.57	0.947
	20^‡		11.66	10.37	4.83	6.19	3.64	3.74	9.06	0.971	7.17	6.58	2.50	0.944
	50^‡		11.49	11.01	3.12	7.59	6.39	6.26	4.27	0.989	8.25	7.98	1.76	0.934
5%	10^‡	4.63	5.55	4.20	4.07	2.20	−0.04	0.89	5.90	0.986	2.77	2.41	1.47	0.933
	20^‡		5.04	4.36	2.38	2.68	1.54	1.70	0.87	0.997	3.06	2.84	0.95	0.940
	50		4.77	4.54	1.26	3.22	2.76	2.76	0.31	0.999	3.45	3.36	0.70	0.940
10%	10^‡	3.01	3.89	2.77	3.47	1.45	−0.54	0.58	7.88	0.995	1.81	1.57	0.83	0.926
	20		3.36	2.86	1.74	1.76	0.90	1.14	0.69	0.999	2.00	1.84	0.62	0.938
	50		3.12	2.97	0.84	2.1	1.81	1.83	0.17	1	2.25	2.19	0.44	0.933

^†Coverage probability, defined as the proportion of BMDLs lying at or below the true BMD.
^‡Scenario where simulated BMDs exceeding the maximum dose of 30 were set as equal to the maximum dose.

Table IX. Summary Statistics for Generated Normal Quadratic Data, Quadratic Fit, and True ρ Varying from 0.2 to 0.65

Background Rate	N per Dose	BMD				Empirical 5%	BMDL(dm)				BMDL(lr)
Background Rate	N per Dose	True	Mean	Median	SD	Empirical 5%	Mean	Median	SD	C.P.^†	Mean	Median	SD	C.P.^†
1%	10^‡	11.47	11.63	9.46	6.69	4.86	−4.88	1.45	57.08	0.980	6.57	5.54	3.99	0.941
	20^‡		11.97	10.06	5.88	6.11	−2.05	3.00	66.93	0.982	7.47	6.59	3.55	0.925
	50^‡		11.73	10.87	3.78	7.49	4.47	5.14	6.40	0.997	8.40	7.99	2.17	0.929
5%	10^‡	4.66	5.53	4.17	4.17	2.2	−0.85	0.64	8.28	0.993	2.90	2.46	1.91	0.919
	20^‡		5.13	4.37	2.70	2.68	0.93	1.51	2.55	1	3.14	2.87	1.09	0.918
	50		4.81	4.54	1.31	3.2	2.52	2.54	0.26	1	3.51	3.39	0.73	0.933
10%	10^‡	3.04	3.95	2.77	3.70	1.45	−1.03	0.42	8.31	0.995	1.95	1.63	1.40	0.912
	20^‡		3.46	2.89	2.11	1.79	0.57	1.03	1.29	1	2.07	1.91	0.69	0.919
	50		3.16	2.98	0.88	2.1	1.69	1.71	0.15	1	2.30	2.22	0.47	0.927

^†Coverage probability, defined as the proportion of BMDLs lying at or below the true BMD.
^‡Scenario where simulated BMDs exceeding the maximum dose of 30 were set as equal to the maximum dose.

The general trends for the effects of sample size and background rate on the BMD are similar to those from the normal linear results. The inline image distributions are more variable at lower sample sizes and background rates and become tighter as both of these factors increase. Generally, the variability is higher for each scenario than for the corresponding linear trend scenario. The results are likely due to the sampling and estimation variability from the addition of a quadratic term. Similar to the linear trend results, the 1% background rate is inadequate for BMD estimation. The BMDs at the 1% background rate are highly variable for all specified values of ρ, with noticeable spikes at the maximum dose of 30, even for n_i = 50.

The quadratic simulation results suggest that the delta method is inadequate for calculating the BMDL, based on the large percentage of negative values for the BMDL(dm) for all sample sizes at the 1% background rate and for n_i= 10 and n_i= 20 at the 5% and 10% background rates. Some of the means and medians for the BMDL(dm) distributions in Tables VII–IX are negative. As a result, the differences between the empirical 5% of the inline image distribution and the median of the BMDL(dm) distribution are much larger than those for the linear results. Coverage probabilities are close to 1 and do not grow closer to 95% with increasing sample size. As for the linear case, coverage probabilities are extremely wide for the BMDL(dm), whereas those for BMDL(lr) are much more consistent and closer to 95%; the BMDL(lr) is the preferable choice for a lower bound on the inline image .

Because most dose-response trends tend to have some curvature, it is valuable to examine how assumed quadratic data behave under different models. To assess possible model misspecification, we first modeled the quadratic data under a linear assumption. Fig. 6 displays distributions at true ρ= 0.35, with a horizontal line at the true BMD. The results suggest that fitting a linear trend to the generated quadratic data is not recommended because the resulting inline image are clearly biased away from the true BMD. The mean and median tend to be higher than the true BMD.

We also examined how the curvature in the quadratic trend affects the BMD by carefully choosing sampling distributions for the linear and quadratic dose-response trends. For the quadratic trend, the choice of means for the first three doses were similar to those for the linear trend. The main difference is the last dose, where the mean response clearly defines the curvature of the quadratic dose-response trend. To see how the curvature affects the BMD, we removed the last dose of the quadratic trend and sampled only from means of the first three dose groups defined for the quadratic trend. These data are then fit with an assumed linear trend. These distributions, also shown in Fig. 6, show that there is a gain in reducing bias; the BMD estimates are considerably closer to the true BMD from generated quadratic data and quadratic fit than the mean or median inline image resulting from fitting a linear trend to generated quadratic data. These results show that the curvature of a nonlinear trend may greatly affect the BMD calculation and should be incorporated if possible. Fortunately, such trends are easily incorporated into dose-response models.

5.3. Sensitivity to Distributional Shape

The small samples typical of dose-response data may result in skewed, rather than symmetric, distributions of responses within a dose group. A way to examine the behavior of skewed distributions is to generate data from a bivariate lognormal distribution and analyze them under a normal distribution assumption.

To evaluate the sensitivity of our method to changes in distributional shape, we kept the same dose-response trends and bivariate correlations but generated a lognormal distribution about each mean. We chose means at (5, 5) to avoid distributions whose observations were primarily distributed along the x and y axes and variance parameters (2, 2) to provide a clear skewness to the distribution. The lognormal data were treated as normal for modeling and subsequent BMD calculation. Fig. 7 displays some of the simulation results for ρ= 0.35.

The true BMD in this case is one for lognormally generated data calculated using a lognormal assumption. The simulations show that the resulting inline image , calculated under a normal assumption, are clearly biased away from the true BMD. A few of the distributions at the 5% and 10% background rates do not even cover the true BMD for those scenarios.

Misspecifying a distribution should be avoided if possible. Nonetheless, under true lognormality, a simple fix is available. Skewed data are common in bioassays, and researchers often will use a transformation of the data to make them conform to a normal distribution. In our case, taking the log of the observations corrects the problem, as seen in Fig. 8.

6. DISCUSSION

We present a method that models two continuous bivariate outcomes and subsequently finds a BMD, where the BMD is a 100q% increase over a background rate. Calculations for lower bounds on the BMD (BMDLs) are possible through either the delta method or a likelihood-based model. Modeling the bivariate outcomes takes the bivariate correlation into account. The method is flexible as it allows variances and correlations to vary with dose. An advantage is that the definition of adversity and characterization of risk are not based on fixed and possibly arbitrary cutoffs, and the BMDL accounts for the variability of the estimated cutoff. The bivariate method presented is analogous to a univariate method using percentile regression for dose-response estimation (Yu, 2002).

One of the natural questions arising with the method is: What background rate is appropriate? From the simulations, it is clear that the 1% background rate is not precise for BMD estimation or BMDL(dm) estimation unless the sample size is quite large. The variability is large for this background rate and often results in BMDL(dm)s that are negative. The bias is also quite large when comparing the mean of the inline image distribution to the true BMD for any scenario with a background rate of 1%. Larger sample sizes reduced overall variability; however, even at n_i= 50, the 1% background rate is not recommended.

One of the features of the combined laboratory data for our example is that each laboratory had a control group. Thus there were a substantial number of responses for defining adversity. For smaller data sets or sample sizes, fewer responses are present in the control from which to find the region of adversity. Typical neurotoxicity animal studies contain 8–10 rats per dose group and five dose groups. Unless the same experiment is conducted in different labs and the data are pooled (as for our example), one cannot realistically guarantee a larger sample size of 20 or 50 animals per group. Given the variability of the inline image distributions at a background rate of 1%, a background rate of 5% to 10% is recommended for practical use. We note that higher background rates give slightly more conservative estimates of the BMD, a phenomenon consistent with findings from Budtz-Jørgensen, Keiding, and Grandjean (2001) for the univariate case.

The simulations also confirmed another property known in the corresponding univariate procedure: that the likelihood-ratio-based BMDL is superior over the delta method BMDL for lower bound estimation. It remains within the range of doses, is based on the likelihood that is the method used for parameter estimation, maintains better coverage probability, and is less variable across all sample sizes, bivariate correlations, and background rates that were used in the simulations.

The method is not robust to model misspecification; fitting a truly quadratic trend using a linear dose response results in biased BMDs that may be nonconservative. In addition, we explored the sensitivity of our method to changes in distributional shape. To reflect bioassay data, we generated skewed rather than symmetric distributions about each mean while keeping similar bivariate correlations. We based our study on a lognormal distribution, which the method could accommodate by using a log transformation of the data. An area of future work is to explore the effects of other distributional assumptions on the method in more detail.

ACKNOWLEDGMENTS

This work was supported by Grants ES06900 and T32 ES07142-18 from the National Institute of Environmental Health Sciences and Grant CA48061 from the National Cancer Institute. The authors thank Drs. Lorenz Rhomberg, Louise Ryan, and Meredith Regan for helpful comments, and Dr. Virginia Moser at the U.S. EPA for the use of the example data.

REFERENCES

Budtz-Jørgensen, E., Keiding, N., & Grandjean, P. (2001). Benchmark dose calculation from epidemiological data. Biometrics, 57, 698–706.
10.1111/j.0006-341X.2001.00698.x
CAS PubMed Web of Science® Google Scholar
Chen, J. J., & Kodell, R. L. (1989). Quantitative risk assessment for teratologic effects. Journal of the American Statistical Association, 84, 966–971.
10.1080/01621459.1989.10478860
Web of Science® Google Scholar
Crump, K. S., & Howe, R. L. (1985). A review of methods for calculating statistical confidence limits in low dose extrapolation. In D. B. Clayson, D. Krewski, & I. Munro (Eds.), Toxicological Risk Assessment, Vol. 1: Biological and Statistical Criteria (pp. 187–202). Boca Raton : CRC Press.
Google Scholar
Moser, V. C. (1991). Applications of a neurobehavioral screening battery. Journal of the American College of Toxciology, 6, 661–669.
10.3109/10915819109078658
Google Scholar
Moser, V. C., Tilson, H. A., MacPhail, R. C., Becking, G. C., Cuomo, V., Frantik, E., Kulig, B. M., & Winneke, G. (1997). The IPCS collaborative study on neurobehavioral screening methods: Protocol design and testing procedures. Neurotoxicology, 18(4), 929–938.
CAS PubMed Web of Science® Google Scholar
Regan, M. M., & Catalano, P. J. (1999a). Bivariate dose-response modeling and risk estimation in developmental toxicology. Journal of Biological, Agricultural, and Environmental Statistics, 4, 217–237.
10.2307/1400383
PubMed Web of Science® Google Scholar
Regan, M. M., & Catalano, P. J. (1999b). Likelihood models for clustered binary and continuous outcomes: Application to developmental toxicology. Biometrics, 55, 760–768.
10.1111/j.0006-341X.1999.00760.x
CAS PubMed Web of Science® Google Scholar
Ryan, L. M. (1992). Quantitative risk assessment for developmental toxicity. Biometrics, 48, 163–174.
10.2307/2532747
CAS PubMed Web of Science® Google Scholar
Yu, Z. F. (2002). Using percentile regression for quantitative risk assessment in developmental toxicology. In Regression Methods for Quantitative Risk Assessment of Continuous Outcomes in Toxicology. Doctoral thesis, Cambridge , MA .
Google Scholar
Yu, Z. F., & Catalano, P. J. (2005). Quantitative risk assessment for multivariate continuous outcomes with application to neurotoxicology: The bivariate case. Biometrics, 61, 757–766.
10.1111/j.1541-0420.2005.00350.x
CAS PubMed Web of Science® Google Scholar

Citing Literature

Volume28, Issue5

October 2008

Pages 1415-1430

A Simulation Study of Quantitative Risk Assessment for Bivariate Continuous Outcomes

Correction(s) for this article

Erratum to “A Simulation Study of Quantitative Risk Assessment for Bivariate Continuous Outcomes,” by Zi-Fan Yu and Paul J. Catzlano, in Risk Analysis, 28(5), 2008

Abstract

1. INTRODUCTION