Collider Bias Is an Insufficient Explanation for the Inverse Obesity Paradox in Prostate Cancer
Funding: This study was funded by the Swedish Cancer Society (20 1033 PjF and CAN 2017/1019).
ABSTRACT
Background
Collider bias is often considered a potential explanation when the association between obesity and disease diagnosis differs from that with disease outcome, as seen in the “obesity paradox.” For prostate cancer (PCa), in particular localized PCa, an “inverse” obesity paradox has been observed, where body mass index (BMI) is negatively associated with diagnosis (hazard ratio [HR] ~0.9 per 5-kg/m2 increase), but positively associated with PCa-specific death (HR ~ 1.2). However, collider bias in this context remains unexplored.
Methods
We simulated binary disease diagnosis and outcome data, including the typically unmeasured/unknown background variable (U) that could introduce collider bias. We calculated U-unadjusted (biased) and U-adjusted (true) marginal odds ratios (OR) from a case-only analysis, and determined the bias percentage using . Similar simulations were performed for classical confounding.
Results
Across a broad range of plausible parameter values for the PCa context, collider bias did not distort the OR of BMI on PCa death by more than 4%, equivalent to a ± 0.04 distortion in the OR estimate for continuous BMI. In comparison, classical confounding showed a higher potential for distorting BMI and PCa death associations than collider bias.
Conclusions
Collider bias alone is unlikely to explain the inverse obesity paradox in (localized) PCa, reinforcing some mechanistic evidence that the observed positive relationship between BMI and PCa death is real, and not a statistical artifact. This finding emphasizes the importance of exploring alternative mechanisms beyond collider bias to better understand the underlying factors driving this paradox.
1 Introduction
Obesity is associated with a lower risk of diagnosis of low-risk, commonly screen-detected prostate cancer (PCa), but is unassociated or even slightly positively associated with the risk of more advanced PCa [1-3]. In contrast, obesity has consistently been associated with worse PCa-specific survival, both in prospective cohort studies following the whole, initially cancer-free cohort from the time of study inclusion and in PCa case-only studies [2-6]. The results from the full cohort analyses reflect the association of obesity with a mix of time from study entry until PCa diagnosis and from PCa diagnosis to PCa-specific death. The outcome of primary interest, time from PCa diagnosis to PCa death, that is, survival from PCa, is more directly studied in analyses of PCa cases only. However, the inclusion of PCa cases only in these analyses might introduce collider stratification bias (short: collider bias), a special kind of selection bias inducing a noncausal association between exposure and outcome in the selected data. The causal diagram in Figure 1 illustrates the underlying conceptual framework and assumptions for the PCa scenario. Obesity affects PCa risk (the collider), and there are unmeasured and/or unknown variables that affect both PCa risk and death (e.g., genetic factors). The presence of these two associations is sufficient to introduce collider bias.

The potential impact of collider bias on analyses of cancer cases only has been well studied in the context of the “obesity paradox”, referring to settings where obesity is associated with an increased disease risk, but better disease survival (observed for e.g., lymphoma, leukemia, colorectal, endometrial, thyroid, and renal cancers) [7-11]. For localized PCa, men with obesity have a decreased risk of diagnosis, but worse survival, which can be seen as an “inverse” form of the obesity paradox. Since the introduction of prostate-specific antigen (PSA) testing for early detection of PCa, localized PCa has made up the majority of PCa's detected in developed countries [12, 13]. However, the potential impact of collider bias on the survival of patients with localized PCa has not been investigated. However, risk factors for PCa development and progression, such as genetic factors, which are usually not adjusted for in epidemiological research, have the potential to introduce such bias. In this study, we used simulation analyses to quantify the likely magnitude of collider bias for the association between obesity and PCa death observed in analyses of localized PCa cases only.
2 Methods
We generated simulated data of variables E (exposure; either normally distributed or binary), S (disease indicator; binary), Y (survival outcome; binary), and U (unmeasured/unknown risk factor for disease and survival; either normally distributed or binary), as depicted in Figure 1 [2].
Our simulation input parameters were based on previous results from Swedish cohort data [2], where 3.9% of the whole study population (N~370,000; mean baseline age 37.5 years) developed a localized “low- or intermediate-risk” PCa (defined as T1-T2, Gleason score 2–7, PSA < 20 ng/mL, N0/NX, and M0/MX) during on average 28 years of follow-up, with a mean (standard deviation [SD]) age at diagnosis of 67.2 (7.4) years. Within the group of localized PCa cases, 4.1% subsequently died because of the cancer. A 5-kg/m2 increase in body mass index (BMI) was associated with an 11% decreased risk of localized PCa (hazard ratio [HR] = 0.89; translating to a HR of 0.92 per 1-SD BMI increase), whereas it was associated with an 18% increased risk of PCa death (HR = 1.18; HRper 1-SD BMI increase = 1.12), after stratification on cohort and birth decade, and adjustment for age, smoking status, healthcare region, country of birth, highest attained education level, and also for income, marital status, Charlson comorbidity index, and PCa treatment in case-only analyses.
- E and U independent,
- , and
- .
Thus, , , , , and can be interpreted as logarithmized odds ratios (ORs). These were five of the seven input parameters of our simulations. We allowed for interaction between E and U on the odds of S because prior work has shown that the magnitude of the collider bias is substantially larger when such an interaction is present [14, 15]. The other two parameters, and , were implicitly determined by specifying the baseline probabilities of S = 1 and Y = 1 within the group S = 1. The sample size was chosen to be 500,000. As outcomes are rare (~4% each), the approximation of HRs derived from time-to-event data by ORs derived from binary data is justified.
After creating the simulated dataset, we calculated beta estimates for associations between E, U, S, and Y from the regression models (logistic if the dependent variable was binary, linear otherwise). The magnitude of collider bias was quantified by the percentage of bias (“percentage bias”) using the formula , where is the unadjusted (“collider-biased”), and the (for U) adjusted (“unbiased”) marginal OR of E on Y within the subset S = 1. , a marginal OR from analyses adjusted for U, was derived as outlined by Zhang [16]. A generic definition of using counterfactuals and further details were provided in Daniel et al. [17]. Marginalization is essential for a valid comparison of unadjusted versus adjusted ORs because of the non-collapsibility property of the OR [17].
- for E binary; and for E continuous, where N2 denotes a bivariate normal distribution, and is the Pearson correlation coefficient between E and U, and
- .
Simulations were conducted using R, version 4.0.5 (R Foundation) [18]. The analysis code is provided as Appendix S1.
3 Results
Scenarios 1–10 in Table 1 show the simulation results of plausible scenarios in the context of BMI and PCa [2]. In all scenarios, including those assuming extremely strong effects of U (scenarios 3 and 4) and those incorporating E–U interaction on S (scenarios 5–7), the difference between collider-biased and unbiased ORs of E on Y within S = 1 was small (generally ≤ 0.04), as was the percentage bias (in general < 4%). The bias was of similar size for continuous (scenarios 1 and 2) and binary (scenarios 8–10) exposures E.
Description | Sce-nario | Simulation settings/input parameters | Simulation results | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Marginalized odds ratios between E and Y in the subset with S = 1 | Corr. (ρ) between E and U…f | ||||||||||||||
Distributionsa | Probabilities | Odds ratios | Collider-biased ORb, c | Un-biased ORb, d | Percentage biase | In whole sampleb | Within S = 1b | ||||||||
E | U | p S,BL | p Y,BL | E→S | E→Y | U→S | U→Y | IAE,U→Y | |||||||
Plausible BMI—PCa scenarios | |||||||||||||||
Using continuous Eg | 1 | N (0, 1) | N (0, 1) | 0.04 | 0.04 | 0.92 | 1.12 | 3 | 3 | No | 1.124 | 1.116 | 0.8% (0.7%) | < 0.001 | 0.007 |
2 | B (1, 0.25) | 5 | 5 | No | 1.127 | 1.124 | 0.3% (0.4%) | 0.002 | 0.005 | ||||||
3 | N (0, 1) | 10 | 10 | No | 1.121 | 1.104 | 1.6% (0.7%) | < 0.001 | 0.016 | ||||||
4 | B (1, 0.25) | 25 | 25 | No | 1.124 | 1.121 | 0.2% (0.3%) | 0.001 | 0.006 | ||||||
5 | N (0, 1) | 3 | 3 | 1.04 | 1.150 | 1.114 | 3.2% (0.7%) | −0.003 | 0.034 | ||||||
6 | N (0, 1) | 3 | 3 | 0.96 | 1.107 | 1.119 | −1.1% (0.7%) | −0.002 | −0.02 | ||||||
7 | N (0, 1) | 10 | 10 | 1.04 | 1.139 | 1.103 | 3.2% (0.5%) | 0.001 | 0.02 | ||||||
Using binary Eg, h | 8 | B (1, 0.10) | N (0, 1) | 0.04 | 0.04 | 0.65 | 1.63 | 3 | 3 | No | 1.588 | 1.567 | 1.4% (2.7%) | < 0.001 | 0.005 |
9 | B (1, 0.25) | 5 | 5 | No | 1.630 | 1.617 | 0.8% (1.7%) | 0.001 | 0.004 | ||||||
10 | N (0, 1) | 3 | 3 | 1.10 | 1.724 | 1.599 | 7.8% (2.6%) | −0.001 | 0.023 | ||||||
Hypothetical scenarios | |||||||||||||||
Modified effect of E on S from scenario 1 | 11 | N (0, 1) | N (0, 1) | 0.04 | 0.04 | 0.67 | 1.12 | 3 | 3 | No | 1.170 | 1.114 | 5.1% (0.9%) | −0.002 | 0.038 |
12 | 0.33 | 1.275 | 1.112 | 14.7% (1.1%) | −0.002 | 0.117 | |||||||||
Strong interaction between E and U on Y | 13 | N (0, 1) | N (0, 1) | 0.04 | 0.04 | 0.92 | 1.12 | 3 | 3 | 2 | 1.407 | 1.121 | 25.6% (1.5%) | < 0.001 | 0.400 |
14 | N (0, 1) | N (0, 1) | 0.04 | 0.04 | 0.92 | 1.12 | 3 | 3 | 0.5 | 0.886 | 1.120 | −20.8% (0.9%) | < 0.001 | −0.356 | |
15 | N (0, 1) | N (0, 1) | 0.04 | 0.04 | 0.92 | 1.12 | 1 | 3 | 0.5 | 0.651 | 1.116 | −41.6% (1.1%) | < 0.001 | −0.568 | |
Modified probabilites of S and Y from scenario 1 | 16 | N (0, 1) | N (0, 1) | 0.5 | 0.04 | 0.92 | 1.12 | 3 | 3 | No | 1.130 | 1.115 | 1.3% (0.1%) | −0.001 | 0.015 |
17 | 0.5 | 1.116 | 1.098 | 1.6% (0.1%) | < 0.001 | 0.019 | |||||||||
Modified effect of E on Y from scenario 1 | 18 | N (0, 1) | N (0, 1) | 0.04 | 0.04 | 0.92 | 2 | 3 | 3 | No | 1.926 | 1.940 | −0.7% (0.7%) | < 0.001 | 0.007 |
Several modified input parameters | 19 | N (0, 1) | N (0, 1) | 0.5 | 0.5 | 0.92 | 1.04 | 3 | 3 | No | 1.045 | 1.033 | 1.2% (0.1%) | −0.003 | 0.014 |
20 | 0.04 | 0.04 | 0.33 | 1.12 | 0.33 | No | 1.009 | 1.113 | −9.4% (0.9%) | −0.002 | −0.120 |
- Note: Each scenario was simulated 100 times with a sample size of 500,000 each.
- Abbreviations: BMI, body mass index; IA, interaction; OR, odds ratio; PCa, prostate cancer.
- a N (0, 1)—normally distributed with mean 0 and variance 1; B (1, p)—Bernoulli distributed with probability p for the value 1 and 1-p for the value 0.
- b Mean from the 100 simulations.
- c Unadjusted for U.
- d Adjusted for U.
- e Mean (standard deviation) from the 100 simulations. The percentage bias was calculated as , where is the unadjusted (“collider-biased”), and the (for U) adjusted (“unbiased”) marginal OR of E on Y.
- f Given as Pearson correlation coefficients ρ.
- g By selecting different ORs for normally distributed and binary U's, we account for the fact that an OR is raised to the power of after dichotomization [19], thus mimicking similar effects sizes.
- h A prevalence of 10% for a binary exposure E, OR of E on S of 0.65, and OR of E on Y of 1.63 mimic the numbers in the Swedish cohorts [2] for obesity versus normal weight.
Next, we investigated how changes in the input parameters affect the magnitude of the bias. Increasing the effect of E on S moderately increased the percentage bias (Table 1; scenarios 11 and 12 vs. 1). The interaction between E and U on S was a very potent factor in introducing collider bias (scenarios 13 and 14), even more so when the main effect of U on S was absent (scenario 15 vs. 14), and was able to reverse the direction of the biased versus the unbiased OR in scenarios 14 and 15. In contrast, the probabilities of disease (pS,BL) and outcome (pY,BL), as well as the specific value of the OR of E on Y, only marginally affected the percentage bias (scenarios 16–18 vs. 1, scenario 19 vs. 17). It is noteworthy, however, that bias, even if it is of the same size in terms of percentage bias, impacts the interpretation of results much more for small effect sizes of E on Y (scenario 19, ORE→Y = 1.04), since even a small bias could change a slightly positive to a slightly negative effect (e.g., an OR from 1.04 to 0.99), while the interpretation of large effects is much less affected by such minor changes (e.g., 2.00 to 1.95). Finally, the direction of the bias was determined by the signs of the associations of U with S and Y (scenario 20). These findings are also visualized in Figures 2 and 3. In summary, the E–U interaction on S is the parameter that has the strongest effect on the amount of collider bias, and the impact of the interrelationship between the main effects of E and U on S together with the interaction term on the amount of collider bias appears to be complex and unpredictable.


As seen in the last column of Table 1, the collider stratification induced association between U and E was small for all plausible BMI–PCa scenarios (Pearson correlation coefficient ρ < |0.04|), compared to the original association between U and S (ORU→S = 3 in most scenarios, corresponding to ρ~0.2 for pS,BL = 0.04). Since for a confounder of the relationship between E and Y the correlation ρ can easily be larger than 0.04, the bias introduced by ignoring such a confounder might be substantially larger than the collider bias introduced by ignoring a confounder of the relationship between S and Y. Indeed, modifying U from an S–Y confounder to a classical E–Y confounder in scenario 5 of Table 1, while leaving all other input parameters unchanged, led to a confounder-induced percentage bias of 240%, compared to < 2% in the original collider bias setting. Figure 4 illustrates this observation.

4 Discussion
Using simulations, we examined the extent to which the observed association between BMI and PCa death in localized PCa cases might be affected by collider stratification bias originating from the analysis of cases only. We specifically scrutinized the results of a recent Swedish cohort study of approximately 370,000 men, where a 5-kg/m2 higher BMI increased the risk of PCa-specific death in localized PCa cases by 18% (HR = 1.18, 95% confidence interval: 1.01–1.37) [2]. Our simulations showed that in this setting, the influence of collider bias is small and is an insufficient explanation for the inverse obesity paradox in PCa. This finding is robust and holds for plausible variations in the input parameters, as long as we do not assume an unreasonably high interaction between BMI and the unmeasured and/or unknown risk factors for PCa diagnosis. Collider bias distorts HRs (approximated by ORs in our simulations) by no more than 4% on the relative percentage bias scale. This corresponds to a distortion of roughly ±0.04 for the actual HR point estimate if the HR is not too far away from 1, as is the case for continuous BMI. Thus, the random variability in the PCa death HR estimate due to the finite sample size (the 95% confidence interval reported in the Swedish study mentioned above spans a range of 0.36) outweighs the collider bias by far.
As PCa is one of the most heritable cancers [20], genetic risk factors for PCa development and progression are strong candidates for introducing collider bias into PCa case-only survival analyses. Genome-wide association studies have identified nearly 300 germline genetic variants associated with PCa risk, as well as PCa mortality [21-25]. The derived polygenic risk scores (PRS) for PCa are potent predictors of PCa risk, and HRs for PCa risk as high as 11 (comparing the top 20% vs. bottom 20% of the PRS) have been reported [25]. Converting this HR back to the continuous PRS scale (assuming linearity) gives an HR of ~2.4 per 1-SD increase. HRs reported in most other studies were slightly smaller, usually converting to HRs of ~2.0 per 1-SD on a continuous scale [21-23]. However, such PRSs do not capture all genetic risk; some common variants with small effects may still be undetected, and rare variants with large effects are also not included. Indeed, one study demonstrated a substantially higher risk of PCa in carriers of rare, highly penetrant PCa genetic risk variants compared to non-carriers, for both low and high PRS values of common variants [23]. Considering this and the probable existence of yet unknown genetic factors, together with the fact that, beyond genetics, family history of PCa and/or environmental factors might affect PCa risk and mortality, the choice of ORs of 3 per 1-SD increase for the U→S and U→Y relationships in most of our simulations in Table 1 seems reasonable. However, we also investigated ORs as high as 10, and even in these extreme scenarios, the magnitude of the collider bias remained small. Notably, strong associations of U with both S and Y are necessary for the induction of collider bias, and only one strong association is not sufficient. Other variables, such as sociodemographic factors, are already routinely adjusted for in observational studies, and can thus be ruled out as drivers of collider bias.
In the investigated scenario of PCa-specific mortality, the potential for collider bias is small because the negative association of BMI with the likelihood of a diagnosis of localized PCa is small (HR per 5-kg/m2 ~ 0.9). For collider bias to become relevant, a substantially stronger association between BMI and localized PCa diagnosis is required. This conclusion does not depend on the specific cumulative incidence of PCa death (i.e., pY,BL), and thus would also hold for associations obtained from studies with a follow-up time resulting in a different, potentially larger, cumulative incidence than the 4% used in most of our simulations. Modification of the effect of BMI on PCa diagnosis by U is an especially influential and uncertain factor in introducing collider bias [14, 15] (Figure 3). However, we could not find any reports of BMI effect modification by genetic PCa risk in the literature; therefore, our assumption that, if present, the effect size of such an interaction is at most half the size (ORs 0.96 and 1.04) of the main effect of BMI on PCa risk (OR 0.92) appears reasonable.
A discussion of the potential of collider bias versus classical confounding bias and other sources of bias in the context of the association of obesity with PCa-specific mortality is warranted. Our simulations demonstrated that classical confounding bears a higher potential of relevantly affecting the observed results than collider stratification. Even relatively modest confounding, including residual confounding, e.g., for smoking, might distort the observed associations by a larger margin than that introduced by collider stratification in PCa cases only. However, to completely explain away the HR of ~1.2 for the relationship between BMI and PCa death (i.e., percentage bias ≥ 20%), very strong confounding would be necessary (Figure 4B); the existence of such confounders beyond the variables already routinely adjusted for in contemporary observational studies [2, 3] is very unlikely. Furthermore, full population and case-only analyses should be affected similarly, and thus classical confounding is also not an explanation for the often observed obesity paradox. Detection bias is another type of bias with a higher potential to distort observed associations than collider bias [4, 26, 27]. Detection bias is quite plausible in full population analyses considering that delayed detection of PCa is more likely in men with obesity compared to normal weight men because of hemodiluted PSA levels, enlarged prostate glands, and possibly lower frequency of asymptomatic PSA screening in men with obesity [2, 4, 26]. By adjusting for clinical characteristics at PCa diagnosis (e.g., TNM staging, PSA level, and Gleason score) in case-only analyses, as done in our previous study from which we derived effect sizes of BMI on PCa [2], the potential of detection bias was minimized. Moreover, interpreting associations between BMI and PCa mortality might also be complicated by analytical challenges arising from differences in age at PCa diagnosis and differential risks of non-PCa death according to BMI status, although modern statistical approaches aim to minimize this source of distortion. Taken together, these findings add further evidence to the rationale that the observed positive relationship between BMI and PCa death is real, and not a statistical artifact. Insulin resistance has recently been shown to be an important pathway through which obesity accelerates PCa death in PCa cases, thus providing a biological explanation for the relationship between BMI and PCa death [28]. Other plausible explanations include [4]: (i) alterations in sex hormone metabolism, particularly androgen deficiency [29], (ii) chronic inflammation characterized by altered levels of adipokines in men with obesity [30], and (iii) less successful treatment, such as higher rates of positive surgical margins, in men with obesity, with associated higher rates of disease recurrence [31].
Our simulation approach is flexible and allows the exploration of a wide range of scenarios, including interactions (Figures 2 and 3). However, we are only simulating simple scenarios with one unobserved S-Y confounder at a time. Furthermore, for our simulations, we used logistic regression models with a binary outcome variable Y, whereas the studies that we were emulating were prospective cohort studies, modeling the time to PCa death using Cox models. However, this simplification (ORs instead of HRs) is often done in simulation analyses and should not substantially affect our findings.
5 Conclusions
The results of our simulations demonstrate that collider stratification bias is unlikely to relevantly affect the positive association between BMI and PCa-specific mortality, as observed in the analyses of localized PCa cases only. Assuming that confounder and detection bias are also unlikely to completely explain away the observed associations due to extensive adjustments of the statistical models adds further evidence to the rationale that the observed positive relationship between BMI and PCa death is real, and not a statistical artifact. Our findings emphasize the importance of exploring alternative mechanisms beyond collider bias to better understand the underlying factors driving this paradox.
Author Contributions
T.S. was involved in conceptualization, funding acquisition, investigation, and writing – review and editing; C.H.: methodology, validation, and writing – review and editing. J.F. was involved conceptualization, formal analysis, investigation, methodology, visualization, and writing – original draft.
Acknowledgements
The authors have nothing to report.
Ethics Statement
The authors have nothing to report.
Consent
The authors have nothing to report.
Conflicts of Interest
The authors declare no conflicts of interest.
Open Research
Data Availability Statement
Only simulated data were used. The procedure for generating data is described in the Methods section.