Volume 59, Issue 1 pp. 39-50
ORIGINAL ARTICLE
Open Access

Prospective comparison of diagnostic tests for bile acid diarrhoea

Christian Borup

Corresponding Author

Christian Borup

Department of Internal Medicine, Zealand University Hospital, Køge, Køge, Denmark

Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark

Correspondence

Christian Borup, Department of Internal Medicine, Zealand University Hospital, Lykkebækvej 1, Køge, Denmark.

Email: [email protected]

Contribution: Conceptualization (equal), Data curation (lead), Formal analysis (lead), Funding acquisition (lead), ​Investigation (equal), Methodology (equal), Project administration (lead), Software (lead), Visualization (lead), Writing - original draft (lead), Writing - review & editing (equal)

Search for more papers by this author
Lars Vinter-Jensen

Lars Vinter-Jensen

Department of Medical Gastroenterology, Aalborg University Hospital, Aalborg, Denmark

Contribution: ​Investigation (equal), Project administration (equal), Writing - review & editing (equal)

Search for more papers by this author
Søren Peter German Jørgensen

Søren Peter German Jørgensen

Department of Gastroenterology and Hepatology, Aarhus University Hospital, Aarhus, Denmark

Contribution: ​Investigation (equal), Project administration (equal), Writing - review & editing (equal)

Search for more papers by this author
Signe Wildt

Signe Wildt

Unit of Medical and Surgical Gastroenterology, Hvidovre University Hospital, Hvidovre, Denmark

Contribution: Conceptualization (equal), Methodology (equal), Supervision (equal), Writing - review & editing (equal)

Search for more papers by this author
Jesper Graff

Jesper Graff

Department of Clinical Physiology and Nuclear Medicine, Hvidovre University Hospital, Hvidovre, Denmark

Contribution: Conceptualization (equal), ​Investigation (equal), Project administration (equal), Supervision (equal), Writing - review & editing (equal)

Search for more papers by this author
Tine Gregersen

Tine Gregersen

Department of Nuclear Medicine and PET, Aarhus University Hospital, Aarhus, Denmark

Contribution: ​Investigation (equal), Project administration (supporting), Writing - review & editing (equal)

Search for more papers by this author
Anna Zaremba

Anna Zaremba

Department of Medical Gastroenterology, Aalborg University Hospital, Aalborg, Denmark

Contribution: ​Investigation (equal), Project administration (supporting), Writing - review & editing (equal)

Search for more papers by this author
Trine Borup Andersen

Trine Borup Andersen

Department of Nuclear Medicine, Aalborg University Hospital, Aalborg, Denmark

Contribution: ​Investigation (equal), Project administration (supporting), Writing - review & editing (equal)

Search for more papers by this author
Camilla Nøjgaard

Camilla Nøjgaard

Unit of Medical and Surgical Gastroenterology, Hvidovre University Hospital, Hvidovre, Denmark

Contribution: ​Investigation (supporting), Writing - review & editing (equal)

Search for more papers by this author
Hans Bording Timm

Hans Bording Timm

Unit of Medical and Surgical Gastroenterology, Hvidovre University Hospital, Hvidovre, Denmark

Contribution: ​Investigation (supporting), Writing - review & editing (equal)

Search for more papers by this author
Antonin Lamazière

Antonin Lamazière

Département de Métabolomique Clinique METOMICS, Hôpital Saint Antoine, Sorbonne University, Paris, France

Contribution: Formal analysis (equal), Methodology (equal), Resources (supporting), Validation (supporting), Writing - review & editing (equal)

Search for more papers by this author
Dominique Rainteau

Dominique Rainteau

Département de Métabolomique Clinique METOMICS, Hôpital Saint Antoine, Sorbonne University, Paris, France

Contribution: Formal analysis (equal), Methodology (equal), Resources (supporting), Validation (equal), Writing - review & editing (equal)

Search for more papers by this author
Svend Høime Hansen

Svend Høime Hansen

Department of Clinical Biochemistry, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark

Contribution: Formal analysis (equal), Methodology (equal), Resources (supporting), Writing - review & editing (equal)

Search for more papers by this author
Jüri Johannes Rumessen

Jüri Johannes Rumessen

Department of Internal Medicine, Zealand University Hospital, Køge, Køge, Denmark

Contribution: Conceptualization (equal), Methodology (equal), Supervision (equal), Writing - review & editing (equal)

Search for more papers by this author
Lars Kristian Munck

Lars Kristian Munck

Department of Internal Medicine, Zealand University Hospital, Køge, Køge, Denmark

Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark

Contribution: Conceptualization (lead), Funding acquisition (lead), ​Investigation (supporting), Methodology (lead), Project administration (lead), Supervision (lead), Writing - review & editing (lead)

Search for more papers by this author
First published: 05 October 2023
Citations: 2

The Handling Editor for this article was Professor Alexander Ford, and it was accepted for publication after full peer-review.

Summary

Background

Bile acid diarrhoea is often missed because gold standard nuclear medicine tauroselcholic [75-Se] acid (SeHCAT) testing has limited availability. Empirical treatment effect has unknown diagnostic performance, whereas plasma 7α-hydroxy-4-cholesten-3-one (C4) is inexpensive but lacks sensitivity.

Aims

To determine diagnostic characteristics of empirical treatment and explore improvements in diagnostics with potential better availability than SeHCAT.

Methods

This diagnostic accuracy study was part of a randomised, placebo-controlled trial of colesevelam. Consecutive patients with chronic diarrhoea attending SeHCAT had blood and stool sampled. Key thresholds were C4 > 46 ng/mL and SeHCAT retention ≤10%. A questionnaire recorded patient-reported empirical treatment effect. We analysed receiver operating characteristics and explored machine learning applied logistic regression and decision tree modelling with internal validation.

Results

Ninety-six (38%) of 251 patients had SeHCAT retention ≤10%. The effect of empirical treatment assessed with test results for bile acid studies blinded had 63% (95% confidence interval 44%–79%) sensitivity and 65% (47%–80%) specificity; C4 > 46 ng/mL had 47% (37%–57%) and 92% (87%–96%), respectively. A decision tree combining C4 ≥ 31 ng/mL with ≥1.1 daily watery stools (Bristol type 6 and 7) had 70% (51%–85%) sensitivity and 95% (83%–99%) specificity. The logistic regression model, including C4, the sum of measured stool bile acids and daily watery stools, had 77% (58%–90%) sensitivity and 93% (80%–98%) specificity.

Conclusions

Diagnosis of bile acid diarrhoea using empirical treatment was inadequate. Exploration suggested considerable improvements in the sensitivity of C4-based testing, offering potential widely available diagnostics. Further validation is warranted. ClinicalTrials.gov: NCT03876717.

1 INTRODUCTION

Bile acid diarrhoea is a common cause of chronic watery diarrhoea, affecting 1% of the general population.1, 2 Increased amounts of bile escaping small bowel reabsorption cause colonic watery secretion and peristalsis, leading to watery diarrhoea with urgency.2-5 Bile acid diarrhoea may be secondary to cholecystectomy, small bowel resection, or inflammation.6 However, in primary bile acid diarrhoea, the causes are genetic and physiological and consequently not detected by colonoscopy with biopsies or radiological imaging.2, 7, 8 The diagnostic yield of testing for bile acid diarrhoea in patients with diarrhoea-type irritable bowel syndrome is 32%, and therefore, guidelines recommend testing in this population.9-11 Unfortunately, the two gold standard tests are cumbersome and not widely available: the tauroselcholic [75Se] acid (SeHCAT) one-week retention testing or the measurement of 48-hour total stool bile acid excretion on a diet with 100 g fat per day.12 Therefore, bile acid diarrhoea is often overlooked or mistaken for irritable bowel syndrome, which is detrimental to patient quality of life and increases healthcare costs.13, 14 In the lack of better options, diagnostic assessment of empirical treatment effect is common practice, although with unknown diagnostic performance.15, 16 Furthermore, with several treatment options available for bile acid diarrhoea, proper diagnosis is fundamental.15, 17-19 Plasma 7α-hydroxy-4-cholesten-3-one (C4) is a surrogate marker of hepatic bile acid synthesis and a cheaper diagnostic alternative that potentially could be widely available; however, compared with SeHCAT testing, C4 has a low sensitivity of about 50%.20-22 Also, the diagnostic value of fibroblast growth factor 19 (FGF19)20, 23, 24 and bile acids in spot stool samples have been assessed for diagnosis.8, 25, 26 However, due to all these test options, meta-analyses on the prevalence and diagnosis of bile acid diarrhoea were limited by significant heterogeneity.9, 27 Therefore, we aimed to determine the diagnostic characteristics of empirical colesevelam treatment and explore improvements in diagnostic tests for bile acid diarrhoea with potential wide availability with reference to gold standard SeHCAT testing.

2 METHODS

2.1 Oversight

This diagnostic accuracy study was part of an investigator-initiated double-blinded placebo-controlled trial of colesevelam in bile acid diarrhoea (SINBAD). The treatment effects are reported separately.19 The study was approved by the Danish Region Zealand ethics committee (SJ-641), the Danish Data Protection Agency, the Danish Medicines Agency (EudraCT 2016-001452-22) and registered on ClinicalTrials.gov (NCT03876717), where the study protocol is available. The study was conducted following the Helsinki Declaration. All patients gave written informed consent before participation. All authors had access to the study data, reviewed and approved the final manuscript.

2.2 Patients

Consecutive patients referred for SeHCAT testing due to suspected bile acid diarrhoea and aged 18–79 years were eligible. A history of cholecystectomy or prior small or large intestine surgery was not reason for exclusion. Patients with inflammatory bowel disease and microscopic colitis were excluded; the complete list of participation criteria has been published.19

2.3 Study procedures

Charts of patients referred for SeHCAT testing were prescreened for eligibility before sending out invitations. Recruitment was in conjunction with the first of two visits required for SeHCAT testing. After informed consent, a blood sample was immediately taken. We logged the time of sampling, fasting state, statin or fibrate medication, and any consumption of alcohol in the previous 24 hours, because these are putative factors affecting FGF19 and C4 levels.28-32 Any statin or fibrate prescription was paused for seven days, i.e. until the second visit for SeHCAT testing. The questionnaires Short Health Scale, Short Form 36 version 2, the gastrointestinal symptom rating scale and a Rome version III questionnaire on functional gastrointestinal disorders were answered.33, 34 During the six days between the two hospital visits for SeHCAT testing, the patients recorded stool habits using a structured diary with Bristol stool scale pictograms. The patients could collect a voluntary random stool sample at home to be immediately frozen below −18 degrees Celsius. These stool samples were brought to the hospital for the second SeHCAT visit to be kept at −80 degrees Celsius. At the second SeHCAT visit, a fasting morning blood sample was collected no later than 10:00 AM, with no alcohol consumption allowed 24 hours prior. The study diary was tallied, and patients fulfilling the criteria for diarrhoea: ≥ 3.0 bowel movements or ≥1.0 watery bowel movements (Bristol stool scale type 6 and 7) per day as an average over the six-day diary were eligible for randomisation to 12 days placebo-controlled treatment with colesevelam, as detailed and reported.19 The patients answered a questionnaire at the end of treatment, while both treatment allocation and diagnostic results were blinded: Was your diarrhoea cured – yes or no? Six months after study completion, we reviewed patient files for final diagnoses.

2.4 Diagnostic procedures

SeHCAT testing methodology has been reviewed.35 We defined one-week SeHCAT retention ≤10% as diagnostic for bile acid diarrhoea. High-performance liquid chromatography–tandem mass spectrometry (HPLC-MS/MS) was used to analyse plasma C417 and bile acid profiles in plasma and stool.36, 37 Stool samples were lyophilised to determine the levels of bile acid species per gram of freeze-dried faecal matter.37 The key C4 threshold for bile acid diarrhoea was >46 ng/mL.20, 22 The SeHCAT test result was disclosed to the patients by their physician after the 12 days of treatment but was kept blinded for the study investigators. C4 plasma samples were analysed en bloc after the study ended. All bile acids were summed for the total measured bile acids in absolute values. Cholic acid and chenodeoxycholic acid, whether unconjugated, sulphated or conjugated to glycine or taurine, were summed as primary bile acids and given as absolute values and percentages of the total measured bile acids. Likewise, amounts of conjugated and unconjugated ursodeoxycholic acid, deoxycholic acid, and lithocholic acid were summed to yield the secondary bile acids in absolute amounts and percentage of the total measured bile acids. Results are also given as sums according to sulphation or conjugation. We assessed the threshold >10% of primary bile acids in spot stool samples as used by others.8, 25 No threshold was prespecified for the total amount of measured bile acids in spot stool samples. FGF19 was analysed by enzyme-linked immunosorbent assay.31 We assessed the FGF19 thresholds <60 pg/mL and >204 pg/mL for diagnosis and screening, respectively.20

2.5 Statistical methods

Statistical analyses were done in R version 4.1.3, 2022. Missing data were handled by complete case analysis. The normality of the data distribution was assessed using quantile–quantile plots. Baseline descriptive statistics were grouped by SeHCAT- and C4-defined bile acid diarrhoea (R package gtsummary38). We did receiver operator characteristics (ROC) analysis and diagnostic 2 × 2 cross-tabulation. We used multivariate linear regression to assess the adjusted effect of sample time, fasting, alcohol, and statin or fibrate medication on the delta values of FGF19 and C4 collected at the second and first SeHCAT visits. Logistic regression was used to explore predictors of a positive SeHCAT test.

Combined machine learning diagnostic models were trained in a 60% randomly sampled data partition controlling the rate of SeHCAT-positive tests, while the diagnostic performance was validated in the remaining 40% (R caret packet39). This machine learning applied logistic regression using fivefold cross-validation with 10 repetitions. Prespecified model covariates were C4, FGF19, the mean number of watery stools, and the percentage of primary bile acids. Stool total measured bile acids were highly significant in univariate testing and added post hoc, while FGF19 was removed. Using C4, the mean number of watery stools and stool total measured bile acids (all three covariates log-transformed) gave the best predictive model. We assessed the assumptions and fit of the models above by residual diagnostics. We used simulated scaled residuals (R packet DHARMa40) and Hosmer–Lemeshow's goodness of fit for the logistic regression models. Collinearity was assessed by the variance inflation factor. Additionally, a decision tree was modelled by machine learning using repeated cross-validation as specified above in the same training and testing datasets (rpart function in the caret packet39).

We considered p values <0.05 significant in general, and positive findings should be considered hypothesis generating with an inherent risk of statistical type 1 error. We adjusted the p values in the multiple comparisons done regarding the stool and plasma bile acid profiles by Holm's method of controlling the family-wise error rate.

3 RESULTS

Recruitment started on October 25, 2018, the last clinical visit was on July 1, 2021. The last patient chart follow-up was on February 13, 2022. We prescreened 1124, invited 621 and enrolled 255 patients with chronic diarrhoea. Four patients did not complete SeHCAT testing and were withdrawn. SeHCAT retention ≤10%, diagnostic of bile acid diarrhoea, was found in 96 (38%) of 251 patients. C4 data were available in 233 patients; 54 (23%) had C4 > 46 ng/mL consistent with bile acid diarrhoea (Figure 1). We classified patients with a normal test result as SeHCAT- or C4-defined miscellaneous diarrhoea.

Details are in the caption following the image
STARD diagram of patient flow according to the SeHCAT gold standard test ≤10% versus C4 threshold of 46 ng/mL. Medication causes for exclusion: previously on colesevelam, current treatment with anticoagulants or ciclosporin.

3.1 Patient characteristics

Patients with bile acid diarrhoea defined using C4 or SeHCAT testing had more frequent watery bowel movements than patients with miscellaneous diarrhoea (Table 1 and Table S1), and they reported a higher degree of diarrhoea symptoms on the gastrointestinal symptom rating scale (Table S2). The patients with bile acid diarrhoea tended to be older, have a higher body mass index and have higher levels of plasma triglycerides, glucose and liver enzymes. Furthermore, a history of cholecystectomy was more common in patients with bile acid diarrhoea, while for SeHCAT-defined diagnosis, a history of surgery of the small intestines was more common in patients with bile acid diarrhoea by a factor 2 and regarding resections of the large intestine by a factor 3; however, there were few cases and these differences did not reach statistical significance (Table 1). More details on the comorbidities and classification of patients with SeHCAT-defined bile acid diarrhoea are given in Table S3, and the final diagnoses of patients with a normal SeHCAT result are given in Table S4. The profiles of bile acid species in spot stool samples differed fundamentally with a factor 2–3 higher absolute levels of bile acids found in patients with bile acid diarrhoea, and the SeHCAT-defined groups showed higher percentages of primary bile acids and lower percentages of secondary bile acids in patients with SeHCAT ≤10% than patients with retention >10%; findings that were robust after controlling the family-wise error rate (Table S5). In contrast, the differences in plasma bile acid profiles were less marked and became non-significant after adjusting the p values (Table S6).

TABLE 1. Patient characteristics.
C4 SeHCAT retention
> 46 ng/mL ≤ 46 ng/mL p ≤ 10% > 10% p
Bile acid diarrhoea N = 54 Misc. diarrhoea N = 179 Bile acid diarrhoea N = 96 Misc. diarrhoea N = 155
Patient history and demographic characteristics
Age (years) 55 (39–64) 47 (32–58) 0.02 51 (37–62) 46 (31–58) 0.036
Body mass index (kg/m2) 31 (26–35) 27 (24–31) <0.001 29 (26–34) 26 (24–31) 0.002
Female sex, n (%) 35 (65) 123 (69) 0.59 59 (61) 109 (70) 0.15
Cholecystectomy, n (%) 18 (33) 25 (14) 0.001 23 (24) 20 (13) 0.02
Small bowel resection, n (%) 2 (3.7) 2 (1.1) 0.23 2 (2.1) 2 (1.3) 0.64
Right hemicolectomy, n (%) 3 (5.6) 5 (2.8) 0.39 6 (6.2) 3 (1.9) 0.09
Rome 3 IBS-D, n (%) 23 (56) 72 (54) 0.79 48 (65) 52 (48) 0.022
Rome 3 IBS-M, n (%) 5 (12) 17 (13) 0.93 5 (6.8) 18 (17) 0.051
Rome 3 functional diarrhoea, n (%) 7 (17) 32 (24) 0.37 14 (19) 25 (23) 0.54
Bowel habits
Total stools per day 3.3 (2.3–4.8) 2.8 (2.0–4.2) 0.14 3.7 (2.5–5.2) 2.7 (1.8–3.9) <0.001
Bristol type 6 and 7 stools per day 2.2 (1.2–4.0) 1.5 (0.7–2.7) 0.01 2.7 (1.5–4.3) 1.2 (0.5–2.4) <0.001
Bile acid diarrhoea biomarkers
SeHCAT one-week retention (%) 4 (2–10) 18 (10–29) <0.001 4 (2–7) 23 (16–31) <0.001
SeHCAT ≤10%, n (%) 41 (76) 47 (26) <0.001
C4 (ng/mL) 65 (52–91) 17 (10–27) <0.001 43 (27–68) 16 (9–26) <0.001
C4 > 46 ng/mL, n (%) 41 (47) 13 (9.0) <0.001
FGF19 (pg/mL) 47 (32–71) 95 (63–154) <0.001 74 (46–133) 87 (58–149) 0.11
Spot stool sample total measured bile acids 16.4 (12.6–33.4) 7.0 (4.3–13.6) <0.001 17.2 (12.3–27.2) 6.1 (3.6–9.2) <0.001
Spot stool sample percentage primary bile acids 15.9 (4.9–52.1) 10.6 (3.9–32.3) 0.12 29.2 (8.4–61.8) 7.4 (3.4–18.4) <0.001
Biochemistry
Triglycerides (mmol/L) 1.7 (1.2–2.2) 1.1 (0.8–1.6) <0.001 1.7 (1.1–2.3) 1.1 (0.8–1.4) <0.001
Total cholesterol (mmol/L) 4.8 (4.2–5.7) 4.7 (4.1–5.5) 0.4 4.8 (4.2–5.7) 4.8 (4.0–5.5) 0.41
HDL cholesterol (mmol/L) 1.2 (1.1–1.7) 1.4 (1.2–1.7) 0.09 1.2 (1.1–1.6) 1.5 (1.2–1.7) 0.003
LDL cholesterol (mmol/L) 2.7 (2.0–3.4) 2.7 (2.2–3.3) 0.95 2.8 (2.0–3.3) 2.7 (2.3–3.3) 0.90
Glucose; fasting (mmol/L) 5.8 (5.3–6.2) 5.3 (4.9–5.7) 0.001 5.7 (5.2–6.2) 5.3 (4.9–5.7) <0.001
Bilirubin (micromol/L) 8 (6–10) 9 (6–12) 0.16 8 (6–10) 9 (6–12) 0.086
Alanine aminotranferase (U/L) 30 (21–49) 24 (17–32) <0.001 31 (23–47) 23 (17–30) <0.001
Alkaline phosfatase (U/L) 83 (67–97) 70 (57–87) 0.002 78 (61–97) 70 (57–84) 0.003
  • Note: Patient characteristics by diagnosis of bile acid diarrhoea using C4 > 46 ng/mL or SeHCAT one-week retention ≤10%.
  • Abbreviations: FGF19, fibroblast growth factor 19; IBS, irritable bowel syndrome; IBS-D, diarrhoea-predominant irritable bowel syndrome; IBS-M, mixed-type irritable bowel syndrome; SF36v2, short form 36 version 2; SHS, short health scale.
  • a n = 44 and 134 (C4 columns); 75 and 104 (SeHCAT columns). Continuous data as medians (interquartile range); count data as total (percentage).

3.2 Predictors of bile acid diarrhoea in the patient history

Cholecystectomy was a weak predictor of C4- and SeHCAT-defined bile acid diarrhoea with adjusted odds ratios of 2.6 (95% confidence interval (CI) 1.1–5.9; p = 0.02) and 2.4 (1.1–5.6; p = 0.04), respectively. An increased mean number of watery stools predicted SeHCAT-defined bile acid diarrhoea with an adjusted odds ratio of 3.3 (1.5–7.5; p = 0.003) (Table 2).

TABLE 2. Diagnostic predictors in patient history, demographic characteristics, and bowel habits.
Predictor C4 SeHCAT
>46 ng/mL ≤46 ng/mL Adjusted odds ratio p ≤10% >10% Adjusted odds ratio P
Bile acid diarrhoea N = 54 Misc. diarrhoea N = 179 Bile acid diarrhoea N = 96 Misc. diarrhoea N = 155
Body mass index >30 kg/cm2 30 (57) 51 (29) 2.6 (1.3–5.4) 0.01 42 (45) 43 (28) 1.8 (0.9–3.4) 0.09
Age >60 years 17 (31) 34 (19) 2.3 (1.0–5.1) 0.05 27 (28) 28 (18) 1.6 (0.8–3.4) 0.21
Watery stools per day >1.0 41 (77) 108 (61) 1.5 (0.6–3.8) 0.37 76 (84) 76 (53) 3.3 (1.5–7.5) 0.003
Total stools per day >3.0 31 (58) 82 (46) 1.7 (0.8–3.8) 0.20 58 (64) 58 (40) 1.6 (0.8–3.2) 0.19
Severely affected by urgent bowel movements 25 (48) 64 (37) 1.2 (0.6–2.4) 0.62 49 (54) 44 (31) 1.9 (1.0–3.6) 0.04
Cholecystectomy 18 (33) 25 (14) 2.6 (1.1–5.9) 0.02 23 (25) 19 (12) 2.4 (1.1–5.6) 0.04
Right hemicolectomy 3 (5.6) 5 (2.8) 1.1 (0.3–8.7) 0.46 6 (6.0) 3 (1.9) 5.3 (1.0–43) 0.07
  • Note: Predictors in patient history and physical examination of C4- and SeHCAT-defined bile acid diarrhoea; number of observations (% of total N). Odds ratios adjusted by the predictors given in the table; mean (95% confidence interval). Watery stools were defined as Bristol stool scale type 6 and 7. Urgent bowel movements were reported using the gastrointestinal symptom rating scale questionnaire and ‘severely affected’ was defined as ‘rather severely affected’ or worse.

3.3 Effect of blood sampling conditions on C4 and FGF19

In patients on statins, mean C4 was 18 (95% CI 5–31) ng/mL lower compared with samples taken after a week's pause of the prescription (p = 0.007). Violation of fasting, sampling time and recent alcohol consumption did not significantly affect C4. Violation of fasting affected FGF19 (Table S7).

3.4 Diagnostic value of empirical treatment response

Of the 251 patients, 168 received randomised treatment in the linked trial, 84 were allocated to placebo and 84 to colesevelam.19 Of the 84 patients on colesevelam, 66 patients answered the questionnaire regarding subjective treatment effect, while diagnostics and treatment allocation were double-blinded. The patient-reported effect of empirical colesevelam treatment had 63% (95% CI 44%–79%) sensitivity and 65% (47%–80%) specificity (Table 3).

TABLE 3. Diagnostic characteristics of subjective response to empirical colesevelam treatment.
Questionnaire, subjective effect SeHCAT ≤10% SeHCAT >10%

“Diarrhoea cured”

n =

20 12

PPV

63% (40—74)

False positive rate

35% (20—54)

“Diarrhoea not cured”

n =

12 22

NPV

65% (52—75)

False negative rate

38% (21—56)

Sensitivity

63% (44—79)

Specificity

65% (47—80)

  • Note: Of 84 patients on colesevelam in the linked placebo-controlled trial, 66 patients while diagnostics and treatment allocation remained double-blinded answered the questionnaire item “Was your diarrhoea cured; yes or no”. Mean values with 95% confidence intervals.
  • Abbreviations: NPV, negative predictive value; PPV, positive predictive value.

3.5 Bivariate distributions of biomarkers and clinical symptoms

SeHCAT retention correlated with the number of watery stools (Spearman's rho (rs) = −0.37, p < 10−8), while C4 correlated to a lesser degree (rs = 0.20, p = 0.003), Figure S1. SeHCAT retention correlated closely with the levels of total measured bile acids in spot stool samples (rs = −0.72, p < 10−15) and to a lesser degree with the percentage of primary bile acids in spot stool samples (rs = −0.34, p < 10−8), Figure S2. Both the total measured amount of bile acids (rs = 0.36, p < 10−6) and the percentage of primary bile acids in spot stool samples (rs = 0.39, p < 10−7) correlated with the number of watery bowel movements, Figure S3.

3.6 Receiver operating characteristics analyses

With SeHCAT retention ≤10% as the diagnostic gold standard, the prespecified C4 threshold >46 ng/mL had 47% (95% CI 37%–57%) sensitivity and 92% (87%–96%) specificity. Exploring other cut-off values, Youden's index optimal threshold was 33 ng/mL with 71% (61%–80%) sensitivity and 84% (78%–90%) specificity. The C4 thresholds <20 and >60 ng/mL had 88% negative and positive predictive values, respectively (Table 4). The threshold of total measured bile acids in spot stool samples <7 μmol/g had a 93% negative predictive value; >16 μmol/g had an 88% positive predictive value. The ROC curves are plotted in Figure 2.

TABLE 4. Receiver operator characteristics compared with SeHCAT.
ROC analysis versus SeHCAT ≤10% ROC-AUC Thres-hold Sensitivity (%) Specificity (%) PPV (%) NPV (%) Positive likelihood ratio Negative likelihood ratio
Watery stools per day, n = 234 0.72 (0.65–0.78) ≥1.0 87 (78–92) 40 (32–48) 47 (43–51) 83 (73–89) 1.4 (1.2–1.7) 0.3 (0.2–0.6)
≥2.0 67 (56–76) 66 (58–74) 55 (48–62) 76 (70–81) 2.0 (1.5–2.6) 0.5 (0.4–0.7)
≥3.0 39 (29–50) 83 (75–88) 58 (47–69) 68 (64–72) 2.2 (1.4–3.5) 0.7 (0.6–0.9)
≥5.0 20 (12–30) 95 (90–98) 72 (53–86) 66 (63–68) 4.1 (1.8–9.5) 0.8 (0.8–0.9)

C4 (ng/mL)

n = 230

0.83 (0.78–0.89) >15 93 (86–97) 48 (39–56) 53 (48–57) 92 (84–96) 1.8 (1.5–2.1) 0.1 (0.1–0.3)
>20 87 (80–93) 64 (56–72) 60 (54–66) 88 (83–94) 2.4 (1.9–3.0) 0.2 (0.1–0.4)
>33 71 (61–80) 84 (78–90) 73 (66–81) 82 (78–87) 4.4 (3.0–6.6) 0.4 (0.3–0.5)
>46 47 (37–57) 92 (87–96) 78 (67–87) 74 (70–78) 5.7 (3.2–10.2) 0.6 (0.5–0.7)
>60 32 (22–43) 97 (94–99) 88 (76–97) 70 (67–73) 11.7 (4.2–32) 0.7 (0.6–0.8)

FGF19 (pg/mL)

N = 230

0.56 (0.49–0.64) <60 40 (30–51) 74 (66–81) 49 (40–58) 67 (62–71) 1.5 (1.1–2.2) 0.8 (0.7–1.0)
<204 89 (80–94) 12 (7–18) 38 (36–40) 63 (45–78) 1.0 (0.9–1.1) 1.0 (0.5–2.0)
Stool total measured bile acids (μmole/g), n = 179 0.89 (0.84–0.93) >7 93 (87–99) 63 (53–71) 64 (59–70) 93 (87–98) 2.5 (1.3–3.2) 0.1 (0.05–0.3)
>11 83 (73–91) 84 (76–90) 79 (71–86) 87 (82–92) 5.1 (3.2–7.9) 0.2 (0.1–0.3)
>16 55 (42–65) 94 (89–98) 88 (79–95) 74 (69–79) 9.5 (4.2–21) 0.5 (0.4–0.6)
Stool primary bile acids (%), n = 179 0.71 (0.63–0.79) >10 75 (63–84) 59 (49–68) 59 (50–63) 76 (68–83) 1.8 (1.4–2.3) 0.4 (0.3–0.7)
>25 52 (40–64) 80 (71–87) 65 (55–74) 70 (64–75) 2.6 (1.7–4.0) 0.6 (0.5–0.8)
>50 37 (26–49) 91 (84–96) 76 (61–86) 67 (63–71) 4.3 (2.2–8.6) 0.7 (0.6–0.8)
Combined model 0.95 (0.91–0.99) Index >0.30 87 (69–96) 85 (71–94) 81 (64–93) 90 (76–97) 5.9 (2.8–13) 0.2 (0.16–0.4)
Index >0.50 77 (58–90) 93 (80–98) 88 (70–98) 84 (71–94) 10.5 (3.5–32) 0.3 (0.1–0.5)
  • Note: Mean diagnostic performance characteristics (95% confidence intervals) compared with SeHCAT retention ≤10% as the gold standard. The key C4 threshold was >46 ng/mL; other thresholds are exploratory.
  • Abbreviations: FGF19, fibroblast growth factor 19; NPV, negative predictive value; PPV, positive predictive value.
  • a The model combined the logarithm to C4, total measured bile acids in stools, and baseline number of watery bowel movements (see Supplementary for model equation); diagnostic performance on internal validation.
Details are in the caption following the image
Receiver operating characteristics (ROC) curves of all continuous diagnostics compared with the gold standard SeHCAT retention ≤10% as diagnostic for bile acid diarrhoea. Daily mean watery stools were defined as the number of Bristol stool scale type 6 and 7 stools. The area under the ROC curves and the diagnostic characteristics of selected thresholds are listed in Table 4.

3.7 Explorative diagnostic modelling

Using C4, total measured bile acids in spot stool samples and the daily mean number of watery stools (all log-transformed) in machine learning regression modelling gave a model kappa value of 0.71. The validated area under the ROC curve in the internal validation dataset was 0.95 (95% CI 0.91–0.99), giving 77% (58%–90%) sensitivity and 93% (80%–98%) (Table 4). Adjusted diagnostic odds ratios of the model covariates are found in Table 5, and the mathematical equation of the model is found in the Supporting Information material, page 16. Assessing the robustness of the modelling to the random split in 60% training and 40% testing partitioning of the database returned area under the ROC curves ranging from 0.93 to 0.95. Adding the percentage of primary bile acids to the model resulted in negligible improvement in a few iterations. The decision tree analysis, including C4 and the mean number of watery stools, suggested that C4 ≥ 31 ng/mL in conjunction with ≥1.1 watery stools as diagnostic for bile acid diarrhoea with 70% (51%–85%) sensitivity and 95% (83%–99%) specificity on internal validation (Table 6) (Figure S4). Validation of the diagnostic performance of the decision tree was possible in data from a previous cohort, which was recruited similarly to the current study.20, 22 Plasma samples in these 71 patients; however, were collected with no regard to ongoing statin therapy or recent alcohol consumption and the samples had been kept at −80°C for 5 years. In this previous cohort, the decision tree model had a sensitivity of 62% (41%–80%) and a specificity of 91% (79–98).

TABLE 5. Combined exploratory regression model.
Predictor Change Adjusted odds ratio mean (95% confidence interval) p
Daily mean number of watery stools +50% 2.2 (1.5–3.7) <0.001
C4 +50% 1.7 (1.2–2.5) <0.01
Spot stool sample total measured bile acids +50% 1.8 (1.2–2.9) <0.01
  • Note: Exploratory combined logistic regression model with SeHCAT ≤10% as the diagnostic gold standard. The model was trained in 60% of the database using 5-fold repeated cross validation with 10 repetitions, and validated in the remaining 40% of the database. Model kappa value: 0.71.
TABLE 6. Diagnostic performance of the decision tree for diagnosis of bile acid diarrhoea.
SeHCAT≤10% SeHCAT>10%

Decision tree

C4 ≥ 31 ng/mL AND daily mean no. watery stools ≥1.1

Yes: bile acid diarrhoea; n= 21 2

PPV

91% (73—98)

LR +

14 (4–57)

No: not bile acid diarrhoea n= 9 39

NPV

81% (71—88)

LR −

0.3 (0.2–0.6)

Sensitivity

70% (51—85)

Specificity

95% (83—99)

  • Note: Machine learning developed decision tree defining bile acid diarrhoea as C4 ≥ 31 ng/mL in conjunction with mean daily number of Bristol type 6 and 7 stools ≥1.1. Internally validated diagnostic performance; mean (95% confidence interval).
  • Abbreviations: LR−, Negative diagnostic likelihood ratio; LR+ Positive diagnostic likelihood ratio; NPV, negative predictive value; PPV, positive predictive value.

All regression models had good fits in model diagnostics.

4 DISCUSSION

The limited availability of SeHCAT for the diagnosis of bile acid diarrhoea and the emergence of new treatment options emphasises the need for more readily available and valid diagnostic tests.14, 17-19 This diagnostic accuracy study included a prospective cohort of consecutive patients with chronic diarrhoea attending the reference SeHCAT test. The patient's subjective response to empirical colesevelam treatment was diagnostically inadequate, while we confirmed that the key C4 threshold of 46 ng/mL was specific but lacked sensitivity. Explorations suggested the sensitivity of C4 could be improved by lowering the C4 threshold to 31 ng/mL in conjunction with diary confirmed number of watery stools ≥1.1 or by using a regression model including C4, the mean diary-registrered number of watery stools and spot stool sample total measured bile acids.

The optimal SeHCAT test threshold is not settled, and some consider values of 10%–15% or even 15%–20% a diagnostic grey zone.40, 41 Although not powered to distinguish SeHCAT thresholds, our data suggest worse diarrhoea symptoms in patients with SeHCAT retention of ≤10% and that the bowel habits of patients with SeHCAT values in the 10%–15% and 15%–20% ranges are more similar to patients with SeHCAT retention >20% (Table S1). This notion is supported by data from the linked trial on the treatment difference between colesevelam and placebo in ranges of SeHCAT retention, suggesting benefit of colesevelam over placebo only in the patients with SeHCAT retention of 10% or less and no apparent benefit in the SeHCAT ranges of 10.1%–15% and 15.1%–20%.19 We, therefore, find the threshold of 10% SeHCAT retention reasonable as diagnostic for bile acid diarrhoea.

The diagnostic performance of C4 reported here confirms our previous findings that the 46 ng/mL threshold had 92% (87%–96%) specificity but a low sensitivity of 47% (37%–57%), so about half of the patients with SeHCAT-defined bile acid diarrhoea would be overlooked using this C4 threshold.20 However, since C4 is inexpensive (about 5%) compared to SeHCAT, it has merit to scrutinise its best use. The C4 threshold of 33 ng/mL increased the sensitivity considerably to 71% (61%–80%) with only a slight decline in specificity to 84% (78%–90%). C4 values less than 20 ng/mL had an 88% (95% CI 83%–94%) negative predictive value to rule out bile acid diarrhoea, whereas values above 60 ng/mL had an 88% (76%–97%) positive predictive value; indicating the 20–60 ng/mL range to be a diagnostic grey zone.20, 22 However, our placebo-controlled treatment data from this patient cohort suggested similar remission rates in patients with C4 > 46 ng/mL as in patients with SeHCAT retention ≤10%.19 Moreover, the explorative decision tree analysis suggested a C4 threshold of 31 ng/mL combined with a diary-recorded mean number of watery stools ≥1.1 may increase the sensitivity to 70% (51%–85%) while keeping 95% (83%–99%) specificity. Compared with the diagnostic performance of C4 alone, it appears that the sensitivity of the proximate C4 threshold 33 ng/mL, is retained and that adding the criterion of watery bowel movements ≥1.1 in the decision tree increased specificity to an acceptable level. The reliability of the proposed decision tree model was strengthened by replication of the diagnostic characteristics in our previous cohort.20 The slightly lower sensitivity of 62% of the decision tree in the previous dataset reflects a lower sensitivity of C4 overall that could be explained by differences in blood sampling conditions. The logistic regression model, which in addition also included the covariate of spot stool sample total measured bile acids, increased the sensitivity to 77% (58%–90%). A recent study reported similar diagnostic performance with a mean sensitivity of 78% (64%–89%) and specificity of 93% (83%–98%); however, this was a case–control study with controls matched on sex, age, and body mass index; not on diarrhoea symptoms or bowel transit.42 We deem our study cohort consisting of prospective consecutive patients attending SeHCAT testing to have better external validity. Both our approaches could potentially limit the size of the grey zone of C4-based diagnostics compared with SeHCAT testing. The regression model does require complex HPLC-MS/MS analysis of bile acids in a random stool sample, whereas the decision tree model would readily be available where C4 is implemented.

Where available, SeHCAT testing could be reserved only for patients with grey zone C4 values. If gold standard testing is unavailable, prescribing a sequestrant, typically colestyramine, to evaluate the empirical treatment effect may be the only approach which recent guidelines deemed reasonable in settings without better diagnostic options.10 However, we here report a sensitivity of 63% (44%—79%) and specificity of 65% (47%—80%) for the patient-reported effect of colesevelam; figures that are insufficient for a diagnostic test and accordingly support guidelines that advise against the diagnostic use of empirical treatment effect.6 Our reported test performance was based on 20 (63%) of 32 patients with bile acid diarrhoea diagnosed by SeHCAT ≤10% compared with 12 (35%) of 34 with SeHCAT >10% deeming their diarrhoea cured on colesevelam after 12 day's treatment. This brief treatment duration of treatment is a limitation; however, similar figures were reported in a retrospective study with a follow-up time of 2–24 months, showing 71% of patients with bile acid diarrhoea improving on colesevelam, but 37% of patients without bile acid diarrhoea also improved on colesevelam.43 With several treatment options available for bile acid diarrhoea, using the empirical effect of just one option, typically colestyramine, as diagnostic is, in our opinion, outdated.15, 17, 18, 44 A notion underlined by the substantial rates of false negative and false positive tests of 38 and 35%, respectively, that we here report (Table 3). C4-based biochemical diagnosis of bile acid diarrhoea is fairly inexpensive with potential widespread availability, and could, therefore, optimise the use of the expensive SeHCAT testing. Where gold standard testing is unavailable, C4-based testing seems a considerable improvement over empirical treatment effect that could scale to the diagnostic needs. Further validation is warranted.

Our analysis of bile acid profiles in spot stool samples demonstrated a close relationship between the total measured amount of bile acids and SeHCAT testing with an area under the ROC curve of 0.89 (0.84–0.93), markedly superior to the 0.71 (0.63–0.79) of the percentage of primary bile acids (Table 4, Figure 2). This finding is surprising, as recent diagnostic studies proposed the percentage of primary bile acids in spot stool samples as a preferred measure of bile acid diarrhoea.25, 26 In contrast to our results, a study including 113 patients reported a 0.69 area under the ROC curve against SeHCAT ≤10% of spot stool sample total bile acids measured by an enzymatic kit.45 However, the reported levels of faecal bile acids of median 9.9 (IQR 4.8–15.4) μmol/g, even in severe bile acid diarrhoea with SeHCAT retention <5%, were lower than the median 17.2 (IQR 12.3–27.2) μmol/g we found in patients with SeHCAT retention ≤10%. Differences in sample preparation could explain this discrepancy as we lyophilised the stool samples to measure bile in dried faecal matter. The bile acid profiles in spot stool samples of patients with bile acid diarrhoea and miscellaneous diarrhoea were markedly different even when controlling the family-wise error rate. This was not true regarding the bile acid profiles in plasma, where apparent differences became non-significant when adjusting for multiple statistical comparisons. Unlike measurements of bile acids in stool, plasma levels are fundamentally affected by the hydrophilicity of each bile acid and by the hepatic first-pass effect.46 Although serum lipidomics have been proposed as diagnostic for bile acid diarrhoea, our data suggest that stool samples have better diagnostic applicability.42

Strengths of this study include the consecutive recruitment of prospective patients referred for SeHCAT testing, limiting the effect of selection bias on external validity. The patient groups in this study were similar in demographic characteristics to data from a large observational study and biochemical profile reported in a recent cohort study.13, 47 The study included all current diagnostic modalities except the 48-hour stool collection, and the study size allowed data partitioning with internal validation of the explorative modelling. Although explorative, the diagnostic performance characteristics of the decision tree model including C4 and the mean number of watery stools was validated both in an internal dataset and in a previous cohort. Combining the measurement of C4 with a diary registration would be readily available where C4 is implemented. The performance characteristics of spot stool sample total measured bile acids were appealing; however, the complexity of the HPLC-MS/MS assay to measure numerous bile acid species is a limitation. This study being part of a trial imposed some limitations as the criteria for inclusion and exclusion were tailored for the trial, and notably, patients with inflammatory bowel disease and microscopic colitis were excluded, and the evaluation of clinical response was done after 12 days' treatment. Specific studies addressing these populations are needed. Logistic regression modelling handles collinearity poorly; therefore, we used few known putative covariates, but comprehensive machine learning might identify combinations of bile acid molecular species to be more specific; however, the kappa statistic of 0.71 of our diagnostic model indicates a substantial improvement in diagnostic classification. Finally, the exploratory findings of the study will need subsequent validation.

In conclusion, the effect of empirical colesevelam treatment was diagnostically inadequate. We confirm that the predefined threshold of the plasma biomarker C4 > 46 ng/mL is specific compared with SeHCAT for the diagnosis of bile acid diarrhoea but lacks sensitivity. Exploration suggested that lowering the C4 threshold to 31 ng/mL in conjunction with an increased number of watery stools could considerably improve the sensitivity, as could modelling combining C4 and spot stool sample total measured bile acids.

AUTHOR CONTRIBUTIONS

Christian Borup: Conceptualization (equal); data curation (lead); formal analysis (lead); funding acquisition (lead); investigation (equal); methodology (equal); project administration (lead); software (lead); visualization (lead); writing – original draft (lead); writing – review and editing (equal). Lars Vinter-Jensen: Investigation (equal); project administration (equal); writing – review and editing (equal). Søren Peter German Jørgensen: Investigation (equal); project administration (equal); writing – review and editing (equal). Signe Wildt: Conceptualization (equal); methodology (equal); supervision (equal); writing – review and editing (equal). Jesper Graff: Conceptualization (equal); investigation (equal); project administration (equal); supervision (equal); writing – review and editing (equal). Tine Gregersen: Investigation (equal); project administration (supporting); writing – review and editing (equal). Anna Zaremba: Investigation (equal); project administration (supporting); writing – review and editing (equal). Trine Borup Andersen: Investigation (equal); project administration (supporting); writing – review and editing (equal). Camilla Nøjgaard: Investigation (supporting); writing – review and editing (equal). Hans Bording Timm: Investigation (supporting); writing – review and editing (equal). Antonin Lamazière: Formal analysis (equal); methodology (equal); resources (supporting); validation (supporting); writing – review and editing (equal). Dominique Rainteau: Formal analysis (equal); methodology (equal); resources (supporting); validation (equal); writing – review and editing (equal). Svend Høime Hansen: Formal analysis (equal); methodology (equal); resources (supporting); writing – review and editing (equal). Jüri Johannes Rumessen: Conceptualization (equal); methodology (equal); supervision (equal); writing – review and editing (equal). Lars Kristian Munck: Conceptualization (lead); funding acquisition (lead); investigation (supporting); methodology (lead); project administration (lead); supervision (lead); writing – review and editing (lead).

ACKNOWLEDGMENTS

We are grateful to the patients for participation and to our funders and collaborators making this work possible. We thank research Majbritt Frost Nilsson (M.H.S.) for her crucial role in the project management of the Aalborg study site.

    CONFLICT OF INTEREST STATEMENT

    SW was on an advisory board for Bristol-Meyers Squibb and was supported by Takeda Pharmaceuticals to attend the ECCO congress. All other authors declare no conflicts of interests.

    FUNDING INFORMATION

    This study was funded entirely by independent research grants, predominantly by a donation from the Fabrikant Vilhelm Pedersen og hustrus mindelegat by recommendation from the Novo Nordisk Foundation (NNF19OC0055844). Smaller donations were granted by the Region Zealand Health-Scientific Fund (R17A48B39, RSSF2017000645), the Axel Muusfeldts Fond (2017-771), Overlæge Johan Boserup og Lise Boserups Legat (20795-24), Aase og Ejnar Danielsens Fond (10-002035), Civilingeniør H.C. Bechgaard og hustru Ella Mary Bechgaards Fond (2017-1064/93), Prosektor Axel Emil Søeborg Ohlsen og ægtefælles Mindelegat (6386 MT/IV), the Foundation for Advancement of Medical Science under the A.P. Møller, and Chastine Mc-Kinney Møller Foundation (18-L-0394).

    AUTHORSHIP

    Guarantor of the article: Lars Kristian Munck.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.