DNA Methylation Markers for Detection of Cholangiocarcinoma: Discovery, Validation, and Clinical Testing in Biliary Brushings and Plasma
Supported by the Jack and Maxine Zarrow Family Foundation, Multisite Gastrointestinal Cancer Detection by Stool DNA Methylation (CA214679), Paul Calabresi Program in Clinical-Translational Research (CA90628), Carol M. Gatton endowment for Digestive Diseases Research, Eddie Gong and Dana Clay, The Cholangiocarcinoma Foundation, and Mayo Clinic SPORE in Hepatobiliary Cancer (P50 CA210964).
Potential conflict of interest: Ms. Berger owns intellectual property rights in Exact Sciences. Dr. Cao was employed by Exact Sciences. Mr. Foote owns intellectual property rights in Exact Sciences. Dr. Kisiel consults, received grants from, and owns intellectual property rights in Exact Sciences. Mr. Mahoney owns intellectual property rights in Exact Sciences. Dr. Roberts advises and received grants from Bayer. He consults for AstraZeneca, The Lynx Group, and MJH Life Sciences. He advises Exact Sciences, Envision Communications, Gilead, OED Therapeutics, GRAIL, and Genentech. He received grants from Ariad, Glycotest, RedHill, TARGET, BTG International, and Wako Diagnostics. Mr. Taylor owns intellectual property rights in Exact Sciences. Ms. Yab owns stock and intellectual property rights in Exact Sciences.
Presented in part at the Liver Meeting, American Association for the Study of Liver Diseases, November 2015, San Francisco, California, and Digestive Disease Week, May 2017, Chicago, Illinois.
Abstract
Cholangiocarcinoma (CCA) has poor prognosis due to late-stage, symptomatic presentation. Altered DNA methylation markers may improve diagnosis of CCA. Reduced-representation bisulfite sequencing was performed on DNA extracted from frozen CCA tissues and matched to adjacent benign biliary epithelia or liver parenchyma. Methylated DNA markers (MDMs) identified from sequenced differentially methylated regions were selected for biological validation on DNA from independent formalin-fixed, paraffin-embedded CCA tumors and adjacent hepatobiliary control tissues using methylation-specific polymerase chain reaction. Selected MDMs were then blindly assayed on DNA extracted from independent archival biliary brushing specimens, including 12 perihilar cholangiocarcinoma, 4 distal cholangiocarcinoma cases, and 18 controls. Next, MDMs were blindly assayed on plasma DNA from patients with extrahepatic CCA (eCCA), including 54 perihilar CCA and 5 distal CCA cases and 95 healthy and 22 primary sclerosing cholangitis controls, balanced for age and sex. From more than 3,600 MDMs discovered in frozen tissues, 39 were tested in independent samples. In the clinical pilot of 16 MDMs on cytology brushings, methylated EMX1 (empty spiracles homeobox 1) had an area under the curve (AUC) of 0.98 (95% confidence interval [CI], 0.95-1.0). In the clinical pilot on plasma, a cross-validated recursive partitioning tree prediction model from nine MDMs was accurate for de novo eCCA (AUC, 0.88 [0.81-0.95]) but not for primary sclerosing cholangitis–associated eCCA (AUC, 0.54 [0.35-0.73]). Conclusion: Next-generation DNA sequencing yielded highly discriminant methylation markers for CCA. Confirmation of these findings in independent tissues, cytology brushings, and plasma supports further development of DNA methylation to augment diagnosis of CCA.
Abbreviations
-
- ACTB
-
- actin beta
-
- AUC
-
- area under the curve
-
- BMP3
-
- bone morphogenetic protein 3
-
- CA19-9
-
- carbohydrate antigen 19-9
-
- CCA
-
- cholangiocarcinoma
-
- CI
-
- confidence interval
-
- CpG
-
- cytosine-phosphate-guanidine
-
- dCCA
-
- distal CCA
-
- DMR
-
- differentially methylated region
-
- eCCA
-
- extrahepatic CCA
-
- EMX1
-
- empty spiracles homeobox 1
-
- FFPE
-
- formalin-fixed, paraffin-embedded
-
- FISH
-
- fluorescence in situ hybridization
-
- HOXA1
-
- homeobox A1
-
- iCCA
-
- intrahepatic CCA
-
- MDM
-
- methylated DNA marker
-
- MSP
-
- methylation-specific polymerase chain reaction
-
- pCCA
-
- perihilar CCA
-
- PCR
-
- polymerase chain reaction
-
- PSC
-
- primary sclerosing cholangitis
-
- rPart
-
- recursive partitioning tree
-
- RRBS
-
- reduced-representation bisulfite sequencing
-
- TELQAS
-
- target enrichment long-probe quantitative amplified signal
Cholangiocarcinoma (CCA) is an aggressive malignancy that accounts for 10%-15% of all hepatobiliary malignancies.(1, 2) The overall incidence of CCA appears to have increased over the past three decades.(2-5) CCA is associated with several established and possible risk factors, including cirrhosis, choledochal cysts, and chronic inflammatory disorders of the biliary tract, especially primary sclerosing cholangitis (PSC); however, most CCAs arise de novo and are considered sporadic.(1) Unfortunately, CCA is usually diagnosed following symptomatic presentation, which heralds advanced-stage disease, limiting application of curative treatments. Consequently, the overall 5-year survival of patients with CCA is estimated to be less than 10%.(5, 6) Patients with PSC are advised to undergo surveillance for CCA by serial serum carbohydrate antigen 19-9 (CA19-9) and imaging with ultrasound or magnetic resonance imaging (MRI) every 6-12 months.(7) The mortality benefit of early detection of CCA in patients with PSC by MRI has only recently been demonstrated.(8) Intrahepatic CCA (iCCA) presents on imaging as a liver mass, which can be biopsied. In contrast, the diagnosis of extrahepatic CCA (eCCA), which comprises both perihilar CCA (pCCA) and distal CCA (dCCA), remains difficult, and new modalities to complement imaging and invasive testing are urgently needed.(9)
Our group and others have hypothesized that aberrant DNA methylation is a biomarker class that could fill this unmet need. Aberrant DNA methylation of cytosine-phosphate-guanidine (CpG) sites within the genome alters gene expression in human cancers.(10, 11) This phenomenon is already known to be broadly informative in cancer; for example, as few as four methylated gene promoters can perfectly discriminate tissues of colorectal cancers and adenomas from normal mucosae.(12) More recently, a stool-based assay of methylated DNA marker (MDM) bone morphogenetic protein 3 (BMP3) and NDRG4 has been approved by the U.S. Food and Drug Administration as part of a multitarget test for screening and early detection of colorectal cancer.(13-16)
In DNA from primary CCA tumor tissues, aberrant methylation has been observed in the promoters of tumor-suppressor genes with functional consequences.(17-19) Pilot clinical observations show that candidate DNA methylation markers applied to brush cytology specimens may accurately detect CCA.(20) Unbiased, next-generation sequencing has identified differentially methylated regions (DMRs) of DNA developed into cell-free MDMs that have substantial diagnostic value in cancers of the colorectum, esophagus, stomach, liver, and pancreas(21-25); however, this discovery approach has not yet been applied to CCA.
Therefore, we hypothesized that (1) DNA sequencing by the reduced-representation bisulfite sequencing (RRBS) technique on DNA extracted from frozen case and control tissues would identify highly discriminant MDMs for CCA; (2) these candidates would be confirmed in independent tissue samples; and (3) these candidates would show high discrimination for pCCA and dCCA cases from benign controls when applied to DNA extracted from cytology brushing samples and plasma.
Materials and Methods
The study protocol conformed to the ethical guidelines of the 1975 Declaration of Helsinki as reflected in a priori approval by the Mayo Clinic institutional review board. Neither prisoners nor institutionalized persons were studied.
Study Overview
Sequential and parallel case-control studies were conducted to identify, validate, and pilot DNA methylation biomarkers in CCA (Fig. 1).

Discovery Phase
RRBS was used to identify DMRs between DNA sequences of primary CCA tumors and matched frozen benign bile duct or liver parenchyma tissues and white blood cell controls. DMRs were defined as regions with ≥5 CpGs per 100 base pair (bp) reading frame that met two criteria. First, methylation percentage requirements were ≥10% in cases and <1% among controls; second, the methylation percentage was required to be ≥5-fold greater in cases than in controls. All DMRs meeting these performance criteria were then technically replicated by quantitative methylation-specific polymerase chain reaction (MSP) assays.
Biological Validation Experiments
After discovery and technical validation, case-control studies were conducted to confirm the top DMRs identified in discovery and technical validation phases on an independent tissue set. Because specific short-fragment DNA methylation aberrancies were targeted from this point forward, we refer to them as MDMs. Performance filtration criteria required the MDMs to show an area under the curve (AUC) > 0.75 and to have at least a 2.5-fold greater copy number in cases compared with controls, to be considered for further testing in clinical samples.
Parallel Clinical Pilots
Selected MDMs were then tested in a clinical pilot on independent biliary brushing samples. MSPs were used in all validation phases and were all performed by technicians blinded to all clinical data. Candidate MDMs were also tested on an independent archive of frozen blood plasma samples (≥1.5 mL) of age-balanced and sex-balanced patients with CCA, PSC cancer-free controls, and healthy controls.
Discovery Phase
Patients and Samples
From existing archives of the International Hepatobiliary Neoplasia Registry, frozen and formalin-fixed, paraffin-embedded (FFPE) CCA tissue samples were identified and matched to adjacent benign hepatobiliary tissues. Case samples included tissues from surgically resected CCAs; approximately 80 Mayo Clinic accessible patient samples were enrolled in the registry between 1988 and 2015. Patients who had received neoadjuvant therapy were excluded. Before DNA extraction, both case and control tissue slides were reviewed by an expert gastrointestinal pathologist to confirm the histological diagnosis of CCA and identify normal adjacent hepatic parenchyma or biliary epithelium. The anatomical classification of CCAs was confirmed by imaging studies along with surgical and pathology notes obtained from electronic medical records. These were classified into eCCA, including both pCCA and dCCA, and iCCA.(26) DNA was extracted from micro-dissected tissues, yielding at least 300 ng of DNA per sample using the Qiagen Blood and Tissue Mini and FFPE Tissue Mini kits (Valencia, CA).
To optimize our marker discovery, we chose to sequence and include multiple control tissues in our analysis that could potentially influence the clinical applicability of the markers. Given the inflammatory nature of PSC and its strong association with CCA, DNA from buffy coats was included in the sequencing to avoid any DNA methylation noise signal from white blood cells during MSP. However, PSC-associated tumors and adjacent control tissues were not available because these patients are rarely treated by surgical resection. Moreover, methylation sequences of normal pancreatic tissues discovered from our previous study were incorporated in the statistical filtration criteria to exclude background methylation of pancreatic origin, given the contiguity of the pancreatico-biliary ducts.
Library Preparation
Detailed methods for preparation of DNA libraries for sequencing and the sequencing protocol have been described.(27) Briefly, 300 ng of genomic DNA was fragmented by digestion with 10 units of MspI, a restriction enzyme that recognizes CpG-containing motifs, to enrich the sample CpG content. Digested fragments were ligated to methylated TruSeq adapters (Illumina, San Diego, CA) containing barcode sequences. Then, size selection of 160-bp to 340-bp fragments was performed using Agencourt AMPure XP SPRI beads/buffer (Beckman Coulter, Brea, CA). Samples then underwent bisulfite conversion using a modified EpiTect protocol (Qiagen).
Sequencing
Samples were loaded onto flow cells according to a prespecified lane assignment, with additional lanes reserved for internal assay controls. Sequencing was performed by the Next-Generation Sequencing Core at the Mayo Clinic Medical Genome Facility on the Illumina HiSeq 2000 (San Diego, CA). Standard Illumina pipeline software reported the sequencing data in the FASTQ format. A Streamlined Analysis and Annotation Pipeline for RRBS was used for sequence alignment and methylation extraction.(28)
Biological Validation Phase
Patients and Samples
DNA extracted from iCCA and eCCA FFPE tissue samples was compared with control groups consisting of benign hepatobiliary tissue. MDMs were selected from several sources (Supporting Table S1). These included RRBS from iCCA and eCCA as well as BMP3 and NDRG4, which were already coupled to the actin beta (ACTB) assay; bisulfite-treated ACTB was used as a reference of bisulfite treatment and DNA input. Given the heterogeneity of CCA, we assessed the similarities and differences of MDMs between iCCA and eCCA by using common samples across the biological validation tissue experiments, unless depleted; however, these were independent of the discovery and clinical pilot patients. After pathology confirmation of diagnoses, DNA was extracted from micro-dissected tissues using Qiagen kits and assayed using MSP.
Assay by Quantitative MSP
MSP methods have been described.(29) Briefly, bisulfite-treated DNA (1 µL) was the template for methylation quantification by fluorescence-based real-time polymerase chain reaction (PCR). Primers for each marker were designed to target the bisulfite-modified sequences (Integrated DNA Technologies [IDT], Coralville, IA). Each primer set was tested with assay-specific controls, including non-bisulfite-converted reference human genomic DNA (Novogen, Oakville, Canada), converted methylated reference (Millipore, Billerica, MA), converted unmethylated reference (Qiagen), and normal buffy coat–converted DNA. In addition, melting curve analysis was run on all primer sets to ensure that specific amplification was occurring. The analytical false-positive rate was <0.05% in all cases, and the analytical sensitivity for fully methylated sequences was 1:5,000.
PCRs for tissue DNA samples were performed with SYBR Green Master Mix (Roche, Mannheim, Germany). All reactions were run on Roche 480 LightCyclers. Bisulfite-treated CpGenome Universal Methylated DNA (Millipore) was used as a positive control and serially diluted to create standard curves for all plates. A region without CpG sites in the ACTB gene was also quantified with PCR using primers recognizing the bisulfite-converted sequence. The reported units, copies per sample, are the number of copies of methylated DNA present in the sample, as determined in comparison with serially diluted universally methylated DNA standards, normalized by the concentration of DNA input, before bisulfite treatment. This assay method was also used for the clinical brushing study.
Clinical Biliary Brushing Study in eCCA
Patients and Samples
Samples were obtained from residual specimens archived in the Mayo Clinic Tissue Registry. Brushings were taken for routine cytology analysis during endoscopic retrograde cholangiopancreatography (ERCP) from patients undergoing clinical evaluation for suspected malignant biliary strictures at our institution. Brushes were put into PreservCyt fixative (Hologic, Marlborough, MA) and submitted to the laboratory.
For routine cytology, ThinPrep (Hologic) slides were made and Papanicolaou-stained. A cytopathologist classified each case as nondiagnostic, negative, atypical, suspicious, or positive for malignancy. Patients were followed prospectively for at least 3 years. Electronic medical records were reviewed to collect patient age, sex, and PSC status. Positive intraluminal brush cytology defined the case status. Controls were required to have clinicopathologic absence of malignancy, which required exclusion of all of the following: positive cytology; polysomy demonstrated by UroVysion (Abbott, Abbott Park, IL) fluorescence in situ hybridization (FISH) of brush cytology specimen and either a malignant-appearing biliary tract stricture detected by radiologic imaging or a mass present on cross-sectional imaging studies; serum CA19-9 level >100 U/mL in the context of a malignant-appearing biliary tract stricture, demonstrated by radiologic imaging (in absence of bacterial cholangitis); or perihilar mass lesion with a malignant-appearing biliary tract stricture on cross-sectional imaging studies with metastasis by imaging studies and/or death from malignant disease.(30)
Patients were matched by age and gender. Based on cytology, FISH, clinical follow-up, and tissue diagnosis, samples were divided into eCCA cases, PSC controls, and non-PSC controls.
DNA was extracted using Dynabeads SILANE Viral NA Kit (Thermo Fisher Scientific, Waltham, MA). Markers were then assayed by quantitative MSP, as done previously.
Clinical Plasma Study
Patients and Samples
For the plasma study, frozen plasma samples (≥1.5 mL) archived from an independent cohort of patients with CCA, PSC controls, and healthy controls were identified. Healthy controls were balanced to cases by age and gender. PSC controls were required to have clinicopathologic absence of malignancy as outlined previously and were followed prospectively in the medical record for at least 3 years. Plasma samples obtained after treatment for CCA were excluded.
DNA was extracted from plasma and bisulfite converted overnight at 50°C with an EZ-96 DNA methylation kit (Zymo Research, Irvine, CA).
Candidate markers were then assayed by blinded technicians using target enrichment long-probe quantitative amplified signal (TELQAS) assays, a modification to quantitative allele-specific real-time target and signal amplification, as described.(15) Given the complexity of TELQAS assay design and the early-phase level of this clinical study, preselected markers with existing triplex TELQAS assay designs were chosen on the basis of representation in the RRBS data set for CCA and their absence in the control data. TELQAS assays include PCR primers, detection probes, and invasive oligos (IDT), GoTaq DNA Polymerase (Promega, Madison, WI), Clevase 2.0 (Hologic), and fluorescence resonance energy transfer reporter cassettes containing fluorescein amidite, Quasar 670, and Hex (Biosearch Technologies, Novato, CA). Triplexes were assayed on the LightCycler 480 (Roche), and all results were normalized to the bisulfite-treated ACTB product amplified from the same sample.
Statistical Analysis
RRBS Discovery
Read-depth criteria were based on the desired statistical power (80%) to detect a 10% difference in the methylation percentage between any two groups in which the sample size of each group was approximately 18 individuals. Statistical significance was determined by overdispersed logistic regression of the methylation percentage per DMR, based on read counts. Each DMR was then ranked by P value, AUC, and the fold change of methylation percentage values among cases and controls.
Biological Validation
The ACTB-corrected copy number of each marker was used to estimate the AUC for discrimination of cases from controls, overall and per CCA subtype. For inclusion in subsequent experiments, candidate markers were required to demonstrate an AUC > 0.75 and have at least a 2.5-fold greater copy number in cases compared with controls.
Clinical Brushing Pilot
Because primary tissues were not available from patients with PSC during the discovery phase, there was no preliminary estimate of marker levels in clinical samples from these patients. Therefore, the non-PSC group was considered to be the primary reference control group. The discriminant accuracy of each marker was summarized as an AUC with corresponding 95% confidence intervals (CIs) as well as the sensitivity of each marker using non-PSC controls as the reference group. A minimum of 10 cases and 10 controls provided sufficient power (80%) to distinguish an AUC of 0.8 from a null hypothesis of 0.5 at a two-sided significance level of 0.05.
Clinical Plasma Pilot
It was estimated that a minimum of 25 patients in the case group provided 80% power to distinguish an AUC of 0.7 from a null value of 0.5 with a one-sided significance level of 0.05. Marker combinations were studied using recursive partitioning trees (rParts).(31) The rPart model first selects a single MDM that provides the greatest separation, or branch split, between cases and controls. Once split, rPart incorporates the next MDM with the greatest separation between cases and controls within each branch point. This continues until the cross-validated stopping rule is achieved to avoid overfitting. To cross-validate the modeling, a bootstrap random sample of the full data set was generated to train the model (approximately two-thirds of the data), and samples not selected for training (approximately one-third) were set aside for testing. The rPart modeling process was carried out as stated previously, and the sensitivity, specificity, and AUC within the testing set were recorded. This entire process was repeated 500 times to estimate the cross-validated sensitivity, specificity, and AUC.
Results
Discovery
Blinded, randomly allocated DNA extracts from 18 eCCA and 17 iCCA tissue samples, each matched to 18 adjacent benign hepatobiliary parenchymal samples, were sequenced by RRBS. Between 2.5 million and 3.5 million CpG sites among the eCCA and iCCA cases and controls were mapped to the reference genome, respectively. After differential methylation and variance analysis, 3,674 DMRs were identified from the eCCA data and 9,303 DMRs from the iCCA data. To reduce the large number of candidate regions to a smaller validation set, we applied stringent performance cutoffs: AUC > 0.85, methylation in at least five contiguous CpGs, and logistic regression P value <0.001. Nine normal buffy coat–derived DNA samples were also sequenced, and selected DMRs were required to have less than 2% methylation in this cohort.
Biological Validation
The 23 iCCA and 16 eCCA MDM candidates who met selection cutoffs and had methylation signatures suitable for MSP primer design were brought forward for independent sample validation (Supporting Table S1A). MDMs discovered in iCCA were tested in independent samples from 25 iCCAs, for which there were 23 available matched nonneoplastic liver samples, and five eCCA tissues, for which there were two nonneoplastic adjacent bile duct control samples. The median AUC was 0.81 (95% CI, 0.75-0.87), and the median fold change was 9.6 (interquartile range, 6.1-14.5).
MDMs discovered in sequencing of eCCA were tested in independent samples from 14 eCCAs, for which there were two available matched nonneoplastic bile duct samples, and 18 iCCA tissues, for which there were 18 nonneoplastic adjacent liver control samples. The median AUC was 0.77 (0.67-0.84), and the median fold change was 5 (3.4-6.5).
Brushing Clinical Pilot
Candidate DMRs selected from the biological validation included VSTM2B.764, SALL1 (spalt like transcription factor 1), PTGDR (prostaglandin D2 receptor), KCNA1 (potassium voltage-gated channel subfamily A member 1), FERIL4.301, RYR2 (ryanodine receptor 2), DKFZP434H168, NTF3, S1PR1 (sphingosine-1-phosphate receptor 1), and KLF12 (Krüeppel-like factor 12). In addition, we assayed four DMRs that were biologically validated among iCCA tissues; these included HOXA1 (homeobox A1), EMX1 (empty spiracles homeobox 1), ITGAL4, and CYP26C1 (cytochrome P450 family 26 subfamily C member 1). These iCCA markers were selected based on high signal-to-noise ratios in the biological validation.
Table 1 displays the clinical characteristics of the patients providing biliary brushing samples. Of these, 12 had pCCA and 4 had dCCA. None of the cases had underlying PSC. Among the 18 controls with benign biliary strictures, 5 had PSC. Median follow-up time was >4 years as the brushings were obtained. By design, there were no significant differences among cases and controls based on sex or age.
CCA (n = 16) | Controls (n = 13) | PSC Controls (n = 5) | P Value* | |
---|---|---|---|---|
Age, years | 0.057 | |||
Median (Q1, Q3) | 67 (62, 79) | 74 (63, 78) | 60 (46, 65) | |
Sex, n (%) | 1.000 | |||
Female | 3 (18.8%) | 3 (21.4%) | 1 (25%) | |
Male | 13 (81.2%) | 10 (78.6%) | 4 (75%) | |
PSC, n (%) | 1.000 | |||
No | 11 (68.8%) | 13 (100%) | 0 (0%) | |
Yes | 5 (31.2%) | 0 (0%) | 5 (100%) | |
CA19-9, units/mL | 0.342 | |||
Median (Q1, Q3) | 675 (10, 1,816) | 17 (5, 59) | 45 (33, 88) | |
Site, n (%) | NA | |||
Perihilar | 12 (75%) | NA | NA | |
Distal | 4 (25%) | NA | NA | |
Follow-up, years | NA | |||
Median (Q1, Q3) | NA | 4.9 (4.1, 5.5) | 3.9 (3.3, 4.4) |
- * P value calculated relative to the pooled sample of controls.
- Abbreviations: NA, not applicable; Q1, first quartile; Q3, third quartile.
Across all case and non-PSC control samples, EMX1 and HOXA1 were both 100% sensitive at a specificity of 92%, respectively (Fig. 2). Even when including the PSC disease controls, methylated EMX1 maintained 100% sensitivity at a specificity of 89%. This translated to excellent discrimination by EMX1, which had an AUC of 0.98 (95% CI, 0.95-1.0) for both eCCA subtypes and control subtypes combined.

Plasma Clinical Pilot
For the plasma pilot, 59 eCCA cases (42 de novo, 17 PSC-associated) and 117 controls were included. CCA location was perihilar in 54 (91.5%) and distal in 5 (8.5%); American Joint Committee on Cancer stage was II or less in 25 (42.4%); and 33 (56%) were surgically treated with intent to cure (Table 2).
CCA (n = 59) | Control (n = 117) | P Value | |
---|---|---|---|
Age, years | 0.254 | ||
Median (Q1, Q3) | 60 (52, 71.5) | 62 (58, 64) | |
Sex, n (%) | 0.471 | ||
Female | 13 (22%) | 32 (27.4%) | |
Male | 46 (78%) | 85 (72.6%) | |
PSC, n (%) | 0.178 | ||
No | 42 (71.2%) | 95 (81.2%) | |
Yes | 17 (28.8%) | 22 (18.8%) | |
CA19-9, units/mL | 0.006 | ||
Median (Q1, Q3) | 220 (56, 1,008) | 69 (17.5, 87.75) | |
Site, n (%) | |||
Perihilar | 54 (91.5%) | NA | NA |
Distal | 5 (8.5%) | NA | |
Follow-up (years) | |||
Median (Q1, Q3) | NA | 2.8 (1.1, 4) | NA |
Stage* | |||
I/II | 25 (42.4%) | NA | NA |
III/IV | 34 (57.6%) | NA | |
Surgical Intervention | NA | ||
Transplanted | 16 (27.1%) | NA | |
Resected | 17 (28.8%) | NA | |
Inoperable | 26 (44.1%) | NA |
- * American Joint Committee on Cancer, 8th edition.
- Abbreviations: NA, not applicable; Q1, first quartile; Q3, third quartile.
rPart modeling from all nine markers classified eCCA with a sensitivity of 76% (95% CI, 63%-86%) at a specificity of 94% (88%-98%) and AUC of 0.9 (0.85-0.95). Importantly, the panel detected 64% (45%-80%) of eCCAs amenable to transplant or surgical resection and 63% (38%-84%) of eCCAs among patients with low serum CA19-9 (≤100 U/mL). This best-fit rPart model had an AUC of 0.91 (0.85-0.96) for de novo eCCA and 0.84 (0.71-0.97) for PSC-associated eCCA. Distributions of each MDM and the combined panel are shown in Supporting Fig. S1.
Cross-validation of the rPart model showed a decrease in overall AUC to 0.81 (0.74-0.88). The cross-validated prediction model showed strong accuracy for de novo eCCA (AUC, 0.88 [0.81-0.95]) but not for PSC-associated eCCA (AUC, 0.54 [0.35-0.73]) (P = 0.0016) (Fig. 3).

The cross-validated sensitivity of the MDM panel in patients with CCA with operable disease was 58% (39%-75%) compared to a sensitivity of 37% (16%-62%) observed for CA19-9 at a cutoff value of 100 units/mL.
In the publicly available data sets from the Cancer Genome Atlas (accessed September 23, 2020), all nine markers were significantly hypermethylated (Supporting Table S2). Expression was also significantly up-regulated or down-regulated for five of the nine genes that annotated to the MDMs. Supporting Table S2 also lists the function of each gene, all relevant to biological pathways in cancer.
Discussion
Without a priori bias to known CpG islands, RRBS successfully identified methylation markers associated with CCA. We successfully validated the selected DMRs as candidate MDMs by using MSP on DNA extracted from independent tissue samples. Due to the heterogeneity of CCA, we aimed to biologically validate our markers not only on eCCA tissues but also on iCCA tissues. The two most informative markers in eCCA brushing samples, EMX1 and HOXA1, were also broadly represented in iCCA tissues. We then successfully piloted selected MDMs in archival biliary brushing-extracted DNA and plasma-extracted DNA from patients with eCCA, demonstrating feasibility to diagnose eCCA at high sensitivity and specificity.
In brushing samples EMX1 and HOXA1, each had sensitivity of 100% in detecting both eCCA subtypes, at 92% specificity, referent to non-PSC controls with benign biliary strictures. When PSC controls were included in the specificity calculation, the EMX1 false-positive rate was only 10%.
These, and other MDMs, may have utility in addressing diagnostic challenges in eCCA, which include difficulty distinguishing benign from malignant biliary strictures by noninvasive imaging. Biliary tract tumors tend to grow longitudinally rather than radially, making their appearance inconspicuous on cross-sectional imaging. The only currently available serum biomarker, CA19-9, is poorly sensitive as a standalone test but may augment the sensitivity of noninvasive imaging, albeit with significant compromise in specificity.(32-34) Patients with dominant strictures on imaging currently undergo diagnostic ERCP for biopsy and cytologic sampling by brushing. Unfortunately, sensitivity of routine cytologic analysis is less than 40%.(35, 36) Sensitivity of cytology-based diagnosis has been greatly augmented by the addition of FISH, although overall sensitivity for early-stage CCA detection is less than 60%.(30, 37) Thus, despite recent advances, increased accuracy is still urgently needed for diagnostic and surveillance tests for CCA. The results of the current study suggest that the MDMs assayed from DNA obtained from biliary brushing should be pursued for complementarity to existing strategies aimed at diagnosis of CCA. We do not yet have data on how MDMs will complement existing strategies, but we plan to perform comparative and complementary studies to novel FISH probes, which are currently in clinical use at Mayo Clinic.
Plasma assay of MDMs appears feasible for diagnosis of eCCA relative to normal controls and PSC controls in the best-fit models. The cross-validated AUC remained strong at >0.8; however, when stratified for PSC status, the panel retained high accuracy for de novo CCA but not PSC-associated CCA. Although CCA is a dominant cause of morbidity and mortality in those with PSC, de novo CCA is substantially more common than PSC-associated CCA in the general population.(1) Population-level screening for CCA is not currently practiced due to low prevalence. Using the rationale of aggregate prevalence for screening of multiple cancers, investigative teams have approached this problem by studying tests that can detect multiple cancers from blood or other media,(38) and case-control and clinical utility studies confirm early feasibility.(39, 40)
Despite these encouraging results, several important limitations must be acknowledged. Although the MDMs were identified by rigorous tissue discovery phase and biologic validation, the clinical pilot studies were powered to show feasibility against a noninformative result; these sample sizes did not allow for tight precision in the analysis of CCA subgroups, particularly analyses stratified for age and sex. CA19-9 data were not available for healthy controls; thus, we used a clinically accepted cutoff rather than modeling CA19-9 as a continuous variable in combination with MDMs. The second limitation was the nonavailability of PSC-associated CCA and PSC controls in the discovery process. At the time the RRBS was conducted, library-preparation protocols required frozen DNA. The frozen tissue archive from which discovery specimens were selected consisted of surgical resection residual samples, and patients with PSC are rarely treated with resection at our institution. This is the most likely reason why plasma assay of MDMs was noninformative for PSC-associated CCA in cross-validated analysis. We plan to address this limitation with future experiments using RRBS on plasma samples, for which PSC will be well represented. Another limitation is that only 39 MDM sequences out of more than 3,600 that were discovered by RRBS were tested in the biologic validation process. This is due to the inefficiency of MSP and the finite amount of DNA available from each patient sample. Targeted sequencing is a strategy with the potential to validate thousands of MDMs from the same amount of FFPE-extracted DNA.
In summary, the RRBS technique identified numerous highly informative DNA methylation markers of CCA. These candidate MDRs discovered from the tissue can be analyzed from biliary brushing and plasma DNA, and they may serve as excellent diagnostic biomarkers for CCA and further enhance the performance of current standard diagnostic tests, such as conventional cytology or FISH. Further optimization of marker panels and brush specimen and plasma collection conditions are anticipated to improve upon these encouraging findings. Large-scale prospective validation among higher-risk disease controls will be necessary before this technology can be used in routine patient care.
Acknowledgment
The authors dedicate this work to the memory of Dr. David A. Ahlquist (1951-2020). They thank Amanda Bedard for her administrative assistance and are grateful for the Genome Analysis Core (GAC) and co-directors Julie M. Cunningham, Ph.D., and Eric Wieben, Ph.D. GAC is supported, in part, by the Center for Individualized Medicine and the Mayo Clinic Comprehensive Cancer Center grant (National Cancer Institute P30CA15083).