Prognostic biomarkers and surrogate end points in PSC
Abstract
See Article on Page 1867
Primary sclerosing cholangitis (PSC) is a rare, progressive cholestatic disease causing considerable morbidity and mortality1 and is one of the leading indications for liver transplantation (LT) in several Western countries.2 For PSC patients the LT-free survival ranges from 13 to 21 years depending on the patient population under study.3, 4 Despite two decades of clinical trials, there is still no existing effective therapy to halt disease progression towards end-stage liver disease.1 Given the lack of effective treatment other than LT, the hepatology community and practitioners are increasingly acknowledging the unmet needs of PSC patients. The considerable advances over the last decade in unmasking the genetic underpinnings of PSC and pathogenetic understanding in mouse models of PSC5, 6 have so far not been translated into clinical established drugs. Clinical trial design to study drugs that improve prognosis in PSC is hampered by the relatively slow progression of the disease, leading to a low rate of clinically relevant end points. Thus, there is an urgent need to identify surrogate end points to overcome these shortcomings in clinical trials in PSC and to enable the translation of recent pathogenetic knowledge into benefit for patients.
The International PSC Study Group (IPSCSG) has given the establishment of surrogate end points for clinical trials in PSC top priority. A consensus paper regarding this pressing issue was recently published, concluding that alkaline phosphatase (ALP), transient elastography (TE) and histology seem like the most promising candidates for surrogate end points in clinical trials for PSC.7 Furthermore, in March 2016, a workshop focusing on surrogate end points, “Trial Design and Endpoints for Clinical Trials in Adults and Children with Primary Sclerosing Cholangitis”, was arranged by the American Association for the Study of Liver Diseases (AASLD) and the U.S. Food and Drug Administration (FDA), hosting leading researchers in the field, industry representatives, patient organizations and regulatory authorities from both the USA (FDA) and Europe (EMA). The discussion brought clarification and insights into what is now needed to finally establish surrogate end points. However, the FDA stated that evidence was still insufficient to conclude regarding surrogate end points for PSC at this point.
Several prognostic risk models have been proposed in an attempt to overcome a shortage of clinically meaningful end points (Table 1). The PSC-specific revised Mayo risk score8 is the most widely used, and change in Mayo risk score has sometimes been employed as outcome in clinical studies. However, its use as surrogate end point has not been validated, and its horizon of only 4 years as well as flat-running survival estimate curve in early stage disease yields limited discriminant information. Furthermore, the Mayo risk score notably failed in forecasting adverse outcomes in high-dose ursodeoxycholic acid (UDCA) treatment trials.9
Study | Year | Site | N | Biomarker | End point | Strengths | Weaknesses |
---|---|---|---|---|---|---|---|
Kim et al.8 | 2000 | USA | 405 | Revised Mayo score | LTX, death | Large cohort Multicentre Validation in independent cohort | Relying mainly on factors reflecting advanced disease |
Ponsioen et al.16 | 2002 | Netherlands | 174 | Amsterdam score (cholangiographic) | LTX, Liver-related death | Long FU | ERC based; ERC is invasive and not indicated in regular FU |
Tischendorf et al.17 | 2007 | Germany | 273 | PSC score | LTX, death | Large cohort, long FU. Both clinical variables and cholangiographic changes | Single centre |
Stanich et al.18 | 2011 | USA | 87 | ALP (normalization) | LTX, death, CCA | Post hoc analysis (of high-dose UDCA trial); retrospective; no time limit to ALP reduction | |
Al Mamari et al.19 | 2013 | UK | 139 | ALP (<1.5 × ULN) | Liver-related death incl. death from CCA, LTX, liver decompensation | Strict definition of ALP improvement, long FU | Retrospective; monocentre; referral centre based; no time limit to ALP reduction |
Lindström et al.20 | 2013 | Scandinavia | 195 | ALP (normalization or 40% reduction after 12 mo) | LTX, death, CCA | Post hoc analysis (of UDCA trial) | |
Rupp et al.21 | 2014 | Germany | 215 | ALP (1: normalization or 50% reduction compared to baseline or <1.5 × ULN within 6 mo; 2: <1.5 × ULN within 12 mo) | LTX, death | Prospective Long FU | Monocentre |
De Vries et al.22 | 2015 | Netherlands | 64 | Nakanuma score (histological) | LTX Liver-related death | Histology is invasive and not indicated in regular FU; monocentre | |
Corpechot et al.11 | 2014 | France | 168 | Fibroscan | LTX, death, liver decompensation | Retrospective, monocentre | |
Vesterhus et al.10 | 2015 | Norway | 167 (derivation); 138 (validation) | ELF | LTX All-cause death | Validation of results in independent cohort; long FU | Retrospective, monocentre, referral centre based |
- ALP, alkaline phosphatase; CCA, cholangiocarcinoma; DS, dominant stricture; ELF, Enhanced Liver Fibrosis ® test; FU, follow-up; LTX, liver transplantation.
Histology has so far been the gold standard and histological improvement is still included as a primary end point in many clinical trials in PSC. There are several concerns associated with using histology as the end point as liver biopsy is an invasive procedure with potential serious adverse events, and the patchy disease distribution of PSC gives rise to considerable sampling error and interobserver variability between pathologists. In recent years, noninvasive methods evaluating liver fibrosis including ultrasound-based TE as well as the serum marker panel Enhanced Liver Fibrosis (ELF®) test (Siemens Medical Solutions Diagnostics Inc., Tarrytown, NY, USA) have proved to predict clinical outcome in PSC patients in single studies, but validation in independent studies is lacking.10, 11 Furthermore, it is yet to be decided whether these (and other suggested prognostic biomarkers and scores) reflect disease stage or activity, of which the latter represents a more dynamic situation perhaps better suited for the assessment of treatment effect. A biomarker harbouring good prognostic power may be highly useful as a stratifier in clinical trials, but this does not necessarily translate into applicability as a surrogate marker. The optimal surrogate end point should react to intervention in a reliable and dynamic manner, reflecting change in disease activity within a reasonable time frame.
In this issue of Liver International, de Vries and colleagues12 report on the use of ALP as a prognostic marker in PSC. An increased level of ALP is the most commonly seen liver test abnormality in PSC. This reflects PSC's cholestatic nature with inflammation and fibrotic stricturing of the bile ducts. In primary biliary cholangitis (PBC), another cholestatic liver disease affecting the smaller bile ducts, a recently published retrospective analysis of a large international PBC cohort confirmed previous suggestions that reduction in ALP can predict outcome in PBC and thus may serve as a surrogate end point in clinical trials in PBC.13 Likewise, changes in ALP have been the most widely used primary efficacy end point to measure response to therapy in PSC, having been employed in all clinical trials in the past two decades. Several reports indicate that reduction in ALP within a certain time frame is of prognostic value in PSC. However, methodological shortcomings with short follow-up, varying definitions of reduction in ALP, assessment of continuous ALP changes on a group level and post hoc analyses in studies not designed for that analysis have precluded final conclusions regarding the usefulness of ALP as a surrogate end point. Therefore, the present article is a very welcome addition to the literature on the prognostic role of ALP in PSC.
By including 366 well-characterized PSC patients of whom 66 met one of the predefined end points, de Vries et al. investigated whether the level of ALP either at diagnosis or 1 year after the diagnosis and the change in ALP levels could serve as prognostic markers. The ALP level at both diagnosis and 1 year later was associated with reaching the end point, with the level at 1 year showing the best predictive value.
The strengths of this study by de Vries et al. include the large number of patients, a blended population-based and referral centre cohort and the admirable use of advanced statistical methods to tease out the optimal use of ALP as a prognostic marker. While the identified optimal threshold for ALP at 1-year post-baseline of 1.3 × ULN was similar to previously used thresholds, analyses showed improved predictive value of ALP used as a continuous variable. This is in line with recent reports concerning the predictive value of ALP in PBC.14, 15 A defined cut-off is needed should ALP be used for patient stratification in clinical trials and in the clinical setting, cut-offs may be felt to be easier to interpret; however, the resulting reduction in predictive power should be considered. Furthermore, the choice of thresholds is not trivial and should be developed based on cost-effectiveness studies.
The results of this retrospective study need replication in an independent PSC cohort before conclusions concerning general validity can be drawn. The majority of the patients started on UDCA therapy at baseline, which is known to affect ALP values, and it would be of interest to explore the prognostic power of ALP in a larger PSC population not treated with UDCA. On the individual level, the naturally unpredictable fluctuating nature of ALP may limit the value of single measurements at any point in time for patient follow-up. Our comprehension of the utility and limitations of ALP as a prognostic biomarker as well as a potential surrogate end point would profit from further exploration of the link between ALP and bile duct inflammation and cholangiographic changes respectively. Further studies should make attempts to establish the stage of the disease (in the absence of histology, noninvasive liver stiffness measurements or the cholangiography-based Amsterdam score) to investigate the value of ALP as a prognostic marker at various stages of the natural history of PSC. Comparisons to the Mayo risk score and other proposed prognostic markers, and investigations of the effect of incorporating such markers (e.g. the ELF® test or liver stiffness measurement by FibroScan® [Echosens, Paris, France]) into a combined model including the 1-year ALP level are warranted.
The study of de Vries is only a first step towards stratified medicine in PSC. It is possible that a robust prognostic model might need to capture several aspects of the pathogenesis. Although these data pinpoint biomarkers of fibrosis as important for prognosis, inflammation of the bile ducts is important in early-stage PSC and a tempting treatment target; thus, better biomarkers of bile duct inflammation should be sought. It is probable that stage-specific surrogate end points, or coprimary end points reflecting distinct facets of the disease, might be needed in a complex disease like PSC. To advance the field in a rare disease like PSC, measures should be made to include all patients into structured prospective databases, comprehensively compiling clinical and laboratory data. In addition, it should be a prerequisite for clinical trials in PSC to include potential biomarkers of disease activity and candidate surrogate end points as explorative end points to promote their validation.
Funding Information
Mette Vesterhus is funded by the Norwegian PSC Research Center.
Conflict of Interest
The authors do not have any disclosures to report.