Development and validation of a primary sclerosing cholangitis–specific patient-reported outcomes instrument: The PSC PRO
Potential conflict of interest: Dr. Younossi consults, advises for and received grants from Intercept. He consults for and received grants from Gilead. He consults for GlaxoSmithKline and Novo Nordisk. He advises for Bristol-Myers Squibb and AbbVie. Dr. Bowlus advises for and received grants from Intercept, Bristol-Myers Squibb, GlaxoSmithKline, and Takeda. He advises for Conatus. He received grants from Gilead, CymaBay, Tobira, Merck, Tawain J, and Target. Dr. Kowdley consults for, advises for, is on the speakers' bureau for, and received grants from Intercept. He advises for, is on the speakers' bureau for, and received grants from Gilead. He consults for and received grants from NGM. He received royalties from UpToDate. Dr. Muir consults for and received grants from Gilead. He consults for Conatus. He received grants from NGM. Dr. Lenderking is employed by Evidera and owns stock in Pfizer. Dr. Levy consults for and received grants from Novartis and Intercept. She consults for Target and Cara. She received grants from GlaxoSmithKline, Tobira, and CymaBay. Dr. Myers is employed by and owns stock in Gilead. Dr. Subramanian is employed by and owns stock in Gilead. Dr. McHutchison is employed by and owns stock in Gilead. Dr. Skalicky received grants from Gilead. Dr. Kleinman received grants from Gilead.
Supported by Gilead Sciences.
Abstract
Primary sclerosing cholangitis (PSC) is a chronic liver disease associated with inflammation and biliary fibrosis that leads to cholangitis, cirrhosis, and impaired quality of life. Our objective was to develop and validate a PSC-specific patient-reported outcome (PRO) instrument. We developed a 42-item PSC PRO instrument that contains two modules (Symptoms and Impact of Symptoms) and conducted an external validation. Reliability and validity were evaluated using clinical data and a battery of other validated instruments. Test-retest reliability was assessed in a subgroup of patients who repeated the PSC PRO after the first administration. One hundred two PSC subjects (44 ± 13 years; 32% male, 74% employed, 39% with cirrhosis, 14% with a history of decompensated cirrhosis, 38% history of depression, and 68% with inflammatory bowel disease [IBD]) completed PSC PRO and other PRO instruments (Short Form 36 V2 [SF-36], Chronic Liver Disease Questionnaire [CLDQ], Primary Biliary Cholangitis – 40 [PBC-40], and five dimensions [5-D Itch]). PSC PRO demonstrated excellent internal consistency (Cronbach alphas, 0.84-0.94) and discriminant validity (41 of 42 items had the highest correlations with their own domains). There were good correlations between PSC PRO domains and relevant domains of SF-36, CLDQ, and PBC-40 (R = 0.69-0.90; all P < 0.0001), but lower (R = 0.31-0.60; P < 0.001) with 5-D Itch. Construct validity showed that PSC PRO can differentiate patients according to the presence and severity of cirrhosis and history of depression (P < 0.05), but not by IBD (P > 0.05). Test-retest reliability was assessed in 53 subjects who repeated PSC PRO within a median (interquartile range) of 37 (27-47) days. There was excellent reliability for most domains with intraclass correlations (0.71-0.88; all P < 0.001). Conclusion: PSC PRO is a self-administered disease-specific instrument developed according to U.S. Food and Drug Administration guidelines. This preliminary validation study suggests good psychometric properties. Further validation of the instrument in a larger and more diverse sample of PSC patients is needed. (Hepatology 2018;68:155-165).
Abbreviations
-
- 5-D Itch
-
- five dimensions
-
- CCA
-
- cholangiocarcinoma
-
- CLDQ
-
- Chronic Liver Disease Questionnaire
-
- ePRO
-
- electronic PRO
-
- HRQL
-
- health-related quality of life
-
- IBD
-
- inflammatory bowel disease
-
- ICC
-
- intraclass correlation coefficient
-
- IQR
-
- interquartile range
-
- LT
-
- liver transplantation
-
- MCID
-
- minimal clinically important difference
-
- PBC-40
-
- Primary Biliary Cholangitis-40
-
- PRO
-
- patient-reported outcome
-
- PSC
-
- primary sclerosing cholangitis
-
- SF-36
-
- Short Form-36 V2
Primary sclerosing cholangitis (PSC) is an autoimmune liver disease that causes chronic injury to the biliary system, leading to obstructive cholangitis, advanced liver disease, and cholangiocarcinoma (CCA).1 PSC is primarily observed in patients with inflammatory bowel disease (IBD) and is associated with a variety of symptoms, including pruritus, fatigue, abdominal pain, chills, fever, weight loss, and jaundice.2, 3 Median transplant-free survival has been estimated to be 21 years from the time of diagnosis.1 Currently, there are no approved medical treatments for patients with PSC, whereas for patients with advanced disease, liver transplantation (LT) remains the only definitive treatment option.4
Symptoms of PSC are not only reflective of disease severity, but can also cause substantial impairment of patients' health-related quality of life (HRQL) and other patient-reported outcomes (PROs), such as fatigue or reduced work productivity5, 6; coexistent IBD can additionally impair PROs.6 In general, PROs are the endpoints designed to capture patients' experience with their disease and its treatment.7 For the purpose of PRO assessment, two types of instruments, namely generic and disease specific, are typically used.8 Generic instruments provide a global assessment of HRQL; therefore, such instruments would allow comparing PSC-associated HRQL impairment with that caused by other diseases. In contrast, disease-specific instruments are more responsive in detecting small, but clinically important, changes by tapping into the most typical symptoms and other specific aspects of the disease; such instruments can be used for more precise evaluation of the comprehensive burden of the disease and of potential treatment options.8 At present, no PSC-specific PRO questionnaire has been developed. Therefore, our aim was to develop and validate a PSC-specific PRO instrument that can be used in clinical research, including clinical trials of novel therapies, in patients with PSC.
Materials and Methods
DEVELOPMENT of PSC PRO
A. Concept Elicitation
Following the literature review, which did not return any comprehensive disease-specific PRO instrument, concept elicitation research with 20 PSC patients (recruited from four clinical sites in Florida and North Carolina and PSC Partners Seeking a Cure) was conducted by a trained qualitative interviewer (A.S.) to document the symptom experience of patients with PSC, capture the patients' descriptions of their symptoms, identify the impact of symptoms on their lives, and gather information about the frequency of the symptoms and impacts (unpublished report). As a result, a draft version of PSC PRO was developed (Supporting Table S1). It included three modules: module 1: PSC Chronic, Everyday Symptoms (a nine-item daily assessment of PSC symptoms); module 2: PSC Acute Symptoms (a seven-item measure capturing periodic assessment of PSC symptoms associated with acute episodes of cholangitis); and module 3: PSC Impacts from Symptoms (a 29-item functional impact measure with seven domains). That draft version was then subjected to content validity assessment in a series of cognitive interviews with an in independent cohort of PSC patients.
B. Cognitive Interviews
For the three rounds of cognitive interviews, participants were recruited from four clinical sites in Colorado, Florida, North Carolina, and Washington and from the PSC Partners Seeking a Cure Patient Registry (http://pscpartners.org/patient-registry/) in March-December 2015; the interviews were conducted by two trained interviewers for both telephone (round 1, paper version only) and in-person (rounds 2 and 3, electronic platform and paper versions) interviews. The sample included participants with PSC with or without IBD, without history of LT or CCA, or any other potentially confounding comorbid disease; thus, the study sample is supposed to be representative of the symptomatic PSC patient population expected to be enrolled in upcoming clinical trials. During the cognitive interviews, participants reviewed the instructions, item content, recall period, and response options for the draft PSC PRO instrument. All participants in the electronic PRO (ePRO) cognitive interviews reported a positive impression of the electronic device format and found it easy to move from screen to screen, despite the fact that nearly half reported being unfamiliar with technology, and none had previously used an ePRO device. There was no evidence that the ePRO formatting of either module changed the way respondents interpreted or responded to the instructions or questions.
After each round of cognitive interviews, the data were reviewed by the study team, and recommendations for changes in the instrument modules and individual items were made with additional review by a translation expert. Throughout the qualitative research work (both concept elicitation and cognitive interviews), a steering committee consisting of four expert hepatologists was closely involved in the research process and decision making. This committee provided expertise on the PSC condition and worked closely with investigators to review the PSC PRO item development process. The qualitative data were coded by two members of the study team with review by a senior reviewer (A.S.) and analyzed using ATLAS.ti software.9
EXTERNAL VALIDATION OF PSC PRO
A. Patient Population
A validation study was conducted to assess the reliability and validity of the PSC PRO tool in a separately enrolled group of patients with an established diagnosis of PSC (enrolled from November 2016 to June 2017); required sample size was calculated using the data from the cognitive interviews. Patients were ages 18-70 years, of both sexes, all ethnicities, with or without IBD, with or without cirrhosis, and without any other potentially confounding comorbidities. After approval of the study by the Institutional Review Board, patients were screened, verbally consented and interviewed by phone, and then asked to complete a medical history questionnaire followed by the PSC PRO and four other PRO questionnaires (the Short Form-36 version 2 [SF-36v2], the Chronic Liver Disease Questionnaire [CLDQ], the Primary Biliary Cirrhosis-40 [PBC-40], and the 5-D Itch Scale9-13 using a secure ePRO website. Patients had unlimited time to answer each question, but for the purpose of calculating the total time to complete the battery, the maximum time to answer one question was capped at 5 minutes given that longer time would likely be attributed to an interruption. The first 50 patients were additionally asked to complete PSC PRO within 3 months after the initial administration for the purpose of test-retest reliability assessment.
B. Internal Consistency and Test-Retest Reliability
Internal consistency of PSC PRO was quantified by Cronbach's alpha coefficients for its domains without and then with one-item removal, as well as item-item and item-total correlations. The distributions of the domain scores were also evaluated for skewness and floor-ceiling effects.
For the purpose of test-retest reliability assessment, a subgroup of patients without any evidence of clinical change (determined by the interviewer) completed PSC PRO twice within approximately 3 months. Correlations between the two administrations were calculated. The distributions of differences in scores between multiple administrations were evaluated, and the median differences were compared to zero by a nonparametric test for matched pairs. Additionally, the intraclass correlation coefficients (ICCs; ICC(3,1) by Shrout-Fleiss14), which is another indicator of a repeated measure reliability, were calculated for the PSC PRO domains; a general linear model with subject ID and the administration ID (first or second) as two predictors of an outcome (the domain score) was used.
C. Validity
Construct validity was assessed in a round of confirmatory factor analysis, by discriminant validity (correlations with own domains vs. other domains), and by associating the PSC PRO scores with potentially relevant demographic and clinical parameters (known-groups validity). The known-groups validation parameters were age, sex, the presence and severity of cirrhosis, history of depression, the presence of IBD, ulcerative colitis, and Crohn's disease, and other clinically relevant parameters with a prevalence of at least 10%; those were also qualitatively used to appreciate the magnitude of the minimal clinically important difference (MCID) for the domains of PSC PRO. Wilcoxon's nonparametric test was used to compare the PSC PRO scores between the groups of patients listed above. Additionally, in a round convergent validity assessment, which used the data from a few validated generic and disease-specific PRO instruments (SF-36v2, CLDQ, PBC-40, and the 5-D Itch Scale) administered at the same time as PSC PRO, correlations between the most related domains of PSC PRO and of those instruments were expected to be the highest (such as between the Emotional Impact domain of PSC PRO and mental health–related domains of SF-36, Emotional or Worry domains of CLDQ, and Emotional domain of PBC-40; or between Physical Function of PSC PRO and Physical Functioning of SF-36, Activity/Energy or Fatigue or CLDQ, and Fatigue of PBC-40).
All quantitative analyses were made in SAS software (version 9.3; SAS Institute Inc., Cary, NC). The study was approved by the Western Institutional Review Board, and a waiver of signed consent was obtained. For subjects who were eligible for the study, a verbal consent by telephone was required.
Results
DEVELOPMENT OF PSC PRO
There were 26 PSC patients enrolled in cognitive interviews for assessment of content validity and development of the final PSC PRO after preliminary concept elicitation, including 20 with PSC+IBD (clinical and demographic parameters of the cohort in Supporting Table S2). Eighteen patients were interviewed in person and 8 by telephone. Based on the participants' feedback on clarity, completeness, and relevance of the items initially included in the draft PSC PRO, a number of changes were made (summarized in Supporting Table S3). In particular, originally included module 2 (PSC: Acute Symptoms) was eliminated and a few of its items were merged into the final PSC Symptoms module.
The final PSC PRO instrument includes two modules: module 1: PSC Symptoms and Module 2: Impact of Symptoms (Supporting Table S4). In module 1, patients are asked about the severity of specific PSC symptoms on a 0-10 scale with a 24-hour recall period; the symptoms include abdominal pain or discomfort, itching, fatigue, jaundice, difficulty with concentration, nausea, fever, chills, and sweats. In module 2, patients are asked about the impact of PSC on their daily life; this module includes seven four-item domains covering various aspects of patients' functioning and well-being (Physical Function, Activities of Daily Living, Work Productivity, Role Function, Emotional Impact, Social/Leisure Impact, and Quality of Life). In both modules, which are supposed to be scored separately, higher scores reflect worse health status. For module 1, the item scores range from 0 to 10 to be additive with maximum score of 120. For module 2, the item scores range from 1 to 5 to be summed within their domains and domain means summed to an overall Impact of Symptoms score. The scoring algorithms for the PSC Symptoms modules were further explored and confirmed using data from the external psychometric validation study.
EXTERNAL VALIDATION OF PSC PRO
The external validation cohort included 102 patients with PSC who completed PSC PRO, SF-36, CLDQ, PBC-40, and 5-D Itch Scale instruments at baseline, including 67 (68%) with PSC + IBD, 37 (39%) with cirrhosis, age 44 ± 13 years, 32% male, 74% employed, 14% with history of hepatic decompensation, and 38% with history of depression (Supporting Table S5). The median (IQR) time to complete all five instruments using our secure ePRO website was 28 (21-41) minutes (minimum, 10; maximum, 132). The baseline PRO scores of the cohort are shown in Table 1 stratified by patients' cirrhosis status, and statistical characteristics of individual item responses and domain scores are in Supporting Table S6. Of the validation cohort, 53 patients without any evidence of clinical change had PSC PRO administered twice within a median of 37 days (IQR, 27-47).
PRO Instrument, Domain |
Cirrhosis (N = 37) |
No cirrhosis (N = 58) | P Value |
Overall (N = 95) |
---|---|---|---|---|
PSC PRO | ||||
Module 1: PSC Symptoms | 23.03 ± 16.99 | 13.07 ± 12.08 | 0.0048 | 16.95 ± 14.93 |
Module 2: Physical Function | 2.32 ± 1.13 | 1.56 ± 0.68 | 0.0008 | 1.86 ± 0.95 |
Module 2: Activities of Daily Living | 2.60 ± 1.14 | 1.880 ± 0.88 | 0.0026 | 2.16 ± 1.05 |
Module 2: Work Productivity | 2.14 ± 1.14 | 1.74 ± 0.88 | 0.19 | 1.87 ± 0.98 |
Module 2: Role Function | 2.37 ± 1.10 | 1.53 ± 0.75 | 0.0002 | 1.86 ± 0.99 |
Module 2: Emotional Impact | 2.99 ± 1.12 | 2.69 ± 1.01 | 0.22 | 2.81 ± 1.06 |
Module 2: Social/Leisure Impact | 2.38 ± 1.13 | 1.67 ± 0.79 | 0.0017 | 1.95 ± 0.99 |
Module 2: Quality of Life | 2.59 ± 1.12 | 1.85 ± 0.96 | 0.0017 | 2.14 ± 1.08 |
Module 2: Total Impact of Symptoms | 17.72 ± 7.20 | 12.98 ± 5.10 | 0.0018 | 14.83 ± 6.40 |
SF-36 | ||||
Physical Functioning | 73.11 ± 23.84 | 87.98 ± 16.03 | 0.0020 | 82.13 ± 20.69 |
Role Physical | 58.11 ± 29.12 | 76.97 ± 25.28 | 0.0018 | 69.55 ± 28.27 |
Bodily Pain | 64.84 ± 20.76 | 75.30 ± 20.69 | 0.0235 | 71.18 ± 21.23 |
General Health | 36.38 ± 23.60 | 50.25 ± 22.53 | 0.0059 | 44.79 ± 23.83 |
Vitality | 39.53 ± 28.11 | 53.18 ± 21.20 | 0.0114 | 47.81 ± 24.93 |
Social Functioning | 63.18 ± 29.16 | 78.95 ± 23.04 | 0.0101 | 72.74 ± 26.62 |
Role Emotional | 68.02 ± 31.03 | 80.85 ± 22.38 | 0.07 | 75.80 ± 26.72 |
Mental Health | 64.73 ± 25.71 | 71.49 ± 15.89 | 0.34 | 68.83 ± 20.47 |
Physical Component Summary | 44.13 ± 8.033 | 50.24 ± 7.49 | 0.0005 | 47.83 ± 8.23 |
Mental Component Summary | 42.77 ± 13.23 | 47.38 ± 9.10 | 0.11 | 45.56 ± 11.08 |
CLDQ | ||||
Abdominal Symptoms | 4.67 ± 1.40 | 5.54 ± 1.36 | 0.0026 | 5.20 ± 1.43 |
Activity | 5.38 ± 1.45 | 6.00 ± 1.09 | 0.0272 | 5.76 ± 1.27 |
Emotional | 4.61 ± 1.50 | 5.08 ± 1.06 | 0.19 | 4.90 ± 1.26 |
Fatigue | 3.86 ± 1.65 | 4.51 ± 1.37 | 0.06 | 4.25 ± 1.51 |
Systemic Symptoms | 5.06 ± 1.30 | 5.85 ± 0.94 | 0.0031 | 5.54 ± 1.16 |
Worry | 4.24 ± 1.78 | 4.69 ± 1.36 | 0.22 | 4.52 ± 1.54 |
Total CLDQ score | 4.63 ± 1.27 | 5.28 ± 0.90 | 0.0151 | 5.02 ± 1.10 |
PBC-40 | ||||
Cognitive | 2.23 ± 1.13 | 1.73 ± 0.80 | 0.0464 | 1.92 ± 0.97 |
Emotional | 2.77 ± 1.12 | 2.41 ± 1.09 | 0.12 | 2.55 ± 1.11 |
Fatigue | 2.64 ± 1.09 | 2.24 ± 0.92 | 0.07 | 2.39 ± 1.00 |
Itch | 1.76 ± 1.31 | 1.26 ± 0.95 | 0.10 | 1.45 ± 1.12 |
Social | 2.76 ± 0.94 | 2.07 ± 0.74 | 0.0005 | 2.34 ± 0.88 |
Symptoms | 1.98 ± 0.66 | 1.73 ± 0.62 | 0.08 | 1.82 ± 0.65 |
Total PBC-40 score | 2.36 ± 0.81 | 1.91 ± 0.59 | 0.0055 | 2.08 ± 0.72 |
5-D Itch Scale | 12.42 ± 4.61 | 9.37 ± 3.51 | 0.0016 | 10.55 ± 4.22 |
A. Internal Consistency
Both first and second administrations of PSC PRO were used for assessment of internal consistency (N = 155 entries). A round of confirmatory factor analysis validated the original design with seven domains for module 2: 99.8% of the sample variance could be explained by exactly seven factors. Furthermore, an excellent internal consistency was detected in all eight PSC PRO domains—Cronbach's alphas ranged from 0.84 to 0.94 (Table 2). Additionally, the changes in Cronbach's alphas after sequential one-item exclusions did not exceed 0.09 (Table 2), therefore suggesting that the items from the same domains were neither too correlated with nor too different from one another. The greatest variability in Cronbach's alphas was found in the Activities of Daily Living domain (–0.09 after excluding Item 20—“running errands,” +0.04 after excluding Item 21—“avoiding certain foods”). The item-to-own domain correlations were above 0.7 for all but one item from the module 2: Impact of Symptoms; the only exception with correlation of 0.488 was the abovementioned Item 20. Those correlations, however, were substantially lower for the module 1: PSC Symptoms (minimum, 0.28; maximum, 0.74; median, 0.64), suggesting a greater independence of individual symptoms of PSC from each other (Table 2).
Domain | Cronbach's α | Cronbach's α With One Item Removed | Item-to-Own-Domain Correlations | Item-to-Item Correlations |
---|---|---|---|---|
Module 1: PSC Symptoms | 0.890 | 0.873 to 0.898 | 0.280 to 0.735 | 0.06 to 0.90 |
Module 2: Physical Function | 0.912 | 0.864 to 0.916 | 0.713 to 0.864 | 0.62 to 0.86 |
Module 2: Activities of Daily Living | 0.856 | 0.768 to 0.899 | 0.488 to 0.810 | 0.38 to 0.90 |
Module 2: Work Productivity | 0.925 | 0.884 to 0.931 | 0.736 to 0.876 | 0.59 to 0.84 |
Module 2: Role Function | 0.908 | 0.844 to 0.903 | 0.726 to 0.890 | 0.57 to 0.84 |
Module 2: Emotional Impact | 0.911 | 0.856 to 0.897 | 0.763 to 0.879 | 0.62 to 0.79 |
Module 2: Social/Leisure Impact | 0.929 | 0.892 to 0.938 | 0.737 to 0.877 | 0.64 to 0.82 |
Module 2: Quality of Life | 0.941 | 0.914 to 0.931 | 0.829 to 0.883 | 0.75 to 0.83 |
The distribution of the PSC PRO domain scores suggested skewness toward lower values, which would reflect a better health status (Fig. 1A-H). Indeed, the proportion of values of less than 3.0 (on a 1-5 scale) exceeded 70% for six of seven domains of module 2: Impact of Symptoms, while less than 10% of patients had domain scores of 4.0 or greater (on a 1-5 scale used for the domains of that module); the only less-skewed domain was Emotional Impact, which had a mode value of 2.0 rather than 1.0 (Fig. 1F). Furthermore, despite the potential maximum of 120 (all the symptoms present at their worst), the maximal observed value for the module 1: PSC Symptoms score did not exceed 70, whereas 78% of patients had that domain score of less than 30 (Fig. 1A). In that context, ceiling effect was not significant given that few or none at all of the patients had the highest possible scores in any of the items or domains (<10% for 36 of 40 scored items, 10%-20% for the remaining 4 items, and <3% for the domain scores). At the same time, the floor effect was more profound for both items and domains: 35 of 40 the items were scored so that >30% of patients had the lowest possible values (of the remaining five items, three belong the Emotional Impact domain and two others are physical and mental tiredness symptoms from module 1), whereas 26 of 40 items had >40% of patients with the lowest scores and 15 of 40 had >50%. Furthermore, from 24% to 36% patients also had the lowest possible module 2 domain scores, with the only exception of the Emotional Impact domain, where that proportion was 5.8%.

B. Validity
Discriminant validity analysis has shown that 41 of 42 items had the highest absolute correlations with their own domains; the only exception was Item 18 (“completing household chores”), which correlated with the domain of Physical Function, insignificantly higher than with its own domain of Activities of Daily Living (0.84 vs. 0.87; P = 0.32).
Known-groups validity of PSC PRO with reference to demographic and clinical parameters is summarized in Table 3 and Supporting Table S7. No difference in PSC PRO scores between patients of different age groups, patients with IBD, ulcerative colitis, aphthous mouth ulcers, or Crohn's disease was found (all P > 0.04). On the other hand, male PSC patients had significantly better scores in comparison to females in five of seven module 2: Impact of Symptoms domains. Furthermore, as expected, patients with cirrhosis, complications of cirrhosis and, in particular, history of weight loss or malnutrition and patients with a history of depression or anxiety had significantly impaired PSC PRO scores.
Male | Female | P Value | |
---|---|---|---|
N | 33 | 69 | |
Module 1: PSC Symptoms | 14.79 ± 14.68 | 18.41 ± 15.08 | 0.22 |
Module 2: Physical Function | 1.58 ± 0.96 | 2.01 ± 0.94 | 0.0052 |
Module 2: Activities of Daily Living | 1.81 ± 0.95 | 2.33 ± 1.07 | 0.0163 |
Module 2: Work Productivity | 1.68 ± 0.94 | 2.00 ± 1.02 | 0.16 |
Module 2: Role Function | 1.50 ± 0.81 | 2.04 ± 1.02 | 0.0049 |
Module 2: Emotional Impact | 2.60 ± 1.13 | 2.90 ± 1.08 | 0.17 |
Module 2: Social/Leisure Impact | 1.59 ± 0.97 | 2.14 ± 1.00 | 0.0025 |
Module 2: Quality of Life | 1.73 ± 1.03 | 2.35 ± 1.09 | 0.0029 |
Module 2: Total Impact of Symptoms | 12.48 ± 6.18 | 16.03 ± 6.49 | 0.0052 |
IBD | No IBD | P Value | |
---|---|---|---|
N | 67 | 32 | |
Module 1: PSC Symptoms | 18.72 ± 14.75 | 15.28 ± 15.48 | 0.15 |
Module 2: Physical Function | 1.89 ± 1.00 | 1.91 ± 0.92 | 0.69 |
Module 2: Activities of Daily Living | 2.18 ± 1.02 | 2.23 ± 1.14 | 0.88 |
Module 2: Work Productivity | 1.98 ± 1.06 | 1.76 ± 0.87 | 0.33 |
Module 2: Role Function | 1.90 ± 1.00 | 1.87 ± 1.00 | 0.77 |
Module 2: Emotional Impact | 2.96 ± 1.03 | 2.58 ± 1.20 | 0.08 |
Module 2: Social/Leisure Impact | 2.08 ± 1.08 | 1.81 ± 0.87 | 0.36 |
Module 2: Quality of Life | 2.31 ± 1.14 | 1.93 ± 0.98 | 0.12 |
Module 2: Total Impact of Symptoms | 15.53 ± 6.65 | 14.20 ± 6.38 | 0.27 |
Cirrhosis | No Cirrhosis | P Value | |
---|---|---|---|
N | 37 | 58 | |
Module 1: PSC Symptoms | 23.03 ± 16.99 | 13.07 ± 12.08 | 0.0048 |
Module 2: Physical Function | 2.32 ± 1.13 | 1.56 ± 0.68 | 0.0008 |
Module 2: Activities of Daily Living | 2.60 ± 1.14 | 1.88 ± 0.88 | 0.0026 |
Module 2: Work Productivity | 2.14 ± 1.14 | 1.74 ± 0.88 | 0.19 |
Module 2: Role Function | 2.37 ± 1.10 | 1.53 ± 0.75 | 0.0002 |
Module 2: Emotional Impact | 2.99 ± 1.12 | 2.69 ± 1.01 | 0.22 |
Module 2: Social/Leisure Impact | 2.38 ± 1.13 | 1.67 ± 0.79 | 0.0017 |
Module 2: Quality of Life | 2.59 ± 1.12 | 1.85 ± 0.96 | 0.0017 |
Module 2: Total Impact of Symptoms | 17.72 ± 7.20 | 12.98 ± 5.10 | 0.0018 |
Complications of Cirrhosis | No Complications of Cirrhosis | P Value | |
---|---|---|---|
N | 14 | 83 | |
Module 1: PSC Symptoms | 27.43 ± 14.74 | 14.60 ± 14.04 | 0.0025 |
Module 2: Physical Function | 2.75 ± 1.20 | 1.65 ± 0.78 | 0.0006 |
Module 2: Activities of Daily Living | 3.09 ± 0.96 | 1.94 ± 0.95 | 0.0003 |
Module 2: Work Productivity | 2.69 ± 1.31 | 1.69 ± 0.83 | 0.0438 |
Module 2: Role Function | 2.68 ± 0.98 | 1.66 ± 0.87 | 0.0004 |
Module 2: Emotional Impact | 3.16 ± 1.00 | 2.66 ± 1.07 | 0.12 |
Module 2: Social/Leisure Impact | 2.80 ± 1.20 | 1.74 ± 0.84 | 0.0018 |
Module 2: Quality of Life | 2.96 ± 1.22 | 1.93 ± 0.97 | 0.0026 |
Module 2: Total Impact of Symptoms | 20.46 ± 7.24 | 13.42 ± 5.52 | 0.0006 |
Depression | No Depression | P Value | |
---|---|---|---|
N | 38 | 63 | |
Module 1: PSC Symptoms | 23.26 ± 15.51 | 13.51 ± 13.60 | 0.0008 |
Module 2: Physical Function | 2.29 ± 1.11 | 1.62 ± 0.78 | 0.0032 |
Module 2: Activities of Daily Living | 2.55 ± 1.13 | 1.94 ± 0.95 | 0.0068 |
Module 2: Work Productivity | 2.39 ± 1.18 | 1.58 ± 0.74 | 0.0023 |
Module 2: Role Function | 2.39 ± 1.09 | 1.56 ± 0.78 | 0.0001 |
Module 2: Emotional Impact | 3.45 ± 1.12 | 2.43 ± 0.91 | <0.0001 |
Module 2: Social/Leisure Impact | 2.59 ± 1.16 | 1.60 ± 0.73 | <0.0001 |
Module 2: Quality of Life | 2.85 ± 1.10 | 1.74 ± 0.89 | <0.0001 |
Module 2: Total Impact of Symptoms | 18.70 ± 7.06 | 12.58 ± 5.14 | <0.0001 |
- No difference between the PSC PRO domain scores was found in patients of different age groups, with and without IBD, ulcerative colitis or Crohn's disease, aphthous mouth ulcers, or arthritis (all P > 0.04). The Total Impact of Symptoms equals to the sum of the seven module 2 domains.
From the magnitude of difference in the PSC PRO scores observed in this study between the clinically meaningful separate groups, particularly patients with and without IBD (Table 3 and Supporting Table S7), we suggest the MCID for PSC PRO scores to be 4 for module 1: Symptoms and 0.3 for domains of module 2: Impact of Symptoms, until a systematic MCID determination study for this instrument is performed.
Correlations of PSC PRO domains with domains of other validated PRO instruments are shown in Table 4. Although all correlations were statistically significant, the highest correlated domains for the module 1: PSC Symptoms were those related to fatigue and vitality. Role Physical of SF-36 and Fatigue of CLDQ were the highest or close to the highest correlated domains for Physical Function, Activities of Daily Living, Work Productivity, Role Function, and Quality of Life domains of PSC PRO. On the other hand, the Emotional Impact PSC PRO domain was expectedly the most correlated with the Mental Health domain of SF-36, Worry domain of CLDQ, and Emotional domain of PBC-40. Furthermore, the Social/Leisure Impact domain was the most correlated with Social Functioning domain of SF-36, Fatigue domain of CLDQ, and Social domain of PBC-40. Finally, of all PSC PRO domains, the 5-D Itch Scale was the most correlated with module 1: PSC Symptoms score (R = 0.60; P < 0.0001; Table 4), and of module 1 items, the highest correlation of 0.84 (P < 0.0001) was expectedly found for Item 4 (“Over the past 24 hours, how bad was your itching at its worst?”).
PRO Instrument/Domain | Module 1: PSC Symptoms | Module 2: Physical Function | Module 2: Activities of Daily Living | Module 2: Work Productivity | Module 2: Role Function | Module 2: Emotional Impact | Module 2: Social/Leisure Impact | Module 2: Quality of Life |
---|---|---|---|---|---|---|---|---|
SF-36 | ||||||||
Physical Functioning | –0.65 | –0.84 | –0.70 | –0.68 | –0.64 | –0.37 | –0.67 | –0.65 |
Role Physical | –0.71 | –0.83 | –0.74 | –0.83 | –0.81 | –0.55 | –0.80 | –0.84 |
Bodily Pain | –0.67 | –0.71 | –0.69 | –0.62 | –0.64 | –0.51 | –0.66 | –0.66 |
General Health | –0.68 | –0.70 | –0.70 | –0.60 | –0.64 | –0.54 | –0.62 | –0.73 |
Vitality | –0.73 | –0.72 | –0.67 | –0.74 | –0.74 | –0.60 | –0.75 | –0.80 |
Social Functioning | –0.64 | –0.67 | –0.64 | –0.68 | –0.77 | –0.66 | –0.81 | –0.83 |
Role Emotional | –0.71 | –0.72 | –0.65 | –0.78 | –0.75 | –0.60 | –0.77 | –0.77 |
Mental Health | –0.55 | –0.47 | –0.51 | –0.61 | –0.61 | –0.76 | –0.64 | –0.68 |
CLDQ | ||||||||
Abdominal Symptoms | –0.67 | –0.62 | –0.63 | –0.55 | –0.58 | –0.46 | –0.56 | –0.56 |
Activity | –0.62 | –0.67 | –0.68 | –0.58 | –0.58 | –0.38 | –0.56 | –0.55 |
Emotional | –0.77 | –0.70 | –0.74 | –0.75 | –0.69 | –0.69 | –0.71 | –0.74 |
Fatigue | –0.82 | –0.79 | –0.72 | –0.75 | –0.69 | –0.55 | –0.73 | –0.76 |
Systemic Symptoms | –0.76 | –0.76 | –0.68 | –0.70 | –0.62 | –0.46 | –0.63 | –0.60 |
Worry | –0.54 | –0.47 | –0.51 | –0.56 | –0.63 | –0.90 | –0.62 | –0.68 |
PBC-40 | ||||||||
Symptoms | 0.62 | 0.62 | 0.62 | 0.54 | 0.50 | 0.47 | 0.59 | 0.54 |
Itch | 0.53 | 0.47 | 0.43 | 0.39 | 0.33 | 0.24 | 0.31 | 0.29 |
Fatigue | 0.80 | 0.79 | 0.77 | 0.79 | 0.77 | 0.56 | 0.76 | 0.79 |
Cognitive | 0.68 | 0.70 | 0.63 | 0.75 | 0.65 | 0.49 | 0.70 | 0.70 |
Emotional | 0.57 | 0.49 | 0.51 | 0.56 | 0.60 | 0.74 | 0.63 | 0.68 |
Social | 0.69 | 0.70 | 0.73 | 0.74 | 0.82 | 0.68 | 0.84 | 0.87 |
5-D Itch Scale | 0.60 | 0.55 | 0.50 | 0.56 | 0.47 | 0.31 | 0.42 | 0.40 |
- In bold are the strongest correlations for each PSC PRO domain. All P < 0.01.
C. Test-Retest Reliability
Fifty-three patients completed the PSC PRO instrument for a second time. The median interval between administrations was 37 days (IQR, 27-47). In test-retest analysis of PSC PRO items, all but one item was highly significantly correlated between the two administrations (P < 0.02); the only exception was one of the PSC symptoms (fever) included in module 1. The correlations between repeated measurements of the domain scores ranged from 0.70 (Work Productivity) to 0.85 (Activities of Daily Living; all P < 0.0001; Table 5). Furthermore, all the score differences, except for the module 1: PSC Symptoms score, had a median of 0, and none of the module 2: Impact of Symptoms domains returned a statistically significant difference between the two administrations (all P > 0.05; Table 5). Finally, ICC values ranged from 0.71 to 0.86 for all of the PSC PRO domain scores (Table 5).
PSC PRO Domain | Correlation b/w Administrations | Difference Between Administrations | ICC (95% CI) | |
---|---|---|---|---|
Median (IQR) | P Value | |||
Module 1: PSC Symptoms | 0.84 | 1 (–2 to 7) | 0.04 | 0.78 (0.65-0.87) |
Module 2: Physical Function | 0.83 | 0 (0.0-0.5) | 0.17 | 0.83 (0.72-0.90) |
Module 2: Activities of Daily Living | 0.85 | 0 (–0.25 to 0.5) | 0.06 | 0.85 (0.76-0.91) |
Module 2: Work Productivity | 0.70 | 0 (–0.25 to 0.5) | 0.67 | 0.71 (0.54-0.82) |
Module 2: Role Function | 0.83 | 0 (0.00-0.25) | 0.06 | 0.83 (0.72-0.90) |
Module 2: Emotional Impact | 0.82 | 0 (–0.25 to 0.25) | 0.88 | 0.86 (0.77-0.92) |
Module 2: Social/Leisure Impact | 0.80 | 0 (–0.25 to 0.25) | 0.18 | 0.81 (0.69-0.89) |
Module 2: Quality of Life | 0.79 | 0 (–0.25 to 0.25) | 0.29 | 0.81 (0.69-0.88) |
Module 2: Total Impact of Symptoms | 0.88 | 0.3 (–1 to 2) | 0.12 | 0.88 (0.80-0.93) |
- Abbreviation: CI, confidence interval.
Discussion
Increasingly recognized as important, PROs reflect patients' experience with their disease and its treatment.8 In this context, development of a validated PRO instrument for PSC to complement clinical outcomes of these patients is needed. Unlike other liver diseases, such as hepatitis C, for which a disease-specific PRO instrument is available and highly effective treatment regimens have been approved, at present, there is no effective treatment or a disease-specific PRO instrument for patients with PSC.15 Because development of any new treatment regimen for PSC patients requires inclusion of both relevant clinical outcomes and PROs, availability of a validated PRO instrument to be used in future clinical trials of PSC will be important.
In this study, we followed a well-established systematic protocol to develop and validate the PSC-specific PRO instrument. In the development phase of the study, input from both experts' and patients' was the key component of selecting items for the draft of the instrument. The draft was then assessed by more patients with PSC for clarity, relevance, and completeness in the context of their experience with PSC, and also by clinicians who routinely see PSC patients in their practices. Modified with their systematically collected feedback, the final PSC PRO consists of two modules and a total of 42 items. The instrument is simple, so that it can be self-administered in both paper and electronic form, and it takes approximately 7-15 minutes to complete.
The external validation phase of PSC PRO indicated a number of strengths for this instrument. First, there is sufficient evidence for its content validity as determined by both PSC patients and experts. There is also significant evidence for internal consistency reliability and validity of the instrument, with Cronbach's alpha values ranging between 0.84 and 0.94 for all its domains and with sufficiently high items to own domain correlations. It is, however, important to note that although reported Cronbach's alphas are high, they are still below 0.95, which would indicate high risk of redundancy of the items.16 In addition, our test-retest reliability analysis showed that multiple administrations of PSC PRO, though accompanied by no clinical changes in patients, return high correlations between repeated measurements for all items and domain scores (all ICCs >0.70), with the exception of some of the items from module 1: Symptoms, which is expected given the nature of the studied outcomes, most of which are symptoms that are not meant to be long-lasting and thus can occur and resolve relatively quickly.
Finally, construct validity testing for PSC PRO returned high correlations of the domains scores with relevant domains of other validated general and liver disease-specific PRO instruments such as SF-36 and CLDQ, respectively. More important, it also correlated with domains of PBC-40, which has been validated in another cholestatic liver disease with overlapping symptoms. On the other hand, neither of the reported correlations were “too high” (such as exceeding 0.90), which would again indicate redundancy of the instrument. Of four instruments used, the worst correlated PRO instrument for PSC PRO was the 5-D Itch Scale, which is expected given that the integral impact of PSC to impaired quality of life is driven by a plethora of, to some extent, independent symptoms, with itching being only one of them. Furthermore, we have also shown significant evidence supporting its known-groups validity based on the differences in the scores associated with patients' sex, cirrhosis status, and the presence of psychiatric comorbidities, all consistent with past reports.17, 18 Although the study was not designed to describe the MCID for PSC PRO, from the differences in the scores we observed in construct validity and test-retest analyses, we estimate that the MCID for module 1 to be around 4 and for the domains of module 2 to be around 0.3, respectively. These estimates require validation in appropriately designed studies, such as using Patient's Global Impressions or Clinical Global Impressions rating measures.19
The study limitations include the use of self-reported unverified clinical data and a relatively small sample size used for external validation, with possibly limited variability in clinically relevant parameters. Indeed, given that PSC is a heterogeneous disease with a wide spectrum of disease severity, a large-scale validation of the instrument, which would involve patients with more-diverse demographics, various comorbidities, and different stages of disease, is needed. Although there are limited population-based data on demographics of PSC, past studies suggest higher prevalence of PSC in males20 whereas our sample includes more female patients; this may result in a potential bias in the studied PROs. On the other hand, the prevalence of IBD in this study was nearly identical to that reported in a meta-analysis of epidemiologic studies (68% vs. 70%).21 However, it is also important to note that although IBD is common in patients with PSC, in order to fully capture the impact of IBD and PSC on PROs, the PSC PRO should be coadministered with an IBD-specific instrument (Inflammatory Bowel Disease Questionnaire).22 Longitudinal studies are needed to validate sensitivity and responsiveness of the instrument to improvement of clinical parameters in PSC patients.
In summary, PSC PRO has been developed as a disease-specific instrument with the aim to provide researchers with an instrument to assess PROs in patients with PSC. The results of this initial round of validation confirm good psychometric properties. Nevertheless, the instrument must be validated using a larger sample and a wider range of patients. Once validated, the outcomes that are captured by PSC PRO will add to clinical endpoints of upcoming clinical trials for patients with PSC and assure that the comprehensive assessment of treatment options includes accurately measured patients' experience.