Five-year follow-up study on quantitative muscle magnetic resonance imaging in facioscapulohumeral muscular dystrophy: The link to clinical outcome
Baziel G. M. van Engelen and Nicol C. Voermans contributed equally to this work.
Abstract
Background
It is unclear how changes in quantitative muscle magnetic resonance imaging (MRI) relate to changes in clinical outcome in facioscapulohumeral muscular dystrophy (FSHD), although this information is crucial for optimal use of MRI as imaging biomarker in trials. We therefore assessed muscle MRI and clinical outcome measures in a large longitudinal prospective cohort study.
Methods
All patients were assessed by MRI at baseline and at 5-year follow-up, employing 2pt-Dixon and turbo inversion recovery magnitude (TIRM) sequences, after which fat fraction and TIRM positivity of 19 leg muscles were determined bilaterally. The MRI compound score (CoS) was defined as the mean fat fraction of all muscles weighted for cross-sectional area. Clinical outcome measures included the Ricci-score, FSHD clinical score (FSHD-CS), MRC sumscore (MRC-SS), and motor-function-measure (MFM).
Results
We included 105 FSHD patients [mean age 54 ± 14 years, median Ricci-score 7 (range 0–10)]. The median change over 5 years' time in the MRI-CoS was 2.0% (range −4.6 to +12.1; P < 0.001). The median change over 5 years' time in clinical outcome measures was small in all measures, with z-scores ranging from 5.0 to 7.2 (P < 0.001). The change in MRI-CoS correlated with change in FSHD-CS and Ricci-score (ρ = 0.25, respectively; ρ = 0.23, P < 0.05). The largest median increase in MRI-CoS was seen in baseline subgroups with an MRI-CoS 20–40% (6.1%), with ≥2 TIRM positive muscles (3.5%) or with an FSHD-CS 5–10 (3.1%).
Conclusions
This 5-year study showed significant changes in MRI and clinical outcome measures and a significant correlation between changes in MRI-CoS and changes in clinical outcome measures. In addition, we identified subgroups of patients that are most prone to radiological disease progression. This knowledge further establishes quantitative MRI parameters as prognostic biomarkers in FSHD and as efficacy biomarkers in upcoming clinical trials.
Introduction
Facioscapulohumeral muscular dystrophy (FSHD) is an autosomal dominantly inherited muscular dystrophy with slow progression and a highly variable disease course and severity. The disease mostly manifests with weakness in facial and scapular muscles, truncal and lower extremity muscle weakness develops later. FSHD patients range from individuals having isolated facial weakness to wheelchair dependent individuals.1 Currently, FSHD cannot be cured and disease progression cannot be prevented, so the treatment for FSHD is supportive.
Over the last few years, the understanding of the (epi)genetic mechanisms causing FSHD has increased. This has led to the development of potential therapies and the first human clinical trials.2 More trials are expected in the upcoming years. These trials face a number of potential obstacles, of which one is the lack of sensitive disease outcome measures and biomarkers.3 Because of the variable yet generally slow progression of FSHD, existing clinical outcome measures may be unable to detect meaningful treatment effects in clinical trials, unless these trials have a long duration and include large numbers of patients.4
An important approach for upcoming FSHD clinical trials is to incorporate biomarkers that reflect the underlying disease process, such as muscle imaging. Muscle magnetic resonance imaging (MRI) has shown to be highly sensitive to intramuscular fat replacement and oedema, which are both indicators of disease activity in myopathies, including FSHD.5-7
Fat replacement correlates strongly to several clinical outcome measures in FSHD cross-sectionally.8, 9 Longitudinal quantitative MRI studies showed significant annual increases in fat replacement of leg muscles with fat fraction increases ranging from 1.0% to 6.7%.9-13 Clinical outcome measures during the 1 or 2 year follow-up in these studies often remained stable, rendering it impossible to find an association between changes in fat replacement and clinical function.9, 10, 13 This is of great concern, because potential biomarkers for clinical trials need to relate to clinical outcome to be clinically relevant. Thus, while significant changes in fat replacement can be determined over follow-up times of several months, clinical outcome measures probably need longer observation periods (>2 years) to show changes.
Another imaging biomarker proposed in FSHD is muscle hyperintensity seen on short tau inversion recovery (STIR) or turbo inversion recovery magnitude (TIRM) MRI sequences (similar T2 weighted sequences with nulling of the fat signal), which may reflect intramuscular oedema.14 Longitudinal studies in FSHD have shown that STIR hyperintense muscles progress faster towards fat replacement. However, the relation between STIR or TIRM hyperintensity and clinical measures of disease progression remains incompletely understood.12, 14-16
This study assesses a high number of clinically heterogeneous FSHD patients over a 5-year follow-up period using a combination of quantitative leg muscle MRI and clinical outcome measures. We hypothesized that this will provide a better insight in MRI progression in the entire spectrum of disease severity. Furthermore, we will test the presumed correlation between change in imaging biomarkers and changes in clinical outcome measures. Both will add to the optimal use of MRI signatures as imaging biomarker in future clinical trials.
Methods
Patients
We invited all 140 FSHD patients that underwent MRI scanning in our observational cohort study between 2014 and 2015.8 Patient selection was as previously published; all patients were 18 years and older and had genetically confirmed FSHD. None of the subjects received drug treatment for FSHD prior to this study. MRI scans were performed at the Radiology department of the Radboud University Medical Center, Nijmegen, the Netherlands, from May 2019 to December 2020. The study protocol was approved by the medical ethics committee region Arnhem-Nijmegen, the Netherlands. All patients gave written informed consent prior to participating.
Clinical outcome measures
All patients were scored by one trained physician (SV) using multiple clinical outcome measures, following the same protocol as described previously.8 In short: Muscle strength was scored using manual muscle testing according to the Medical Research Council (MRC).17 MRC-scores of hip flexion, hip abduction, knee flexion, knee extension, foot dorsal flexion and foot plantar flexion were determined and combined in the mean MRC sum score (MRC-SS) of the legs. The functional motor abilities of the participants were assessed using the 32-item motor function measure (MFM).18 FSHD clinical severity was scored using the ‘Ricci-score’ and the FSHD clinical score (FSHD-CS).19, 20
Magnetic resonance imaging acquisition
All MRI exams were performed by the same protocol as described previously.8 All MRI images were acquired on the same Siemens 3T scanner. Due to a scanner system upgrade, the baseline images were acquired on a TIM Trio and the follow-up images on a Prisma system. Patients were positioned feet first in supine position in the magnet after which MRI was performed of both upper and lower legs with phased arrays coils placed around them. Scout images were made in three orthogonal directions to guide positioning of imaging slices. First, a transverse Dixon sequence was acquired (field of view 271 × 435 mm, matrix size 200 × 320, echo time 2.45/3.675 ms, 72 slices with slice thickness 5 mm, slice gap 0 mm). A flip angle of 3° and repetition time of 10 ms were used, as recommended to minimize T1 weighting.21 For the upper legs, the most proximal slice was positioned at the top of the trochanter major and for the lower leg, the most proximal slice was positioned on the top of the fibular head. Subsequently, the upper legs and lower legs were also imaged with a TIRM sequence (field of view 271 × 435 mm, matrix size 160 × 356, repetition time 4000 ms, echo time 40 ms, inversion time 220 ms, 36 slices with slice thickness 5 mm, slice gap 5 mm, flip angle 150°). These TIRM stacks were centred at the middle slice of the corresponding Dixon stack. Due to the upgrade of the MR system, the follow-up MRI examinations in the Dixon sequence differed from the baseline examinations in echo-times, which were set at out-phase/in-phase 1.23/2.46 ms.
Magnetic resonance imaging analysis
Predefined anatomic landmarks were used to determine which slices of the MRI dataset were to be analysed, as reported previously.8 In the upper leg, the slices at one-third and two-thirds of the distance between the spina iliaca anterior superior and the upper edge of the patella were selected. In the lower leg, slices were used at two-thirds and five-sixth of the distance between the lateral malleolus and the lower edge of the patella for the gastrocnemius and one-third and two-thirds of the distance between these landmarks for the other lower leg muscles. These distances between anatomical landmarks were marked on the skin using fish-oil capsules that appear bright on the MRI images. Two authors (S. V. and K. M.) visually verified the selected slices to ensure consistency between baseline and follow-up scans.
A fat fraction map was calculated in Matlab version R2018a (Mathworks, Natick, MA, USA) from the Dixon fat and water images that were reconstructed by the Siemens Syngo Via software (fat image/[fat image + water image]). Two clinicians (S. V. and J. J.) drew regions of interests (ROIs) around 12 thigh muscles and seven lower leg muscles, bilaterally, using ImageJ software (Fiji).22 To ensure consistency between both raters, all drawn ROIs were checked by a second clinician and interrater variability was assessed on a subset of 20 scans. These ROIs were then used to calculate the average fat fraction (FF) of each individual muscle.
Due to an MR-system upgrade between the baseline and follow-up study the Dixon reconstruction software was changed, which influenced the calculated FF. The relation between FF values from four FSHD patients examined just before and after the upgrade fitted a linear model .
TIRM images were visually assessed by two clinicians (S. V. and J. J.) for the presence of signal hyperintensities, and each individual muscle scored as TIRM positive (TIRM+) or TIRM negative (TIRM-). When the scoring differed between observers, consensus was reached by consultation of a third clinician with experience in FSHD (KM). Muscle trophicity was evaluated by assessing changes in normal muscle area in the distal slice, following the equation: . A normal muscle variable averaged for the number of muscles combined was calculated for each patient for both baseline and follow-up and the change between them.
Magnetic resonance imaging compound score
To assess lower extremity fat replacement for each patient and to compare MRI outcome to clinical outcome measures, an additional FF variable was calculated: The FF compound-score (FF-CoS). It was calculated for both baseline and follow-up by calculating the average FF of all individual muscles weighted for cross-sectional area (CSA) (left and right leg, distal and proximal slice). We excluded the proximal semimembranosus muscle from the calculation, because it was often not present in the proximal slice. The ΔFF-CoS was calculated by subtracting the FF-CoS at baseline from the FF-CoS at follow-up.
To compare our data with data of other longitudinal cohorts, we excluded non-penetrant carriers and wheelchair dependent patients at baseline from specific analyses. Next, we selected a subset of muscles with the highest increase in FF between baseline and follow-up. This FF-CoS-subset is expected to detect disease progression and treatment effects with a shorter MRI analysis protocol, particularly in the ROI analyses.
Statistical analyses
All statistical analyses were performed using IBM SPSS Statistics (SPSS inc. Chicago, IL, USA), version 25. Descriptive statistics were calculated for all measures and are presented as mean, standard deviation (SD) and range if the data was normally distributed and presented as median and inter-quartile range (IQR) if the data was non-normally distributed. Most earlier quantitative MRI studies use the mean and standard deviation for FF results, but our FF variables were not normally distributed. Therefore, we used the median and IQR for all analyses performed, but presented both the median and the mean for all FF outcome. We used the mean FF in the discussion to compare our results to earlier studies. A paired t-test was performed to test the mean difference at baseline and follow-up for normally distributed data and the Wilcoxon signed rank test for skewed data, which is quantified by z-scores. An ANOVA was performed on dummy variables to test the difference for multiple groups. Spearman rho correlation coefficients (ρ) were calculated to determine the relation between mean FF and clinical outcome measures and change in FF and change in clinical outcome measures. Missing values scarcely occurred due to MR artefacts and if encountered were excluded pairwise. P-values of < 0.05 were considered statistically significant.
Results
Participants
One hundred and five FSHD patients participated in this 5-year follow-up study. Of the remaining 35 patients from the baseline study 6 had died, 11 participated in the follow-up study without MRI examination, and 18 were lost to follow-up. Follow-up visits were planned 4.7 ± 0.3 years after the baseline visit. Patient characteristics are listed in Table 1.
Variable | N = 105 |
---|---|
Sex, n | Female: 43 (41%) |
Male: 62 (59%) | |
Age, years (mean ± SD [range]) | 54 ± 14 [23–80] |
FSHD type, n | Type 1: 98 (93.3%) |
Type 2: 6 (5.7%) | |
Type 1 + 2: 1 (1%) | |
D4Z4 Repeat size in units (mean ± SD [range]) | 6 ± 1.7 [2–9] |
Symptomatic patients, n | 97 (92.4%) |
Wheelchair dependent patients, n | 6 (5.7%) |
Asymptomatic participants, n | 5 (4.8%) |
Non-penetrant participants, n | 3 (2.9%) |
BMI, kg/m2 (mean ± SD [range]) | 25.7 ± 3.7 [18–38] |
Change in fat fraction
Over 4.7 years' time there was an increase in the median compound fat fraction (ΔFF-CoS) of 2.0% [IQR 0.1–4.2], range −4.6+12.1 (P < 0.001). After exclusion of all non-penetrant cases (n = 5) and wheelchair dependent individuals (n = 1), the median ΔFF-CoS was 2.4% [0.5–4.7].
The majority of muscles showed a significant increase in FF at follow-up (Figure 1 and Tables S1 and S2). The highest median ΔFF was found in the gastrocnemius lateral head (2.2%), the adductor magnus (2.0%), semimembranosus (1.3%), adductor longus (1.2%), biceps femoris caput longus (1.1%), and semitendinosus (1.0%).

Strong interrater agreement was found for the assessment of fatty replacement (intraclass correlation coefficient 0.99, P < 0.001).
Change in muscle trophicity
The median change in normal muscle area over 4.7 years' time was −22.6 mm2 [IQR −77.7 to +14.6] (P < 0.001). Almost all upper leg muscles showed significant decreases in normal muscle area, opposed to most lower leg muscles, which show stable or slightly increasing normal muscle area (tibialis anterior muscle, extensor digitorum muscle, peroneus muscle, and tibialis posterior muscle) (Table S3).
Change in turbo inversion recovery magnitude hyperintensity
At baseline, 49/105 patients (46.7%) had at least one TIRM+ muscle and 130/3990 muscles (3.3%) were TIRM+. At follow-up, 80/105 patients had at least one TIRM+ muscle (76.2%) and 351/3990 muscles were TIRM+ (8.8%). The number of TIRM+ muscles of each patient at baseline (median 0 [0–2]) differed significantly from the number at follow-up (median 3 [0.5–5], P < 0.001). Muscles that were most often TIRM+ at follow-up were the tibialis anterior (26%), medial gastrocnemius muscle (25%) and the vastus lateralis muscle (23%).
At baseline, we found 38 TIRM+ muscles with a normal FF (<15%).23 Of these 38 muscles, 27 (71.1%) were still TIRM+ at follow-up and 24 (63.2%) showed an increase in fat replacement (Figure 2). The median ΔFF of all muscles that were TIRM+ at baseline (+8.9% [1.3–18.8]) was significantly higher than the median ΔFF of all muscles that were TIRM- at baseline (+0.6% [0.3–0.9], P < 0.001).

The change in several clinical outcome measures was dependent on the number of TIRM+ muscles at baseline (FSHD-CS, P = 0.013; MRC-SS P < 0.001) and on the number of TIRM+ muscles at follow-up (FSHD-CS: P = 0.006; Ricci-score: P = 0.009; MFM: P = 0.001; MRC-SS: P = 0.686).
Factors associated with differences in fat fraction compound score
ΔFF-CoS neither differed between men and women (P = 0.136), nor between FSHD1 and FSHD2 patients (P = 0.276). ΔFF-CoS was dependent of the clinical outcome measures at baseline (FSHD-CS and Ricci-score; P < 0.001), the baseline FF-CoS (P < 0.001) and number of TIRM+ muscles at baseline (P < 0.05), but not on age or repeat size (respectively P = 0.213 and 0.691).
Vice versa, the ΔMFM and the ΔMRC-SS were dependent of the baseline FF-CoS (P < 0.05), though the ΔRicci-score and ΔFSHD-CS were not (respectively P = 0.063 and P = 0.315).
Subgroups with higher median delta fat fraction compound score depending on baseline magnetic resonance imaging outcomes
The following subgroups showed ΔFF-CoS values higher than the values found in the entire cohort: patients with a baseline FF-CoS between 20% and 40% had a ΔFF-CoS of 6.1% [2.5–9.5] (n = 23; Figure 3A), patients with ≥1 TIRM+ muscle at baseline had a ΔFF-CoS of 3.1% [1.0–4.9] (n = 49) and patients with ≥2 TIRM+ muscles at baseline had a ΔFF-CoS: 3.5% [1.8–6.5]; (n = 32) (Figure 3B).

Subgroups with higher median delta fat fraction compound score depending on baseline clinical outcome measures
The following subgroups showed ΔFF-CoS values higher than the values found in the entire cohort: patients with a baseline Ricci-score of 4–8 had a ΔFF-CoS of 3.1% [1.4–5.2] (n = 61) and patients with a baseline FSHD-CS of 5–10 had a ΔFF-CoS of 3.1% [1.7–5.1]; (n = 45) (Figure 3C).
Combination of subgroups with higher median delta fat fraction compound score
When the abovementioned subgroups based on solely baseline MRI outcome (FF-CoS 20–40% and ≥2 TIRM+ muscles) were combined the median ΔFF-CoS was 7.4% [3.8–10.1] (n = 9). When the abovementioned subgroups on baseline TIRM outcome (≥2 TIRM+ muscles) was combined with the FSHD-CS 5–10 subgroup the median ΔFF-CoS was 3.8% [3.0–7.1] (n = 18). Comparing the patients in the subgroup based on solely MRI outcome to the patients in the subgroup based on TIRM+ muscles combined with FSHD-CS showed that 6 patients were identified in both subgroups, 3 patients were only identified by the MRI criteria, and 12 patients were only identified using clinical criteria. Combining all criteria (FF-CoS 20–40%, ≥2 TIRM+ muscles, FSHD-CS 5–10) forming one subgroup did not increase median ΔFF-CoS (ΔFF 7.2% [4.1–10.0]; n = 6) and eliminating this subgroup from the cohort showed a median ΔFF-CoS of 1.7% [0.1–3.7] (n = 99).
Asymptomatic and non-penetrant cases
Of the four asymptomatic and five non-penetrant patients in our baseline cohort, one proceeded from non-penetrant to asymptomatic and one from non-penetrant to symptomatic. The eight remaining asymptomatic/non-penetrant patients at follow-up (mean age 48 ± 12, range 32–71) showed a median FF-CoS at follow-up of 5.5% [5.0–5.9] and a median ΔFF-CoS of –-0.1 [−0.4–0.1]. At follow-up, two patients (of which one was non-penetrant) showed one new TIRM+ muscle, which both did not show evident progression in FF (−1.6 and 0.6).
Facioscapulohumeral muscular dystrophy type 2
The seven FSHD2 patients (four men; mean age 51 ± 8 years) had a higher FF-CoS at baseline and at follow-up than the FSHD1 patients (n = 98) (median at baseline respectively 23.3% [14.5–43.1] vs. 11.5% [7.1–26.8] and at follow-up respectively 26.8% [17.7–48.8] and 14.2% [8.2–32.7]). The median increase in FF-CoS was higher in the FSHD2 patients, though not significantly different: 3.5% [2.9–5.6], versus 1.7% [0.1–4.1] in FSHD1 patients (P = 0.276).
Change in clinical outcome measures
Median baseline and follow-up results from all clinical outcome measures are provided in Table 2. Except for the MRC-SS (P = 0.756), all clinical outcome measures progressed significantly at follow-up, with z-scores ranging from 5.0 to 7.2 (P < 0.001). Out of all clinical outcome measures, the highest change over 5 years' time was found in the MFM, with a mean of −5.6 (on a scale of 0 to 100; P < 0.001). The changes in these clinical outcome measures were independent of the baseline score: subgroup analysis based on baseline FSHD-CS showed no significant differences between groups (P = 0.173).
Clinical test (score range) | Baseline, median [IQR] | Follow-up, median [IQR] | 5-year change, mean ± SD | Difference (z-score) | P-value |
---|---|---|---|---|---|
MFM total (0–100) | 91.7 [74.5–97.9] | 87.1 [63.0–97.9] | −5.6 ± 6.4 | −7.2 | <0.001* |
MFM D1 (0–100) | 85.9 [53.2–97.4] | 71.8 [32.7–97.4] | −8.2 ± 10.7 | −6.7 | <0.001* |
Ricci-score (0–10) | 6.0 [3.0–7.0] | 7.0 [3.0–8.0] | 0.6 ± 1.2 | 5.0 | <0.001* |
FSHD-CS total (0–15) | 6.0 [3.0–10.0] | 7.0 [3.5–11.0] | 0.9 ± 1.4 | 5.7 | <0.001* |
FSHD-CS pelvic (0–5) | 1.0 [0.0–3.0] | 2.0 [0.0–3.5] | 0.6 ± 1.0 | 5.6 | <0.001* |
MRC-SS | 4.4 [3.5–5] | 4.5 [3.6–5] | −0.5 ± 3.5 | −0.3 | 0.756 |
- * P value <0.05.
Correlation between fat fraction and clinical outcome measures
At baseline, the FF-CoS correlated strongly with all clinical outcome measures used (MFM; ρ = −0.850, P < 0.001; Ricci-score; ρ = 0.834, P < 0.001; FSHD-CS; ρ = 0.838, P < 0.001; MRC-SS; ρ = −0.825, P < 0.001). At follow-up, the FF-CoS showed equally strong correlations with the clinical outcome measures: The FSHD-CS and Ricci-score correlated the strongest (respectively ρ = 0.853 and ρ = 0.856; P < 0.001), but correlations with the MRC-SS and MFM were high as well (respectively ρ = −0.833 and ρ = −0.838; P < 0.001). Because there was a change in all clinical outcome measures used at follow-up, we were able to correlate the change in clinical outcome measures with the change in MRI compound score. The ΔFF-CoS correlated with ΔFSHD-CS (ρ = 0.248; P = 0.011), the ΔFSHD-CS subscore pelvic (ρ = 0.338; P < 0.001) and ΔRicci-score (ρ = 0.229, P = 0.019), but not with the other outcome measures used (Figure 4).

We identified a subset of muscles with the highest increase in FF between baseline and follow-up. This FF-CoS-subset consisted of the adductor magnus, semimembranosus, the adductor longus, biceps femoris caput longus, semitendinosus and gastrocnemius lateral head. The FF-CoS-subset correlated significantly and strongly with clinical outcome measures at baseline and follow-up with ρ ranging from 0.76 (P < 0.001) to 0.83 (P < 0.001). The ΔFF-CoS-subset also correlated with the delta of most clinical outcome measures used: ΔRicci-score (ρ = 0.198; P = 0.044), ΔFSHD-CS (ρ = 0.206; P = 0.035) and FSHD-CS subscore pelvic (ρ = 0.277; P = 0.004).
Discussion
This study provides 5-year follow-up data on a large clinically and genetically well characterized, heterogeneous cohort of FSHD patients combining quantitative muscle MRI fat fraction (FF) and the assessment of MRI TIRM positivity with clinical outcome measures. We showed significant changes in both MRI parameters and clinical outcome measures, and a significant correlation between the change in MRI fat fraction compound score (ΔFF-CoS) and the change in FSHD clinical score (FSHD-CS) and Ricci-score. In addition, the results show which subgroups of patients and which leg muscles are prone to radiologically determined disease progression. Vice versa, MRI biomarkers FF-CoS and number of TIRM+ muscles have a predictive value for clinical progression. This knowledge further establishes MRI as a prognostic biomarker in FSHD and as an efficacy biomarker in upcoming clinical trials. We discuss the main findings below.
Change in fat fraction
The mean change in FF-CoS (3.1% in 5 years, excluding non-penetrant and wheelchair dependent individuals) is lower than expected compared with earlier studies using quantitative MRI (mean annual change of 1.0–6.7%).9-11, 13 Variations in reported FF progression at the patient level are expected as these measurements depend on multiple factors, which may differ between studies. For instance, it might be caused by the high phenotypical variability in FSHD, the selection of muscles included in an analysis (e.g. as observed in our study and that of others most upper leg muscles progress faster than lower leg muscles),24 variability in the observational period of a longitudinal study (which ranges from about 3 months to 2 years), and the disease stage of selected muscles at baseline. Muscles with a baseline FF between roughly 20–50% generally have the fastest progression rate. Our cohort had a relatively low baseline compound FF compared with some others (FF-CoS of 19 ± 14% vs. 30 ± 35%)10 and a lower mean FF of calve and hamstring muscles (27–42% vs. 44–45%).9 Furthermore, the number and location of the analysed transversal slices may have a significant impact on reported progression rates,25 given the knowledge that the distal muscle part has faster fatty replacement rates. Finally, differences in MRI acquisition methods and data processing may contribute to differences in reported FF between studies, for example, T1-weighting in the MRI acquisition. This may lead to an overestimation of reported FF compared with our data, which is not T1 weighted.
The relatively stable fat content over time found in the entire cohort emphasizes an important aspect for the definition of inclusion criteria for future trials. In order to use muscle MRI as a prognostic biomarker, inclusion criteria need to be carefully chosen: they should be based on imaging characteristics that are known to capture high FF changes, such as the analysis of specific muscles known to be affected often in FSHD.
Change in muscle trophicity
Only in upper leg muscles a small decrease in normal muscle area was found, while lower leg normal muscle area remained stable or even increased slightly. We hypothesize that the normal muscle mass in FSHD decreases over time, and while lower leg muscle hypertrophy has been described in many other muscular dystrophies, this is not typical for FSHD.26 Possibly, very mild lower leg muscle hypertrophy exists in FSHD, but we think our findings are at least partially attributable to measurement errors. Because we evaluated single MRI slices, we could only evaluate normal muscle area and not muscle volume. Due to the very small size of lower leg muscles in the evaluated slice, inconsistencies in ROI placement are likely to occur. Nevertheless, changes in leg muscle trophicity, specifically evaluation of entire muscle volumes, should be topic for future research.
Subgroup analyses
The ΔFF-CoS is higher in specific subgroups of patients, defined on baseline MRI outcome (FF-CoS and number of TIRM+ muscles) or clinical outcome (FSHD-C). For the MRI outcome, our finding of higher ΔFF in TIRM+ and intermediately fatty replaced muscles is comparable to earlier studies using smaller cohorts (≤45 patients). These studies showed that individual muscles with an FF below 10–20% or higher than 60–70% are less likely to progress in a 12-month interval.9, 11, 12 The peak of change of individual muscles was always found in intermediately fatty replaced muscles, but with varying definitions (for instance FF between 25–75% and 40–50%).11, 24 Multiple studies have demonstrated that TIRM+ show an increased progression of fatty replacement.11, 14, 15, 24
Change in turbo inversion recovery magnitude outcome
While our study confirms that most TIRM+ muscles at baseline show an increase of fat replacement compared with TIRM- muscles, a small subset of baseline TIRM+ muscles remain TIRM+ after 5 years, some even without an increase in FF. This might imply that not all TIRM+ muscles progress to fat replacement in the following 5 years. TIRM+ might even be a reversible process if the underlying disease mechanism is targeted adequately. This will have to be investigated in the upcoming phase-3 trials. Nevertheless, our data show that the number of TIRM+ muscles has a predictive value for disease progression.
Asymptomatic and non-penetrant participants
Analysis on our subgroup of asymptomatic and non-penetrant participants showed that penetrance increases at follow-up, which confirms that penetrance in FSHD continues to increase in adulthood.27 The remaining asymptomatic and non-penetrant participants from our cohort did show normal leg muscle FF at follow-up and the mean change in FF-CoS in this group was negligible (−0.2%), comparable to the changes in FF found in lower extremity muscles of healthy adults.28
Correlation between magnetic resonance imaging fat replacement and clinical outcome measures
Because patients in our entire cohort showed significant progression in the majority of clinical outcome measures during the follow-up period, we were able to confirm the presumed longitudinal correlation between FF and several clinical outcome measures during disease progression in FSHD (on average ρ = 0.3; P < 0.001). The consequence of these moderate longitudinal correlations between FF-CoS and clinical outcome measures is that, at an individual patient level, the change in FF-CoS cannot be taken as a substitute for the clinical progression and vice-versa. The discrepancy of strong cross-sectional and moderate longitudinal correlations between imaging biomarkers and clinical outcome measures has also been observed in other muscular dystrophies.29, 30 This discrepancy can have multiple causes. First and foremost, given the consistent finding across different studies, we suspect a methodological limitation of the instruments used. Estimations of change requires a higher degree of instrument sensitivity and reliability, which is likely not met by a multitude of the instruments. This is particularly true for slowly progressive diseases like FSHD, wherein the measured differences are very small and have a wide variance. Second, variance in clinical outcome is induced by participant dependent factors such as effort, fatigue, fear or learning effects, which plays no role in MRI outcome. Third, functionality measured by clinical outcome measures such as the ones used in our study is dependent on more than the leg muscles alone, thus information on for instance truncal muscles might be needed for an adequate comparison. Finally, quantification of functionality or strength using clinical outcome measures may relate to the extent of fat replacement and inflammation as seen on MRI images, but they are obviously not the same. Disease activity as seen on MRI images may be less subjective and probably precedes decreasing functionality, depending on the muscles affected and their function: if one hamstring muscle shows more intensive fat replacement patients will not often notice a decrease in functionality, as long as the other hamstring muscles can sufficiently compensate. However, this may change if the other hamstring muscles deteriorates as well. Conversely, if a tibialis anterior muscle gets severely affected, a patient will often notice a decrease in functionality soon. Even when a patient notices slight changes in functionality, using compensation mechanisms and rehabilitation advises can relatively sustain it, up to a certain threshold that cannot be easily predicted and is patient dependent.
Limitations
Because our MRI protocol was drafted in 2014, we applied 2pt Dixon and TIRM imaging to both legs of all participants, in contrast to more recent protocols that often use 3pt Dixon and water T2 sequences and cover the whole body. While 3pt Dixon images are expected to result in more accurate FF, in longitudinal studies the difference in FF is more relevant. The results of our study, as well as those of other investigations, indicate that 2pt-Dixon approaches, which are more commonly available on clinical MRI systems, can accurately quantify FF differences over time. As TIRM imaging provides rather qualitative than quantitative data as obtained by water T2 sequences, the latter might be more objective. MRI advances during the long follow-up period of our study forced us to work with an upgrade of the MR system. This is less likely to occur in shorter longitudinal studies, but is expected to hamper future extended clinical trial protocols. We were able to correct our data for the upgrade, repeated the analysis and showed that our main conclusions did not change. Therefore, our approach may be valuable in future studies encountering similar upgrade challenges.
With use of manual segmentation of individual leg muscles, a cumbersome and time-consuming method, we were only able to assess four slices per leg. This could have increased the variability of the results. However, based on the cohort size, this is not expected to influence the generalizability of the data. Next, because our protocol is very time consuming, it is not suited for screening patients for in clinical trials. To overcome this, we determined which subset of leg muscles showed similar correlations with clinical outcome measures as found for the compound score of all muscles combined. We presume that the FF-CoS-subset is most suitable as a screening method. Additional validity analyses are needed to confirm this. Recently, different segmentation methods have been evaluated to assess biomarkers in neuromuscular diseases, such as single muscle, muscle group and global muscle segment assessment. These (semi-)automatic approaches are faster and less prone to error than manual segmented single muscle assessment and therefore may be better suitable for screening.9, 31, 32 Finally, the use of other indexes quantifying muscle fatty replacement such as muscle fatty infiltration (MFI) may be used as alternative, but these were not available at the time of the baseline study.33
Conclusions
This unique data set acquired in a large natural history study with a follow-up period of 5 years has enabled us to demonstrate a relationship between longitudinal changes in MRI outcome measures and changes in clinical outcome, establishing quantitative MRI as prognostic biomarker in FSHD. We identified a selection of leg muscles and subgroups of patients displaying most progression in MR FF. Potentially, these subgroups of patients are most eligible for therapy testing in early phase clinical trials. Including them probably increases the chance of finding an intervention effect. This knowledge establishes quantitative MRI as an efficacy biomarker in upcoming clinical trials and substantiates the utility of MRI-based inclusion criteria for clinical trials, resembling criteria that have already been employed in some phase-2 studies in FSHD.7, 34 Future research needs to build on this knowledge, developing and validating more feasible MRI screening and segmentation methods.
Acknowledgements
The authors thank the patients for their cooperation in this study. Several authors of this publication are members of the Radboudumc Center of Expertise for neuromuscular disorders (Radboud-NMD), Netherlands Neuromuscular Center (NL-NMD) and the European Reference Network for rare neuromuscular diseases (EURO-NMD). This work has been presented in the 265th ENMC workshop on Muscle Imaging in FSHD. We thank the ENMC for organizing this workshop and all participants for their valuable input to the discussion, in particular Hermien Kan, associate professor of Radiology in Leiden university medical center, the Netherlands. The authors of this manuscript certify that they comply with the ethical guidelines for authorship and publishing in the Journal of Cachexia, Sarcopenia and Muscle.35
Conflicts of Interest
S. C. Vincenten, D. van As, J. Jansen and A. Heerschap declare that they have no conflict of interest. K. Mul has received a research grant from the FSHD Society, a research grant from the FSHD Stichting and a consultancy fee paid to institution from Avidity Biosciences. L. Heskamp has received a consulting fee paid to the institution from AMRA Medical. B. G. van Engelen has received a grant from the Prinses Beatrix Spierfonds, a grant from the Dutch FSHD Society, a grant from the Stichting Spieren voor Spieren, a patent: Myositis. Patent number: EP2012740236, date of filing 5-7-2012, a consulting fee paid to institution of Fulcrum, a consulting fee paid to institution from Arrowhead and a consulting fee paid to institution from Facio; N. C. Voermans has received support of the ENMC for 2 international research workshops on FSHD, a consulting fee from Fulcrum paid to the institution and an unpaid leadership role in the FSHD European Trial Network as part of FSHD Europe.