Volume 1, Issue 2 pp. 125-132
ORIGINAL ARTICLE
Open Access

Reliability of the MRI-based Brain Atrophy and Lesion Index in the evaluation of whole-brain structural health

Tao Gu

Tao Gu

Department of Radiology, Beijing Hospital, National Center of Gerontology, Beijing, China

Simon Fraser University ImageTech Laboratory, Surrey Memorial Hospital, Surrey, British Columbia, Canada

Search for more papers by this author
Hui Guo

Hui Guo

Simon Fraser University ImageTech Laboratory, Surrey Memorial Hospital, Surrey, British Columbia, Canada

Department of Diagnostic Imaging, Tianjin Medical University General Hospital, Tianjin, China

Search for more papers by this author
Xiaowei Song

Corresponding Author

Xiaowei Song

Simon Fraser University ImageTech Laboratory, Surrey Memorial Hospital, Surrey, British Columbia, Canada

Health Sciences & Innovation, Fraser Health Authority, Surrey, British Columbia, Canada

Correspondence: Xiaowei Song, Health Sciences and Innovation, Surrey Memorial Hospital, Fraser Health Authority, 13750 96th Avenue, Surrey, British Columbia V3V 1Z2, Canada ([email protected]).Search for more papers by this author
First published: 06 August 2018
Citations: 2

Abstract

Background

The Brain Atrophy and Lesion Index (BALI), which evaluates several common aging-related MRI changes in combination, has been validated as a feasible method to assess the status of structural brain health. Previous studies have been based primarily on older participants and high-field MRI. Here, we tested the generalizability of the BALI by examining its measurement properties in a wide age range at both high and conventional MRI field strengths.

Methods

Subjects (n = 229) who had T2WI at either 1.5T or 3.0T were grouped into younger (age ≤ 60 years) and older (age > 60 years) groups. Image evaluation and scoring were performed independently by two experienced neuroradiologists who have mastered the BALI method. Inter- and intrarater agreement rates were examined comparing age groups and field strengths.

Results

The intraclass correlation coefficient for the BALI total score was consistently high under each experimental condition (interrater ICC ≥ 0.92, 95% CI: 0.84-0.96), with no statistical difference between age groups (Fisher Z = 1.43) or field strengths (Z = 0.60). The reliability for BALI category subscores ranged between moderate and perfect (eg, 0.85 vs 0.57 for GA), similar for both age groups and typically greater at 3.0T than at 1.5T.

Conclusion

The BALI based on T2WI can be reliably applied to the evaluation of the whole-brain health of both younger and older adults at both field strengths, even though high-field MRI is preferable.

1 INTRODUCTION

Aging is a process characterized by deficit accumulation over time.1 As is the situation with the body,2 various degenerative changes can also accumulate in the brain,3 and they interact to overwhelm repair processes, causing high-level failure in brain functions, and can lead to cognitive decline and dementia.4 To collectively evaluate the additive effects of several common structural deficits on brain function, a semiquantitative rating scale, the Brain Atrophy and Lesion Index (BALI), has been validated with the application of multiple research datasets.5, 6 The BALI assesses global atrophy (GA) and lesions in both the supratentorial and infratentorial compartments. Categorized changes are assessed as lesions in the gray matter (eg, cortical infarcts) and dilated perivascular spaces in the subcortical white matter as well as lesions in the periventricular regions, deep white matter, basal ganglia, and the surrounding regions.4-6 By integrating multiple deficits in the aging brain into one scale, the BALI has been used as a proxy measure of brain health status and a way to model the dynamic brain changes in the process of aging.3, 4, 7

To date, the BALI has demonstrated sensitivity in the relationship with age and cognition, differentiating people with different cognitive diagnoses and those with high risks for cognitive decline, and predicting those converted to dementia.3-8 Moderate-to-high reliability has been reported for BALI based on different sequences.4, 9, 10 Nevertheless, previous studies have chiefly used high-field MRI (eg, 3.0 tesla) with a relatively high signal-to-noise ratio and assessed the brain of older subjects with many age-associated changes.4, 9, 10 It is yet to be determined (a) whether the BALI method can also be reliably applied in evaluating people of younger ages and (b) whether the data acquired from clinically conventional MRI may also be used to score BALI with acceptable reliability.

To test the generalizability of the BALI, here, we examined the measurement property of the BALI using the data from the adults of a wide age range and at both high and conventional MRI field strengths.

2 METHODS

2.1 Participants

We accessed the data (n = 229) from a convenience sample of adults who underwent a general health evaluation at Beijing Hospital from August 16, 2016, to August 31, 2017, agreed to have MRI brain scans at either 3.0T or 1.5T, and were not diagnosed with terminal malignancy, stroke, heart diseases, or cognitive decline. The sample contained 72% male, with age ranging between 25 and 80 years (mean age = 48.3 ± 12.5); over 95% of the participants were married, 91% had a college or university degree, and 82% were working on a job (Table 1).

Table 1. Characteristics of the sample
Category Case Age (mean ± SD) Male (%) Married (%) High education (%) Working on a job (%)
Overall N 229 48.3 ± 12.5 72.1 95.2 91.3 82.5
n 48 46.2 ± 9.0 85.4 93.8 95.8 100.0
Age group
Younger N 194 44.5 ± 9.2 74.2 94.8 93.3 96.9
n 35 46.3 ± 8.2 94.3 97.1 97.1 100.0
Older N 35 69.4 ± 5.7 60 97.1 80.0 2.9
n 35 69.4 ± 5.7 60 97.1 80.0 2.9
Field strength
1.5T N 60 47.0 ± 11.4 75.0 100.0 98.3 91.7
n 35 43.0 ± 9.0 75.0 100.0 97.2 100.0
3.0T N 169 48.8 ± 12.9 71.0 93.5 88.8 79.3
n 35 46.4 ± 9.0 85.7 91.4 94.3 100.0
  • N, sample size; n, randomly selected subsample size; SD, standard deviation.

2.2 MRI scans

Whole-brain scans were acquired using one of the four MRI scanners, including two of 3.0T (Discovery MR750; General Electric Medical Systems, Waukesha, WI, USA; and Achieva; Philips Medical Systems, Best, The Netherlands) and two of 1.5T (MAGNETOM Espree, Siemens, Germany; and Optima MR360; General Electric Medical Systems). Nearly three quarters (73.8%) of the MRI data were acquired with 3.0T. BALI scoring was completed based on the evaluation of the two-dimensional T2-weighted imaging (2D T2WI). The sequence parameter settings were as follows: TR/TE = 2500-5600/90-110 ms; flip angle = 90° or 140-160°; field of view: 230 mm; matrix size: 180 × 256; slice thickness: 5.0 mm; and 24 axial slices to cover the whole brain. Detailed parameter settings were optimized for scanner specifics.

2.3 Evaluation of the Brain Atrophy and Lesion Index

As described elsewhere,4-10 the BALI is a semiquantitative summary rating scale, adapted from several well-established scales that assess localized structural changes.11, 12 Changes that are integrated into the BALI evaluation include gray matter lesions (eg, cortical infarcts) and subcortical dilated perivascular spaces (GM-SV), deep white matter lesions (DWM), periventricular white matter lesions (PV), lesions in the basal ganglia and surrounding areas (BG), lesions in the infratentorial compartment (IT), and GA.

Applying the BALI rating schema, a value between 0 and 3 was assigned to assess a change in each category, with a higher score indicating greater severity. In the categories DWM and GA, the values of 0-5 were used, allowing the capture of more severe changes and thereby avoiding ceiling effects. The “other findings” category was included to record other possible changes such as neoplasm, trauma, idiopathic normal-pressure hydrocephalus, focal asymmetry, and deformity, each of which is sometimes seen in older adults. The BALI total score was calculated as the sum of subscores of all the seven categories.5

Figure 1 shows examples from the sample of the BALI categories evaluated as having different subscores (Figure 1).

Details are in the caption following the image
Examples showing the evaluation of the Brain Atrophy and Lesion Index (BALI) using T2-weighted MRI

2.4 Reliability assessment

Images were evaluated independently by two experienced neuroradiologists (TG and HG) with 10 and 12 years of diagnostic imaging experience, respectively, and familiarity with the BALI method. Imaging evaluation and BALI scoring were performed with the subjects' demographic information and scan field strength masked.

For the assessment of the interrater agreement rate, the two raters each assessed a subsample of 105 (45.9%) subjects. For statistical analysis and group comparison, the subjects from the selected subsample were divided into different experimental conditions, that is, younger (≤60 years) vs older (>60 years) age groups, and conventional (1.5T) vs high (3.0T) MRI field strengths, n = 70 for each comparison group (accounted for 18%-100% of the original sample under each condition). For the assessment of the intrarater agreement rate, each rater independently evaluated the same sample two times on separate days. The order of the image evaluation was determined using a random number generator.

2.5 Statistical analysis

For the BALI total score (interval data), the interrater and intrarater agreement rates were assessed using the intraclass correlation coefficient (ICC; moderate: 0.5-0.75; good: 0.75-0.90; excellent: >0.90);13 comparisons on the ICC between different raters or different experimental conditions were made using Fisher Z test. For the BALI subscores (categorical data), the interrater and intrarater agreement rates were assessed using the Cohen K coefficient (moderate: 0.41-0.6; substantial: 0.61-0.80; perfect: 0.81-1.0).14 Differences in the BALI total score between experimental conditions (age group and MRI field strength) were examined for each rater using two-way ANOVA.

All statistical analyses were performed using IBM Statistics SPSS version 22 (IBM, Chicago, IL, USA). The level of statistical significance was set at P < 0.050. The 95% confidence intervals (95% CI) were reported together with the mean values whenever appropriate.

3 RESULTS

As detailed in Table 1, the randomly selected subsamples for reliability testing well represented the samples under different experimental conditions, regarding age and other demographic features (Table 1). The BALI total score differed by age (4.20 ± 2.27 for the younger group vs 11.37 ± 2.79 for the older group), which was consistent for both raters and with no age vs field strength interactions (Fage = 77.02/64.73, Ffield strength = 2.51/5.22, Finteraction = 0.34/0.22).

Considering the reliability of the BALI total score (Table 2), the ICC for the interrater agreement was 0.94 (95% CI = 0.90-0.97) for the overall sample. The interrater ICC values were also good for each of the different experimental conditions: 0.96 for younger, 0.92 for older, 0.92 for 1.5T, and 0.94 for 3.0T. There was no statistical difference in the ICC between age groups (Z = 1.43, P = 0.153) or field strengths (Z = 0.60, P = 0.549).

Table 2. Intra- and interrater reliability of the Brain Atrophy and Lesion Index (BALI) total score assessment
BALI total score Overall Age group Field strength
Young Older 1.5T 3.0T
Rater 1
Mean ± SD 4.20 ± 2.27 4.20 ± 2.27 11.37 ± 2.79 4.19 ± 2.35 4.86 ± 2.26
Intrarater agreement rate (95% CI) 0.88 (0.77-0.94) 0.88 (0.77-0.94) 0.91 (0.82-0.95) 0.93 (0.86-0.96) 0.91 (0.82-0.95)
Rater 2
Mean ± SD 4.69 ± 2.26 4.69 ± 2.26 11.26 ± 3.16 4.03 ± 2.04 5.34 ± 2.16
Intrarater agreement rate (95% CI) 0.95 (0.90-0.97) 0.95 (0.90-0.97) 0.96 (0.92-0.98) 0.88 (0.78-0.94) 0.95 (0.90-0.97)
Interrater agreement rate (95% CI) 0.94 (0.90-0.97) 0.96 (0.93-0.98) 0.92 (0.85-0.96) 0.92 (0.84-0.96) 0.94 (0.87-0.97)
Fisher Z (Sig) between experimental conditions 1.43 (0.153) 0.60 (0.549)
  • SD, standard deviation; CI, confidence interval; Sig, level of significance.
  • a Used for analysis reliability of interrater agreement rate between different field strengths and age groups.

The ICC for intrarater agreement of the BALI total score was consistently high for each rater (ICC ≥ 0.90 for the total sample). There was no difference in the intrarater ICC between younger (≥0.88) and older (≥0.91) age groups (Z ≤ 1.82, P ≥ 0.069), or between 1.5T (≥0.88) and 3.0T (≥0.91; Z ≤ 1.61, P ≥ 0.107; Table 2).

The reliability for the BALI category subscores was also high for the overall sample tested; the Cohen K coefficients ranged between moderate (0.61) for DWM and perfect (0.85) for GM-SV (Table 3). The Cohen K values were more significant in the younger than in the older subjects for GM-SV (0.83 vs 0.46) and BG (0.79 and 0.52), but not for other subscores; in contrast, the Cohen K values were consistently higher (except for DWM) at 3.0T (0.68 for IT to 0.85 for GA) than at 1.5T (0.50 for BG to 0.65 for GM-SV), demonstrating robust identification of these brain changes due to higher field strength (Table 3).

Table 3. Intra- and interrater reliability in the evaluation of the Brain Atrophy and Lesion Index (BALI) subscores
BALI subcategory Score Overall Age group Field strength
Young Older 1.5T 3.0T
Gray matter lesions and subcortical dilated perivascular spaces (GM-SV)
Rater 1 Mean ± SD 1.38 ± 0.49 1.37 ± 0.49 1.77 ± 0.60 1.22 ± 0.49 1.49 ± 0.51
Intrarater agreement rate 0.82 0.81 0.77 0.55 0.89
Rater 2 Mean ± SD 1.44 ± 0.50 1.46 ± 0.51 1.71 ± 0.57 1.25 ± 0.44 1.57 ± 0.50
Intrarater agreement rate 0.83 0.77 0.72 0.46 0.94
Interrater agreement rate 0.85 0.83 0.46 0.65 0.83
Deep white matter lesions (DWM)
Rater 1 Mean ± SD 1.04 ± 0.74 1.00 ± 0.77 2.31 ± 0.58 0.89 ± 0.71 1.09 ± 0.78
Intrarater agreement rate 0.62 0.48 0.61 0.70 0.55
Rater 2 Mean ± SD 1.15 ± 0.62 1.14 ± 0.65 2.43 ± 0.70 0.97 ± 0.65 1.23 ± 0.65
Intrarater agreement rate 0.65 0.72 0.63 0.78 0.62
Interrater agreement rate 0.61 0.57 0.51 0.57 0.58
Periventricular white matter lesions (PV)
Rater 1 Mean ± SD 0.35 ± 0.53 0.34 ± 0.54 1.54 ± 0.82 0.25 ± 0.44 0.37 ± 0.55
Intrarater agreement rate 0.78 0.69 0.65 0.55 0.77
Rater 2 Mean ± SD 0.38 ± 0.49 0.34 ± 0.48 1.66 ± 0.77 0.44 ± 0.50 0.37 ± 0.49
Intrarater agreement rate 0.66 0.71 0.56 0.70 0.60
Interrater agreement rate 0.78 0.75 0.64 0.56 0.76
Lesions in the basal ganglia and surrounding areas (BG)
Rater 1 Mean ± SD 0.54 ± 0.77 0.51 ± 0.74 2.09 ± 0.78 0.56 ± 0.91 0.66 ± 0.84
Intrarater agreement rate 0.76 0.77 0.70 0.55 0.74
Rater 2 Mean ± SD 0.58 ± 0.82 0.57 ± 0.82 2.03 ± 0.82 0.22 ± 0.64 0.74 ± 0.89
Intrarater agreement rate 0.65 0.62 0.66 0.56 0.65
Interrater agreement rate 0.77 0.79 0.52 0.50 0.76
Lesions in the infratentorial regions (IT)
Rater 1 Mean ± SD 0.60 ± 0.64 0.57 ± 0.66 1.66 ± 0.97 0.69 ± 0.86 0.66 ± 0.64
Intrarater agreement rate 0.68 0.71 0.80 0.77 0.71
Rater 2 Mean ± SD 0.81 ± 0.84 0.74 ± 0.78 1.40 ± 1.01 0.75 ± 0.77 0.83 ± 0.86
Intrarater agreement rate 0.60 0.60 0.85 0.74 0.58
Interrater agreement rate 0.64 0.67 0.69 0.52 0.68
Global atrophy (GA)
Rater 1 Mean ± SD 0.52 ± 0.68 0.40 ± 0.60 1.91 ± 0.82 0.53 ± 0.65 0.57 ± 0.70
Intrarater agreement rate 0.75 0.67 0.68 0.75 0.81
Rater 2 Mean ± SD 0.52 ± 0.68 0.43 ± 0.61 1.83 ± 0.75 0.36 ± 0.64 0.60 ± 0.70
Intrarater agreement rate 0.57 0.59 0.69 0.46 0.52
Interrater agreement rate 0.77 0.71 0.76 0.57 0.85

Under any given condition, the intrarater agreement rates for any BALI category subscore were always better than moderate and mostly comparable between the raters (Table 3).

4 DISCUSSION

In the present study, we investigated the reliability of the BALI evaluation using clinical routinely acquired 2D T2-weighted MRI at 1.5T and 3.0T in a convenience sample of younger and older adult participants. Reliability tests were conducted on the BALI total score and each of the category subscores. Inter- and intrarater agreement rates were compared between age groups (younger vs older) and field strengths (1.5T vs 3.0T). Our data suggested consistently high inter- and intrarater agreement rates with the BALI total score under different experimental conditions, which did not differ between the different age groups and field strengths. The data also suggested moderate-to-perfect inter- and intrarater agreement rates with each BALI category subscore regardless of age group and field strength, even though the reliability for a few categories was higher at 3.0T than at 1.5T.

Previous research has repeatedly shown that the BALI can be used to capture and summarize several common structural changes in the aging brain, thereby providing a way to study the impact of global structural changes on brain function in older adults.5-10 It has been known that various structural brain changes can coexist with brain aging, reflecting heterogeneous profiles in the process of age.15 Importantly, brain changes, including cerebral volume loss, can start early in the adult life span.16 Many of these changes can be clinically less significant or meaningful when considered individually, but when combined, they produce additive effects on function.4 Indeed, our previous work has shown that BALI total score differed significantly by age and in older people with different cognitive conditions.5-8 Here, the high reliability of the BALI total score and subscores in both younger and older subjects suggests that the BALI method may also be used in studying younger adults. Thus, the evidence yielded from the present study is in support of extending the BALI method beyond older ages, allowing enhanced generalizability of the BALI for the global assessment of brain health changes in broader study settings. This contribution is of significance, due to the critical opportunity to investigate the accumulation of deficits in the brain prior to older adulthood.

The BALI focuses on morphologic changes from the widely available routine clinical sequences.10 Here, we have further tested the reliability of the BALI evaluation using MR imaging acquired at both 1.5T and 3.0T. Consistent with a previous report on older adults,9 the data from the present study involving both older and younger adults have confirmed that the method is easy to master and the evaluation time is manageable.5 The interrater agreement rate of the BALI total scores is satisfactory at both field strengths and by both raters, indicating the robustness of the BALI total score as a global measure of structural brain health. Meanwhile, the increased reliability for several category subscores at 3.0T demonstrates advantages of high-field strength: The higher signal-to-noise ratio can benefit BALI evaluation, allowing the raters to more sensitively identify subtle changes and robustly grade the BALI categories.

Our data must be interpreted with caution. In this study, both raters who evaluated the images are experts in diagnostic neuroimaging and have well mastered the BALI method. Whether nonneuroradiologist raters may require additional training and practice to achieve the same level of reliability for each experimental condition remains to be determined. However, previous studies have shown highly reliable rating scores for older participants using 3.0-T T1-weighted MRI by nonneuroradiologist raters trained with the BALI method (using the rating schema descriptions, atlas, examples, and case discussions).5, 17, 18 Furthermore, our study used a subsample of the subjects from one study site for the reliability testing. Although the statistical analyses suggested satisfactory reliability, to what extent that the findings can be generalized to the general population deserves further research with increased sample sizes.

In conclusion, our study suggests that multiple structural brain changes can be collectively evaluated with the use of the BALI total score in adults of a wide age range. The BALI score based on both 1.5-T and 3.0-T T2WI is highly reliable in capturing global brain changes that can accumulate across the adult life course. High-field MRI can further improve the robustness in detecting subtle changes.

ACKNOWLEDGMENTS

The authors sincerely acknowledge Dr. K. Rockwood and Dr. W. Siu for critical discussions and Ms. Betty Chinda for proofreading the manuscript. This research was partly supported by Capital's Funds for Health Improvement and Research of China (2014-4-4052). Additional funding for data analysis was from Canadian Institutes of Health Research (CSE-125739) and Surrey Hospital & Outpatient Centre Foundation (2015-030, G2017-001).

    CONFLICT OF INTEREST

    Tao Gu receives a fellowship award from Beijing Hospital to conduct postdoctoral research in Canada. Hui Guo receives a fellowship award from the China Scholarship Council to conduct postdoctoral research in Canada.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.