Validation of a predictive model for obstructive sleep apnea in people with Down syndrome
Funding information: HRSA MCHB, Grant/Award Number: R40MC25322-01-00; Eunice Kennedy Shriver National Institute of Child Health and Human Development, Grant/Award Number: F32HD068101; National Institutes of Health, Grant/Award Numbers: 4T32GM007748-38, 4T32GM007748-37
Abstract
Detecting obstructive sleep apnea (OSA) is important to both prevent significant comorbidities in people with Down syndrome (DS) and untangle contributions to other behavioral and mental health diagnoses. However, laboratory-based polysomnograms are often poorly tolerated, unavailable, or not covered by health insurance for this population. In previous work, our team developed a prediction model that seemed to hold promise in identifying which people with DS might not have significant apnea and, consequently, might be able to forgo a diagnostic polysomnogram. In this study, we sought to validate these findings in a novel set of participants with DS. We recruited an additional 64 participants with DS, ages 3–35 years. Caregivers completed the same validated questionnaires, and our study team collected vital signs, physical exam findings, and medical histories that were previously shown to be predictive. Patients then had a laboratory-based polysomnogram. The best modeling had a validated negative predictive value of 50% for an apnea–hypopnea index (AHI) > 1/hTST and 73.7% for AHI >5/hTST. The positive predictive values were 60% and 39.1%, respectively. As such, a clinically reliable screening tool for OSA in people with DS was not achieved. Patients with DS should continue to be monitored for OSA according to current healthcare guidelines.
Abbreviations
-
- 5-HIAA
-
- 5-hydroxyindoleacetic acid
-
- AHI
-
- apnea–hypopnea index
-
- CSHQ
-
- Children's Sleep Habits Questionnaire
-
- DOPAC
-
- 3,4-dihydroxyphenylacetic acid
-
- DS
-
- Down syndrome
-
- GABA
-
- γ-aminobutyric acid
-
- LLM
-
- logic learning model
-
- NPV
-
- negative predictive value
-
- OSA
-
- obstructive sleep apnea
-
- PEA
-
- phenylethylamine
-
- PPV
-
- positive predictive value
-
- SpO2
-
- wake peripheral oxyhemoglobin saturation
-
- SRBD
-
- Sleep-Related Breathing Disorders (SRBD) Scale of the Pediatric; Sleep Questionnaire
-
- TST
-
- total sleep time
1 INTRODUCTION
Obstructive sleep apnea (OSA) is a significant comorbidity for the majority of the estimated 212,000 people with Down syndrome (DS) in the United States (Antonarakis et al., 2020; de Graaf et al., 2017), 417,000 people with DS in Europe (de Graaf et al., 2021), and millions more with DS worldwide (GBD 2017 Disease and Injury Incidence and Prevalence Collaborators, 2018). Some studies suggest that the prevalence of OSA in pediatric-aged people with DS is anywhere between 55% and 97% (Austeng et al., 2014; Ng et al., 2006; Shott, 2006), compared with 1%–4% in the neurotypically developing pediatric population (Lumeng & Chervin, 2008). Adults with DS also have a high prevalence of OSA—78% in one study (Giménez et al., 2018)—sometimes presenting as a new diagnosis, but often occurring as a persistent or recurrent condition from childhood (Jensen et al., 2012).
Making an early diagnosis is important. When left undetected or untreated in people with DS, OSA can lead to deficits in executive functioning (Chen et al., 2013), a loss of verbal IQ and cognitive flexibility (Breslin et al., 2014), a decrease in visuoperceptual skills (Andreou et al., 2002), and an increase in mood disorders (Capone et al., 2013). OSA might also explain symptoms that are often misinterpreted as behavioral disorders, mental health diagnoses, or dementia. Currently, the gold standard diagnostic tool for this population is an overnight polysomnogram conducted in a sleep laboratory, located in a hospital or sleep center (Bull et al., 2022). While noninvasive, these sleep studies are often not well tolerated by many people with DS who have sensory processing disorders (McGuire & Chicoine, 2021). For many other people with DS, laboratory-based sleep studies are not readily available or may simply not be covered by healthcare insurers, leading to undue delays in the diagnosis. People with DS also do not always have access to home-based sleep studies, and, when they do, are often not able to tolerate the testing even in the home setting.
Previous work sought to identify a combination of low-cost or simplified procedures that could accurately predict OSA in people with DS, so as to potentially bypass the need for polysomnograms (Allareddy et al., 2016; Elsharkawi et al., 2017; Jayaratne et al., 2017; Skotko et al., 2017). A preliminary model—incorporating parental surveys, medication history, anthropometric measurements (e.g., height, weight), vital signs, age, and physical exam findings—had a cross-validated negative predictive value (NPV) of 90% for moderate and severe OSA in this population (Skotko et al., 2017). In short, the model seemed to hold promise in identifying which people with DS might not have significant apnea and consequently could obviate the need for a diagnostic polysomnogram. The purpose of this study was to validate the findings in a novel set of participants with DS, and thus determine if the predictive model could be implemented in a clinical setting.
2 MATERIALS AND METHODS
2.1 Participants
All people with DS, ages 3–35 years, who were already attending the DS Program at Massachusetts General Hospital, were invited to participate in the validation phase of this study. The upper limit of 35 years was chosen to exclude any effects of potentially undiagnosed mild cognitive impairment or Alzheimer disease dementia, which can begin to occur at this age. People with DS were excluded if they (a) already had an adenotonsillectomy, adenoidectomy, or tonsillectomy, (b) had a sleep study within the past 6 months, (c) were actively being treated with CPAP or BiPAP for obstructive or central sleep apnea, or (d) participated in the development phase of our project (Skotko et al., 2017). Patients were also excluded if they were unwilling to complete a sleep study. All participants needed to have at least one caregiver who could read and write primarily in English or Spanish. Written informed consent or assent was obtained from the participants with DS, based on their age and legal guardianship status. As needed, written informed consent was obtained from parents and/or legal guardians. This study was approved by the Massachusetts General Brigham Institutional Review Board (protocol 2012P002062).
The participants in the development phase were previously described (Skotko et al., 2017).
2.2 Study procedures
Prior to their first clinical visit, one parent/guardian of the patient with DS completed two previously validated questionnaires about the sleeping habits of their son or daughter with DS: the Sleep-Related Breathing Disorders (SRBD) Scale of the Pediatric Sleep Questionnaire and the Children's Sleep Habits Questionnaire (Chervin et al., 2000; Owens et al., 2000). The survey instruments were available in both English and Spanish and were both used in the development phase of our research (Skotko et al., 2017).
Each study participant then had one in-person visit followed by one overnight polysomnogram. The purpose of the in-person visit was to collect the objective measurements that were part of the previously published predictive model (Skotko et al., 2017). These included weight, height, blood pressure, hypertension percentile, awake SpO2, neck circumference, presence of macroglossia, Mallampati score classification, and Friedman score classification. We also recorded if the patient was being medically treated for asthma, thyroid disease, and gastroesophageal reflux disorder. Age, gender, ethnicity, and race were collected, as they were also variables in the previously published predictive model (Skotko et al., 2017).
Each overnight polysomnography was performed at either the Massachusetts General Hospital for Children Sleep Laboratory (for pediatric-aged participants) or the Massachusetts General Hospital Sleep Center (for adult-aged participants). One board-certified physician analyzed all of the pediatric-aged polysomnograms based on the American Academy of Sleep Medicine (AASM)'s Manual for the Scoring of Sleep and Associated Events Respiratory Rules for Children (American Academy of Sleep Medicine, 2015), and was blinded to other aspects of the study. Another board-certified physician analyzed all of the adult-aged polysomnograms and was similarly unaware of the findings using the predictive model. For adult-aged participants, hypopnea was defined as >30% decrease in airflow or respiratory effort, by nasal pressure signal excursion, lasting ≥10 s, in association with a 4% or greater oxygen desaturation. Apnea was defined as ≥90% drop in thermal sensor excursion lasting ≥10 s. The apnea–hypopnea index (AHI), calculated as the total hypopneas and apneas per hour of total sleep time (TST), was the main outcome measure. The scoring for the polysomnograms in the pilot phase was previously described (Skotko et al., 2017). We also included two exploratory variables from the polysomnograms, themselves: the “3% oxygen desaturation index” (i.e., the number of SpO2 drops by 3% or more per hour of sleep) and a component of the oximetry signal spectral decomposition previously identified as predictive of OSA in neurotypical children (“the third statistical moment of the full spectrum”).
Additional exploratory variables included the measurement of urine metabolic markers that had been previously studied in people with DS (Elsharkawi et al., 2017). Assays included epinephrine, norepinephrine, dopamine, serotonin, glycine, taurine, γ-aminobutyric acid (GABA), glutamate, phenylethylamine, aspartic acid, histamine, 3,4-dihydroxyphenylacetic acid, 5-hydroxyindoleacetic acid, tyramine, and tryptamine. Urine creatinine level was measured for each sample, and individual urine neurotransmitter levels were corrected for corresponding urine creatinine concentration. Urine samples were collected, whenever possible, from participants on the night of and morning after the polysomnogram and analyzed as previously reported (Elsharkawi et al., 2017; Skotko et al., 2017).
2.3 Statistical analyses
A priori estimates indicated that a sample size of 80 participants with completed sleep studies in the validation dataset would have 87% power to confirm an independent predictor of OSA identified during the discovery phase with the following assumptions: the predictor is normally distributed, the odds ratio over 1 SD for predicting OSA is at least 2.0, a two-tailed test with p < 0.05 is considered significant, and the prevalence of OSA is 50% in the cohort. Study data were collected and managed using Research Electronic Data Capture (REDCap) electronic data capture tools hosted at Massachusetts General Hospital (Harris et al., 2009, 2019).
A model aiming to predict three levels of OSA severity (none: AHI ≤ 1/hTST; mild: 1 < AHI ≤5/hTST; moderate to severe: AHI > 5/hTST) was developed using a logic learning machine (LLM) by using the Rulex 4.5 suite (https://www.rulex.ai/), previously described (Skotko et al., 2017).
3 RESULTS
3.1 Participants
At Massachusetts General Hospital, 84 new participants were consented to participate in this study. Of these, 64 completed a sleep study. Of the 20 consented participants who did not complete a sleep study, 9 were lost to follow-up, 7 had parents who no longer wished to pursue a sleep study after consenting, 2 were determined to have met exclusionary criteria after consenting (both pursued an adenotonsillectomy before sleep study performed), 1 patient could not tolerate the sleep study, and 1 parent chose to pursue a sleep study out-of-state and results were not available. We compared our new sample of participants from Massachusetts General Hospital (MGH, N = 64) to the previously reported participants from Boston Children's Hospital (BCH; N = 102; Skotko et al., 2017). The previously reported participants were significantly younger (BCH: M = 7.2; SD = 4.3; MGH: M = 12.2; SD = 9.2; p < 0.001). As such, we combined the cohorts together and then randomly assigned them to a new training cohort (N = 82) and a new validation cohort (N = 84). These new cohorts did not differ significantly based on race, age, or hospital where they were evaluated (Table 1). Furthermore, the subjects in the two cohorts did not differ based on their SRBD total scores, CSHW total scores, and severity of OSA (Table 1).
Variable | Options | Overall (N = 166) | Validation cohort (N = 84) | Training cohort (N = 82) | p-value |
---|---|---|---|---|---|
Hospital | BCH | 102 (61.4%) | 46 (54.8%) | 56 (68.3%) | 0.07 |
MGH | 64 (38.6%) | 38 (45.2%) | 26 (31.7%) | ||
Sex | Female | 73 (44.0%) | 40 (47.6%) | 33 (40.2%) | 0.34 |
Male | 93 (56.0%) | 44 (52.4%) | 49 (59.8%) | ||
Race | White | 108 (65.1%) | 52 (61.9%) | 56 (68.3%) | 0.55 |
Black/African American | 13 (7.8%) | 9 (10.7%) | 4 (4.9%) | ||
Asian | 7 (4.2%) | 3 (3.6%) | 4 (4.9%) | ||
Other race | 10 (6.0%) | 7 (8.3%) | 3 (3.7%) | ||
Multiracial | 5 (3.0%) | 2 (2.4%) | 3 (3.7%) | ||
Unknown | 23 (13.9%) | 11 (13.1%) | 12 (14.6%) | ||
Age | M ± SD (range) | 9.1 ± 7.1 (3.0–34.7) | 9.3 ± 7.3 (3.0–32.6) | 8.9 ± 6.9 (3.0–34.7) | 0.71 |
SRBD total score | M ± SD (range) | 0.35 ± 0.22 (0.00–0.89) | 0.35 ± 0.22 (0.00–0.89) | 0.35 ± 0.22 (0.00–0.86) | 0.81 |
CSHQ Total score | M ± SD (range) | 47.4 ± 7.9 (33.0–73.0) | 46.9 ± 7.5 (33.0–65.7) | 48.0 ± 8.4 (33.3–73.0) | 0.42 |
BMI (percentile) | M ± SD (range) | 75.9 ± 24.8 (2.4–100) | 79.4 ± 21.1 (11.1–100) | 72.4 ± 27.8 (2.4–99.7) | 0.07 |
Days from first visit to sleep study | M ± SD (range) | 116 ± 149 (0–899) | 110 ± 119 (0–611) | 122 ± 176 (0–899) | 0.60 |
OSA | 0 ≤ AHI ≤1 | 67 (40.4%) | 35 (41.7%) | 32 (39.0%) | 0.51 |
1 < AHI ≤5 | 48 (28.9%) | 21 (25.0%) | 27 (32.9%) | ||
5 < AHI | 51 (30.7%) | 28 (33.3%) | 23 (28.0%) |
- Abbreviations: AHI, apnea–hypopnea index; BCH, Boston Children's Hospital; CSHQ, Children's Sleep Habits Questionnaire; BMI, body mass index; MGH, Massachusetts General Hospital; OSA, obstructive sleep apnea; SRBD: Sleep-Related Breathing Disorders (SRBD) Scale of the Pediatric Sleep Questionnaire.
3.2 Predictive model
A new LLM model was generated on the new training cohort using the variables in the final model of our previously published work—that is, demographics, survey instruments, physical examination, and medical history (Skotko et al., 2017). We generated additional LLM models adding the exploratory urinary biomarker and oximetry variables described above.
In our validation sample, 49 (58.3%) participants fulfilled criteria for OSA—that is, AHI >1/hTST (Table 2). The best LLM model had a validated NPV of 50% (95% CI: 23.0%–77.0%) for an AHI >1/hTST and 73.7% (95% CI: 56.9–86.6%) for AHI >5/hTST. The positive predictive values (PPV) were 60% (95% CI: 47.6%–71.5%) and 39.1% (95% CI: 25.1%–54.6%), respectively. The specificity was 20.0% and 50.0% and the sensitivity was 85.7% and 64.3%, respectively. The false negative rates were 14.3% and 35.7%, and the false positive rates were 80.0% and 50.0%, respectively (Table 2 and Figure 1).
Results from polysomnograms | ||||
---|---|---|---|---|
0 ≤ AHI ≤1 | 1 < AHI ≤5 | AHI >5 | Total | |
Predicted results | ||||
0 ≤ AHI ≤1 | 7 | 5 | 2 | 14 |
1 < AHI ≤5 | 15 | 1 | 8 | 24 |
AHI >5 | 13 | 15 | 18 | 46 |
Total | 35 | 21 | 28 | 84 |
- Abbreviation: AHI, apnea–hypopnea index; OSA, obstructive sleep apnea.

For AHI >5/hTST, the negative likelihood ratio was 0.71, and the positive likelihood ratio was 1.29. As such, the diagnostic odds ratio was 1.8, meaning that a patient has a 1.8-fold higher odds of moderate/severe OSA when the model predicts moderate/severe OSA than when the model does not predict moderate/severe OSA. Or conversely, a patient has a 1.8-fold higher odds of not having moderate/severe OSA when the model does not predict moderate/severe OSA than when the model does predict moderate/severe OSA.
When the urinary biomarkers were added as exploratory variables into the LLM, the diagnostic odds ratio did not improve (Table S1). When the overnight oximetry values were added as exploratory variables and a new LLM was generated, the NPV slightly improved to 74.5% and the diagnostic odds ratio improved slightly to 2.4 when the urinary biomarkers were not included (Table S2). The NPV was 72.4%, and the diagnostic odds ratio was 2.3 when the urinary biomarkers were also included (Table S3).
Across the four models, the strongest predictive accuracy for AHI >5/hTST was a PPV of 46.2%, an NPV of 75.0%, and a diagnostic odds ratio of 2.4 (Table S4 and Figure 1).
4 DISCUSSION
In our previously published work, we developed a model that had a cross-validated NPV of 90% for moderate or severe apnea (AHI >5/hTST) in people with DS between the ages of 3–24 years (Skotko et al., 2017). This promising result suggested that a “negative” result could obviate the need for a polysomnogram, using a screen of low-cost measures that could be collected in the primary care office. The current validation study—an important next step before such a model could be routinely used in a clinical setting—yielded suboptimal results. In this validation cohort, the model had a cross-validated NPV of 74% for moderate or severe apnea (AHI >5/hTST) in people with DS between the ages of 3–35 years. This means that a “negative” screen accurately predicted no or mild apnea 75% of the time. Given the substantial impact that untreated moderate to severe OSA can impose on the intellectual abilities (Breslin et al., 2014) and health (Andreou et al., 2002; Capone et al., 2013; Chen et al., 2013) of people with DS, providing false reassurance to 25% of people with DS would not be clinically justified. The NPV also did not substantially improve when other exploratory variables, such as urinary biomarkers and oximeter readings, were included.
Similar to our previously published work, the models also did not have strong PPVs. A “positive” screen on the validation models accurately predicted moderate or severe apnea <50% of the time. The diagnostic odds ratio of 3.1 in our previous work (Skotko et al., 2017) was reduced to 2.4 in our best validation model in this study. Previous research has demonstrated that positive likelihood ratios above 10 and negative likelihood ratios below 0.1 have been noted as providing convincing diagnostic evidence, whereas those above 5 and below 0.2 give strong diagnostic evidence (Deeks, 2001). The results of our best validation model did not meet either of these clinical thresholds.
There can be several explanations for these differences. First, the LLM model in the previous work was likely an overfit of the available data, an issue that occurs in any data modeling to some degree, but one that can be substantial when applying flexible machine learning techniques to small training sets. When validated on a novel sample, the model did not perform as strongly. In addition, the broad phenotypic diversity of people with DS may have further diminished the generalizability of the model predictions when applied to a different population set. We sought to include the physical characteristics and co-occurring medical conditions most likely to predict OSA in our original modeling. Any unmeasured variables should be explored in future studies.
While the current study did not find a combination of variables that could reliably screen for OSA in people with DS, many novel elements were evaluated across both studies, including lateral cephalograms, 3D photogrammetry, and urinary biomarkers (Skotko et al., 2017). Future research could avoid duplicating these efforts, while building upon those elements that seemed to yield more robust contributions to the predictive model, including physical exam findings, caregiver-completed surveys, and anthropometric measurements. Other researchers have also attempted but not succeeded in producing reliably predictive OSA screens for this population (de Miguel-Diez et al., 2003; Jheeta et al., 2013; Marcus et al., 1991; Shires et al., 2010; Shott, 2006; Stores & Stores, 2014).
This study was not without limitations. Children under the age of 3 years were not included given the anticipated difficulty for them to comply with the more involved study elements required for inclusion in the model. The promising noninvasive elements (e.g., caregiver-completed questionnaires and simple physical examination findings) might show more promise in a younger cohort. Our primary outcome measure was total AHI, which is composed of the central apnea index and the obstructive AHI. A possibility remains that some variables might have been predictive of these individual components. This study also did not include the totality of possible predictive measures. We did not include objective measurements of participants' intellectual abilities, although we are not aware of any previous research demonstrating a pathogenic relationship between intellectual capacity and OSA. As technology advances—particularly personal wearable devices tracking sleep quality (Alma et al., 2022)—future research efforts might still be valuable.
In summary, notwithstanding extensive efforts, a clinically reliable screening tool for OSA in people with DS was not achieved. As such, people with DS should continue to be monitored according to current healthcare guidelines. The American Academy of Pediatrics (AAP) recommends that physicians screen annually for symptoms of OSA in people with DS. Regardless of symptoms, though, all children with DS should be referred for a polysomnogram between 3 and 4 years of age, given the high prevalence of OSA in this population (Bull et al., 2022). While the medical guidelines for adults with DS do not address screening for OSA (Tsou et al., 2020), a recent literature review and expert consensus suggested that adults should also be regularly evaluated for symptoms, as the OSA burden remains significant (Capone et al., 2018; Chicoine & McGuire, 2010; Jensen & Bulova, 2014).
AUTHOR CONTRIBUTIONS
Conceptualization: Brain G. Skotko, Alexandra Garza Flores, David Gozal.; Data curation: Alexandra Garza Flores, Mary Ellen McDonough, and Vasiliki Patsiogiannis. Formal analysis: Damiano Verda, Marco Muselli, David Gozal, and Eric A. Macklin. Funding acquisition: Brain G. Skotko. Investigation: Brain G. Skotko, Alexandra Garza Flores, Ibrahim Elsharkawi, Vasiliki Patsiogiannis, and Mary Ellen McDonough. Methodology: Brain G. Skotko, Alexandra Garza Flores, Ibrahim Elsharkawi, Vasiliki Patsiogiannis, Mary Ellen McDonough, Roberto Hornero, and David Gozal. Project administration: Brain G. Skotko, Alexandra Garza Flores, Mary Ellen McDonough, and David Gozal. Resources: Brain G. Skotko. Software: Vasiliki Patsiogiannis and Marco Muselli; Supervision: Brain G. Skotko and David Gozal. Validation: Eric A. Macklin. Visualization: Eric A. Macklin. Writing-original draft: Brain G. Skotko. Writing – review and editing: Brain G. Skotko, Alexandra Garza Flores, Ibrahim Elsharkawi, Vasiliki Patsiogiannis, Mary Ellen McDonough, Vasiliki Patsiogiannis, Marco Muselli, Roberto Hornero, David Gozal, and Eric A. Macklin.
ACKNOWLEDGMENTS
We thank all of the sleep center technologists, nurses, and staff at Massachusetts General Hospital Adult Sleep Lab, especially Karen Gannon, and Massachusetts General Hospital for Children Pediatric Sleep Disorders Program, especially Ellen Grealish. We further thank Drs. Bernard Kinane and Matt Bianci for their interpretations of the polysomnograms. Drs. Allie Schwartz, Jose Florez, and Jessica McCannon helped with patient recruitment; Jennifer Dever and Leah McDonough provided valuable research assistance. Dr. Cynthia Morton provided the opportunity for Alexandra Garza Flores to participate in this study through T32 grants. We thank Shimon Sharon and Yossi Shamir of ToolsGroup and Ester Pescio of Rulex, Inc., for providing access to the statistical analyses. We are grateful to Nonin for donating the WristOx2 Model 3150 used in this study.
FUNDING INFORMATION
HRSA MCHB, Grant number: R40MC25322-01-00; NICHD, Grant number: F32HD068101; NIH Grant number: 4T32GM007748-37 and 4T32GM007748-38.
CONFLICT OF INTEREST
Dr. Skotko occasionally consults on the topic of Down syndrome through Gerson Lehrman Group. He receives remuneration from Down syndrome nonprofit organizations for speaking engagements and associated travel expenses. Dr. Skotko received annual royalties from Woodbine House, Inc., for the publication of his book, Fasten Your Seatbelt: A Crash Course on Down Syndrome for Brothers and Sisters. Within the past 2 years, he has received research funding from F. Hoffmann-La Roche, Inc., AC Immune, and LuMind IDSC Down Syndrome Foundation to conduct clinical trials for people with Down syndrome. Dr. Skotko is occasionally asked to serve as an expert witness for legal cases where Down syndrome is discussed. Dr. Skotko serves in a nonpaid capacity on the Honorary Board of Directors for the Massachusetts Down Syndrome Congress and the Professional Advisory Committee for the National Center for Prenatal and Postnatal Down Syndrome Resources. Dr. Skotko has a sister with Down syndrome.
Open Research
DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.