Bronchopulmonary dysplasia: A problem of prediction or a problem of diagnosis?
While mortality and most morbidities are decreasing in extremely low gestational age neonates (ELGAN), bronchopulmonary dysplasia (BPD) rates are not.1, 2 Despite a huge number of research papers on BPD (more than 2500 in the last 5 years in PubMed alone), this reflects the many failures and few successes in preventing the disease.
If we are not able to prevent it, are we at least able to predict it—which would be a prerequisite to better define populations for clinical studies? In this issue of Acta Paediatrica, Baker and Davis3 use the NICHD ‘BPD outcome estimator’4 on an Australian sample of ELGAN and find that its discriminatory performance has an area under the ROC curve (AUC) of 0.81-0.84 (depending on the day of life of the infant) in predicting the composite outcome of BPD or death.
Their results roughly confirm a previous meta-analysis that had tested 19 clinical models for BPD prediction.5
Surely, an AUC of 0.8 is too low to allow clinical decisions on an individual patient, and prediction of BPD and death is easier than predicting BPD alone.5 But how about populations? The AUC can be interpreted as the probability that the result of the score of a randomly selected infant with BPD will be greater than the result of the score of a randomly selected non-BPD infant. Therefore, an AUC of 0.8 implies a 20% error in discrimination.
Moreover, the NICHD score much underestimated real BPD risk of the sample—that is it is ‘uncalibrated’. Poor calibration is also a problem frequently encountered with BPD prediction.5 The AUC value is agnostic to poor calibration because it is based on the ranks of scores, not their actual values. But for clinicians and parents, this error matters.
Why is it so difficult to predict BPD? One reason is the high inter-hospital variability of BPD frequency, greater than that of other morbidities,1 probably partly related to healthcare-associated, non-biological reasons. This makes it difficult to generalise findings from one study to another.
Another reason would be using wrong predictors. The NICHD algorithm uses 5 variables: GA, birth weight, sex, mean FiO2 on a day and mode of ventilation. One might rightly argue that these data are too coarse to allow a fine-tuned prediction. Moreover, the importance of these variables (and the amount of information provided) is vastly different. GA and birth weight convey much more information than sex and result in a sort of ceiling effect. For instance, at ≤24 weeks more than 80% of infants have BPD.6 This baseline risk—before knowing other information on an individual baby—is already highly shifted towards BPD, and other variables have limited scope to increase or decrease the risk.
But low granularity of our prediction variables is only part (and perhaps a small part) of the problem. A second reflection concerns the definition of BPD which is very different from other diagnoses.
During the last 50 years, from the initial description of BPD by Northway et al, the disease itself and its definitions have changed.6, 7 The inadequacies of definitions of BPD have been recently reviewed,6, 8 focusing especially on changes in the population of neonates at risk of BPD and in available modes of ventilation.
Clearly, the complexity of BPD puts it outside the scope of this commentary.6-8 I would like to consider here another critical aspect of the diagnosis, which sets BPD on a category of its own and impacts on prediction as well: the constraint of diagnosing BPD at 36 weeks. This feature was absent in Northway definition and was introduced by Shennan in 1988, repeated by the NICHD in 20016 (the 2 most frequently used definitions of BPD), and is by now entrenched in the mind of all neonatologists, despite BPD starts much earlier than 36 weeks, even prenatally.6
The rationale of choosing the 36-weeks timeline was to maximise the prediction of long-term adverse pulmonary outcomes in infancy—that is a predictive, surrogate diagnosis.
It is worth noting that for other complications of prematurity that do not occur immediately after birth but take some days or weeks to develop, such as periventricular leucomalacia and retinopathy of prematurity, a strict timepoint for evaluation and diagnosis is not used. For instance, the neonatological community considers an infant as having ROP if s/he has ROP at any time after birth, not at a prespecified point at 36 weeks. Nor is ‘ROP at 36 weeks’ chosen for its ability to predict later myopia or other eye problems.
Thus, BPD suffers of a dual nature: that of a genuine disease of preterm infants and that of a surrogate outcome (though the ability of a BPD diagnosis to predict later lung conditions is rather low6, 7).This second, surrogate, aspect has prevailed, obscuring the first. Any review of BPD contains details on the pathophysiology of cardiopulmonary impairment. Studies on the respiratory course of ELGAN neonates demonstrate that lung disease in the first 14 days follows well recognisable different trajectories in these infants—with different outcomes and ‘BPD-at-36-weeks’ risks.6 Different clinical phenotypes of BPD have been described.9 Yet, the diagnostic process squanders this information and is only based on the level of respiratory support.
Neonates who according to current definitions have respiratory problems between the first week of life (when symptoms can be due to respiratory distress syndrome, RDS) and 36 weeks post-menstrual age (when it is BPD) live or die in a diagnostic limbo. They do not have RDS anymore and cannot have BPD.
Until now, if they died of respiratory insufficiency—not an infrequent occurrence—they could not be counted as BPD deaths. Now things are changing: the recent NICHD workshop8 has finally proposed that infants dying because of respiratory failure between 14 days and 36 weeks have severe (grade IIIA) BPD.
Still, infants who die for other reasons before 36 weeks are not eligible to be at risk of BPD; therefore, death and BPD are ‘competitive events’. As epidemiologic theory has clearly shown, if we examine only the survivors (or survivors + BPD grade IIIA deaths), bias in determining risk factors and their contribution to outcomes will ensue due to a ‘collider-stratification’ (selection) bias.10
Though selection bias refers to causal studies, its effect impacts on predictive studies as well. This problem is little considered in practice: only 5% of the papers examined by Hines et al7 reported if mortality before 36 weeks was considered. But selection bias can produce wrong results11: in the presence of censoring by death, neither adjusting for selection bias, nor not adjusting for it, are viable choices.11
The alternative chosen by Baker and Davis (and by innumerable other neonatologists) is to use a composite outcome (‘death or BPD’). In many randomised trials, this strategy is the default approach to the handling of competing risks. Because randomisation on average cancels out confounding, this strategy is sound as we can assume that all outcomes are in fact caused by the treatment.
Vice versa, in observational studies, where confounding exists and the assumption of only one cause does not hold, creating a composite outcome (i) attributes the same weight to death and BPD, which is questionable and (ii) causes the estimates of risk factors—hence their weight in prediction—to depend on the frequency of the 2 outcomes.11 This is the case of ventilation, which worsens BPD risk but is a life-saving procedure. Interestingly, in Baker and Davis study3 the predictive ability did not change much from day 0 to day 28, as if ventilation mode and FiO2 did not add much to the current diagnostic framework of BPD.
In the original NICHD study,4 death and BPD had about the same frequency (13% each, 26% in total), versus the present study3 where the relative frequency of death to BPD was 1:4, and the total frequency 48%: the higher overall frequency explains the poor calibration discussed above.
What is the way out of this? The new NICHD classification8 accounts for respiratory deaths before 36 weeks, though for other causes of death the problem of competitive events remains.
BPD is a very complex syndrome,6, 8, 9 not a state of an infant at a certain time as if the finish line of a race. A modest proposal would be to take back BPD diagnosis to indicate a not uncommon clinical condition, elaborating on its pathophysiological substrates, employing diagnostic criteria shared within the neonatal community, free to occur after the first week of life and relinquishing the 36-week timeline whose merit appears to be a (misclassified) surrogate of a later diagnosis of pulmonary/vascular disease.
The 36-week timeline—suitably renamed—could be maintained to allow comparisons of secular trends in grouped data, but not as a neonatal outcome proper in individual subjects.
CONFLICTS OF INTEREST
The author has no conflicts of interest to declare.