Volume 2025, Issue 1 2048711
Research Article
Open Access

Validation of the Frailty-Adjusted Prognosis Tool for 30-Day Mortality in Older Emergency Department Patients

A. Z. Szczesna

A. Z. Szczesna

Emergency Department , University Hospital Basel , University of Basel , Petersgraben 2, Basel , CH-4031 , Switzerland , unibas.ch

Search for more papers by this author
S. K. Nissen

S. K. Nissen

Department of Clinical Medicine , University of Southern Denmark , Campusvej 55, Odense , DK-5230 , Denmark , sdu.dk

Department of Emergency Medicine , Odense University Hospital , J. B. Winsløws Vej 4, Odense , DK-5000 , Denmark , ouh.dk

Search for more papers by this author
M. Brabrand

M. Brabrand

Department of Clinical Medicine , University of Southern Denmark , Campusvej 55, Odense , DK-5230 , Denmark , sdu.dk

Department of Emergency Medicine , Odense University Hospital , J. B. Winsløws Vej 4, Odense , DK-5000 , Denmark , ouh.dk

Search for more papers by this author
R. Bingisser

R. Bingisser

Emergency Department , University Hospital Basel , University of Basel , Petersgraben 2, Basel , CH-4031 , Switzerland , unibas.ch

Search for more papers by this author
C. H. Nickel

Corresponding Author

C. H. Nickel

Emergency Department , University Hospital Basel , University of Basel , Petersgraben 2, Basel , CH-4031 , Switzerland , unibas.ch

Search for more papers by this author
First published: 04 July 2025
Academic Editor: Nikhat Kaura

Abstract

Aim: Accurate prognostication in the older population in the emergency department (ED) is a crucial but difficult skill. Both frailty and vital signs are independent predictors of mortality, but relying on vital signs alone underestimates risk. The Frailty-adjusted Prognosis tool (FaP-ED) was developed to predict 30-day mortality in older ED patients by combining vital signs with degree of frailty. We aim to validate FaP-ED in an independent ED population for 30-day mortality prediction.

Method: This study is based on a single-centre, observational prospective cohort of undifferentiated consecutive ED patients ≥ 65 years. FaP-ED combines the National Early Warning Score (NEWS) and the Clinical Frailty Scale (CFS) in multivariable logistic regression. We assessed discrimination of FaP-ED with area under the receiver operating characteristic (AUROC) and calibration using slope and intercept.

Results: Among 1166 analysed patients, median age was 78%, and 53.1% were female. In total, 2.7% died within 30 days of presentation to ED. The median NEWS was 1.0 and the median CFS was 3. FaP-ED showed good discrimination with an area under the curve (AUC) of 0.84 in comparison with NEWS (AUC 0.82) and CFS (AUC 0.79) as well as good calibration (slope 1.05; intercept −0.27) compared to NEWS (slope 1.13; intercept −0.24) and CFS (slope 0.95; intercept −0.57).

Conclusion: FaP-ED showed robust prognostic performance in temporal validation, with less biased estimates than NEWS and CFS alone. It could be implemented as an integral adjunct in addition to holistic, pragmatic, patient-centred care of the older population.

Trial Registration: ClinicalTrials.gov identifier: NCT05400707

1. Introduction

Emergency physicians are seeing an increasing number of older patients presenting with higher levels of comorbidity, frailty and acuity [1]. Recognition of illness severity and the implications thereof are crucial to guide appropriate management in this vulnerable and complex population.

Early Warning Scores (EWSs) are tools based on vital signs designed to identify patients at higher risk of clinical deterioration and mortality, so that appropriate measures can be initiated [2]. The National Early Warning Score (NEWS) is a frequently used tool to predict clinical deterioration and risk of 24 h mortality [3, 4]. However, vital signs are neither sensitive nor specific for the prediction of severe illness, intensive care unit (ICU) admission or death in older adults [5]. Recent findings have shown that age substantially affects the ability of NEWS to predict 24 h mortality, particularly in patients aged 80 years and older, where a severe underestimation of in-hospital mortality has been observed [6].

Frailty has been shown to be an independent predictor of mortality [710] and is an even stronger predictor of mortality than age [11]. It is defined as a state with decreased reserve and limited resistance to stressors which in turn increases the risk of adverse outcomes [1214]. Frailty is common among older patients in the emergency department (ED) with reported prevalence rates between 21% and 67% [14, 15]. A recent international flash mob study involving 14 European countries reported that 40% of older adults in ED were living with frailty [16]. It has been suggested that incorporating frailty into the assessment of older patients in ED could allow for more accurate evaluation, enabling timely and appropriate clinical actions [13]. There are many different tools to assess frailty and patient vulnerability, notably the gold standard comprehensive geriatric assessment (CGA) which requires considerable resources [17]. The Clinical Frailty Scale (CFS) is a pragmatic frailty assessment tool, which takes less than a minute to complete on average [18], and has been validated for use in ED [14, 19, 20]. CFS independently predicts hospital admission, 1-month survival after ICU admission, survival 1 year after traumatic injury, outcomes after cardiopulmonary resuscitation, hospital length of stay and in-hospital, 30-day and 1-year mortality [14, 2125].

Given that frailty outperforms age in terms of mortality prediction [11], we see the need for frailty-adjusted tools to improve predictive accuracy. The FaP-ED tool was developed to assess 30-day mortality in older adults by combining frailty and aggregated vital signs in multivariable logistic regression. Internal validation showed accurate risk assessment [26]. External validation, including temporal validation, is required to assess whether a prediction model is reproducible and generalisable to new and different patient groups [27]. We aim to temporally validate the FaP-ED tool on a new cohort of patients aged 65 and older, to determine its value as a useful adjunct for clinicians when making decisions regarding care.

2. Materials and Methods

2.1. Study Design

This study is based on a single-centre prospective cohort with consecutive sampling. All patients aged 65 years and above presenting to the ED at the University Hospital of Basel between April 25 and May 30, 2022, were assessed for inclusion 24 h a day, 7 days a week. Data collection conformed to the principles outlined in the Declaration of Helsinki, and the study was approved by the local ethics committee (https://www.eknz.ch, Nr. 236/13). The University Hospital of Basel is a tertiary care centre with an annual census of approximately 55,000 patients aged 16 and above; approximately one-third of patients are aged 65 and above [28]. Reporting of this study adheres to the Transparent Reporting of a multivariable model for Individual Prognosis or Diagnosis (TRIPOD) guidelines for prognostic modelling studies [29].

2.2. Selection of Participants

Patients aged 65 and above were consecutively screened for inclusion at presentation to the ED. Only index visits were included, meaning that if patients represented to the ED within 30 days during inclusion period, subsequent presentations were excluded from the analysis. Cognitively impaired patients were included to increase generalisability and minimise bias according to recommendations [30]. Given the all-comer nature and low risk category of the study, we did not obtain written consent. Patients were informed that they could withdraw from the study at any point.

2.3. Data Collection

Vital signs used for NEWS, namely, systolic blood pressure, heart rate, respiratory rate, oxygen saturation, use of supplemental oxygen, temperature and consciousness, were recorded as part of standard triage at presentation. NEWS was developed and updated by the Royal College of Physicians in 2012 and 2017, respectively, and is a widely used scoring system based on aggregated vital signs to predict clinical deterioration. The score assigns between 0 and 3 points per vital sign measured, with higher points equating to higher degree of abnormality; 1 point is allocated if the patient is receiving supplemental oxygen [3, 4]. We used an adaptation of the original NEWS score which assessed consciousness using the ACVPU scale. In case of any missing vital sign values, a subsequent chart review of digital patient records including scanned paper notes for all patients in question was performed, to ensure data completeness. All vital signs recorded within 30 min of patient arrival in ED were included.

The CFS is a commonly used score to assess frailty [31]. It was developed within the Canadian Study of Health and Aging and is a 9-level ordinal scale ranging from very fit (score 1) to living with very severe frailty (score 8), and score 9 is reserved for those who are terminally ill [32]. CFS was assessed by both triage clinicians (both nurses and liaison physicians) and study personnel. We primarily used triage clinicians’ CFS score for our analyses. If this was missing it was substituted by CFS score assigned by study personnel, given that good interrater reliability was demonstrated for both groups despite differences in clinical experience and training. Triage clinicians were taught using case studies, while study personnel received general training regarding CFS as well as training material provided in an application [28].

2.4. Outcomes

Primary outcome measure was 30-day all-cause mortality, obtained from patients’ electronic health records, official registries or insurance records. This measure was chosen as a pragmatic marker of prognosis which could help guide goals of care conversations, bearing in mind its limitations in a holistic medical context.

2.5. Statistical Analysis

FaP-ED uses multivariable logistic regression combining NEWS and CFS [26]. Patients with a CFS 9 (‘terminally ill’) were excluded from analyses due to the risk of overestimating the association between CFS and mortality, in accordance with previous studies [10, 14, 21, 26]. As done previously during development and internal validation, CFS 1 (‘very fit’) was combined with CFS 2 (‘fit’) due to absence of events. Given the very low rate of missing vital signs, we considered multiple imputation appropriate. Multiple imputation of missing vital signs was performed after chart review using the chained equation procedure [33], using the same parameters that were used in internal validation. Fits from internal validation were used for FaP-ED, NEWS and CFS to enable overall comparison [26].

Predictive performance was assessed using measures of discrimination and calibration. Area under the receiver operating characteristic (AUROC) was used to assess discrimination, while calibration was assessed using slope and intercept.

Using the outcome prevalence of 0.054 from internal validation [26], a minimal sample size of 575 with 32 events was deemed necessary with the pmsampsize package for R. All statistical analyses were performed using R, Version 4.3.1 [34].

3. Results

In total, 1349 patients were included in the cohort. Of these, 0.7% (N = 10) were undergoing either mechanical ventilation or cardiopulmonary resuscitation at presentation, CFS was missing for 8.4% (N = 113), 0.4% (N = 6) were terminally ill with CFS 9 and 4.0% (N = 54) were lost to follow up, leaving 86.4% (N = 1166) for analyses. A total of 126 vital sign values among 82 patients were missing in the dataset. A chart review of these patients was performed, allowing for an additional 64 vital sign values to be completed, leaving 62 missing vital signs among 43 patients which were multiply imputed. Possible reasons for incomplete study documentation for these patients are the high-pressure clinical environment and rapid onward transfer to ICU, catheter labs or operating theatres. Median age was 78 (IQR: 13), and 53.1% (N = 603) were female. In total, 2.7% (N = 31) died within 30 days of presentation to ED. The median NEWS was 1.0 (IQR: 3.0), and the median CFS was 3 (IQR: 2). Further patient characteristics are visible in Table 1.

Table 1. Baseline characteristics for survivors and nonsurvivors.
30-day survivors (N = 1135) Nonsurvivors (N = 31) All (N = 1166)
Age (years), median [IQR] 78 [13] 83 [14] 78 [13]
  
Sex, n (%)
Female 603 (53.1) 15 (48.4) 618 (53.0)
Male 532 (46.9) 16 (51.6) 548 (47.0)
  
ESI∗∗
Median, [IQR] 3.0 [1.0] 2.0 [1.0] 3.0 [1.0]
Missing, n (%) 4 (0.4) 0 (0.0) 4 (0.3)
  
NEWS∗∗
Median, [IQR] 1.0 [2.0] 5.0 [5.0] 1.0 [3.0]
Missing, n (%) 41 (3.6) 2 (6.5) 43 (3.7)
  
CFS∗∗, n (%)
1 43 (3.8) 0 (0.0) 43 (3.7)
2 219 (19.3) 3 (9.7) 222 (19.0)
3 332 (29.3) 0 (0.0) 332 (28.5)
4 190 (16.7) 3 (9.7) 193 (16.6)
5 149 (13.1) 2 (6.5) 151 (13.0)
6 126 (11.1) 8 (25.8) 134 (11.5)
7 65 (5.7) 10 (32.3) 75 (6.4)
8 11 (1.0) 5 (16.1) 16 (1.4)
  • Note: NEWS scores were calculated from vital signs available post-chart review, prior to multiple imputation procedure.
  • Sex refers to binary sex categorisation at birth.
  • ∗∗Emergency Severity Index (ESI), NEWS and CFS refer to scores obtained at presentation to ED.

FaP-ED demonstrated an AUROC of 0.84 (CI = 0.75–0.93), compared to NEWS (AUROC 0.82, CI = 0.74–0.90) and CFS (AUROC 0.79, CI = 0.68–0.89) (see Figure 1). FaP-ED was significantly better than CFS overall (DeLong: Z = −2.9, p < 0.05) as well as in the > 80 years age group (DeLong: Z = −2.8, p < 0.05). FaP-ED was not significantly better than NEWS in any age group (overall: DeLong: Z = −0.6, p 0.57; 65–80 years: DeLong: Z = −0.3, p 0.78; > 80 years: DeLong: Z = −0.2, p 0.87) or CFS in the 65–80 years age group (DeLong: Z = −1.1, p 0.29). Calibration of FaP-ED was also maintained with slope 1.05 and intercept −0.27 compared to NEWS (slope 1.13, intercept −0.24) and CFS (slope 0.95, intercept −0.57). Calibration plots for FaP-ED, NEWS and CFS can be found in the Supporting Information. Further details of prognostic performance including performance across different age categories can be viewed in Table 2. 30-day mortality likelihood ratios for respective NEWS and CFS scores are visible in Figure 2. Calibration plots for FaP-ED, NEWS and CFS for age categories 65–80 years, > 80 years and overall can be found in the Supporting Information.

Details are in the caption following the image
Receiver operating characteristic curves for FaP-ED, CFS and NEWS. AUROC with corresponding 95% CI is printed for all three tools.
Table 2. Calibration and discrimination measures for FaP-ED, NEWS and CFS.
Age Calibration Discrimination
Intercept Slope AUC 95% CI
FaP-ED
Overall −0.27 1.05 0.841 0.749 0.932
65–80 years −0.53 1.11 0.828 0.623 1.000
> 80 years −0.18 0.99 0.823 0.711 0.936
  
NEWS
Overall −0.24 1.13 0.816 0.736 0.897
65–80 years −0.61 1.26 0.802 0.612 0.992
> 80 years 0.12 1.09 0.815 0.727 0.903
  
CFS
Overall −0.57 0.95 0.788 0.683 0.892
65–80 years −0.94 0.95 0.778 0.547 1.000
> 80 years −0.52 0.89 0.759 0.633 0.884
  • Note: Intercept, slope and AUC are shown for overall age category, as well as 65–80 years and > 80 years, respectively.
Details are in the caption following the image
Likelihood ratios for 30-day mortality. This heatmap displays likelihood ratios for 30-day mortality for individuals aged ≥ 65 by NEWS and CFS with cutoffs at 5 and 10 for potential clinical impact [35].

4. Discussion

FaP-ED has maintained predictive accuracy in temporal validation, both in terms of discrimination and calibration [26]. The estimates according to FaP-ED are less biased than those according to NEWS and CFS alone.

There are two aspects to consider when assessing the performance of a prediction model: discrimination and calibration. Discrimination is defined as the ability of said model to discern between high and low-risk groups, while calibration is defined as the accuracy of the risk estimation, that is, the agreement between observed and predicted event rates [36, 37]. Calibration is a vital element of predictive performance; good discrimination between high- and low-risk groups is of questionable clinical relevance if poor calibration results in systematic risk over- or underestimation, leading in turn to under- or overtreatment and the potentially harmful consequences thereof.

Particularly in terms of calibration, it appears that NEWS and CFS are biased in opposite directions, and that FaP-ED mitigates this bias and is more robust in temporal validation despite increased complexity. In terms of discrimination and calibration for patients aged 80 and above, FaP-ED appears to be even more precise compared to NEWS and CFS. Overall, FaP-ED is less age-sensitive and more accurate than NEWS and CFS for predicting 30-day mortality.

Risk prediction in the older population is challenging, and relying on vital signs alone underestimates risk [5, 6, 13]. Various suggestions and attempts have been made to adapt EWS to increase their prognostic value, adjusting for age, sex or mobility being examples thereof [6, 38, 39]. Frailty is a reliable predictor of mortality [711, 14, 2125]. It includes premorbid mobility status and is arguably a more meaningful measure of functional reserve than age alone, making it a particularly appropriate factor to adapt EWS with. Previous systematic review has indicated a lack of clinical prediction tools in the ED that shift the odds of adverse outcomes before and after testing in meaningful way [35]. A positive likelihood ratio for 30-day mortality above 5 (a five-fold increase in odds), and especially 10 with FaP-ED as shown in Figure 2 represents an improvement over previously reported positive likelihood ratio values, which ranged from 1.26 to 1.74 for 6-week post-ED mortality for tools measuring constructs closely related to frailty premorbid functional ability, presence of delirium and malnutrition [35, 40]. These findings underscore the importance of combining illness severity and frailty assessment for individualised patient assessment.

In the face of the ageing population, it is important that clinicians appreciate what matters most to older patients, so that appropriate, patient-centred decisions regarding medical care can be made [41, 42]. Thirty-day mortality alone is a unidimensional outcome and must be interpreted with care. Nonetheless, the insight FaP-ED delivers regarding prognosis could provide an integral context for pragmatic and holistic discussions with patients and their relatives regarding further diagnostics and treatment. It may also help to identify patients who might benefit from further assessment, for example, in form of the CGA or components thereof [13].

The strengths of this study include the large cohort, consecutive sampling and the inclusion of patients with mild cognitive impairment in accordance with previous recommendations [30, 43]. A further strength is the handling of missing data. The chart review conducted allowed for more complete data for patients treated in resuscitation bays, where vital signs were consistently documented on paper by the clinical team but not always successfully transferred to study documentation by study personnel. Additionally, multiple imputation was performed for missing vital signs as recommended [44]. It has been shown that multiple imputation is a method which is not used frequently enough in the development and validation of EWS, despite the clear recommendation in its favour [37].

It is worth noting that mortality rates were lower in this cohort compared to the development cohort (5.4% for included patients), and that this could affect the performance of FaP-ED. One possible reason for the difference in mortality rate could be a difference in case mix. The recruitment procedure was identical in both years. We are not aware of any methodological differences which could have affected mortality rates.

Missing CFS entries were less frequent than those in other studies [16, 45]. In our setting, patients lacking a clinical team CFS entry were more frequently male, younger, were assigned lower triage scores (i.e., more urgent) and were more likely to be admitted to ICU [28].

Limitations include the fact that the recruitment periods for development and temporal validation did not differ in season (development: 18.03.–20.05.2019; temporal validation: 25.04.–30.05.2022) and that recruitment took place in the same location. An external validation study in a different ED, ideally in a different country, at a different time of year would further increase generalisability. Additionally, the number of missing CFS scores (N = 113, 8.4%) was considerably greater for our cohort compared to the development cohort (N = 40, 1.8%). This could be due to decreased awareness among clinical staff as well as high workload and time pressures.

In terms of future research required in this area, it would be useful to evaluate phase 3 of prediction modelling [46], namely, assessing the impact of FaP-ED on therapeutic management and patient outcomes. This could be done in the form of an implementation study measuring the effect on clinical outcomes such as hospital admission, length of stay or ICU admission as well as patient-centred outcomes. A further research question is the comparison of the predictive performance of FaP-ED with age-adjusted EWS such as the International Early Warning Score (IEWS) [39].

5. Conclusion

FaP-ED is a promising clinical prognostic tool which could complement holistic, pragmatic and patient-centred discussions between clinicians and patients and their relatives regarding future care. It offers a superior assessment of risk for older patients in EDs, surpassing the accuracy of the NEWS and CFS when used independently and less sensitive to changes over time. Further research regarding the implementation of FaP-ED is required, assessing whether it can improve clinical outcomes.

Nomenclature

  • ACVPU
  • Alert-confusion-voice-pain-unresponsive
  • AUC
  • Area under the curve
  • AUROC
  • Area under the receiver operating characteristic curve
  • CFS
  • Clinical Frailty Scale
  • CGA
  • Comprehensive geriatric assessment
  • ED
  • Emergency department
  • EWS
  • Early Warning Score
  • FaP-ED
  • Frailty-adjusted Prognosis in the Emergency Department
  • ICU
  • Intensive care unit
  • IEWS
  • International Early Warning Score
  • IQR
  • Interquartile range
  • NEWS
  • National Early Warning Score
  • TRIPOD
  • Transparent Reporting of a multivariable model for Individual Prognosis or Diagnosis
  • Disclosure

    The sponsor had no role in the design, methods, data collection, analysis or preparation of the manuscript. The funding body had no influence on study design, data collection, interpretation or decision on publication. An earlier version of the abstract was published as a supplement in Swiss Medical Weekly (Supplementum 276: Abstracts of the 8th Annual Spring Congress of the Swiss Society of General Internal Medicine, published 21.05.2024).

    Conflicts of Interest

    The authors declare no conflicts of interest.

    Funding

    This work​ was supported by the EMERGE scientific funds from the University Hospital Basel.

    Acknowledgements

    The authors acknowledge the administration and clinical team of the ED of Basel University Hospital for their help with data collection, with special thanks to Monika Stadler.

      Supporting Information

      Calibration plots for FaP-ED, NEWS and CFS for age categories 65–80 years, > 80 years and overall can be found in the supporting information.

      Data Availability Statement

      The data that support the findings of this study are available from the corresponding author upon reasonable request.

        The full text of this article hosted at iucr.org is unavailable due to technical difficulties.