ORIGINAL ARTICLE

Open Access

Developing a prediction model for all-cause mortality risk among patients with type 2 diabetes mellitus in Shanghai, China

中国上海地区建立的2型糖尿病患者全因死亡风险预测模型

Jiying Qi

Department of Endocrine and Metabolic Diseases, Shanghai Institute of Endocrine and Metabolic Diseases, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Shanghai National Clinical Research Center for Metabolic Diseases, Key Laboratory for Endocrine and Metabolic Diseases of the National Health Commission of the PR China, Shanghai Key Laboratory for Endocrine Tumor, State Key Laboratory of Medical Genomics, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Search for more papers by this author

Ping He,

Ping He

Link Healthcare Engineering and Information Department, Shanghai Hospital Development Center, Shanghai, China

Search for more papers by this author

Huayan Yao,

Huayan Yao

Computer Net Center, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Search for more papers by this author

Yanbin Xue,

Yanbin Xue

Computer Net Center, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Search for more papers by this author

Wen Sun,

Wen Sun

Wonders Information Co. Ltd., Shanghai, China

Search for more papers by this author

Ping Lu,

Ping Lu

Wonders Information Co. Ltd., Shanghai, China

Search for more papers by this author

Xiaohui Qi,

Xiaohui Qi

Department of Endocrine and Metabolic Diseases, Shanghai Institute of Endocrine and Metabolic Diseases, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Search for more papers by this author

Zizheng Zhang,

Zizheng Zhang

Department of Endocrine and Metabolic Diseases, Shanghai Institute of Endocrine and Metabolic Diseases, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Search for more papers by this author

Renjie Jing,

Renjie Jing

Department of Endocrine and Metabolic Diseases, Shanghai Institute of Endocrine and Metabolic Diseases, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Search for more papers by this author

Bin Cui,

Corresponding Author

Bin Cui

[email protected]

orcid.org/0000-0002-7293-5740

Department of Endocrine and Metabolic Diseases, Shanghai Institute of Endocrine and Metabolic Diseases, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Correspondence

Guang Ning and Bin Cui, Department of Endocrine and Metabolic Diseases, Shanghai Institute of Endocrine and Metabolic Diseases, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.

Email: [email protected] and [email protected]

Search for more papers by this author

Guang Ning,

Corresponding Author

Guang Ning

[email protected]

Department of Endocrine and Metabolic Diseases, Shanghai Institute of Endocrine and Metabolic Diseases, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Correspondence

Email: [email protected] and [email protected]

Search for more papers by this author

Jiying Qi,

Jiying Qi

Department of Endocrine and Metabolic Diseases, Shanghai Institute of Endocrine and Metabolic Diseases, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Search for more papers by this author

Ping He,

Ping He

Link Healthcare Engineering and Information Department, Shanghai Hospital Development Center, Shanghai, China

Search for more papers by this author

Huayan Yao,

Huayan Yao

Computer Net Center, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Search for more papers by this author

Yanbin Xue,

Yanbin Xue

Computer Net Center, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Search for more papers by this author

Wen Sun,

Wen Sun

Wonders Information Co. Ltd., Shanghai, China

Search for more papers by this author

Ping Lu,

Ping Lu

Wonders Information Co. Ltd., Shanghai, China

Search for more papers by this author

Xiaohui Qi,

Xiaohui Qi

Department of Endocrine and Metabolic Diseases, Shanghai Institute of Endocrine and Metabolic Diseases, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Search for more papers by this author

Zizheng Zhang,

Zizheng Zhang

Department of Endocrine and Metabolic Diseases, Shanghai Institute of Endocrine and Metabolic Diseases, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Search for more papers by this author

Renjie Jing,

Renjie Jing

Department of Endocrine and Metabolic Diseases, Shanghai Institute of Endocrine and Metabolic Diseases, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Search for more papers by this author

Bin Cui,

Corresponding Author

Bin Cui

[email protected]

orcid.org/0000-0002-7293-5740

Department of Endocrine and Metabolic Diseases, Shanghai Institute of Endocrine and Metabolic Diseases, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Correspondence

Email: [email protected] and [email protected]

Search for more papers by this author

Guang Ning,

Corresponding Author

Guang Ning

[email protected]

Department of Endocrine and Metabolic Diseases, Shanghai Institute of Endocrine and Metabolic Diseases, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Correspondence

Email: [email protected] and [email protected]

Search for more papers by this author

First published: 16 December 2022

https://doi.org/10.1111/1753-0407.13343

Jiying Qi, Ping He, and Huayan Yao contributed equally to this work.

Funding information: National Key R&D Program of China, Grant/Award Number: 2018YFC1314802

Share a link

Email
Wechat
Bluesky

Abstract

Background

All-cause mortality risk prediction models for patients with type 2 diabetes mellitus (T2DM) in mainland China have not been established. This study aimed to fill this gap.

Methods

Based on the Shanghai Link Healthcare Database, patients diagnosed with T2DM and aged 40-99 years were identified between January 1, 2013 and December 31, 2016 and followed until December 31, 2021. All the patients were randomly allocated into training and validation sets at a 2:1 ratio. Cox proportional hazards models were used to develop the all-cause mortality risk prediction model. The model performance was evaluated by discrimination (Harrell C-index) and calibration (calibration plots).

Results

A total of 399 784 patients with T2DM were eventually enrolled, with 68 318 deaths over a median follow-up of 6.93 years. The final prediction model included age, sex, heart failure, cerebrovascular disease, moderate or severe kidney disease, moderate or severe liver disease, cancer, insulin use, glycosylated hemoglobin, and high-density lipoprotein cholesterol. The model showed good discrimination and calibration in the validation sets: the mean C-index value was 0.8113 (range 0.8110–0.8115) and the predicted risks closely matched the observed risks in the calibration plots.

Conclusions

This study constructed the first 5-year all-cause mortality risk prediction model for patients with T2DM in south China, with good predictive performance.

摘要

背景: 中国大陆地区2型糖尿病(T2DM)患者全因死亡风险预测模型尚未建立, 本项研究旨在填补这一空白。

方法: 基于上海申康医疗数据库, 筛选出2013年1月1日至2016年12月31日期间年龄在40-99岁的T2DM患者, 随访至2021年12月31日。所有患者按2:1的比例随机分配到训练组和验证组。采用Cox比例风险模型建立全因死亡风险预测模型。通过鉴别(Harrell c指数)和校准(校准图)对模型性能进行评价。

结果: 在中位随访6.93年期间, 共有399784例T2DM患者最终入选, 其中68318例死亡。最终的预测模型包括年龄、性别、心力衰竭、脑血管疾病、中度或重度肾病、中度或重度肝病、癌症、胰岛素使用、HbA_1c和高密度脂蛋白-胆固醇。模型在验证组中表现出良好的判别能力, 平均c指数值为0.8113(范围为0.8110-0.8115)。在校准图中, 预测的风险与观测到的风险密切匹配, 表明模型校准良好。

结论: 本研究构建了中国南方地区首个T2DM患者5年全因死亡风险预测模型, 具有较好的预测性能。

1 INTRODUCTION

Diabetes is a chronic metabolic disease that severely threatens human health. More than 6.7 million adults were estimated to have died from diabetes or its complications in 2021, accounting for 12.2% of all-cause deaths worldwide.¹ Previous studies have noted that the all-cause mortality in patients with diabetes is almost twice that of those without diabetes; furthermore, the younger the age and the longer the duration of diabetes, the higher the risk of mortality.^2-7 Early screening of high-risk diabetic patients is undoubtedly crucial so that we can enhance clinical management and provide timely intervention, thereby reducing the risk of premature mortality. One of the practical and effective approaches is to develop all-cause mortality risk prediction models for the diabetic population.

In the past decade or so, many prediction models have been well developed in Western populations, such as the Estimation of Mortality Risk in Type 2 Diabetic Patients (ENFORCE) model,^{8, 9} the Risk Equations for Complications of Type 2 Diabetes (RECODe) model,¹⁰ the Building, Relating, Assessing, and Validating Outcomes (BRAVO) model,¹¹ and the UK Prospective Diabetes Study (UKPDS) Outcomes Model 2 model.¹² Due to differences in genetics, socioeconomic factors, and diabetes management approaches, these prediction models developed based on Western populations are often not directly applicable to Asian populations. Compared with other populations, Asian populations develop diabetes at a younger age, have a higher risk of complications, suffer longer from complications, and die earlier.^{7, 13-15} In contrast, prediction models constructed based on Asian populations are still limited, all from Hong Kong^16-19 and Taiwan^20-22 in China. To date, there are no relevant studies in mainland China. Therefore, we conducted this study to develop a 5-year all-cause mortality prediction model based on a large-scale population with type 2 diabetes mellitus (T2DM) in Shanghai, China.

2 METHODS

2.1 Data source

This study was conducted using the Shanghai Link Healthcare Database (SLHD), a representative clinical database covering ＞99% of the residents, developed and operated by the Shanghai Hospital Development Center (an administrative department of the Shanghai Municipal People's Government). In China, government-run hospitals are classified as primary (grade I), secondary (grade II), and tertiary (grade III) hospitals according to their capabilities in medical care, medical education, and medical research, with tertiary hospitals being the best. The Shanghai Hospital Development Center is responsible for monitoring 35 tertiary hospitals, all of which are required by administrative regulations to upload general medical practice data (i.e., outpatient visits, emergency department visits, and hospital admissions) to the SLHD. The SLHD has released data for academic research since 2013, which requires review and approval to access. Mortality data were obtained from the Shanghai Big Data Center.

Any personally identifiable information was scrambled to protect privacy, so the study was exempt from institutional review board approval because the researchers were blinded to patient identities. All diseases were identified according to the International Classification of Diseases, 10th Revision and relevant diagnosis (Table S1).

2.2 Study population

First, we identified patients aged 40–99 years who were diagnosed with T2DM between January 1, 2013 and December 31, 2016 (n = 418 730). Follow-up started from the date of the first diagnosis of T2DM until death or December 31, 2021, whichever came first. After excluding patients with <1 year of follow-up (n = 18 946), a total of 399 784 patients with T2DM were finally enrolled in the study (Figure S1).

2.3 Candidate predictors

Candidate predictors were selected based on data availability and clinical relevance, including age, sex, hypertension, dyslipidemia, diabetic complications, ischemic heart disease, peripheral vascular disease, heart failure, cerebrovascular disease, dementia, chronic lung disease, moderate or severe kidney disease, mild liver disease, moderate or severe liver disease, cancer, insulin, oral antidiabetic drugs, antihypertensive drugs, lipid-lowering drugs, anticoagulant drugs, aspirin, other antiplatelet drugs, nonsteroidal anti-inflammatory drugs, glycosylated hemoglobin (HbA1c), total cholesterol, high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), and triglyceride. Comorbidities and medications were assessed between the earliest date of recording and the end date of the 1-year lag period after cohort entry. Biochemical indicators were defined as the closest recorded values within 2 years before and after enrollment.

Multiple imputation was performed to handle missing values of biochemical indicators. The imputation model included candidate predictors and event indicators. A total of 10 datasets were imputed, and the data distribution of biochemical indicators in the original and imputed datasets indicated good imputation (Table S2). Subsequent analyses were repeated for each imputed dataset. Based on asymptotic theory, Rubin developed a set of rules to combine the estimates and standard errors (SEs) of each imputed dataset into an overall estimate and SE to provide valid statistical results.^{23, 24} The results of this study were combined by Rubin's rule or shown as the median, interquartile range (IQR), or full range of the 10 estimates.^{23, 24}

2.4 Statistical analysis

All the patients were randomly allocated into training and validation sets at a 2:1 ratio. Baseline characteristics between the training and validation sets were compared using the Student's t test for continuous variables and the chi-square test for categorical variables. Standardized mean differences (SMDs) were also calculated to assess the comparability of baseline characteristics between the training and validation sets, with values less than 10% indicating relative balance.

Cox proportional hazards models were used to develop the all-cause mortality prediction model. To account for the nonlinearity of the continuous variables (age, HbA1c, total cholesterol, HDL-C, and LDL-C), we transformed them using restricted cubic splines with four knots placed at the respective 5th, 35th, 65th, and 95th sample percentiles.²⁵ Variable selection was performed within each imputed training dataset. Initially, univariate Cox proportional hazards models were used to identify significant predictors. The major predictors were retained directly to form the basic model, including age and sex. Starting from the basic model, an additional candidate predictor was selected for inclusion in the multivariate Cox proportional hazards model at each step, and then the model was evaluated for improvement. Predictors incorporated into the model should significantly improve discrimination and integrated discrimination improvement, and reduce the Akaike information criterion. The Akaike information criterion is a measure of the model's goodness of fit, with a lower value indicating a better fit.²⁶ If a predictor was retained in at least 8/10 of the imputed training datasets, it would be eventually included. Interaction terms between age and other factors and between sex and other factors were also assessed, which did not significantly improve model performance. Predictors included in the final model were age, sex, heart failure, cerebrovascular disease, moderate or severe kidney disease, moderate or severe liver disease, cancer, insulin use, HbA1c, and HDL-C. Based on the selected variables, models were fitted in each of the 10 imputed training datasets. The estimates were combined using Rubin's rule to obtain coefficients and SEs, as well as hazard ratios (HRs) and 95% confidence intervals (CIs).

The developed model was applied to the 10 imputed validation datasets to evaluate model performance. Discrimination was assessed by estimating Harrell C-index, with higher values indicating better performance. Calibration was assessed by plotting the predicted event probabilities against the observed event probabilities. The risk prediction model was internally validated using 100 bootstrap samples. In addition, to examine the effect of missing data, we repeated the validation using complete case analysis. All statistical analyses were performed using R language software, version 4.1.2 (R Foundation for Statistical Computing, Vienna, Austria).

3 RESULTS

A total of 399 784 patients with T2DM were enrolled in this study. During a median follow-up of 6.93 years (IQR 5.64–8.31 years), 68 318 patients died, with an all-cause mortality of 2.54 per 100 person-years. The median age was 63 years (IQR 56–72 years), and 51.11% were male. Table 1 shows the baseline characteristics of patients with T2DM in the training set (n = 266 523) and validation set (n = 133 261). The SMD values for all baseline characteristics between the training and validation sets were much lower than 10%, implying that the two were well balanced.

TABLE 1. Baseline characteristics of the study population

Variables	Training set (n = 266 523)	Validation set (n = 133 261)	p value	SMD
Age, median (IQR)	63 (56–72)	63 (56–72)	0.981	0.001
Male, n (%)	136 408 (51.18)	67 905 (50.96)	0.182	0.004
Comorbidities, n (%)
Hypertension	112 973 (42.39)	56 736 (42.58)	0.260	0.004
Dyslipidemia	33 767 (12.67)	16 873 (12.66)	0.948	<0.001
Diabetic complications	34 522 (12.95)	17 526 (13.15)	0.079	0.006
Ischemic heart disease	51 025 (19.14)	25 598 (19.21)	0.630	0.002
Peripheral vascular disease	4333 (1.63)	2163 (1.62)	0.961	<0.001
Heart failure	10 268 (3.85)	5033 (3.78)	0.243	0.004
Cerebrovascular disease	40 491 (15.19)	20 418 (15.32)	0.285	0.004
Dementia	2244 (0.84)	1096 (0.82)	0.535	0.002
Chronic lung disease	24 764 (9.29)	12 482 (9.37)	0.445	0.003
Moderate or severe kidney disease	7018 (2.63)	3497 (2.62)	0.875	0.001
Mild liver disease	22 718 (8.52)	11 334 (8.51)	0.846	0.001
Moderate or severe liver disease	2698 (1.01)	1375 (1.03)	0.574	0.002
Cancer	19 191 (7.20)	9538 (7.16)	0.623	0.002
Medications, n (%)
Insulin	97 011 (36.40)	48 323 (36.26)	0.399	0.003
Oral antidiabetic drugs	180 598 (67.76)	90 145 (67.65)	0.464	0.002
Antihypertensive drugs	145 440 (54.57)	72 664 (54.53)	0.805	0.001
Lipid-lowering drugs	94 366 (35.41)	47 086 (35.33)	0.653	0.002
Anticoagulant drugs	5826 (2.19)	2932 (2.20)	0.780	0.001
Aspirin	76 110 (28.56)	38 112 (28.60)	0.780	0.001
Other antiplatelet drugs	46 286 (17.37)	22 946 (17.22)	0.246	0.004
Nonsteroidal anti-inflammatory drugs	73 898 (27.73)	37 057 (27.81)	0.592	0.002
Biochemical indicators, median (IQR)^a
HbA1c (%)	7.100 (6.300–8.400)	7.100 (6.300–8.400)
Total cholesterol (mmol/L)	4.660 (3.890–5.460)	4.660 (3.890–5.460)
HDL-C (mmol/L)	1.140 (0.950–1.380)	1.140 (0.950–1.380)
LDL-C (mmol/L)	2.800 (2.170–3.460)	2.800 (2.175–3.460)
Triglyceride (mmol/L)	1.420 (1.010–2.055)	1.430 (1.010–2.050)

Abbreviations: HbA1c, glycosylated hemoglobin; HDL-C, high-density lipoprotein cholesterol; IQR, interquartile range; LDL-C, low-density lipoprotein cholesterol; SMD, standardized mean difference.
^a Median of the values of the 10 imputed datasets.

Variable selection was performed in each of the 10 imputed training datasets. Predictors included in the final model were age, sex, heart failure, cerebrovascular disease, moderate or severe kidney disease, moderate or severe liver disease, cancer, insulin use, HbA1c, and HDL-C. Table 2 shows the results of univariate and multivariate Cox proportional hazards models for all-cause mortality in the training set, including coefficients (SEs) and HRs (95% CIs). In the multivariate Cox proportional hazards model, all variables were significant except for the second cubic term of the spline function for age and HDL-C. The 5-year all-cause mortality risk prediction equation for patients with T2DM was as follows, where

{(u)}_{+} = \{\begin{matrix} u, if u > 0, \\ 0, if u \leq 0 . \end{matrix}

\begin{array}{l} 100 \times (1 - {0.9980421}^{\exp} (0.06667041 \times age + 1.860926 e^{(- 05)} \\ \times {(age - 46)}_{+}^{3} - 1.435202 e^{(- 05)} \times {(age - 59)}_{+}^{3} \\ - 2.177196 e^{(- 05)} \times {(age - 68)}_{+}^{3} + 1.751472 e^{(- 05)} \\ \times {(age - 84)}_{+}^{3} + 0.24859731 \times (male = = TRUE) \\ + 0.42522929 \times (heart failure = = TRUE) \\ + 0.22311493 \times (cerebrovascular disease = = TRUE) \\ + 0.71501785 \times (moderate or severe kidney disease = = TRUE) \\ + 0.76391954 \times (moderate or severe liver disease = = TRUE) \\ + 0.77068201 \times (cancer = = TRUE) \\ + 0.45461855 \times (insulin use = = TRUE) \\ - 0.10718135 \times HbA 1 c + 0.03508194 \times {(HbA 1 c - 5.50)}_{+}^{3} \\ - 0.0776974 \times {(HbA 1 c - 6.60)}_{+}^{3} + 0.04498314 \\ \times {(HbA 1 c - 7.70)}_{+}^{3} - 0.002367679 \times {(HbA 1 c - 11.20)}_{+}^{3} \\ - 0.82047966 \times HDL ­ C + 0.8167382 \times {(HDL ­ C - 0.72)}_{+}^{3} \\ - 1.162871 \times {(HDL ­ C - 1.02)}_{+}^{3} + 0.07287736 \\ \times {(HDL ­ C - 1.27)}_{+}^{3} + 0.2732556 \times {(HDL ­ C - 1.85)}_{+}^{3})) \end{array}

TABLE 2. Results of univariate and multivariate Cox proportional hazards models for all-cause mortality in the training set

	Coefficient (SE)	HR (95% CI)	Coefficient (SE)	HR (95% CI)
	Univariate		Multivariate
Age
X	0.065 (0.004)	1.067 (1.060–1.075)	0.067 (0.004)	1.069 (1.061–1.077)
S₁	0.040 (0.009)	1.041 (1.022–1.059)	0.027 (0.009)	1.027 (1.009–1.046)
S₂	−0.064 (0.028)	0.938 (0.888–0.990)	−0.021 (0.028)	0.980 (0.928–1.034)
Sex	0.129 (0.009)	1.137 (1.116–1.158)	0.249 (0.011)	1.282 (1.256–1.309)
Heart failure	1.267 (0.016)	3.550 (3.443–3.660)	0.425 (0.016)	1.530 (1.481–1.580)
Cerebrovascular disease	0.752 (0.011)	2.122 (2.077–2.168)	0.223 (0.011)	1.250 (1.223–1.278)
Moderate or severe kidney disease	1.190 (0.018)	3.289 (3.172–3.410)	0.715 (0.019)	2.044 (1.968–2.123)
Moderate or severe liver disease	0.908 (0.033)	2.480 (2.324–2.646)	0.764 (0.034)	2.147 (2.010–2.293)
Cancer	0.937 (0.014)	2.551 (2.484–2.620)	0.771 (0.014)	2.161 (2.103–2.221)
Insulin use	0.752 (0.009)	2.122 (2.083–2.161)	0.455 (0.011)	1.576 (1.543–1.609)
HbA1c
X	−0.125 (0.020)	0.882 (0.848–0.918)	−0.107 (0.020)	0.898 (0.863–0.935)
S₁	1.205 (0.151)	3.336 (2.464–4.516)	1.140 (0.142)	3.126 (2.354–4.151)
S₂	−2.659 (0.341)	0.070 (0.035–0.139)	−2.524 (0.321)	0.080 (0.042–0.152)
HDL-C
X	−1.238 (0.061)	0.290 (0.257–0.328)	−0.821 (0.068)	0.440 (0.384–0.505)
S₁	1.686 (0.299)	5.395 (2.975–9.783)	1.043 (0.315)	2.837 (1.513–5.321)
S₂	−2.673 (0.838)	0.069 (0.013–0.366)	−1.485 (0.867)	0.227 (0.040–1.277)

Abbreviations: CI, confidence interval; HbA1c, glycosylated hemoglobin; HDL-C, high-density lipoprotein cholesterol; HR, hazard ratio; S₁, first cubic term; S₂, second cubic term; SE, standard error; X, linear term.

Example: 65 years old; male; history of heart failure, cerebrovascular disease, moderate or severe kidney disease, moderate or severe liver disease, and cancer; use of insulin; HbA1c level = 12.00%, HDL-C level = 1.00 mmol/L.

5-year all-cause mortality risk ≈ 86.94%

\begin{array}{l} 100 \times (1 - {0.9980421}^{\exp} (0.06667041 \times 65 + 1.860926 e^{(- 05)} \\ \times {(65 - 46)}_{+}^{3} - 1.435202 e^{(- 05)} \times {(65 - 59)}_{+}^{3} \\ + 0.24859731 + 0.42522929 + 0.22311493 \\ + 0.71501785 + 0.76391954 + 0.77068201 + 0.45461855 \\ - 0.10718135 \times 12.00 + 0.03508194 \times {(12.00 - 5.50)}_{+}^{3} \\ - 0.0776974 \times {(12.00 - 6.60)}_{+}^{3} + 0.04498314 \\ \times {(12.00 - 7.70)}_{+}^{3} - 0.002367679 \times {(12.00 - 11.20)}_{+}^{3} \\ - 0.82047966 \times 1.00 + 0.8167382 \times {(1.00 - 0.72)}_{+}^{3})) \end{array}

Table 3 shows the C-index of the 10 imputed validation datasets and complete dataset. The prediction model had good discrimination in all imputed validation datasets, with a mean C-index value of 0.8113 (range 0.8110–0.8115). In the internal validation of 100 bootstrap samples, the mean C-index value was 0.8113 (range 0.8110–0.8116). Figure 1 shows the calibration curves of the 5-year all-cause mortality risk prediction model in the 10 imputed validation datasets. All calibration curves were very close to each other. Although the risk in the middle part was slightly underestimated, overall, there was no significant difference between the predicted and observed risks, indicating that the prediction model was well calibrated. Sensitivity analysis using complete case analysis included 155 354 patients with T2DM, which also observed good discrimination (C-index 0.8087) and calibration (Figure S2).

TABLE 3. C-index of the imputed validation datasets and complete dataset

	C-index	95% CI	100 bootstrap C-index
Imputed validation datasets
1	0.8114	(0.8088–0.8140)	0.8114
2	0.8113	(0.8087–0.8140)	0.8114
3	0.8113	(0.8087–0.8140)	0.8113
4	0.8112	(0.8085–0.8138)	0.8112
5	0.8110	(0.8083–0.8136)	0.8110
6	0.8111	(0.8085–0.8138)	0.8111
7	0.8114	(0.8087–0.8140)	0.8114
8	0.8113	(0.8086–0.8139)	0.8113
9	0.8115	(0.8088–0.8141)	0.8115
10	0.8115	(0.8089–0.8142)	0.8116
Completed dataset	0.8087	(0.8063–0.8112)	0.8088

Abbreviation: CI, confidence interval.

Details are in the caption following the image — **FIGURE 1**
Open in figure viewer PowerPoint

Calibration curves of the 5-year all-cause mortality risk prediction model in the imputed validation datasets

4 DISCUSSION

This study constructed the first 5-year all-cause mortality risk prediction model for patients with T2DM in south China. The model achieved good discrimination and calibration by using highly accessible predictors collected in routine clinical settings, including age, sex, heart failure, cerebrovascular disease, moderate or severe kidney disease, moderate or severe liver disease, cancer, insulin use, HbA1c, and HDL-C.

Over the past decade or so, many all-cause mortality risk prediction models have been constructed for patients with T2DM. However, due to differences in ethnicity, genetics, socioeconomic factors, and disease management approaches, prediction models developed based on specific populations are usually not directly applicable to other populations. Several models have been established to predict the risk of all-cause mortality in Western populations with T2DM.^{8-12, 27-31} Although some of these models were developed based on multiethnic populations,^{10-12, 28-31} the vast majority included ethnicity as a significant predictor.^{10, 12, 28-31} Previous studies have pointed out some unique characteristics of Asian diabetic populations, such as younger age of onset of diabetes, higher risk of complications, and earlier time of death.^13-15 Nevertheless, the prediction models developed based on Asian populations cover very few regions, only Hong Kong and Taiwan in China.^16-22 It is necessary to construct specific all-cause mortality risk prediction models for patients with T2DM in mainland China, which could provide more accurate risk prediction.

Notably, the data sources used to develop the models can also affect predictive validity. Whether models developed based on non-real-world data can be well applied to the real world remains a question. It has been found that prediction models perform much less well in the clinical trial setting than in the real world.^{8-10, 32} This may be related to differences in patients' intrinsic motivation and health awareness. Volunteers participating in clinical trials may place greater value on health, which to some extent reduces their mortality risk, thus affecting model performance. In this study, we developed the risk prediction model based on real-world data to ensure its effectiveness when applied in real-world clinical settings.

The SLHD is the largest medical database in mainland China, which provided a good basis for the design and implementation of this study. We identified a number of candidate predictors based on data availability and clinical relevance. The final prediction model included age, sex, heart failure, cerebrovascular disease, moderate or severe kidney disease, moderate or severe liver disease, cancer, insulin use, HbA1c, and HDL-C, which balanced the predictive validity and parsimony of the model well.

Glycemic control is definitely crucial for patients with T2DM. Clinically, the main indicator used to evaluate glycemic control status is HbA1c, which reflects the average plasma glucose level over the past 2–3 months. Insulin, as the most effective class of antidiabetic drugs, is commonly used in patients whose plasma glucose levels cannot be well controlled by oral antidiabetic drugs. In general, patients with higher HbA1c levels or those using insulin have a longer duration and more severe disease. Therefore, HbA1c and insulin play a crucial role in predicting the risk of all-cause mortality among patients with T2DM.^{8-11, 17-22, 27, 33, 34} In addition, heart failure, cerebrovascular disease, moderate or severe kidney disease, moderate or severe liver disease, and cancer are all major diseases that severely threaten human health and life, and their effects on mortality are well recognized.^{10, 11, 18, 19, 22} HDL-C level is also a common predictor in all-cause mortality prediction models, which may be related to the fact that it is an important factor for cardiovascular disease.^{8-10, 18, 19} Regular screening, early identification, and active intervention for these diseases should be advocated in order to delay disease progression and reduce the risk of premature death.

This study has some limitations. The first is the data limitation. Although the SLHD is the largest medical database in mainland China, there is a relatively high proportion of missing data for some variables, such as biochemical indicators, duration of diabetes, body mass index, smoking, and alcohol consumption. It is worth noting that such missing data were usually not random and may indicate patient characteristics, health awareness level, and health status. Directly removing observations with missing values would introduce severe bias. Therefore, we performed multiple imputation to handle missing data (widely considered the best approach) and conducted complete case analysis to examine the effect of missing data. Second, due to resource constraints, the prediction model was only internally validated, and further external validation in other populations is advisable. Third, only a 5-year all-cause mortality risk prediction model was constructed in this study, and future studies with longer follow-up periods are needed to update the model to predict the mortality risk at 10, 15, and more years.

In conclusion, this study constructed the first 5-year all-cause mortality risk prediction model for patients with T2DM in south China, which achieved good discrimination and calibration by using highly accessible predictors collected in routine clinical settings. The model provides a powerful tool for clinicians to identify high-risk diabetic patients, which can enhance clinical management and facilitate timely intervention, ultimately improving survival quality and extending life expectancy.

ACKNOWLEDGEMENTS

This study was funded by the National Key R&D Program of China (No. 2018YFC1314802).

DISCLOSURE STATEMENT

The authors declare that there is no conflict of interest.

Supporting Information

REFERENCES

1 International Diabetes Federation. Diabetes Atlas (10th edition) 2021. https://diabetesatlas.org/atlas/tenth-edition/, Accessed February 2022.
Google Scholar
2Rao Kondapally Seshasai S, Kaptoge S, Thompson A, et al. Diabetes mellitus, fasting glucose, and risk of cause-specific death. N Engl J Med. 2011; 364(9): 829-841.
10.1056/NEJMoa1008862
PubMed Web of Science® Google Scholar
3Campbell PT, Newton CC, Patel AV, Jacobs EJ, Gapstur SM. Diabetes and cause-specific mortality in a prospective cohort of one million U.S. adults. Diabetes Care. 2012; 35(9): 1835-1844.
10.2337/dc12-0002
PubMed Web of Science® Google Scholar
4Tancredi M, Rosengren A, Svensson A-M, et al. Excess mortality among persons with type 2 diabetes. New Engl J Med. 2015; 373(18): 1720-1732.
10.1056/NEJMoa1504347
CAS PubMed Web of Science® Google Scholar
5Bragg F, Holmes MV, Iona A, et al. Association between diabetes and cause-specific mortality in rural and urban areas of China. JAMA. 2017; 317(3): 280-289.
10.1001/jama.2016.19720
PubMed Web of Science® Google Scholar
6Yang JJ, Yu D, Wen W, et al. Association of diabetes with all-cause and cause-specific mortality in Asia: a pooled analysis of more than 1 million participants. JAMA Netw Open. 2019; 2(4):e192696.
10.1001/jamanetworkopen.2019.2696
PubMed Web of Science® Google Scholar
7Tseng CH. Mortality and causes of death in a national sample of diabetic patients in Taiwan. Diabetes Care. 2004; 27(7): 1605-1609.
10.2337/diacare.27.7.1605
PubMed Web of Science® Google Scholar
8De Cosmo S, Copetti M, Lamacchia O, et al. Development and validation of a predicting model of all-cause mortality in patients with type 2 diabetes. Diabetes Care. 2013; 36(9): 2830-2835.
10.2337/dc12-1906
CAS PubMed Web of Science® Google Scholar
9Copetti M, Shah H, Fontana A, et al. Estimation of mortality risk in type 2 diabetic patients (ENFORCE): an inexpensive and parsimonious prediction model. J Clin Endocrinol Metab. 2019; 104(10): 4900-4908.
10.1210/jc.2019-00215
PubMed Web of Science® Google Scholar
10Basu S, Sussman J, Berkowitz S, Hayward R, Yudkin J. Development and validation of risk equations for complications of type 2 diabetes (RECODe) using individual participant data from randomised trials. Lancet Diabetes Endocrinol. 2017; 5(10): 788-798.
10.1016/S2213-8587(17)30221-8
PubMed Web of Science® Google Scholar
11Shao H, Fonseca V, Stoecker C, Liu S, Shi L. Novel risk engine for diabetes progression and mortality in USA: building, relating, assessing, and validating outcomes (BRAVO). Pharmacoeconomics. 2018; 36(9): 1125-1134.
10.1007/s40273-018-0662-1
PubMed Web of Science® Google Scholar
12Hayes A, Leal J, Gray A, Holman R, Clarke P. UKPDS outcomes model 2: a new version of a model to simulate lifetime health outcomes of patients with type 2 diabetes mellitus using data from the 30 year United Kingdom prospective diabetes study: UKPDS 82. Diabetologia. 2013; 56(9): 1925-1933.
10.1007/s00125-013-2940-y
CAS PubMed Web of Science® Google Scholar
13Yoon KH, Lee JH, Kim JW, et al. Epidemic obesity and type 2 diabetes in Asia. Lancet. 2006; 368(9548): 1681-1688.
10.1016/S0140-6736(06)69703-1
PubMed Web of Science® Google Scholar
14Ramachandran A, Ma RC, Snehalatha C. Diabetes in Asia. Lancet. 2010; 375(9712): 408-418.
10.1016/S0140-6736(09)60937-5
PubMed Web of Science® Google Scholar
15Ma RC, Chan JC. Type 2 diabetes in east Asians: similarities and differences with populations in Europe and the United States. Ann N Y Acad Sci. 2013; 1281(1): 64-91.
10.1111/nyas.12098
PubMed Web of Science® Google Scholar
16Yang X, So W, Tong P, et al. Development and validation of an all-cause mortality risk score in type 2 diabetes. Arch Intern Med. 2008; 168(5): 451-457.
10.1001/archinte.168.5.451
PubMed Web of Science® Google Scholar
17Wan E, Fong D, Fung C, et al. Prediction of five-year all-cause mortality in Chinese patients with type 2 diabetes mellitus-a population-based retrospective cohort study. J Diabetes Complications. 2017; 31(6): 939-944.
10.1016/j.jdiacomp.2017.01.017
PubMed Web of Science® Google Scholar
18Quan J, Pang D, Li T, et al. Risk prediction scores for mortality, cerebrovascular, and heart disease among Chinese people with type 2 diabetes. J Clin Endocrinol Metab. 2019; 104(12): 5823-5830.
10.1210/jc.2019-00731
PubMed Web of Science® Google Scholar
19Lee S, Zhou J, Leung K, et al. Development of a predictive risk model for all-cause mortality in patients with diabetes in Hong Kong. BMJ Open Diabetes Res Care. 2021; 9(1):e001950.
10.1136/bmjdrc-2020-001950
PubMed Web of Science® Google Scholar
20Li T, Li C, Liu C, et al. Development and validation of prediction models for the risks of diabetes-related hospitalization and in-hospital mortality in patients with type 2 diabetes. Metabolism. 2018; 85: 38-47.
10.1016/j.metabol.2018.02.003
PubMed Web of Science® Google Scholar
21Liu C, Li C, Wang M, Yang S, Li T, Lin C. Building clinical risk score systems for predicting the all-cause and expanded cardiovascular-specific mortality of patients with type 2 diabetes. Diabetes Obes Metab. 2021; 23(2): 467-479.
10.1111/dom.14240
CAS PubMed Web of Science® Google Scholar
22Chiu S, Chen Y, Lu J, Ng S, Chen C. Developing a prediction model for 7-year and 10-year all-cause mortality risk in type 2 diabetes using a hospital-based prospective cohort study. J Clin Med. 2021; 10(20): 4779.
10.3390/jcm10204779
CAS PubMed Web of Science® Google Scholar
23Rubin DB. Multiple Imputation for Nonresponse in Surveys. Vol 81. John Wiley & Sons; 2004.
Google Scholar
24Marshall A, Altman D, Holder R, Royston P. Combining estimates of interest in prognostic modelling studies after multiple imputation: current practice and guidelines. BMC Med Res Methodol. 2009; 9: 57.
10.1186/1471-2288-9-57
PubMed Web of Science® Google Scholar
25Harrell FE Jr. Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis[M]. Springer; 2015.
10.1007/978-3-319-19425-7
Google Scholar
26Akaike H. A new look at the statistical model identification. IEEE T Automat Contr. 1974; 19(6): 716-723.
10.1109/TAC.1974.1100705
CAS Web of Science® Google Scholar
27Raffield L, Hsu F, Cox A, Carr J, Freedman B, Bowden D. Predictors of all-cause and cardiovascular disease mortality in type 2 diabetes: diabetes heart study. Diabetol Metab Syndr. 2015; 7: 58.
10.1186/s13098-015-0055-y
PubMed Web of Science® Google Scholar
28Wells B, Jain A, Arrigain S, Yu C, Rosenkrans W, Kattan M. Predicting 6-year mortality risk in patients with type 2 diabetes. Diabetes Care. 2008; 31(12): 2301-2306.
10.2337/dc08-1047
PubMed Web of Science® Google Scholar
29Wells B, Roth R, Nowacki A, et al. Prediction of morbidity and mortality in patients with type 2 diabetes. Peer J. 2013; 1:e87.
10.7717/peerj.87
PubMed Web of Science® Google Scholar
30McEwen L, Karter A, Waitzfelder B, et al. Predictors of mortality over 8 years in type 2 diabetic patients: translating research into action for diabetes (TRIAD). Diabetes Care. 2012; 35(6): 1301-1309.
10.2337/dc11-2281
PubMed Web of Science® Google Scholar
31Robinson T, Elley C, Kenealy T, Drury P. Development and validation of a predictive risk model for all-cause mortality in type 2 diabetes. Diabetes Res Clin Pract. 2015; 108(3): 482-488.
10.1016/j.diabres.2015.02.015
PubMed Web of Science® Google Scholar
32Basu S, Sussman J, Berkowitz S, et al. Validation of risk equations for complications of type 2 diabetes (RECODe) using individual participant data from diverse longitudinal cohorts in the U.S. Diabetes Care. 2018; 41(3): 586-595.
10.2337/dc17-2002
PubMed Web of Science® Google Scholar
33Tseng CH. Factors associated with cancer- and non-cancer-related deaths among Taiwanese patients with diabetes after 17 years of follow-up. PLoS One. 2016; 11(12):e0147916.
10.1371/journal.pone.0147916
PubMed Web of Science® Google Scholar
34Tseng CH. Use of insulin and mortality from breast cancer among Taiwanese women with diabetes. J Diabetes Res. 2015; 2015: 1-8.
10.1155/2015/678756
Web of Science® Google Scholar

Volume15, Issue1

January 2023

Pages 27-35

Developing a prediction model for all-cause mortality risk among patients with type 2 diabetes mellitus in Shanghai, China

中国上海地区建立的2型糖尿病患者全因死亡风险预测模型