Volume 15, Issue 4 pp. 338-348

ORIGINAL ARTICLE

Open Access

Development and validation of risk prediction models for large for gestational age infants using logistic regression and two machine learning algorithms

使用Logistic回归和两种机器学习算法开发和验证大于胎龄儿风险预测模型

Ning Wang,

Ning Wang

Department of Endocrinology, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China

Search for more papers by this author

Haonan Guo,

Haonan Guo

Department of Endocrinology and Second Department of Geriatrics, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China

Search for more papers by this author

Yingyu Jing,

Yingyu Jing

Department of Endocrinology and Second Department of Geriatrics, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China

Search for more papers by this author

Yifan Zhang,

Yifan Zhang

Department of Endocrinology and Second Department of Geriatrics, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China

Search for more papers by this author

Bo Sun,

Bo Sun

orcid.org/0000-0002-3217-0579

Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Xi'an Jiaotong University Health Science Center, Xi'an, China

Search for more papers by this author

Xingyan Pan,

Xingyan Pan

Xi'an Jiaotong University, Xi'an, China

Search for more papers by this author

Huan Chen,

Huan Chen

Department of Endocrinology and Second Department of Geriatrics, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China

Search for more papers by this author

Jing Xu,

Jing Xu

Department of Endocrinology, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China

Search for more papers by this author

Mengjun Wang,

Mengjun Wang

Department of Endocrinology, Xi'an, China

Search for more papers by this author

Xi Chen,

Xi Chen

Department of Epidemiology and Statistics, School of Public Health, Medical College, Zhejiang University, Hangzhou, China

Search for more papers by this author

Lin Song,

Corresponding Author

Lin Song

[email protected]

Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Xi'an Jiaotong University Health Science Center, Xi'an, China

Correspondence

Lin Song, Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Xi'an Jiaotong University Health Science Center, Xi'an, China.

Email: [email protected]

Wei Cui, Department of Endocrinology and Second Department of Geriatrics, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China.

Email: [email protected]

Search for more papers by this author

Wei Cui,

Corresponding Author

Wei Cui

[email protected]

orcid.org/0000-0002-5666-0919

Department of Endocrinology and Second Department of Geriatrics, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China

Correspondence

Lin Song, Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Xi'an Jiaotong University Health Science Center, Xi'an, China.

Email: [email protected]

Wei Cui, Department of Endocrinology and Second Department of Geriatrics, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China.

Email: [email protected]

Search for more papers by this author

Ning Wang,

Ning Wang

Department of Endocrinology, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China

Search for more papers by this author

Haonan Guo,

Haonan Guo

Department of Endocrinology and Second Department of Geriatrics, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China

Search for more papers by this author

Yingyu Jing,

Yingyu Jing

Department of Endocrinology and Second Department of Geriatrics, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China

Search for more papers by this author

Yifan Zhang,

Yifan Zhang

Department of Endocrinology and Second Department of Geriatrics, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China

Search for more papers by this author

Bo Sun,

Bo Sun

orcid.org/0000-0002-3217-0579

Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Xi'an Jiaotong University Health Science Center, Xi'an, China

Search for more papers by this author

Xingyan Pan,

Xingyan Pan

Xi'an Jiaotong University, Xi'an, China

Search for more papers by this author

Huan Chen,

Huan Chen

Department of Endocrinology and Second Department of Geriatrics, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China

Search for more papers by this author

Jing Xu,

Jing Xu

Department of Endocrinology, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China

Search for more papers by this author

Mengjun Wang,

Mengjun Wang

Department of Endocrinology, Xi'an, China

Search for more papers by this author

Xi Chen,

Xi Chen

Department of Epidemiology and Statistics, School of Public Health, Medical College, Zhejiang University, Hangzhou, China

Search for more papers by this author

Lin Song,

Corresponding Author

Lin Song

[email protected]

Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Xi'an Jiaotong University Health Science Center, Xi'an, China

Correspondence

Lin Song, Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Xi'an Jiaotong University Health Science Center, Xi'an, China.

Email: [email protected]

Wei Cui, Department of Endocrinology and Second Department of Geriatrics, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China.

Email: [email protected]

Search for more papers by this author

Wei Cui,

Corresponding Author

Wei Cui

[email protected]

orcid.org/0000-0002-5666-0919

Department of Endocrinology and Second Department of Geriatrics, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China

Correspondence

Lin Song, Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Xi'an Jiaotong University Health Science Center, Xi'an, China.

Email: [email protected]

Wei Cui, Department of Endocrinology and Second Department of Geriatrics, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China.

Email: [email protected]

Search for more papers by this author

First published: 08 March 2023

https://doi.org/10.1111/1753-0407.13375

Ning Wang, Haonan Guo and Yingyu Jing contributed equally to this work.

Share a link

Email
Wechat
Bluesky

Abstract

Background

Large for gestational age (LGA) is one of the adverse outcomes during pregnancy that endangers the life and health of mothers and offspring. We aimed to establish prediction models for LGA at late pregnancy.

Methods

Data were obtained from an established Chinese pregnant women cohort of 1285 pregnant women. LGA was diagnosed as >90th percentile of birth weight distribution of Chinese corresponding to gestational age of the same-sex newborns. Women with gestational diabetes mellitus (GDM) were classified into three subtypes according to the indexes of insulin sensitivity and insulin secretion. Models were established by logistic regression and decision tree/random forest algorithms, and validated by the data.

Results

A total of 139 newborns were diagnosed as LGA after birth. The area under the curve (AUC) for the training set is 0.760 (95% confidence interval [CI] 0.706–0.815), and 0.748 (95% CI 0.659–0.837) for the internal validation set of the logistic regression model, which consisted of eight commonly used clinical indicators (including lipid profile) and GDM subtypes. For the prediction models established by the two machine learning algorithms, which included all the variables, the training set and the internal validation set had AUCs of 0.813 (95% CI 0.786–0.839) and 0.779 (95% CI 0.735–0.824) for the decision tree model, and 0.854 (95% CI 0.831–0.877) and 0.808 (95% CI 0.766–0.850) for the random forest model.

Conclusion

We established and validated three LGA risk prediction models to screen out the pregnant women with high risk of LGA at the early stage of the third trimester, which showed good prediction power and could guide early prevention strategies.

摘要

背景:大于胎龄儿(Large for gestational age, LGA)是危害母儿生命健康的不良妊娠结局之一。本研究旨在建立妊娠晚期LGA的预测模型。

方法:研究对象来自一个1285名中国孕妇队列。LGA诊断为大于同性别新生儿胎龄对应的中国出生体重分布的第90百分位数。根据胰岛素敏感性和胰岛素分泌指标将GDM孕妇分为3型。通过logistic回归和决策树/随机森林算法建立模型, 并通过上述数据进行验证。

结果:139例新生儿出生后被诊断为LGA。由7项常用临床指标(包括血脂)和GDM亚型组成的logistic回归模型的训练集AUC为0.760 (95% CI 0.706 ~ 0.815), 内部验证集AUC为0.748 (95% CI 0.659 ~ 0.837)。两种机器学习算法建立的预测模型中, 决策树模型的训练集和内部验证集的AUC分别为0.813 (95%CI 0.786 ~ 0.839)和0.779 (95%CI 0.735 ~ 0.824), 随机森林模型AUC分别为0.854 (95%CI 0.831 ~ 0.877)和0.808 (95%CI 0.766 ~ 0.850)。

结论:本研究建立并验证了3种LGA风险预测模型, 可在孕晚期筛选出LGA高风险孕妇, 预测效能较好, 可指导早期预防策略。

1 INTRODUCTION

Fetal overgrowth increases the risk of many adverse outcomes for both mother and offspring, such as prolonged labor, cesarean section, shoulder dystocia, admission to the neonatal intensive care unit, and neonatal hypoglycemia after birth, increased risk of obesity and diabetes at childhood and adulthood.^{1, 2} Large for gestational age (LGA) is diagnosed as birth weight >90th percentile based on standard growth charts of different population.³ Fetuses who were diagnosed as LGA at the second trimester are associated with an increasing risk of LGA at birth.⁴ Fetal overgrowth is directly affected by the nutrients in maternal circulation at the third trimester.⁵ LGA is generally diagnosed before delivery, which leaves a limited phase for early intervention. Therefore, it is necessary to establish LGA prediction models at the early stage of the third trimester, to identify and intervene with pregnant women at high risk of giving birth to LGA offspring, in order to reduce the incidence of LGA at birth.

Gestational diabetes mellitus (GDM) subtypes, which were identified by insulin sensitivity and insulin secretion, may have different mechanisms in causing LGA infants.^{6, 7} However, the risk of fetal overgrowth is still increased even after strict glycemic management and in nondiabetic obese mothers.^{8, 9} The increased risk of fetal overgrowth in these models may be associated with the increased levels of lipid profile.^{10, 11} A meta-analysis indicates that the abnormal maternal lipid profile, which includes increased triglycerides (TG) and decreased high-density-lipoprotein cholesterol (HDL-C) levels during pregnancy, are associated with the risk of giving birth to LGA for pregnant women.¹²

To our knowledge, there is a limited LGA prediction model that includes GDM subtype as a variable; therefore we aimed to establish a model for early screening and intervention based on GDM subtypes and pregnancy-related glucolipid metabolism indexes, in combination with other LGA risk factors, which could be used at the early stage of the last trimester to guide prevention strategies.

2 METHODS

2.1 Data sources

The LGA prediction model was developed based on 1285 pregnant women with relatively complete clinical data during pregnancy from the Department of Obstetrics at the Northwest Women's and Children's Hospital from November 2019 to March 2020. All participants have given their informed consent for participation in this study. We included pregnant women ≥18 years old with full-term fetuses and relatively complete clinical data at the first and second trimesters for further analysis. Women with prepregnancy type 1 diabetes, type 2 diabetes, or monogenic diabetes were excluded, as were women who met the result of oral glucose tolerance test (OGTT) for fasting blood glucose (FBG) ≥7.1 mmol/L or 2 h plasma glucose ≥11.1 mmol/L, and other metabolic or chronic diseases, including pregnancy complicated with thyroid disease, preeclampsia/hypertension, liver, kidney, or cardiovascular disease. All the pregnant women included in the study did not undergo OGTT at the first trimester and had an FBG <5.1 mmol/L at that period.

2.2 Diagnoses and outcome

LGA was diagnosed as >90th percentile of birth weight distribution of Chinese corresponding to gestational age of the same-sex newborns.³ GDM or normal glucose tolerance (NGT) were diagnosed according to the International Diabetes and Pregnancy Study Group (IADPSG) criteria¹³ at the 24th–28th gestational week. GDM subtypes were classified according to the definition by Powe and colleagues on the basis of the distributions on the composite insulin sensitivity index (ISI) and Stumvoll I index of the NGT group (control),⁷ which were calculated by following equations^{14, 15} (BMI, body mass index; GLU, glucose; FINS, fasting insulin):

ISI composite index = \frac{10000}{\sqrt{FBG \times FINS} \times (average GLU \times average INS)}

Stumvoll I index = 2032 + 4.681 \times INS 0 - 135.0 \times GLU 120 + 0.995 \times INS 120 + 27.99 \times BMI - 269.1 \times GLU 0

According to Powe's definition,⁷ GDM-resistance women and GDM-dysfunction women were defined as below the 25th percentile range of the ISI composite index and the Stumvoll I index based on the distribution of the indexes for the NGT group, respectively. The GDM-mixed women have the characters of ISI composite index and the Stumvoll I index below the 25th percentile range of the NGT group, and we excluded the women with both ISI composite index and the Stumvoll I index above the 75th percentile range of the NGT group. Because the proportion of LGA in the GDM-dysfunction and GDM-mixed subtypes were similar, with no statistically significant difference compared with NGT group, we combined these two groups as the non-GDM-resistance group for further analysis.

2.3 Definitions and stratification of clinical indexes

Gestational weight gain was calculated as the maternal weight before production minus the prepregnancy weight, while maternal weight before and during pregnancy were recorded by the electronic medical records. Age at menarche was stratified by 11 years old, because current studies found that menarche age <11 years old was a risk factor for metabolic diseases and fetal overgrowth during pregnancy.^16-18 The variables in the model were screened out by the least absolute shrinkage and selection operator (LASSO) regression analysis, the levels of FBG, cholesterol (CHO), TG, HDL-C, and low-density-lipoprotein cholesterol (LDL-C) at the 28th–32th gestational week were stratified by 5.3 mmol/L, 8.2 mmol/L, 4.63 mmol/L, 1.24 mmol/L, and 4.92 mmol/L, according to the trimester-specific reference intervals (TSRIs) for blood lipid profiles in Chinese pregnant women.¹⁹

2.4 Data collection and plasma biochemical parameters detection

We made a questionnaire survey to collect the general information of the pregnant women, such as smoking status, alcohol consumption, history of macrosomia, and parity. The plasma biochemical parameters such as lipid profile were detected during the 28th–32th gestational weeks of pregnancy. Standard methods at the certified lab were employed to measure all the laboratory tests, which include glucose oxidase approach for FBG, with an intra- and interassay variation factor 2.1% and 2.6%, respectively. The plasma lipid profile, such as TG, total CHO, HDL-C, and LDL-C, were tested by the enzyme catalyzed approach. The indexes of the thyroid associated antibodies and insulin were detected by commercially kits (thyroglobulin antibody, TG-Ab [R-A-07-01, 30-3000 IU/mL], thyroidperoxidase antibodies, TPOAb [R-A-08-01, 10-1000 IU/mL], insulin [R-C-01-01, 5–180 mU/mL], 3 V Bioengineering, China). The biochemical indicators of liver, including aspartate transaminase, alanine transaminase, total protein, albumin, globulin, vitamin B12, and ferritin, were detected by the kits that were produced by Shandong 3 V Bioengineering Company, China, and measured by Hitachi 7600 automatic 49 biochemical analyzer.

2.5 Statistical analysis

Means ± SD and median (interquartile range) were employed to manifest the normally distributed continuous variables and nonnormally distributed continuous variables, respectively. The former variables were analyzed by the t test and the later by Mann–Whitney U test, chi-square test for the categorical variables when compared two groups. Differences across the four groups (NGT and three GDM subtypes) were compared using one-way analysis of variance for normally distributed continuous variables, Kruskal–Wallis test for the skewed distributed continuous variables, or chi-square test for categorical variables. When a p < .05, pairwise comparisons between the NGT group and each GDM subgroups were made using the Tukey's test, Dunn's test, or chi-square test, respectively. p values for pairwise comparisons were adjusted using the Bonferroni correction.

2.6 Multiple imputation

We conduct multiple imputation using chained equations to replace missing values within 20% of the total, which including vitamin B12, the indexes of plasma lipid profile, and ferritin at the third trimester. We employed five estimation models according to the size of the data and the capacity of the software (Jupyter Notebook 6.4.5, python 3.9.7).

2.7 The logistic regression modeling strategies

The data set was randomly assigned by random sample (SPSS 26.0), approximately 70% of all cases for the training set (n = 900) and the rest for the validation set (n = 385). LASSO regression analysis (STATA 15.0) and the cross-verification LASSO logistic regression (3 folds, seed 123) with the largest lamda for mean squared prediction error within one SE were conducted to select and determine the variables. Multivariate logistic regression (SPSS 22.0) was employed to modeling the strategies (backward variable selection). We used STATA (version 15.0) to plot the nomograph for generalizing, area under the curve (AUC) to reflect the estimated average optimism of the prediction accuracy, calibration curves to measure the probability of the relationship between the model and observed rate in sets, decision curve analysis (DCA) curves to show the net return, and Hosmer-Lemeshow test to calculate the differences of the predicted and the true value.

2.8 The machine learning (ML) algorithms

We processed the data with Random Under-Sampling and Synthetic Minority Over-Sampling Technique (SMOTE) because of the imbalanced quantity of GDM women (402) and NGT women (883); we also used the random undersample to trim most of the classes to reduce the overfitting risk caused by SMOTE. We used decision tree (DT) and random forest (RF) to develop the LGA prediction models by Jupyter Notebook (Anaconda) 6.4.5, and employed Graphviz 2.38 to plot the graphs. DT is made up of a single root node and several internal nodes and leaf nodes, with CART algorithms based on Gini coefficient. RF is an integrate algorithm layered on the top of multiple DT classifiers, which are randomly constructed and controlled by several selected characteristic variables. Bagging and random feature selection were employed to conduct the learning process. We compared the ML models by various approaches: area under the receiver operator characteristic curve, precision, recall, F1-score, accuracy, and specificity.

3 RESULTS

3.1 Baseline data

Out of 3829 pregnant women, 1285 women underwent the 75 g OGTT at the 24th–28th gestational week, 402 (10.5%) of whom were diagnosed as GDM according to the IADPSG criteria¹³ and 139 newborns were diagnosed as LGA. Among GDM women, 193 patients (48.0%) were classified into the GDM-resistance group with the clinical character of insulin resistance (ISI composite 47.56 ± 14.54 vs. 139.14 ± 43.75, p < 0.001, Table 1) and insulin hypersecretion (Stumvoll I index 1498.54 ± 301.78 vs. 944.32 ± 244.16, p < .001) compared with the NGT group. A total of 138 patients (34.3%) were assigned to the GDM-dysfunction group with the character of normal insulin sensitivity and decreased insulin secretion (Stumvoll I index 664.45 ± 143.22 vs. 944.32 ± 244.16, p < .001), and 71 patients (17.7%) were classified into the GDM-mixed group with characters of insulin resistance (ISI composite index 80.91 ± 17.34 vs. 139.14 ± 43.75, p < .05) and decreased insulin secretion (Stumvoll I index 689.34 ± 188.56 vs. 944.32 ± 244.16, p < .001). All women in the three GDM subtypes exhibited an elder maternal age (all p < .001), and a higher percentage of family diabetes history (all p < .05) compared to the NGT group. Only the GDM-resistance group showed an elevated pre-BMI (25.60 ± 2.95 vs. 21.30 ± 2.80, p < .001) and infant birth weight relative to the NGT group. For other factors, such as gestational week, smoking status, alcohol consumption, and gestational weight gain, there were no significant differences of the three GDM subtypes compared with the NGT group.

TABLE 1. The clinical characteristics of the NGT and GDM subtypes.

	GDM-resistance	p^a	GDM-dysfunction	p^a	GDM-mixed	p	NGT
Number (n)	193		138		71		883
Maternal age (years)	31.77 ± 4.52	< .001	31.65 ± 3.61	< .001	32.44 ± 3.83	< .001	30.27 ± 3.69
Family history of diabetes mellitus (n, %)	36 (18.6)	< .001	20 (14.4)	.007	10 (14.1)	.026	48 (5.4)
Pre-BMI (kg/m²)	25.60 ± 2.95	< .001	20.80 ± 2.00	—	20.95 ± 1.66	—	21.30 ± 2.80
Gestational week (weeks)	38.78 ± 1.41	—	38.99 ± 1.24	—	38.94 ± 1.17	—	39.00 ± 1.23
Smoking status (n, %)	5 (2.6)	—	0 (0)	—	2 (2.8)	—	19 (2.1)
Alcohol consumption (n, %)	1 (0.5)	—	0 (0)	—	0 (0)	—	3 (0.3)
GWG (kg)	12.32 ± 1.34	—	13.12 ± 1.53	—	12.78 ± 1.51	—	12.64 ± 1.67
Infant birth weight (g)	3496.35 ± 552.14	< .001	3401.01 ± 426.48	—	3317.18 ± 382.87	—	3333.92 ± 400.57
LGA	56 (29.0)	< .001	9 (6.5)	—	5 (7.0)	—	69 (7.8)
The second trimester
OGTT
FBG (mmol/L)	5.20 ± 0.46	< .001	5.06 ± 0.47	< .001	5.04 ± 0.49	< .001	4.80 ± 0.44
1 h glucose OGTT (mmol/L)	9.32 ± 1.79	< .001	8.84 ± 1.80	.001	9.13 ± 1.78	< .001	8.09 ± 1.85
2 h glucose OGTT (mmol/L)	7.56 ± 1.39	< .001	7.58 ± 1.34	< .001	7.68 ± 1.62	.002	6.97 ± 1.40
AUC (glucose)	15.7 ± 1.24	< .001	15.16 ± 1.12	< .001	15.49 ± 1.45	< .001	13.98 ± 1.07
Fasting insulin (uU/mL)	16.00 ± 4.65	< .001	7.51 ± 2.59	—	10.74 ± 2.99	—	8.81 ± 3.69
1-h insulin OGTT (uU/mL)	139.32 ± 41.23	< .001	46.26 ± 18.85	.041	49.43 ± 17.61	.046	61.30 ± 14.65
2-h insulin OGTT (uU/mL)	147.34 ± 42.45	< .001	39.37 ± 19.31	.027	41.02 ± 14.87	.034	55.86 ± 16.28
AUC (insulin)	220.94 ± 40.23	< .001	69.75 ± 15.76	< .001	75.31 ± 11.23	< .001	93.64 ± 15.31
Insulin sensitivity (ISI composite index)	47.56 ± 14.54	< .001	159.26 ± 36.76	—	80.91 ± 17.34	.026	139.14 ± 43.75
Insulin secretion (Stumvoll I index)	1498.54 ± 301.78	< .001	664.45 ± 143.22	< .001	689.34 ± 188.56	< .001	944.32 ± 244.16

Note: Data are presented as n (%) for categorical variables, median (interquartile range) or mean (SD) for continuous variables. Differences across the three groups (NGT and two GDM subtypes) were compared using one-way analysis of variance for normally distributed continuous variables, Kruskal–Wallis test for the skewed distributed continuous variables, or chi-squared test for categorical variables. When p < .05, pairwise comparisons between the NGT group and each GDM subgroups were made using the Tukey's test, Dunn's test, or chi-square test, respectively. p values for pairwise comparisons were adjusted using the Bonferroni correction.
Abbreviations: AUC, area under the curve; FBG, fasting blood glucose; GDM, gestational diabetes mellitus; GWG, gestational weight gain; ISI, insulin sensitivity index; LGA, large gestational age infant; NGT, normal glucose tolerance; OGTT, oral glucose tolerance test; pre-BMI, pre-pregnancy body mass index.

3.2 LGA risk prediction model based on logistic regression analysis

In our study, the training set (900 pregnant women) and the validation set (385 pregnant women) were randomly assigned from the cohort. The demographic and clinical characteristics of the two groups were shown in Table 2; no significant statistic differences were found between the two sets of the included variables.

TABLE 2. The clinical characteristics of the training cohort and the validation cohort.

Variables	Training cohort (n = 900)	Validation cohort (n = 385)	p
NGT (n, %)	616 (68.4)	267 (69.4)	.748
GDM-resistance (n, %)	135 (15.0)	58 (15.1)	.976
Non GDM-resistance (n, %)	146 (16.2)	63 (16.4)	.950
LGA (n, %)	97 (10.8)	42 (10.9)	.945
Maternal age (year)	30.9 ± 3.8	30.5 ± 4.0	.561
T2DM family history (n, %)	84 (9.3)	30 (7.8)	.373
pre-BMI (kg/m²)	21.9 ± 3.2	21.9 ± 3.0	.099
Age at menarche ≤11 years (n, %)	66 (7.3)	18 (4.7)	.077
ART (n, %)	61 (6.7)	23 (6.0)	.593
Thyroid antibodies + (TPOAb/TgAb)	129 (14.3)	44 (11.4)	.162
Above IOM recommended GWG	169 (18.7)	76 (19.7)	.687
History of macrosomia	30 (3.3)	12 (3.1)	.842
Parity	1.30 (0.66)	1.33 (0.63)	.190
Vitamin B₁₂ (pg/mL)	58.99 (37.21–82.54)	65.84 (51.93–81.84)	.194
Ferritin (ng/mL)	45.65 ± 37.21	43.72 ± 37.21	.303
Total protein (g/L)	69.33 (65.21–74.91)	69.03 (64.88–74.18)	.851
Albumin (g/L)	43.11 ± 8.45	40.11 ± 7.87	.203
Globulin (g/L)	29.28 ± 3.33	28.73 ± 3.06	.280
ALT (U/L)	18.12 ± 3.12	18.18 ± 2.4	.924
AST (U/L)	19.18 ± 4.76	18.80 ± 5.12	.537
CHO (mmol/L)	6.1 ± 1.2	6.1 ± 1.1	.701
TG (mmol/L)	3.6 ± 1.6	3.6 ± 1.6	.568
HDL-C (mmol/L)	1.8 ± 0.4	1.8 ± 0.4	.524
LDL-C (mmol/L)	3.1 ± 0.8	3.2 ± 0.9	.392
FBG (mmol/L)	4.3 ± 0.8	4.4 ± 0.7	.136

Note: Data are presented as n (%) for categorical variables, median (interquartile range) or mean (SD) for continuous variables. Differences between the two groups were analyzed by the t test and the Mann–Whitney U test for the normally distributed continuous variables and the skewed distributed continuous variables, Chi-square test was employed to analysis the categorical variables.
Abbreviations: ALT, alanine aminotransferase; ART, assisted reproductive technology; AST, aspartate aminotransferase; CHO, total cholesterol; FBG, fasting blood glucose; GDM, gestational diabetes mellitus; HDL-C, high-density lipoprotein cholesterol; IOM, Institute of Medicine; LDL-C, low-density lipoprotein cholesterol; NGT, normal glucose tolerance; pre-BMI, pre-pregnancy body mass index; T2DM, type 2 diabetes mellitus; TG, triglyceride.

3.3 LASSO logistic regression and cross-validation of LASSO logistic regression for the selection of variables

The LASSO logistic regression analysis and cross-validation were employed to select the variables (Figure 1 in Appendix S1). Maternal age stratification, type 2 diabetes mellitus (T2DM) family history, GDM subtypes, macrosomia history, age at menarche ≤11 years, CHO, TG, HCL-C, LDL-C, and FBG were screened out as variables that constitute the final model.

We employed multivariable logistic regression analysis to calculate the relevant β-coefficients and constant based on the 10 variables screened out by LASSO logistic regression. The final multivariate logistic regression prediction model was developed by eight variables based on the backward step algorithm (Table 3). The predicted LGA risk is estimated by the equation:

P = \frac{1}{1 + \exp . (- x)}, X = - 3.162 + 0.611 Maternal Age 30 - 34 y / 0.920 Maternal Age 35 - 39 y / 0.683 Maternal Age \geq 40 y + 0.606 T 2 DM family history - 0.367 GDM nonresistance / + 1.294 GDM resistance + 1.580 macrosomia + 1.522 CHO - 0.679 TG + 1.104 LDL - C + 1.151 FBG .

TABLE 3. The development of model by multivariate logistic regression.

Variables	B	SE	Wald	p	OR	95% CI
Maternal age <30 years (reference)			7.634	.054
30–34 years	0.611	0.287	4.537	.033	1.843	1.050–3.233
35–39 years	0.920	0.356	6.684	.010	2.510	1.249–5.043
≥40 years	0.683	0.720	0.900	.343	1.979	0.483–8.109
T2DM family history	0.606	0.347	3.055	.081	1.833	0.929–3.615
NGT (reference)			27.025	<.001
GDM nonresistance	−0.367	0.376	0.954	.329	0.692	0.331–1.447
GDM resistance	1.294	0.280	21.356	<.001	3.648	2.107–6.315
macrosomia	1.580	0.479	10.902	<.001	4.856	1.901–12.408
CHO	1.522	0.443	11.819	.001	4.580	1.924–10.905
TG	−0.679	0.355	3.656	.056	0.507	0.253–1.017
LDL-C	1.104	0.518	4.544	.033	3.017	1.093–8.329
FBG	1.151	0.360	10.246	1	0.001	3.161–1.562
Constant	−3.162	0.261	146.958	1	0.000	0.042

Abbreviations: CHO, total cholesterol; CI, confidence interval; FBG, fasting blood glucose; GDM, gestational diabetes mellitus; LDL-C, low-density lipoprotein cholesterol; LGA, large for gestational age; NGT, normal glucose tolerance; OR, odds ratio; T2DM, type 2 diabetes mellitus; TG, triglyceride.

3.4 Nomograph

We employed nomogram to visualize and popularize the LGA prediction model (Figure 1).

Details are in the caption following the image — **FIGURE 1**
Open in figure viewer PowerPoint

Nomogram for predicting LGA risk. The total score is calculated by the sum score of FBG, LDL-C, TG, CHO, pregnancy history of macrosomia, FBG (1 if ≥5.3 mmol/L and 0 if <5.3 mmol/L), LDL-C (1 if ≥4.92 mmol/L and 0 if <4.92 mmol/L), TG (1 if ≥4.63 mmol/L and 0 if <4.63 mmol/L), CHO (1 if ≥8.2 mmol/L and 0 if <8.2 mmol/L), macrosomia (0 if no and 1 if yes), NGT/GDM subtypes (0 if NGT, −0.367 if the GDM nonresistance subtype, 1.294 if the GDM-resistance subtype), T2DM family history (0 if no and 1 if yes), and maternal age (0 if <30, 1 if 30–34, 2 if 35–39, 3 if ≥40). CHO, cholesterol; FBG, fasting blood glucose; GDM, gestational diabetes mellitus; LDL-C, low density lipoprotein cholesterol; LGA, large for gestational age; NGT, normal glucose tolerance; T2DM, type 2 diabetes mellitus; TG, triglyceride.

3.5 Validation of the prediction model

The AUC for the training set is 0.760 (95% CI 0.706–0.815), and 0.748 (95% CI 0.659–0.837) for the validation set (Figure 2A,D in Appendix S1), which suggested a good discriminative capacity of the LGA prediction model. The calibration curves of the model showed that the result of the training set and the validation set corresponded well (Figure 2B,E in Appendix S1). The DCA plot was shown in Figure 2C,F in Appendix S1, which indicated certain positive net benefits in the prediction model among majority threshold probabilities. The Hosmer–Lemeshow test demonstrated a nonsignificant statistical difference in each set (All p > .05, Table 4), which suggested a high consistency. The generalization of the logistic regression model is presented in Table 5.

TABLE 4. Hosmer and Lemeshow test.

Set	Chi-square	Sig.
Training set	6.346	0.500
Validation set	5.119	0.529

TABLE 5. The generalization of the models.

Models	AUC _{Training set} (95% CI)	AUC _{Validation set} (95% CI)	Precision	Recall	F1-score	Accuracy	Specificity
Logistic regression model	0.760 (0.706–0.815)	0.748 (0.659–0.837)	0.638	0.600	0.618	0.641	0.680
Decision tree	0.813 (0.786–0.839)	0.779 (0.735–0.824)	0.672	0.668	0.670	0.696	0.719
Random forest	0.854 (0.831–0.877)	0.808 (0.766–0.850)	0.707	0.695	0.701	0.725	0.751

Abbreviations: AUC, area under the curve; CI, confidence interval.

3.6 Machine learning models of the LGA prediction model

Figure 3 in Appendix S1 shows receiver operating characteristic curves of the training set and validation set of the DT model and the RF model respectively. Spearman correlation coefficient and Pearson correlation coefficient were employed to sort all variables relative to LGA group and normal birth weight group for the two ML models (Figure 4 in Appendix S1). Figure 5 in Appendix S1 and 6 show the tree structure of the two ML models. The generalization of the two ML models is presented in Table 5. Our ML models achieved an expected discrimination.

4 DISCUSSION

We established three LGA prediction models that could be used at an early stage of the third trimester by the logistic regression analysis (eight variables) and the two ML algorithms (DT and RF, all variables). The AUCs for the training sets and validation sets are 0.760 (95% CI 0.706–0.815) and 0.748 (95% CI 0.659–0.837) for the logistic regression analysis model, 0.813 (95% CI 0.786–0.839) and 0.779 (95% CI 0.735–0.824) for the DT model, 0.854 (95% CI 0.831–0.877) and 0.808 (95% CI 0.766–0.850) for the RF model.

Women of GDM subtypes inclined to have LGA infants might be caused by different mechanisms.^{6, 7} Women with GDM-resistance showed a character of relatively higher pre-BMI than the NGT group.^{7, 20} Previous study established the LGA prediction model including GDM subtypes as a variable.²¹ However, the model showed a limited prediction power with an AUC of 0.698 for the training set, which may be related to the following reasons: the variables in the model were all in without screened by LASSO regression; the diagnosis criteria of GDM was based on the results of OGTT at the 24th–32th gestational week ranged by different regions, rather than the OGTT at the 24th–28th gestational week recommended by IADPSG; and the effect of plasma lipid profile at late pregnancy on LGA was not considered. Lene and colleagues²² found that the combination of Matsuda index (insulin sensitivity) and their disposition index (DI) significantly increased the prediction power of LGA than the GDM subtypes; however, DI is calculated as Matsuda index multiplied by Stumvoll I index (insulin secretions), which amplified the weight of Matsuda index and without the consideration of collinearity between the two continuous variables. At the same time, as a multicenter study, Lene and colleagues²² classified the GDM subtypes based on the summarized data of all centers worldwide to calculate the normal range of insulin sensitivity and insulin secretion indexes, without taking into account the existing differences caused by race.

Maternal overweight/obesity is closely related to the birth weight of newborns.^{23, 24} However, pre-BMI stratification was not included in the LASSO regression analysis in our logistic regression model. We speculate that it might be related to the obesity character of the GDM-resistance subtype, which leads to the limited importance of pre-BMI in the logistic regression model. Actually, we combined the influence of insulin sensitivity/secretions and obesity as GDM subtypes, which reflect the whole pathophysiological process leading to the elevated blood glucose during pregnancy.

Modifiable risk factors are essential to identify and remeasure the relationship of the risk factor and outcomes, which is important for prevention. Maternal FBG and lipid profile at the early stage of the third trimester were also considered as two modifiable independent variables in the published LGA prediction model,^{25, 26} which leaves enough time for diet and exercise interventions before delivery. Maternal lipid profile of TG, HDL-C, and LDL-C were increased within a physiological range from the second to third trimester during pregnancy²⁷; however, an independent relationship emerged between overrange lipid profile and fetal overgrowth,¹⁰ including the level of TG at late pregnancy.²⁵ A prospective study showed that the level of maternal fasting and postprandial TG manifested a stronger power in predicting neonatal fat content than FBG after strict glycemic control.²⁸ According to the TSRIs for blood lipid profiles in Chinese pregnant women,¹⁹ the levels of FBG, CHO, TG, HDL-C, and LDL-C at the 28th–32th gestational week above 5.3 mmol/L, 8.2 mmol/L, 4.63 mmol/L, 1.24 mmol/L, and 4.92 mmol/L are associated with adverse pregnancy outcomes. However, studies on the effect of improving maternal lipid profiles at the 3rd trimester on fetal outcomes are limited.

The relationship between maternal dyslipidemia and fetal overgrowth is mediated by placenta lipid transport capacities.²⁹ Lipases in placental microvillous membrane could hydrolyze plasma lipoproteins, such as TG, phospholipids, and other lipid nutrients in maternal circulation, and release to nonesterized fatty acid (NEFA). The translocation of NEFA in the placenta is through simple diffusion of fatty acid transporter proteins, fatty acid translocases, fatty acid binding proteins, and major facilitator superfamily domain containing protein2a in a transporter-controlled manner.^{30, 31} NEFAs can cross the base membrane into the fetal circulation and then be absorbed by the fetal liver and eventually cause fetal overgrowth.³² Our previous study found that elevated placental lipid content (TG, CHO) at the third trimester of overweight pregnant women enhanced placental mTORC1-RPS6 and ERK1/2 signaling, leading to the rise of cord blood insulin level and birth weight.¹¹

The relationship between previous macrosomia delivery history and the risk of having LGA might be caused by diet and the awareness of monitoring fetal overgrowth at the third trimester. T2DM and GDM have many overlapped susceptible genes.^{33, 34} The function of islet cells decreased accompanied with the increasing of maternal age.³⁵ Accompanied by advanced maternal age, pregnancy increased the metabolic load of mother, revealed small defects (associated with T2DM genes) of islet cell for the mother. Finally, leading to the elevated blood glucose during pregnancy, and increases the risk of fetal overgrowth.

For the comparison among the three prediction models, the logistic regression model including eight common variables and the formula/nomograph could be used for external verification. By contrast, the two ML models including all variables are difficult to popularize due to the algorithm limitation. As for ML, the prediction models established by DT and RF show relatively good performance in the validation set, with improved receiver operating characteristics, accuracy, sensitivity, and specificity relative to the traditional logistic model, which is due to the advanced algorithms of ML. In details, DT and RT have certain features of selection ability, which can show the ranking of feature importance intuitively; DT has certain interpretability, and the structure of the tree can be visualized; RF has the ability to prevent overfitting, and the performance of accuracy is better than most single algorithms. However, ML models still have shortcomings. To be specific, the algorithm of some ML is a black box, which cannot be reflected by concise diagrams or formulas. In addition, ML models are difficult to popularize than logistic models due to the high requirements on computing power and computing environment.

The advantages of our research include that the definition of LGA is based on the distribution of birth weight corresponding to the sex of newborn. It has reported that the combination of predictors including glycemic measures, BMI, and maternal age showed an increased power in predicting LGA than individual indicators,^{36, 37} and our LGA prediction model has taken all these indicators into account. The continuous variables in our model were transformed into dichotomous variables according to the reported target during pregnancy, avoiding collinearity among variables. Modifiable variables were included in our models, which can be used to access the remission after some interventions.

The limitations of our study include a relatively small sample size and single-center study, which limited the further adjustment for ethnicity differences and other confounding factors.

In conclusion, we established three LGA prediction models that can be used at early stage of the third trimeste to identify pregnant women with a high risk of giving birth to LGA and to guide prevention strategies at an early stage of the third trimester.

AUTHOR CONTRIBUTIONS

Ning Wang, Lin Song, and Wei Cui designed the work presented by the article. Ning Wang, Haonan Guo, and Yingyu Jing completed the meta-analysis and drafted and revised the article. Bo Sun, JJing Xu, Huan Chen, and Mengjun Wang collected the data and revised the article for critically important content. Lin Song and Wei Cui final approved of the version to be published. All authors contributed to the article and approved the submitted version.

ACKNOWLEDGEMENTS

The authors thank all obstetricians of the Northwest Women and Children's Hospital who participated in the study and contributed to the collection of data.

FUNDING INFORMATION

We acknowledge grant funding of the Natural Science Foundation of Shaanxi Province (No. 2020GXLH-Y-029, 2019JQ069, 2019JM262), the Natural Science Foundation of China (No. 81801459; No. 82071732; No. 81741079), the Natural Science Foundation for Postdoctoral Scientists of China (No. 2018M641001, No. 2016M600799), the Clinical Research Award of the First Affiliated Hospital of Xi'an Jiaotong University, China (No. XJTU1AF-CRF-2019-007).

CONFLICT OF INTEREST STATEMENT

The authors declare no potential conflicts of interest.

CONSENT TO PARTICIPATE

Individuals were informed about the use of their data and were offered an opt-out. All of the included participants gave their informed consent. The data were used anonymously. Written informed consent was obtained from the parents.

Open Research

DATA AVAILABILITY STATEMENT

The data associated with the paper are not publicly available but are available from the corresponding author on reasonable request.

Supporting Information

REFERENCES

1Barbour LA. Metabolic culprits in obese pregnancies and gestational diabetes mellitus: big babies, big twists, big picture: the 2018 Norbert Freinkel Award Lecture. Diabetes Care. 2019; 42(5): 718-726.
10.2337/dci18-0048
CAS PubMed Web of Science® Google Scholar
2Barbour LA, Farabi SS, Friedman JE, et al. Postprandial triglycerides predict newborn fat more strongly than glucose in women with obesity in early pregnancy. Obesity (Silver Spring). 2018; 26(8): 1347-1356.
10.1002/oby.22246
CAS PubMed Web of Science® Google Scholar
3Dai L, Deng C, Li Y, et al. Population-based birth weight reference percentiles for Chinese twins. Ann Med. 2017; 49(6): 470-478.
10.1080/07853890.2017.1294258
PubMed Web of Science® Google Scholar
4Rekawek P, Liu L, Getrajdman C, et al. Large-for-gestational age diagnosed during second-trimester anatomy ultrasound and association with gestational diabetes and large-for-gestational age at birth. Ultrasound Obstet Gynecol. 2020; 56(6): 901-905.
10.1002/uog.21930
CAS PubMed Web of Science® Google Scholar
5Kiserud T, Benachi A, Hecher K, et al. The World Health Organization fetal growth charts: concept, findings, interpretation, and application. Am J Obstet Gynecol. 2018; 218(2 S): S619-S629.
10.1016/j.ajog.2017.12.010
PubMed Web of Science® Google Scholar
6Wang N, Song L, Sun B, et al. Contribution of gestational diabetes mellitus heterogeneity and prepregnancy body mass index to large-for-gestational-age infants-a retrospective case-control study. J Diabetes. 2021; 13(4): 307-317.
10.1111/1753-0407.13113
CAS PubMed Web of Science® Google Scholar
7Powe CE, Allard C, Battista MC, et al. Heterogeneous contribution of insulin sensitivity and secretion defects to gestational diabetes mellitus. Diabetes Care. 2016; 39(6): 1052-1055.
10.2337/dc15-2672
CAS PubMed Web of Science® Google Scholar
8Evers IM, de Valk HW, Mol BW, ter Braak EW, Visser GH. Macrosomia despite good glycaemic control in type I diabetic pregnancy; results of a nationwide study in The Netherlands. Diabetologia. 2002; 45(11): 1484-1489.
10.1007/s00125-002-0958-7
CAS PubMed Web of Science® Google Scholar
9Catalano PM, HD MI, Cruickshank JK, et al. The hyperglycemia and adverse pregnancy outcome study: associations of GDM and obesity with pregnancy outcomes. Diabetes Care. 2012; 35(4): 780-786.
10.2337/dc11-1790
CAS PubMed Web of Science® Google Scholar
10Berglund L, Brunzell JD, Goldberg AC, et al. Evaluation and treatment of hypertriglyceridemia: an Endocrine Society clinical practice guideline. J Clin Endocrinol Metab. 2012; 97(9): 2969-2989.
10.1210/jc.2011-3213
CAS PubMed Web of Science® Google Scholar
11Song L, Wang N, Peng Y, Sun B, Cui W. Placental lipid transport and content in response to maternal overweight and gestational diabetes mellitus in human term placenta. Nutr Metab Cardiovasc Dis. 2022; 32(3): 692-702.
10.1016/j.numecd.2021.12.018
CAS PubMed Web of Science® Google Scholar
12Wang J, Moore D, Subramanian A, et al. Gestational dyslipidaemia and adverse birthweight outcomes: a systematic review and meta-analysis. Obes Rev. 2018; 19(9): 1256-1268.
10.1111/obr.12693
CAS PubMed Web of Science® Google Scholar
13American DA. Diagnosis and classification of diabetes mellitus. Diabetes Care. 2011; 34(Suppl 1): S62-S69.
10.2337/dc11-S062
PubMed Web of Science® Google Scholar
14Matsuda M, DeFronzo RA. Insulin sensitivity indices obtained from oral glucose tolerance testing: comparison with the euglycemic insulin clamp. Diabetes Care. 1999; 22(9): 1462-1470.
10.2337/diacare.22.9.1462
CAS PubMed Web of Science® Google Scholar
15Stumvoll M, Van Haeften T, Fritsche A, Gerich J. Oral glucose tolerance test indexes for insulin sensitivity and secretion based on various availabilities of sampling times. Diabetes Care. 2001; 24(4): 796-797.
10.2337/diacare.24.4.796
CAS PubMed Web of Science® Google Scholar
16Wang L, Yan B, Shi X, et al. Age at menarche and risk of gestational diabetes mellitus: a population-based study in Xiamen, China. BMC Pregnancy Childbirth. 2019; 19(1): 138.
10.1186/s12884-019-2287-6
PubMed Web of Science® Google Scholar
17Chen L, Li S, He C, et al. Age at menarche and risk of gestational diabetes mellitus: a prospective cohort study among 27,482 women. Diabetes Care. 2016; 39(3): 469-471.
10.2337/dc15-2011
CAS PubMed Web of Science® Google Scholar
18Li H, Shen L, Song L, et al. Early age at menarche and gestational diabetes mellitus risk: results from the healthy baby cohort study. Diabetes Metab. 2017; 43(3): 248-252.
10.1016/j.diabet.2017.01.002
CAS PubMed Web of Science® Google Scholar
19Zheng W, Zhang L, Tian Z, Zhang L, Liang X, Li G. Establishing reference ranges of serum lipid level during pregnancy and evaluating its association with perinatal outcomes: a cohort study. Int J Gynaecol Obstet. 2022; 156(2): 361-369.
10.1002/ijgo.13636
CAS PubMed Web of Science® Google Scholar
20Wang N, Peng Y, Wang L, et al. Risk factors screening for gestational diabetes mellitus heterogeneity in Chinese pregnant women: a case-control study. Diabetes Metab Syndr Obes. 2021; 14: 951-961.
10.2147/DMSO.S295071
PubMed Web of Science® Google Scholar
21Gibbons KS, Chang AMZ, Ma RCW, et al. Prediction of large-for-gestational age infants in relation to hyperglycemia in pregnancy - a comparison of statistical models. Diabetes Res Clin Pract. 2021; 178:108975.
10.1016/j.diabres.2021.108975
PubMed Web of Science® Google Scholar
22Madsen LR, Gibbons KS, Ma RCW, et al. Do variations in insulin sensitivity and insulin secretion in pregnancy predict differences in obstetric and neonatal outcomes? Diabetologia. 2021; 64(2): 304-312.
10.1007/s00125-020-05323-0
CAS PubMed Web of Science® Google Scholar
23Liu Y, Guo F, Zhou Y, Yang X, Zhang Y, Fan J. The interactive effect of Prepregnancy overweight/obesity and isolated maternal hypothyroxinemia on macrosomia. J Clin Endocrinol Metab. 2021; 106(7): e2639-e2646.
10.1210/clinem/dgab171
PubMed Web of Science® Google Scholar
24Wang N, Ding Y, Wu J. Effects of pre-pregnancy body mass index and gestational weight gain on neonatal birth weight in women with gestational diabetes mellitus. Early Hum Dev. 2018; 124: 17-21.
10.1016/j.earlhumdev.2018.07.008
PubMed Web of Science® Google Scholar
25Mossayebi E, Arab Z, Rahmaniyan M, Almassinokiani F, Kabir A. Prediction of neonates' macrosomia with maternal lipid profile of healthy mothers. Pediatr Neonatol. 2014; 55(1): 28-34.
10.1016/j.pedneo.2013.05.006
PubMed Web of Science® Google Scholar
26Xi F, Chen H, Chen Q, et al. Second-trimester and third-trimester maternal lipid profiles significantly correlated to LGA and macrosomia. Arch Gynecol Obstet. 2021; 304(4): 885-894.
10.1007/s00404-021-06010-0
CAS PubMed Web of Science® Google Scholar
27Paulmichl K, Hatunic M, Hojlund K, et al. Modification and validation of the triglyceride-to-HDL cholesterol ratio as a surrogate of insulin sensitivity in white juveniles and adults without diabetes mellitus: the Single Point Insulin Sensitivity Estimator (SPISE). Clin Chem. 2016; 62(9): 1211-1219.
10.1373/clinchem.2016.257436
CAS PubMed Web of Science® Google Scholar
28Samsuddin S, Arumugam PA, Md Amin MS, et al. Maternal lipids are associated with newborn adiposity, independent of GDM status, obesity and insulin resistance: a prospective observational cohort study. BJOG. 2020; 127(4): 490-499.
10.1111/1471-0528.16031
CAS PubMed Web of Science® Google Scholar
29Segura MT, Demmelmair H, Krauss-Etschmann S, et al. Maternal BMI and gestational diabetes alter placental lipid transporters and fatty acid composition. Placenta. 2017; 57: 144-151.
10.1016/j.placenta.2017.07.001
CAS PubMed Web of Science® Google Scholar
30Larque E, Pagan A, Prieto MT, et al. Placental fatty acid transfer: a key factor in fetal growth. Ann Nutr Metab. 2014; 64(3–4): 247-253.
10.1159/000365028
CAS PubMed Web of Science® Google Scholar
31Delhaes F, Giza SA, Koreman T, et al. Altered maternal and placental lipid metabolism and fetal fat development in obesity: current knowledge and advances in non-invasive assessment. Placenta. 2018; 69: 118-124.
10.1016/j.placenta.2018.05.011
CAS PubMed Web of Science® Google Scholar
32Lewis RM, Hanson MA, Burdge GC. Umbilical venous-arterial plasma composition differences suggest differential incorporation of fatty acids in NEFA and cholesteryl ester pools. Br J Nutr. 2011; 106(4): 463-467.
10.1017/S0007114511000377
CAS PubMed Web of Science® Google Scholar
33Dereke J, Palmqvist S, Nilsson C, Landin-Olsson M, Hillman M. The prevalence and predictive value of the SLC30A8 R325W polymorphism and zinc transporter 8 autoantibodies in the development of GDM and postpartum type 1 diabetes. Endocrine. 2016; 53(3): 740-746.
10.1007/s12020-016-0932-7
CAS PubMed Web of Science® Google Scholar
34Lin Z, Wang Y, Zhang B, Jin Z. Association of type 2 diabetes susceptible genes GCKR, SLC30A8, and FTO polymorphisms with gestational diabetes mellitus risk: a meta-analysis. Endocrine. 2018; 62(1): 34-45.
10.1007/s12020-018-1651-z
CAS PubMed Web of Science® Google Scholar
35Cnop M, Igoillo-Esteve M, Hughes SJ, Walker JN, Cnop I, Clark A. Longevity of human islet alpha- and beta-cells. Diabetes Obes Metab. 2011; 13(Suppl 1): 39-46.
10.1111/j.1463-1326.2011.01443.x
CAS PubMed Web of Science® Google Scholar
36Cooray SD, Wijeyaratne LA, Soldatos G, Allotey J, Boyle JA, Teede HJ. The unrealised potential for predicting pregnancy complications in women with gestational diabetes: a systematic review and critical appraisal. Int J Environ Res Public Health. 2020; 17(9): 3048.
10.3390/ijerph17093048
PubMed Web of Science® Google Scholar
37Kushtagi P, Arvapally S. Maternal mid-pregnancy serum triglyceride levels and neonatal birth weight. Int J Gynaecol Obstet. 2009; 106(3): 258-259.
10.1016/j.ijgo.2009.03.004
PubMed Web of Science® Google Scholar

Volume15, Issue4

April 2023

Pages 338-348

Development and validation of risk prediction models for large for gestational age infants using logistic regression and two machine learning algorithms

使用Logistic回归和两种机器学习算法开发和验证大于胎龄儿风险预测模型

Abstract

Background

Methods

Results

Conclusion

摘要

1 INTRODUCTION

2 METHODS

2.1 Data sources

2.2 Diagnoses and outcome

2.3 Definitions and stratification of clinical indexes

2.4 Data collection and plasma biochemical parameters detection

2.5 Statistical analysis

2.6 Multiple imputation

2.7 The logistic regression modeling strategies

2.8 The machine learning (ML) algorithms

3 RESULTS

3.1 Baseline data

3.2 LGA risk prediction model based on logistic regression analysis

3.3 LASSO logistic regression and cross-validation of LASSO logistic regression for the selection of variables

3.4 Nomograph

3.5 Validation of the prediction model

3.6 Machine learning models of the LGA prediction model

4 DISCUSSION

AUTHOR CONTRIBUTIONS

ACKNOWLEDGEMENTS

FUNDING INFORMATION

CONFLICT OF INTEREST STATEMENT

CONSENT TO PARTICIPATE

Open Research

DATA AVAILABILITY STATEMENT

Supporting Information

REFERENCES

Figures

References

Related

Information