RESEARCH ARTICLE

Full Access

A search for factors associated with reduced carbohydrate intake and NTD risk in two population-based studies

Corresponding Author

Gary M. Shaw

[email protected]

orcid.org/0000-0001-7438-4914

Stanford University School of Medicine, Department of Pediatrics, Division of Neonatology, Stanford University School of Medicine, Stanford, California, USA

Correspondence

Gary M. Shaw, Department of Pediatrics, Stanford University, 453 Quarry Road, Palo Alato, CA 94304, USA.

Email: [email protected]

Search for more papers by this author

Wei Yang,

Wei Yang

Stanford University School of Medicine, Department of Pediatrics, Division of Neonatology, Stanford University School of Medicine, Stanford, California, USA

Search for more papers by this author

Kari A. Weber,

Kari A. Weber

Department of Epidemiology, Fay. W. Boozman College of Public Health, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA

Search for more papers by this author

Andrew F. Olshan,

Andrew F. Olshan

orcid.org/0000-0001-9115-5128

Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA

Search for more papers by this author

Tania A. Desrosiers,

Tania A. Desrosiers

Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA

Search for more papers by this author

The National Birth Defects Prevention Study,

The National Birth Defects Prevention Study

Search for more papers by this author

Gary M. Shaw,

Corresponding Author

Gary M. Shaw

[email protected]

orcid.org/0000-0001-7438-4914

Stanford University School of Medicine, Department of Pediatrics, Division of Neonatology, Stanford University School of Medicine, Stanford, California, USA

Correspondence

Gary M. Shaw, Department of Pediatrics, Stanford University, 453 Quarry Road, Palo Alato, CA 94304, USA.

Email: [email protected]

Search for more papers by this author

Wei Yang,

Wei Yang

Stanford University School of Medicine, Department of Pediatrics, Division of Neonatology, Stanford University School of Medicine, Stanford, California, USA

Search for more papers by this author

Kari A. Weber,

Kari A. Weber

Department of Epidemiology, Fay. W. Boozman College of Public Health, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA

Search for more papers by this author

Andrew F. Olshan,

Andrew F. Olshan

orcid.org/0000-0001-9115-5128

Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA

Search for more papers by this author

Tania A. Desrosiers,

Tania A. Desrosiers

Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA

Search for more papers by this author

The National Birth Defects Prevention Study,

The National Birth Defects Prevention Study

Search for more papers by this author

First published: 07 March 2024

https://doi.org/10.1002/bdr2.2328

Share a link

Email
Wechat
Bluesky

Abstract

Background

Two population-based case–control studies have reported an increased risk of neural tube defect (NTD)-affected pregnancies among women with low carbohydrate diet in the periconceptional period. Given that only two studies have investigated this association, it is unclear to what degree the findings could be impacted by residual confounding. Here, we further interrogated both studies that observed this association with the objective to identify factors from a much larger number of factors that might explain the association.

Methods

By employing a machine learning algorithm (random forest), we investigated a baseline set of over 200 variables. These analyses produced the top 10 variables in each data set for cases and controls that predicted periconceptional low carbohydrate intake.

Results

Examining those prediction variables with logistic regression modeling, we did not observe any particular variable that substantially contributed to the NTD-low carbohydrate association in either data set.

Conclusions

If there are underlying factors that explain the association, our findings suggest that none of the 200+ variables we examined were sufficiently correlated with what that true explanatory exposure may be. Alternatively, our findings may suggest that there are other unidentified factor(s) at play, or the association observed in two independent data sets is directly related to low carbohydrate intake.

1 INTRODUCTION

Two recent studies observed that women's low carbohydrate diet (≤5th percentile) in the periconceptional period was associated with an increased odds of neural tube defect (NTD)-affected pregnancies. Desrosiers et al. (2018) observed that women's low carbohydrate diet in the year before conception was associated with a modest increased risk (adjusted OR = 1.3) of NTD-affected pregnancies. Their investigation was conducted in the National Birth Defects Prevention Study (NBDPS) corresponding to the years following US mandatory folic acid fortification of grains. Their novel findings, however, could not disentangle whether the increased NTD risk observed with restricted carbohydrate intake was a consequence of lower folate intake or other factors related to low carbohydrate intake because in this study period, folic acid and carbohydrate measures would derive from similar foods (e.g., enriched grains such as breads and pastas). This association was recapitulated in data from a NTD study conducted in California (CA-NTD Study) prior to folic acid fortification showing a 2-fold increased NTD risk among women reporting low carbohydrate intake (Shaw & Yang, 2019). This subsequent finding suggested that the observed increase in NTD risk was attributable, at least in part, to something other than reduced folic acid intake.

If the low carb-NTD association is explained (at least in part) by something other than folic acid intake, it's unclear what those factor(s) could be. Each of the original studies investigated only a limited set of suspected confounding factors based on substantive inferences conceptualized using directed acyclic graphs (e.g., glycemic index, elevated body mass index) with no factor being identified as a major source of confounding. However, there may be residual confounding by other factor(s) not considered in the original studies. Given the unknown mechanism underlying the observed association between low carbohydrate intake and NTDs, and the unknown degree to which that association may be due to residual confounding, our objective was to use machine learning to further interrogate both NBDPS and CA-NTD Study data to investigate over 200 additional, previously unconsidered nutritional, demographic, and behavioral factors that might potentially explain the observed association.

2 METHODS

2.1 NBDPS data

Details of the population-based case-control National Birth Defects Prevention Study (NBDPS) conducted at nine US centers can be found elsewhere (Reefhuis et al., 2015). Briefly, NBDPS included data from women with pregnancies affected with selected birth defects as well as women who had pregnancies without birth defects (controls) corresponding to estimated dates of delivery between October 1997 and December 2011. Cases included for this analysis were livebirths, stillbirths, and terminations with an NTD, specifically spina bifida or anencephaly. These phenotypes were ascertained prenatally through 1-year after delivery employing criteria of diagnostic descriptions based on physical examination, surgery, imaging, or autopsy. Further diagnosis review for eligibility was also conducted by clinical geneticists (Rasmussen et al., 2003). NTDs occurring in conjunction with chromosomal abnormalities or single gene disorders were not included. Controls were selected randomly from the same area and time period as cases.

Interviews were conducted with women with 1959 NTD-affected pregnancies (66%) and 11,829 women who delivered control infants (64%). The computer-assisted-telephone interview was structured, approximately an hour in duration, conducted in English or Spanish, and completed within 6 weeks to 24 months from the woman's estimated delivery date. Women were queried on demographic factors, their medical history, and myriad lifestyle and behavioral factors. Usual consumption of foods in the year before pregnancy was obtained using a previously validated, modified 58-item food frequency questionnaire (Willett et al., 1987). Additional questions elicited information on cereal consumption for the 3 months prior to conception. Intake amounts of dietary nutrients including carbohydrates from food and nonalcoholic beverages (e.g., sodas and juices) were estimated using the USDA National Nutrient Database for Standard Reference, Version 27, which contains values for nutrients and for numerous food items (Pehrsson et al., 2015; USDA Agricultural Research Service, 2007).

Among interviewed participants we further restricted to (i) cases (n = 1870) and controls (n = 11,452) that were singletons; (ii) cases (n = 1831) and controls (n = 11,370) where the pregnancy was without pregestational diabetes; (iii) cases (n = 1730) and controls (n = 10,691) with energy intake ranging from 500 to 5000 kcal and missing fewer than two food frequency questionnaire items as a quality control; and (iv) cases (n = 1726) and controls (n = 10,649) where interview data had fewer than 10% missing questions. These 1726 cases and 10,649 controls served as the analytic base, with the case group phenotype further classified in some analyses as anencephaly (n = 561) or spina bifida (n = 1165). This analytic sample resembled that in the Desrosiers et al. (2018) study with the exception that here we also included pregnancies with the date of conception before April 1, 1998 as well as cases and controls that derived from the New Jersey center early in the conduct of the NBDPS. These additions to the analytic sample were considered appropriate as they derived from the larger NBDPS study and were appropriate for our research queries.

2.2 CA-NTD study data

Details of this population-based case–control study can be found elsewhere (Shaw et al., 1995). Briefly, 653 infants or fetuses with a NTD (cases) were ascertained by reviewing medical records, including ultrasonography, at all hospitals and clinics for those infants/fetuses delivered in select California counties between February 1989 and January 1991. Singleton, live born infants and fetuses with NTDs, including those prenatally diagnosed and electively terminated, were considered as cases. Controls (n = 644) were selected randomly from each area hospital in proportion to the hospital's estimated contribution to the total population of singleton infants born alive in the same time period as cases. We excluded women who only spoke languages other than English or Spanish, or who had a previous NTD-affected pregnancy, leaving 613 cases and 611 controls.

Interviews were conducted in-person with mothers of 538 (88%) cases and of 539 (88%) controls an average of 5 months from the actual or projected date of term delivery. Interviews obtained information on pregnancy and lifestyle factors including vitamin/mineral supplements women used in the period 3 months before conception and in each trimester of pregnancy. Information about average daily intake of food items was obtained by administering a well-established, 100-item semi quantitative food frequency questionnaire (Block et al., 1986). Women were asked to estimate usual frequency and portion size of food items consumed during the 3 months before conception. Of the 1077 women who completed an interviewer-administered questionnaire, 1007 completed a food frequency questionnaire, and 916 (of 454 cases and 462 controls) contained suitable data based on error checks built into the associated analytic software. We further restricted analyses to 449 case and 458 control women without pregestational diabetes. Cases were further classified as: anencephaly (n = 175) or spina bifida (n = 253) for some analyses. This is the same analytic sample as used in Shaw & Yang, 2019.

2.3 Analyses

We defined low carbohydrate diet as intake in the 5th percentile or lower among control women in each data set separately, corresponding to an estimated carbohydrate intake of approximately 95 g/day in NBDPS data and 122 g/day in CA-NTD Study data. This percentile cutoff was employed in the two previous studies that demonstrated the association between restricted carbohydrate intake and NTDs (Desrosiers et al., 2018; Shaw & Yang, 2019).

Using a random forest approach, we assessed >200 variables from interview responses. Random forest, a data mining algorithm, produces a set of decision trees using random subsets of the data and combines them to produce a mean prediction model based on variable importance (Strobl et al., 2009). This method accounts for interactions and nonlinear associations among many factors simultaneously to determine their prediction of a variable of interest (i.e., low carbohydrate) (Strobl et al., 2009). Importance for each potential predictor was estimated using the “varimp” function (Party package in R software) to obtain the metric mean decrease accuracy (MDA) specifying “ntree = 2500” and “mtry = 15” as the number of trees and number of selected predictor variables per split, respectively. The MDA is essentially the decrease in accuracy of the model. So the more a permutation decreases the accuracy, the more important the variables are to the model. As an example, the MDA of irrelevant variables should vary randomly around zero since their removal does not decrease the accuracy (Strobl et al., 2009). Alternatively, variables that were “important” had an MDA that fell outside of the magnitude of the lowest negative value in either the positive or negative direction.

The >200 variables assessed in each study represented demographic (e.g., maternal age and education), behavioral (e.g., cigarette smoking and cannabis use), medical (e.g., epilepsy and diabetes), and dietary factors (e.g., supplements used and food intake) (see Appendix 1 for details) for their “prediction” of low carbohydrate intake separately in the case and control groups. This assessment in both cases and controls ensured that prediction variables in either group were identified and included in analyses. Only variables derived from interview questions that had a frequency >0.1% in the NBDPS and >1% in CA-NTD study (i.e., to allow about 10 participants answering “Yes” or reporting being exposed in each study) were considered for further analyses. Values for all time-varying variables were considered for the time-periods 1 month before through 2 (NBDPS) months after conception or 3 (CA-NTD Study) months before and after conception to capture the etiologically relevant window for neural tube development.

The random forest algorithm provided a ranked list of variables that “predict” low carbohydrate diet in NTD cases as well as in controls. We performed additional analyses using the top 10 variables (in case or control groups) identified as predicting low carbohydrate intake (selecting the top 20 did not offer further information).

The association (ORs and 95% CIs) between low carbohydrate and NTDs overall, as well as spina bifida and anencephaly alone was estimated with logistic regression separately for each data set. As an approach to try to explain what might be explaining the associations observed between low carbohydrate and NTDs, we calculated stratified effect measure estimates on each of the top predictor variables. Stratification variables measured on a continuous scale were categorized as low (≤25th percentile of the reported within-sample distribution) versus not low (>25th percentile). Stratified ORs were explored for heterogeneity (Wald Chi-squared test) by adding an interaction term between low carbohydrate and each predictor in logistic regression models.

Random forest analyses were performed in R software (version 4.1.3) using the Party Package. All other analyses were performed using SAS version 9.4 (SAS Institute, Cary, NC). Analytic activities associated with this project were approved by the institutional review boards of each collaborating center as well as the California Department of Public health for both studies.

3 RESULTS

3.1 NBDPS data

The random forest analysis yielded the ranked variables for all NTD cases, anencephaly, spina bifida, and controls shown in Figure 1. The top 10 variables were nearly identical between all NTD cases and controls yielding a combined list of 11 variables that were subjected to further analysis. All 11 variables were related to dietary intake: magnesium, copper, thiamin, glycemic index, caffeine from soda, vitamin B6, iron, diet quality index, dietary folate equivalents, vitamin C, and riboflavin.

Details are in the caption following the image — **FIGURE 1**
Open in figure viewer PowerPoint

Random forest analysis of variables predicting the risk of low carbohydrate intake among NTDs and controls. Shown are random forest analyses for NTDs (upper-left, accuracy = 96.3%, sensitivity = 40.0%, specificity = 100%), anencephaly (upper-right, accuracy = 94.5%, sensitivity = 8.8%, specificity = 100%), Spina Bifida (lower-left, accuracy = 96.2%, sensitivity = 38.0%, specificity = 100%), and controls (lower-right, accuracy = 97.5%, sensitivity = 50.4%, specificity = 100%). The x-axis is the value of mean decrease in accuracy and the y-axis is a list of the top 10 predictors. Of note, random forest analysis identifies the relative importance of predictive variables but does not indicate magnitude or direction of potential associations with NTDs, NBDPS 1997–2011.

The association between low carbohydrate and NTDs overall, as well as spina bifida and anencephaly alone is shown in Table 1. The OR for each grouping was 1.2 (95% CI, 1.0–1.5) for NTDs overall, 1.0–1.6 for spina bifida and 0.9–1.9 for anencephaly. The stratified ORs and 95% CIs between low versus not low carbohydrate intake and NTDs for each stratum of each of the top ranked variables (from either cases or controls) are presented in Table 2. ORs for anencephaly and spina bifida are presented in Tables S1 and S2, respectively. We did not observe much evidence for heterogeneity (statistical precision may have been low for some comparisons) in the overall OR of 1.2 for any of the “prediction” variables with the possible exception of glycemic index. Low versus not low strata of glycemic index had some statistical evidence of heterogeneity for the association between low carbohydrate and anencephaly (Table S1). The direction of this finding indicated that low glycemic index and low carbohydrate were associated with the highest OR. This may have arisen by chance owing to the number of comparisons made, but nevertheless is opposite to what we would expect a priori with respect to glycemic index, which is not a decrease but an increase in odds with increasing glycemic index values (Shaw et al., 2003).

TABLE 1. Odds ratios NTDs overall, spina bifida, and anencephaly associated with periconceptional low carbohydrate intake, NBDPS 1997–2011.

Case type	Carbohydrate intake^a	Case	Control	Odds ratio (95% CI)
All NTD	Low	105	532	1.2 (1.0,1.5)
All NTD	Not low	1621	10,117	Referent
Anencephaly	Low	34	532	1.2 (0.9,1.9)
Anencephaly	Not low	527	10,117	Referent
Spina Bifida	Low	71	532	1.2 (1.0,1.6)
Spina Bifida	Not low	1094	10,117	Referent

^a Low intake defined as carbohydrate intake ≤5th percentile, that is, approximately 95 g/day, determined among controls; not low intake considered >95 g/day.

TABLE 2. Odds ratios for associations between NTDs overall comparing low with not low carbohydrate intake across strata of variables identified as important predictors in random forest analyses, NBDPS 1997–2011.

Stratification predictor variable	Low carbohydrate intake^a		OR (95% CI) for low carbohydrate intake	p-value for interaction term^c
Stratification predictor variable	NTD cases (n = 1726)	Controls (n = 10,649)	OR (95% CI) for low carbohydrate intake	p-value for interaction term^c
Magnesium (mg/day)
≤ 25th percentile	97	504	1.1 (0.8, 1.4)	0.2
> 25th percentile	8	28	1.8 (0.8, 4.1)
Copper (mg/day)
≤ 25th percentile	97	497	1.2 (0.9, 1.5)	0.7
> 25th percentile	8	35	1.4 (0.7, 3.1)
Thiamin (mg/day)
≤ 25th percentile	100	496	1.1 (0.9, 1.4)	0.7
> 25th percentile	5	36	0.9 (0.4, 2.3)
Glycemic index
≤ 25th percentile	36	151	1.5 (1.0, 2.2)	0.2
> 25th percentile	69	381	1.1 (0.9, 1.5)
Caffeine from soda (mg/day)
≤ 25th percentile	38	239	1.2 (0.9, 1.8)	0.8
> 25th percentile	67	293	1.3 (1.0, 1.7)
Vitamin B6 (mg/day)
≤ 25th percentile	92	472	1.2 (0.9, 1.5)	0.7
> 25th percentile	13	60	1.4 (0.8, 2.5)
Iron (mg/day)
≤ 25th percentile	86	445	1.2 (0.9, 1.6)	0.7
> 25th percentile	19	87	1.4 (0.8, 2.3)
Diet quality index
≤ 25th percentile	97	490	1.1 (0.9, 1.4)	0.8
> 25th percentile	8	42	1.2 (0.6, 2.6)
Folate DFE (μg/day)
≤ 25th percentile	88	451	1.1 (0.9, 1.4)	0.5
> 25th percentile	17	81	1.4 (0.8, 2.3)
Vitamin C (mg/day)
≤ 25th percentile	85	404	1.2 (0.9, 1.6)	0.5
> 25th percentile	20	128	1.0 (0.6, 1.6)
Riboflavin (mg/day)
≤ 25th percentile	81	425	1.0 (0.8, 1.3)	0.2
> 25th percentile	24	107	1.5 (0.9, 2.3)
Any predictor^b
≤ 25th percentile	105	532	1.2 (1.0, 1.5)	NA
> 25th percentile	0	0	NA

Abbreviation: DFE, Dietary folate equivalents.
^a Low intake defined as carbohydrate intake ≤5th percentile, that is, approximately 95 g/day, determined among controls; not low intake considered >95 g/day.
^b ≤25th percentile in any predictors listed above versus not.
^c p-value from Wald Chi-squared statistic.

We also conducted an analysis that explored low intake (≤25th percentile) in any of the prediction variables relative to not low (>25th percentile) in all predictors to determine whether any vs none contributed to the observed association of 1.2 (95% CI: 1.0, 1.5) for low carbohydrate. The results of these analyses gave an OR of 1.2 (Table 2 and Tables S1 and S2) indicating that the collection of prediction variables (11 for overall NTD cases, spina bifida, and controls; 12 for anencephaly and controls) identified by random forest contributed to the association directly or indirectly by some unknown correlate variable to some or all (vs. none) of the prediction variables. Of note, there were no cases and controls in the low carbohydrate intake group who were also in the group of not low in all predictors. This suggests the random forest algorithm functionally identified the strongest predictors.

3.2 CA-NTD study

ORs for low carbohydrate and NTDs, OR = 2.0 (95% CI: 1.2, 3.4) overall, as well as spina bifida, OR = 2.1 (95% CI: 1.2, 3.7) and anencephaly, OR = 1.8 (95% CI: 0.9, 3.5) are shown in Table 3.

TABLE 3. Odds ratios for NTDs overall, spina bifida, and anencephaly associated with periconceptional low carbohydrate intake, CA-NTD Study 1989–1991.

Case type	Carbohydrate intake^a	Case	Control	Odds ratio (95% CI)
All NTD	Low	43	23	2.0 (1.2, 3.4)
All NTD	Not low	406	435	Referent
Anencephaly	Low	15	23	1.8 (0.9, 3.5)
Anencephaly	Not low	160	435	Referent
Spina Bifida	Low	25	23	2.1 (1.2, 3.7)
Spina Bifida	Not low	228	435	Referent

^a Low intake defined as carbohydrate intake ≤5th percentile, that is, approximately 122 g/day, determined among controls; not low intake considered >122 g/day.

Random forest results predicting low carbohydrate intake revealed the ranked variables for all NTD cases, anencephaly, spina bifida, and controls in Figure 2. In this data set, the top 10 variables were similar for all NTD cases and controls producing a combined listing of 14 variables to be considered for further analysis. As observed in the NBDPS analysis, the top selected “prediction” variables were all based on dietary intake: total calorie intake, phosphorous, potassium, thiamin, protein, cysteine, sodium, calcium, niacin, sucrose, glucose, fiber, daily grams of bread, cereal, rice and pasta, and grains for fiber.

The stratified ORs and 95% CIs between low versus not low carbohydrate intake and NTDs overall for each stratum of each of these prediction variables are presented in Table 4 and in Tables S3 and S4 for anencephaly and spina bifida. We did not observe evidence for heterogeneity (statistical precision may have been low for some comparisons) in the overall OR of ~2.0 for any of the “prediction” variables individually, with the possible exceptions of niacin and low calcium. As in the NBDPS data set, we explored whether low in any of the prediction variables relative to not low in all predictors contributed to the observed association. These results similarly showed the OR of ~2.0, indicating that the prediction variables (14 for overall NTDs, spina bifida, and controls; 13 for anencephaly and controls) identified by random forest in total contributed to the association directly or indirectly by some unknown correlate variable to some or all of the prediction variables. That is, these prediction variables considered as any versus none (not low in all variables), are related to the low carbohydrate association but none in isolation accounts for the association.

TABLE 4. Odds ratios for associations between NTDs overall comparing low with not low carbohydrate intake across strata of variables identified as important predictors in random forest analyses, CA-NTD Study 1989–1991.

Stratification predictor variable	Low carbohydrate intake^a		OR (95% CI) for low carbohydrate intake	p-value for interaction term^b
Stratification predictor variable	NTD cases (n = 449)	Controls (n = 458)	OR (95% CI) for low carbohydrate intake	p-value for interaction term^b
Total calories (kcal/day)
≤ 25th percentile	43	23	1.9 (1.0, 3.4)	NA
> 25th percentile	0	0	NA
Phosphorus (mg/day)
≤ 25th percentile	42	22	1.7 (1.0, 3.1)	0.8
> 25th percentile	1	1	1.1 (0.1, 18.1)
Potassium (mg/day)
≤ 25th percentile	43	22	1.7 (0.9, 3.0)	0.9
> 25th percentile	0	1	NA
Thiamin (mg/day)
≤ 25th percentile	41	21	1.9 (1.0, 3.4)	0.6
> 25th percentile	2	2	1.1 (0.2, 7.9)
Protein (g/day)
≤ 25th percentile	41	23	1.4 (0.8, 2.4)	0.9
> 25th percentile	2	0	NA
Cysteine (mg/day)
≤ 25th percentile	39	23	1.5 (0.8, 2.6)	0.9
> 25th percentile	4	0	NA
Sodium (mg/day)
≤ 25th percentile	43	23	1.5 (0.9, 2.7)	NA
> 25th percentile	0	0	NA
Calcium (mg/day)
≤ 25th percentile	41	17	2.3 (1.2, 4.3)	0.1
> 25th percentile	2	6	0.4 (0.1, 1.9)
Niacin (mg/day)
≤ 25th percentile	41	20	2.2 (1.2, 4.1)	0.2
> 25th percentile	2	3	0.7 (0.1, 4.3)
Sucrose (g/day)
≤ 25th percentile	37	21	2.6 (1.4, 4.9)	0.9
> 25th percentile	6	2	3.0 (0.6, 14.9)
Fiber (g/day)
≤ 25th percentile	40	23	1.7 (0.9, 3.0)	1.0
> 25th percentile	3	0	NA
Bread, cereal, rice and pasta (g/day)
≤ 25th percentile	39	20	2.0 (1.1, 3.6)	0.7
> 25th percentile	4	3	1.5 (0.3, 6.6)
Grains for fiber (g/day)
≤ 25th percentile	38	21	1.9 (1.0, 3.5)	0.7
> 25th percentile	5	2	2.7 (0.5, 14.0)
Glucose (g/day)
≤ 25th percentile	35	21	1.9 (1.0, 3.5)	0.3
> 25th percentile	8	2	4.2 (0.9, 20.1)
Any predictor^c
≤ 25th percentile	43	23	1.9 (1.1, 3.3)	NA
> 25th percentile	0	0	NA

^a Low intake defined as carbohydrate intake ≤5th percentile, that is, approximately 122 g/day, determined among controls; not low intake considered >122 g/day.
^b p-value from Wald Chi-squared statistic.
^c ≤25th percentile in any predictors listed above versus not.

4 DISCUSSION

The objective of this work was to further interrogate two data sets from population-based case-control studies that previously demonstrated an association between low carbohydrate intake and NTDs to attempt to determine whether another variable or set of variables, other than low carbohydrate, might underlie the association. To aid this endeavor we employed the machine learning algorithm random forest to identify potential factors from among more than 200 variables reflecting nutrition, demographics, and behaviors—an approach that was agnostic to variable selection and not hindered by data features such as collinearity. Although the machine learning algorithm could choose from >200 variables from each data set, no single variable was observed to substantially contribute to the low carbohydrate association in either data set. Analyses that explored whether low in any (vs. not low in all) of the top prediction variables chosen (in either cases or controls) by the algorithm in each data set did, however, reflect the elevated OR for low carbohydrate intake. These analyses indicated that the prediction variables identified by random forest in total contributed to the association directly or indirectly by some unknown correlated variable to some or all of the prediction variables. We could not further discriminate what that variable relationship might be with these two data sets. Of note, thiamin was observed to be an important predictor of low carbohydrate in both data sets. Lowered thiamin has been observed as a potential, but only modest, risk factor of NTDs (anencephaly) in the NBDPS (Chandler, 2012) and in the CA-NTD Study among nonusers of vitamin supplements (Shaw et al., 1999).

The meaning of a low carbohydrate diet is complex—(absence of important nutrients, surrogate marker for an at-risk subset of the population with metabolic disease, etc.). Given that only two studies have investigated the association between low carbohydrate intake and NTD risk, and that we did not identify another variable in these studies to “explain” the consistently observed association, we believe it is premature to speculate on what these findings may mean in terms of recommendations for periconceptional diet choices.

The lack of finding a single variable that specifically contributed to the previously observed association of low carbohydrate in these two data sets may suggest that (a) the selected machine learning algorithm may not have been sufficiently sensitive, although it did identify that a “low” value in any of the important predictor variables, in composite, reproduced the elevated odds ratio; (b) none of the 200+ variables are sufficiently correlated with the true exposure, if it is indeed something other than low carbohydrate intake underlying the low carbohydrate association; (c) at least one of the 200+ variables does explain the association directly or indirectly, but is measured with insufficient specificity or subject to exposure misclassification in these studies; or (d) the association observed in two independent data sets is unbiased by residual confounding and is directly related to low carbohydrate intake. The latter explanation will require investigation in other data that may be better able to quantify intake of specific carbohydrates.

ACKNOWLEDGMENTS

This work was partially supported by the Centers for Disease Control and Prevention Centers of Excellence (U01DD001226 and U01DD001302). The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the California Department of Public Health.

Open Research

DATA AVAILABILITY STATEMENT

The data are from the two studies interrogated here (a CDC-funded multi-site study and a CA-only study) are not publicly available at this time due to privacy or ethical restrictions.

Supporting Information

REFERENCES

Block, G., Hartman, A. M., Dresser, C. M., Carroll, M. D., Gannon, J., & Gardner, L. (1986). A data-based approach to diet questionnaire design and testing. American Journal of Epidemiology, 124(3), 453–469.
10.1093/oxfordjournals.aje.a114416
CAS PubMed Web of Science® Google Scholar
Chandler, A. L., Hobbs, C. A., Mosley, B. S., Berry, R. J., Canfield, M. A., Qi, Y. P., Siega-Riz, A. M., Shaw, G. M., & National Birth Defects Prevention Study. (2012). Neural tube defects and maternal intake of micronutrients related to one-carbon metabolism or antioxidant activity. Birth Defects Research. Part A, Clinical and Molecular Teratology, 94, 864–874.
10.1002/bdra.23068
CAS PubMed Web of Science® Google Scholar
Desrosiers, T. A., Siega-Riz, A. M., Mosley, B. S., & Meyer, R. E. (2018). Low carbohydrate diets may increase risk of neural tube defects. Birth Defects Research, 110(11), 901–909. https://doi.org/10.1002/bdr2.1198
10.1002/bdr2.1198
CAS PubMed Web of Science® Google Scholar
Pehrsson, P., Patterson, K., Haytowitz, D., & Phillips, K. (2015). Total carbohydrate determinations in USDA's National Nutrient Database for standard reference. The FASEB Journal, 29(1), S740.6.
10.1096/fasebj.29.1_supplement.740.6
Google Scholar
Rasmussen, S. A., Olney, R. S., Holmes, L. B., Lin, A. E., Keppler Noreuil, K. M., & Moore, C. A. (2003). Guidelines for case classification for the National Birth Defects Prevention Study. Birth defects research. Part A, clinical and molecular Teratology, 67(3), 193–201.
10.1002/bdra.10012
CAS PubMed Google Scholar
Reefhuis, J., Gilboa, S. M., Anderka, M., Browne, M. L., Feldkamp, M. L., Hobbs, C. A., Jenkins, M. M., Langlois, P. H., Newsome, K. B., Olshan, A. F., Romitti, P. A., Shapira, S. K., Shaw, G. M., Tinker, S. C., Honein, M. A., & The National Birth Defects Prevention Study. (2015). The National Birth Defects Prevention Study: A review of the methods. Birth defects research. Part A, clinical and molecular Teratology, 103(8), 656–669.
10.1002/bdra.23384
CAS PubMed Web of Science® Google Scholar
Shaw, G. M., Quach, T., Nelson, V., Carmichael, S. L., Schaffer, D. M., Selvin, S., & Yang, W. (2003). Neural tube defects associated with maternal periconceptional dietary intake of simple sugars and glycemic index. The American Journal of Clinical Nutrition, 78(5), 972–978. https://doi.org/10.1093/ajcn/78.5.972
10.1093/ajcn/78.5.972
CAS PubMed Web of Science® Google Scholar
Shaw, G. M., Schaffer, D., Velie, E. M., Morland, K., & Harris, J. A. (1995). Periconceptional vitamin use, dietary folate, and the occurrence of neural tube defects. Epidemiology, 6(3), 219–226.
10.1097/00001648-199505000-00005
CAS PubMed Web of Science® Google Scholar
Shaw, G. M., Todoroff, K., Schaffer, D. M., & Selvin, S. (1999). Periconceptional nutrient intake and risk for neural tube defect-affected pregnancies. Epidemiology, 10, 711–716.
10.1097/00001648-199911000-00011
CAS PubMed Web of Science® Google Scholar
Shaw, G. M., & Yang, W. (2019). Women's periconceptional lowered carbohydrate intake and NTD-affected pregnancy risk in the era of prefortification with folic acid. Birth Defects Research, 111(5), 248–253.
10.1002/bdr2.1466
CAS PubMed Google Scholar
Strobl, C., Malley, J., & Tutz, G. (2009). An introduction to recursive partitioning: Rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychological Methods, 14(4), 323–348.
10.1037/a0016973
PubMed Web of Science® Google Scholar
USDA Agricultural Research Service. (2007). USDA National Nutrient Database for standard reference, release 20: Author From http://www.ars.usda.gov/ba/bhnrc/ndl.
Google Scholar
Willett, W. C., Reynolds, R. D., Cottrell-Hoehner, S., Sampson, L., & Browne, M. L. (1987). Validation of a semi-quantitative food frequency questionnaire: Comparison with a 1-year diet record. Journal of the American Dietetic Association, 87(1), 43–47.
Google Scholar

Volume116, Issue3

March 2024

e2328

A search for factors associated with reduced carbohydrate intake and NTD risk in two population-based studies