The authors thank two anonymous reviewers and the editor, Brian Roe, for comments and suggestions that greatly strengthened the paper. Research for this study was supported by USDA-ERS Cooperative Agreement No. 58-4000-1-0025. The views expressed here are those of the authors and cannot be attributed to the Economic Research Service or the U.S. Department of Agriculture. All remaining errors are our own.

About

Sections

PDF

Tools

Share a link

Email
Wechat
Bluesky

Abstract

This article measures changes in the distribution of dietary quality among adults in the United States over the period 1989–2008. Diet quality is a direct input to health, is often used as a proxy for well-being, and is an outcome variable for a wide variety of economic interventions. For the population as a whole, we find significant improvements across all levels of diet quality. Further, we find improvements for both low-income and higher-income individuals alike. Counterfactual distributions of dietary quality are constructed to investigate the extent to which observed improvements can be attributed to changes in the nutritional content of foods and to changes in population characteristics. We find that 63% of the improvement for all adults can be attributed to changes in food formulation and demographics. Changes in food formulation account for a substantially larger percentage of the dietary improvement within the lower-income population (19.6%) vs. the higher-income population (6.4%).

Poor nutrition is a contributing factor to four of ten major causes of death in the United States: coronary heart disease, cancer, stroke, and type 2 diabetes (Jemal et al. 2008). Poor diet quality is associated with increased risks of coronary heart disease, stroke, and diabetes (Chiuve et al. 2012), cardiovascular disease (Nicklas, O'Neil, and Fulgoni 2012), breast cancer (Shahril), colorectal cancer (Reedy et al. 2008), and prostate cancer (Bosire et al. 2013). Moreover, diet quality is often used as a measure of well-being in developing countries (Ravaillon 1996) and developed countries (Strauss and Thomas 1998). In this article, we study how the distribution of adult diet quality in the United States has evolved over the last two decades.

Improving dietary quality has long been a focus of government policy because of its direct impact on human health, particularly among the poor. Specific interventions have included increasing the resources available to households to buy food (e.g., Supplemental Nutrition Assistance Program [SNAP])¹ and providing healthy foods directly to individuals (e.g., the School Breakfast Program, School Lunch Program, Special Supplemental Nutrition Program for Women, Infants, and Children [WIC], and Fresh Fruit and Vegetable Program). Policies have also aimed at increasing the information available to individuals about what constitutes a healthy diet: the Food Guide Pyramid was released in 1992 and subsequently updated in 2005 as MyPyramid and in 2011 as MyPlate; federally approved SNAP education programs grew from seven active states in 1992 to 50 in 2004; mandatory nutrition labeling was enacted in 1994; and mandatory calorie postings in restaurants was introduced in 2011. Current policy proposals seek to improve diet quality by restricting the range of foods eligible for purchase under SNAP and changing the relative prices of foods via taxes or subsidies.

In this article, we use stochastic dominance to compare the distribution of dietary quality over time and between income groups. Stochastic dominance is frequently used in the economics literature to analyze the distribution of income or wealth. This empirical approach allows us to completely characterize the nature of the changes in dietary quality over time, paying close attention to low-income individuals, whose diets are of particular concern to policymakers. Stochastic dominance is particularly well suited to studying diet quality, where exact thresholds between “good” diets and “poor” diets is fuzzy.

Further, we construct counterfactual distributions of dietary quality to investigate the extent to which observed improvements can be attributed to changes in the nutritional content of foods and to changes in demographics. In short, we ask how would the distribution of dietary quality change if food in 1989 were formulated as it was in 2008? Further, what would have the distribution of dietary quality looked like in 1989 had the demographic landscape of 2008 prevailed?

When comparing the observed distributions of dietary quality, we find a statistically significant and economically meaningful improvement across the entire population over the period 1989–2008. Improvements occur for individuals in households above and below our chosen poverty threshold. Counterfactual estimates indicate that 53.3% of the dietary improvement in the U.S. population can be attributed to changes in demographics (i.e., an aging, more educated, and ethnically diverse population) and an additional 10.1% of the improvement is attributed to changes in food composition (e.g., decreases in saturated fats, sugars, and sodium). The residual 36.6% is unexplained by either changes in demographics or food composition.

The article proceeds as follows. We begin by describing a widely used measure of dietary quality—the Healthy Eating Index 2005 (HEI-2005)—that forms the basis of our analysis. We then turn to a description of our primary data sources, the National Health and Nutrition Examination Survey (NHANES) and the earlier Continuing Survey of Food Intakes by Individuals (CSFII); we extend the HEI-2005 to the earlier study period of 1989–91. We then motivate our empirical approach by providing a brief overview of stochastic dominance. After the presentation of results, we discuss the economic and policy implications in the final section.

Diet Quality

The healthfulness of an individual's diet depends on two factors: energy balance and dietary quality. Energy balance is the relationship between calories consumed and energy expended, which results in body weight management (Hall et al. 2012). Dietary quality can be expressed as a per calorie metric that measures the degree to which a diet complies with a set of criteria (here, the Dietary Guidelines for Americans via the HEI). In this article, we focus on dietary quality, and we note that there is evidence that higher quality diets are associated with decreased obesity rates (i.e. improved energy balance) (Epstein et al. 2001, 2008).

We use the HEI—developed in 1995 to measure compliance with the U.S. government's recommendations for healthful eating—as our measure of diet quality. The HEI has been widely used and evaluated as a valid measure of diet quality (Guenther et al. 2008). In the medical literature, the HEI has been found to be a significant predictor of medical outcomes, notably of all-cause mortality, mortality due to malignant neoplasms (Ford et al. 2011), and overweight and obesity (Guo et al. 2004). Further, the HEI has been extensively used by economists to measure the outcome of policy interventions—for example, welfare reform (Kramer-LeBlanc, Basiotis, and Kennedy 1997), School Breakfast Program (Bhattacharya, Currie, and Haider 2006), food stamps and WIC (Wilde, McNamara, and Ranney 1999), nutrition labeling (Kim, Nayga, and Capps 2001), and unusually cold weather (Bhattacharya et al. 2004). Finally, it is has also been found to be associated with food insecurity (Bhattacharya, Currie, and Haider 2004) and has been proposed as a possible indicator of food deserts (Bitler and Haider 2011).

Every five years, the Dietary Guidelines for Americans are revised by the USDA and Health and Human Services based on the advice of an expert advisory panel. These guidelines are the U.S. government's official recommendations for healthful eating and form the basis for information provided to consumers. Many of the USDA's food-assistance programs must be in compliance with the Dietary Guidelines for Americans. The HEI was updated in 2005 to reflect the 2005 Dietary Guidelines for Americans (frequently called the HEI-2005; see Guenther, Reedy, and Krebs-Smith 2008).² Because the HEI-2005 was constructed with the 2005 Dietary Guidelines for Americans as its basis, one can think of using this index as a consistent measure of dietary quality, with 2005 defined as the base period.

The HEI (henceforth, HEI refers to the HEI-2005) is the sum of twelve components based on consumption of various foods or nutrients. Each component assigns a score ranging from 0 to 5 (total fruit, whole fruit, total vegetables, dark green/orange vegetables and legumes, total grains, whole grains), 0 to 10 (milk, meats and beans, oils, saturated fat, sodium), or 0 to 20 for the percentage of calories from solid fats, alcoholic beverages, and added sugars, creating a maximum score of 100. Table 1 provides exact details of the scoring.

Details are in the caption following the image — **Figure 1.**
Open in figure viewer PowerPoint

Healthy Eating Index 2005 Standards for Scoring *Source*: Recreated from Guenther et al. (2007).

There is debate among nutritionists about how a given HEI score maps into the notion of “healthy” versus “unhealthy” diet quality. One generally accepted rule of thumb is that total scores of more than 80 are considered “good,” scores of 51–80 are considered “needs improvement,” and scores of less than 51 are considered “poor.” Characterizing a diet based on a single cutoff is difficult (analogous to characterizing what it means to be poor based on a poverty line). A key advantage of the stochastic dominance methods used in this research is that they allow general statements about improvements in dietary quality over time or between subpopulations without having to define a specific threshold.

It is worth repeating that the components of the HEI are density based (the ratio of an individual's component intake to their total calorie intake) rather than quantity based. By design, the HEI measures the relative quality of foods consumed, independent of total calories (and of energy expenditure). We use the total HEI score as the underlying metric of interest in this study for two reasons. First, the HEI score has been extensively validated and tested as a measure of diet quality (Guenther et al. 2008). Second, joint tests of dominance are limited in practice to two or three dimensions, rather than the dozen component scores that make up the HEI.³

Data

Our sample uses nationally representative, repeated cross-sectional, individual food intake data from two surveys: the CSFII (1989–91 and 1994–96), and the continuous waves of the NHANES (2001–08). In both surveys, respondents report 24-hour dietary intakes and demographic information, including income and household size.⁴ Each survey wave is an independently drawn sample, which is representative of the United States, with the USDA overseeing the food intake component in both surveys. Finally, for consistency across samples, we focus on adults aged 20 years and older.

The HEI is calculated by linking the USDA's MyPyramid Equivalents Database (MPED) to food intake surveys. The MPED decomposes individual foods into MyPyramid guideline equivalents so that each HEI component can be computed as shown in table 1. As noted, because there is no officially released MPED for the 1989–91 CSFII, the HEI has not been previously computed for surveys before 1994. Of the 3,953 unique foods reported by adults aged 20 and older on day 1 in the 1989–91 CSFII, 3,907 (98.8%) of these foods are also reported in the 1994–96 CSFII. We therefore use the 1994–96 MPED to calculate the HEI for individuals in 1989–91.⁵

We classify individuals as low-income if household income falls below 185% of the Federal Poverty Guidelines. This is a policy-relevant threshold that serves as an upper bound on the cutoff for many federal nutrition assistance programs. During our sample period, the cutoff for SNAP was 130%, and the cutoff for WIC was 185%. The Federal Poverty Guidelines are also used as an eligibility criterion for the National School Lunch Program, School Breakfast Program, Child and Adult Care Food Program, and the Expanded Food and Nutrition Education Program.⁶ Table 2 reports the mean HEI scores for the population as a whole and for individuals above and below 185% of the poverty line for each of the periods in our sample.⁷

Table 2. Healthy Eating Index 2005 Summary Statistics

Population	1989–91	1994–96	2001–04	2005–08
U.S. population	50.16 (13.97)^a	51.10 (13.88)^a	51.50 (11.91)	52.46 (12.49)
	[10.09, 96.42]	[10.69, 97.47]	[13.52, 99.46]	[8.78, 95.38]
No.	9,498	9,867	8,640	9,258
Low-income	48.96 (19.83)^a	49.36 (15.45)^a	49.65 (13.29)^a	51.37 (14.99)
	[10.09, 90.25]	[10.69, 93.81]	[15.08, 99.46]	[8.78, 94.60]
No.	4,965	3,433	3,551	3,857
Higher-income	50.56 (11.19)^a,^b	51.73 (13.16)^a,^b	52.36 (11.09)^b	52.92 (11.29)^b
	[11.51, 96.42]	[13.63, 97.47]	[13.52, 93.97]	[10.00, 95.38]
No.	4,533	6,434	5,089	5,401

^a Note: Standard deviations are given in parentheses. Maxima and minima are given in brackets.
a Within-population mean is significantly lower than that in 2005–08 at the 5% level.
b Within-year higher-income mean is significantly different from low-income mean at the 5% level.

Table 2 shows a consistent pattern of increasing dietary quality across groups over time. Comparing the most recent period of 2005–08 to the earlier periods, we see a significant increase (at the 5% level) for the population at large over the periods 1989–91 and 1994–96. Low-income individuals appear to have a stagnant HEI score over the period 1989–2004, and then a significant increase in the period 2005–08. We also compare low- and higher-income individuals within year and find that higher-income individuals have significantly higher mean HEI scores for all years in the data, although in the final year of the data, the mean HEI gap between groups is smallest.

Stochastic Dominance

We have seen that mean HEI scores have increased for all groups over the interval 1989–2008. But does the mean HEI obscure variation in dietary quality across individuals? For example, is the increase in diet quality due to general improvements across the population at a steady rate or due to larger improvements among those with the lowest (or highest) diet quality? To address these possibilities, we study the entire distribution of dietary quality for groups of interest using an approach common in the study of income and well-being: stochastic dominance.⁸

Definitions

Consider two distributions of HEI scores with cumulative distribution functions (CDFs) F_A(z) and F_B(z), for a population of interest in two distinct time periods or, alternatively, for two mutually exclusive subpopulations within a single time period. We say that distribution B first-order stochastically dominates distribution A if

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0001$

with strict inequality for some z. In other words, no matter where the threshold for “healthy” is set, a greater share of the population characterized by distribution B have a “healthy” diet. This relationship is illustrated in the left panel of figure 1.

Distributional studies of well-being often look to higher orders of stochastic dominance, notably second-order stochastic dominance (SOSD). Whereas first-order stochastically dominance (FOSD) counts the number of individuals falling below a given healthy diet threshold (which would, in turn, determine the headcount ratio), SOSD captures the depth or severity of inadequate diets. SOSD is sensitive to the extent to which diets fall in the lower tails of the distribution.

To formally define SOSD, let $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0021$ , and likewise for B, so that FOSD of B over A can be written as $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0022$ . F_B will second-order stochastically dominates F_A if

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0002$

with a strict inequality for some value of z. This relationship is illustrated in the right panel of figure 1, which shows that the CDFs cross, thereby ruling out FOSD over the entire range of HEI. The integrated difference between F_A and F_B, shown in the subpanel, is strictly positive, and thus F_B second-order stochastically dominates F_A. More generally, dominance at order s of B over A is then defined as $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0023$ where,

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0003$

with a strict inequality for some value of z.

Stochastic dominance maps into social welfare under fairly standard assumptions about the utility derived from a healthy diet (Deaton 1997). For example, if B first-order stochastically dominates A, then for any social welfare function W defined on the distribution of diet quality F(z) such that $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0024$ where U is any monotonically nondecreasing utility function of z (U^′≥0), it must be true that social welfare derived from distribution B will be at least as high as the welfare derived from A. We can extend the mapping of social welfare to SOSD by requiring U to be nondecreasing and concave in z (U^′≥0,U^′′≤0). Note that because dominance of order s implies dominance of order s+1, it follows that the latter is a less stringent condition. Thus, welfare implications are the strongest in the first-order case. Finally, we also make the standard assumption of anonymity so that each individual is weighted equally in the social welfare function.

Estimation

A useful expression for $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0025$ in empirical analyses is (Davidson and Duclos 2000):

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0004$ (1)

Integrating the empirical analogue of (1) by parts leads to a natural estimator of $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0026$

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0005$ (2)

where we account for complex survey design (e.g., CFSII and NHANES) by letting θ_i be an individual's sample weight, $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0027$ is the population size in distribution j (with corresponding sample size n_j), and I(⋅) is the indicator function. The first-order case leads to the empirical CDF

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0006$ (3)

and the statistic for the second-order case follows directly.

Inference

We are interested in testing the hypothesis that the distribution of dietary quality in one time period dominates the distribution in another time period. For example, allowing distribution F_B to be the more recent time period, the null hypothesis of an increase in dietary quality at order s∈{1,2} is,

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0007$

where the positive superscript denotes the hypothesis of dietary improvement, whereas in testing the null hypothesis that dietary quality has decreased (denoted by $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0028$ ), the signs would be reversed. One could also posit a null of equality, but notice that rejection of both $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0029$ and $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0030$ implies rejecting equality.

Bishop, Formby, and Thistle (1989) propose a multiple testing procedure by hypothesizing dominance in both directions—that is, testing the null of $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0031$ and $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0032$ versus their respective alternatives and drawing inferences from the combined acceptance/rejection. A variety of approaches to drawing inferences based on the Bishop, Formby, and Thistle (1989) procedure have been proposed, such as multiple comparison tests (Anderson 1996; Davidson and Duclos 2000) or Kolmogorov–Smirnov-type tests (Barrett and Donald 2003; Bennett 2010; Linton, Massoumi, and Whang 2005; McFadden 1989). Multiple comparison approaches are based on arbitrarily chosen ordinates, which can lead to test inconsistency (Barrett and Donald 2003; Davidson and Duclos 2000). Therefore, in this study we use a Kolmogorov–Smirnov-type statistic that compares all objects within the support of the two distributions.

Let Z be defined as the union of the supports of A and B. Define the following functionals for each order s

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0008$ (4)

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0009$ (5)

Notice that that the null hypotheses can be rewritten in terms of these functionals. That is, the null of increased diet quality (F_B dominating F_A) at order s is simply $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0033$ , and similarly for decreased diet quality (F_A dominating F_B) using $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0034$ . When the distributions are mutually independent, Kolmogorov–Smirnov-type tests based on $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0035$ are consistent (McFadden 1989).⁹ Test statistics are calculated using

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0010$ (6)

Because there are infinitely many F_A(z) satisfying the null such that F_B(z)≤F_A(z), the limiting null distribution is not uniquely defined and depends on the underlying unknown distributions of F_A and F_B. We follow Barrett and Donald (2003) and use the least favorable configuration to construct the null distribution. The least favorable configuration is the point in the null distribution that is least favorable to the alternative hypothesis (i.e., F_A=F_B). As a result, the test is conservative; rejection of the null under the least favorable configuration implies rejection at any point in the null distribution. We construct a bootstrap distribution of $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0036$ to simulate the p values.

We use a recentering bootstrap approach, which has been shown to perform well against alternative methods (see Barrett and Donald (2003); Linton, Maasoumi, and Whang (2005)). Let $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0037$ be defined as above from (2) but computed on a random bootstrap sample drawn with replacement from distribution j.¹⁰ The statistic is recentered by the observed values so that we have $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0038$ . We can then define recentered bootstrap functionals $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0039$ by replacing $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0040$ > with $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0041$ in (4) and (5). The recentered bootstrap t statistics are

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0011$ (7)

We approximate p values from the distribution of bootstrapped test statistics by

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0012$ (8)

The p values allow for a test of stochastic dominance at order s based on the rule “reject $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0042$ if $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0043$ ” where α represents the conventional levels of statistical significance. Thus under the Bishop, Formby, and Thistle (1989) procedure, rejection of the null $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0044$ in favor of $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0045$ coupled with a failure to reject $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0046$ is viewed as statistical evidence in favor of F_B dominated F_A at order s.

Robustness Check for FOSD

To determine the stochastic rankings of two distributions, we must distinguish between four possible true states of nature: the distributions are equal, A lies above B, A lies below B, or the curves cross. The Bishop, Formby, and Thistle (1989) procedure described above distinguishes between these four states by conducting two one-sided tests. The result is lower power in detecting a crossing of the CDFs, which could lead to overclassification of dominance (Dardanoni and Forcian 1999; Gastwirth and Nayak 1999). This is at least partially due to the fact that rejection of $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0047$ or $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0048$ by itself is consistent with both FOSD and a crossing—hence, the use of two one-sided tests to rule out the crossing under the Bishop, Formby, and Thistle (1989) procedure.

A second drawback to the Bishop, Formby, and Thistle (1989) procedure is how the total error probability α is apportioned to each one-sided test (Dardanoni and Forcian 1999). As is typical with standard hypothesis testing, the one-sided critical value c(α) is based on ensuring that the probability of committing a type I error (i.e., rejecting $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0049$ , $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0050$ , or both when they are true) is less than the nominal level α. But as noted by Dardanoni and Forcian (1999), the Bishop, Formby, and Thistle (1989) procedure does not allow one to control how the total error probability α is allocated to each classification (equality, dominance in one direction, dominance in the opposite direction, or a crossing).

Bennett (2013) improves on the Bishop, Formby, and Thistle (1989) procedure by writing it as a two-stage test that allows one to test for a crossing while giving the researcher flexibility in allocating the total error rate to each stage. Let α and β denote a pair of prespecified significance levels for the first and second stage, respectively. The first stage is to posit a null of equality (F_A=F_B) and determine rejection or acceptance based on the critical value a(α). If we accept the null, then we infer that the distributions are indistinguishable. Upon rejection, however, the second stage determines the state of nature among the three alternatives (A dominates B, B dominates A, or they cross) using the critical value b(α,β). This allows β to be the portion of the total error probability α allocated to a crossing (i.e., αβ) and the remaining α(1−β) is split evenly between dominance in either direction.

Bennett (2013) tabulates asymptotic critical values of a(α) and b(α,β) for frequently used significant levels. In the applications below, we wish to calculate the asymptotic p values. To do so, we need to preset the total error rate α, the level at which we are controlling for falsely rejecting equality. We use two levels of significance (10% and 1%) so that the second stage is robust to our choice of α. The associated a(α) critical values are a(0.1)=1.2239 and a(0.01)=1.6277 (see table 1 in Bennett 2013).

The maximum of the one-sided test statistics found in (6) is used in the first stage, and the minimum is used in the second stage. To simplify notation, let these statistics be $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0051$ and $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0052$ , respectively. We are interested in the distribution of K_min conditional on rejecting equality. In other words, if K_min is large enough (i.e., larger than the second stage critical value b(α,β)) conditional on K_max>a(α), then we reject the null in favor of a crossing. Asymptotically, as shown in proposition 2.6 in Bennett (2013), if F_A=F_B and b<a then

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0013$ (9)

where¹¹

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0014$ (10)

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0015$ (11)

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0016$ (12)

The two-stage p values (denoted $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0053$ ) are calculated from (9) where we use two levels of α. Thus, a $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0054$ value below conventional levels of significance is evidence that the distributions cross. Put differently, larger p values are consistent with the null hypothesis that the distributions do not cross.

Results

Our main results are summarized in tables 3 and 4 and depicted in figures 2–4. Tables report the bootstrapped p values for tests of increases and decreases in diet quality (Barrett and Donald 2003), as well as the asymptotic two-stage p values (Bennett 2013). The final column summarizes the inferred ranking of distributions based on these tests. In short, we find that there has been a statistically significant and economically important improvement in the HEI scores over the period under study. For any level of dietary quality, more Americans have higher HEI scores in 2005–08 than they did in the period 1989–91. However, there are differences between income groups with regards to when and where the improvements occurred (table 5).

Table 3. Tests of Stochastic Dominance among U.S. Adults

Distribution		Bootstrap Tests				Two-Stage
A	B	$urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0055$	$urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0056$	$urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0057$	$urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0058$	$urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0059$	$urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0060$	Inferred Ranking
1989–91	1994–96	0.007	0.010	0.900	0.660	0.507	0.362	A≺₁B^***
	2001–04	0.028	0.002	1.000	0.937	0.999	0.999	A≺₁B^**
	2005–08	0.002	0.000	1.000	0.877	0.981	0.966	A≺₁B^***
1994–96	2001–04	0.129	0.149	0.383	0.925	0.003	0.001	ND
	2005–08	0.010	0.003	0.999	0.863	0.981	0.965	A≺₁B^***
2001–04	2005–08	0.133	0.051	0.991	0.790	0.972	0.950	A≺₂B*

^a Notes: The $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0061$ values refer to one-sided tests of the null hypothesis $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0062$ using equation (8). The asymptotic $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0063$ values are calculated from (9), where α=0.1, 0.01. The notation A≺_sB reads, “Distribution B dominates distribution A at order s,” while ND indicates no dominance at order 1 or 2. Inferred ranking is based on statistical significance levels of 1% (indicated by a triple asterisk), 5% (indicated by a double asterisk), and 10% (indicated by a single asterisk).

Table 4. Tests of Stochastic Dominance among U.S. Adults by Income Group

Distribution		Bootstrap Tests				Two-stage
A	B	$urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0064$	$urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0065$	$urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0066$	$urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0067$	$urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0068$	$urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0069$	Inferred Ranking
Low income
1989–91	1994–96	0.218	0.257	0.570	0.654	0.027	0.009	ND
	2001–04	0.290	0.140	0.977	0.952	0.927	0.878	ND
	2005–08	0.008	0.000	1.000	0.882	0.998	0.996	A≺₁B^***
1994–96	2001–04	0.332	0.300	0.575	0.889	0.034	0.012	ND
	2005–08	0.006	0.003	0.991	0.855	0.984	0.971	A≺₁B^***
2001–04	2005–08	0.031	0.033	0.997	0.818	0.998	0.996	A≺₁B^**
Higher income
1989–91	1994–96	0.007	0.006	0.880	0.684	0.429	0.289	A≺₁B^***
	2001–04	0.004	0.000	0.999	0.897	0.999	0.999	A≺₁B^***
	2005–08	0.002	0.000	1.000	0.906	0.985	0.973	A≺₁B^***
1994–96	2001–04	0.083	0.106	0.553	0.932	0.032	0.011	ND
	2005–08	0.007	0.010	0.991	0.886	0.957	0.926	A≺₁B^***
2001–04	2005–08	0.135	0.137	0.913	0.697	0.555	0.410	ND

^a Notes: The $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0070$ values refer to one-sided tests of the null hypothesis $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0071$ using equation (8). The asymptotic $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0072$ values are calculated from (9), where α=0.1,0.01. The notation A≺_sB reads, “Distribution B dominates distribution A at order s,” while ND indicates no dominance at order 1 or 2. Inferred ranking is based on statistical significance levels of 1% (indicated by a triple asterisk), 5% (indicated by a double asterisk), and 10% (indicated by a single asterisk).

Table 5. Location and Time-Path of Dietary Improvement

	Between Period			Total
HEI range	$urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0073$	$urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0074$	$urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0075$	$urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0076$
All adults
0–40	2.80	12.61	6.21	21.64
40–50	5.65	6.83	9.54	22.01
50–60	12.41	1.86	10.25	24.52
60–100	20.38	−3.64	15.10	31.83
0–100	41.23	17.66	41.09	100.00
Low income
0–40	2.70	8.49	8.33	19.41
40–50	−1.91	6.99	14.73	19.85
50–60	4.40	0.81	20.57	25.80
60–100	11.35	−4.18	27.84	34.94
0–100	16.54	12.11	71.47	100.00
Higher income
0–40	2.98	15.14	4.49	22.70
40–50	8.58	8.46	6.04	23.07
50–60	15.24	4.36	4.40	24.00
60–100	23.27	−0.99	7.92	30.23
0–100	50.07	26.97	22.85	100.00

^a Note: Numbers represent the percentage of the twenty-year improvement coming from the area bounded by the HEI range and the two distributions. $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0077$ ; $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0078$ ; $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0079$ ; and $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0080$ .

Between Periods

Figure 2 shows the empirical CDFs for the U.S. adult population in each period. Distributions shift systematically to the right over time, in other words toward a more nutritious diet. Because the shifts are relatively small, in this and subsequent figures, we present the estimated difference between the earliest period (1989–91) and the latest period (2005–08) in a subpanel. The area under the difference curve in the subpanel is equal to the area between the distributions. We can see the twenty-year improvement was positive and pointwise statistically significant for the empirically relevant range of HEI scores.

From the first three rows of table 3, we see that in comparing the period 1989–91 to all subsequent periods, the null of decreasing dietary quality is strongly rejected and in no case do we reject the null of an increase in diet quality. In comparing the periods 1994–96 and 2001–04, we are unable to order the distributions in either the first- or second-order case. We do find strong evidence that the period 2005–08 first-order stochastically dominates the period 1994–96, but the results are fairly weak with regards to an ordering of the periods 2005–08 and 2001–04.

Some care is required in interpreting the last two columns of table 3 because they report results from Bennet's two-stage test. As noted, these p values are for the null hypothesis that the CDFs do not cross, as determined by both K_max and K_max being statistically large. Loosely speaking, these can be interpreted as the (conditional) probability of rejecting the hypothesis of no crossing. Thus, a $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0081$ value below conventional levels of significance can be interpreted as evidence that the distributions cross. Bennet's two-stage test supports the main findings in that there is no statistical evidence that the 1989–91 distribution crosses any of the later years.

Between Income Groups

We now turn our attention to direct comparisons of individuals above and below 185% of the poverty guideline. As noted, we choose 185% of the poverty line as our cutoff because it is an upper limit on the threshold for many federal nutrition assistance programs.¹² Panel (a) of figure 3 presents the empirical CDFs and the difference between the periods 1989–91 and 2005–08 for low-income individuals; panel (b) likewise for higher-income individuals. Table 4 presents results from statistical tests of dominance by income group. For both groups, we find strong evidence that the distribution of dietary quality in the period 2005–08 first-order stochastically dominates the distribution in the earliest period, with no evidence of a crossing.¹³

Results support the observation in table 2 that a significant portion of dietary improvement among low-income individuals occurred over the period 2001–08. For example, in comparing the period 1989–91 to the period 1994–96, we find no evidence of a partial ordering according to the bootstrap results, and the two-stage test confirms this by finding significant evidence of a crossing. In comparing the period 1989–91 to the period 2001–04, again the bootstrap results are silent on the ordering, as is the asymptotic test, indicating no dominance at orders 1 or 2. However, in comparing the most recent time period 2005–08 to any of the earlier distributions, all tests show a statistically significant, first-order improvement in dietary quality, with no evidence of a crossing.

Comparing distributions among higher-income individuals, we can see that the period 1989–91 is first-order dominated by each subsequent period, with no evidence of a crossing. Comparing the period 1994–96 to the period 2001–04, the bootstrap results indicate a weak rejection of $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0082$ , which could lead one to infer a partial ordering. However, when consulting the asymptotic two-stage test, we find significant evidence of a crossing, thereby ruling out first-order dominance. We do see that the period 2005–08 first-order stochastically dominates the period 1994–96, but we cannot rank the two most recent time periods.

We can compare the total twenty year improvements in each income group by examining subpanels (a) and (b) in figure 3. We see that low-income individuals experienced relatively smaller increases over the bottom tail of the relevant range of HEI compared with their higher-income counterparts. We can more formally investigate this finding by taking the difference (between above and below 185% of the poverty line) in the differences (between the earlier and later periods). Figure 4 superimposes the subfigures in panels (a) and (b) of figure 3 in the top panel and then plots the difference between the two in the bottom panel—that is, in the subpanel of figure 4 we plot:

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0017$ (13)

As shown in figure 4, considering lower levels of dietary quality below a HEI of 45, we find higher-income individuals experienced a greater improvement over the period 1989–2008 than low-income individuals. Whereas at higher levels of the HEI distribution, low-income individuals experienced greater increases in dietary quality. In other words, we find some evidence that within the poor dietary quality population, low-income individuals experienced less improvement over the 20-year period than higher-income individuals.

Rate and Location of Change

Given the differential gains in dietary quality noted above, we now investigate when in time and where in the distribution of dietary quality these improvements took place. For consistency and cross-sample/population comparisons, we focus on fixed portions of the distribution of dietary quality. An obvious choice is to use quartiles, which are all roughly segmented by HEI scores of 40, 50, and 60.¹⁴ Table 5 measures the amount of dietary improvement occurring in a particular quartile between two time periods as the percentage of total improvement $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0083$ . That is, we measure the area bounded by the two empirical CDFs within each quartile range of the HEI scores. For example, the percentage of improvement in the United States over the 20-year period that occurred in the bottom quartile (<40) between 1989–91 and 1994–96 was 2.8%. The last column of table 5 measures the overall improvements over the period 1989–2008 within each quartile of the distribution of dietary quality.

For the U.S. adult population, improvements below the median (HEI <50) occurred steadily over the period 1989–2008. In the upper range of dietary quality (HEI above 50), however, virtually all of the gains occurred over the periods 1989–96 and 2001–08. Overall, there were slightly higher gains in the upper quartiles compared to the lower quartiles for the U.S. population.

Comparing the between-period improvements by income group, we see that 71.5% of the total improvement in the diets of the low-income population occurred more recently over the period 2001–08. This is in contrast to the higher-income population, which saw the majority of their improvements occurring over the period 1989–2001 (77.1%). Improvements in the lower quartiles for the higher-income population have been relatively steady over the 20-year period, whereas most of the improvement in low-income diets within the lower quartiles occurred more recently over the period 1994–2008. In other words, at the lower end of the distribution of dietary quality, low-income individuals have seen comparatively limited or lagging improvements.

Table 5 emphasizes the reasons for targeting the most vulnerable group at risk of poor diets—the low-income, low–dietary quality population. This is best seen by examining the last column of table 5, which measures the total gains over the 20-year period within each quartile. The higher-income population has had almost proportional gains across all levels of HEI, whereas the low-income population has seen less improvement in the lower quartiles of diet quality.

Counterfactual Analysis

We now explore whether factors that evolve gradually over time within the population can help explain observed improvements in the distribution of HEI scores between 1989 and 2008. We focus on two factors in particular: changes in food formulation and changes in the demographic landscape.¹⁵ In the figures below, we focus on the differences between the observed 2005–08 distribution and the 1989–91 counterfactual distributions.

Food Reformulation

The composition of the food supply has changed considerably over the last twenty years in response to changes in policy, regulation, technology, and consumer tastes. For example, Vesper et al. 2012 found that levels of transfats in the population declined after new labeling requirements were put in place in 2003. We now investigate how much of the improvement in dietary quality can be attributed to changes in food composition.

To identify foods and food mixtures that have undergone food reformulation (e.g., changes in the type of fat used in processed foods), we use the USDA Food and Nutrient Database for Dietary Studies (FNDDS). FNDDS consists of a series of databases updated every two years in conjunction with the continuous waves of NHANES to reflect the current state of food formulation and packaging. We combine the FNDDS to cover the period 1994–2008. We briefly explain the method here, with more details in the online supplementary appendix.

To construct the distribution of dietary quality in the period 1989–91 as if food were formulated in the period 2005–2008, we first identify all foods coded as reformulated in the 1994–2008 FNDDS. We then replace the nutrient values for these food items in the 1989–91 CSFII with the reformulated values found in the FDNNS. We also replace the MPED values of the 1989–91 reformulated foods with their 2005–08 values. We then construct a new HEI score based on updated nutrient and MPED values for each respondent in the 1989–91 sample. Figure 5 displays the results from the reformulation counterfactual, as well as results from the next section.

The distribution of HEI that accounts for reformulation lies everywhere to the right of the original 1989–91 distribution over the relevant range of the HEI. The implication is that, holding food choices constant, changes in food composition could be a contributing factor to dietary improvement. In figure 5, the indicated shaded area represents the change in the empirical CDFs attributed to reformulation. The ratio of this area to the total area provides a scalar measure of change. Here, improvements attributed to reformulation represent about 10.1% of the total difference between the period 1989–91 and the period 2005–08.

An important caveat is that this exercise captures partial equilibrium effects, and some care must be taken interpreting these results. Our counterfactual analysis cannot account for the fact that individuals in the period 1989–91 might have chosen different foods had their foods been formulated as they were in the period 2005–08. Nevertheless, it shows how food reformulation, all else equal, can play an important role in changing dietary quality.

Demographic Changes

The United States of 2005–08 is an older, more diverse, and better educated country than the United States of 1989–91. To the extent that these factors are correlated with healthy eating, they may explain some of the improvements in dietary quality. Table 6 illustrates demographic changes using data from our sample and from the U.S. Census. There is a clear decrease in the population aged 30–44 years and a concomitant rise in the population aged 45–64 years. The decrease in the non-Hispanic white population has come from an increase in the Hispanic and other race/ethnicity groups. Finally, the overall educational attainment in the population has also increased.

Table 6. U.S. Population Characteristics, Adults Aged 20 Years and Older

Demographic	CSFII	NHANES	Census^a	Census^b
	1989–91	2005–08	1990	2005–07
Aged 20–29 years^c	21.7	19.4	22.7	19.1
Aged 30–44 years	35.9	28.3	33.5	29.1
Aged 45–64 years	26.2	35.4	26.1	35.0
Aged ≥65 years	16.3	16.8	17.6	16.8
Non-Hispanic white	78.8	71.9	78.4	69.5
Non-Hispanic black	10.8	11.3	10.6	11.3
Hispanic	7.7	11.6	7.6	12.8
Other race/ethnicity	2.7	5.2	3.4	6.4
Did not attend high school	8.5	6.0	9.6	6.1
High school, no college	46.2	37.8	44.5	39.7
Attended college	45.2	56.2	45.9	54.2
Total No.	9,377	9,257

a U.S. Census Bureau, General Population Characteristics (CP-1, 3-4).
b U.S. Census Bureau, 2005–07 Annual Community Survey three-year sample.
c All numbers are expressed as the percentage of the adult population aged 20 and older

To investigate the effect of evolving population characteristics, we construct counterfactual distributions of HEI scores following an approach proposed by DiNardo, Fortin, and Lemieux. We ask, “What would the distribution of HEI scores look like had the demographic landscape of 2005–08 prevailed in 1989–91?” We focus on age, race/ethnicity, and educational attainment, all of which have been found to be correlated with diet healthfulness (Popkin, Siega-Riz, and Haines 1996). The intuition is to adjust each individual's sampling weight in the base period 1989–91 conditional on a set of demographics such that it captures the relative probability that the individual would be represented in the more recent 2005–08 sample.

To briefly describe the DiNardo, Fortin, and Lemieux 1996 methodology, let each individual observation be a vector (y, h, t), where y is HEI, h is vector of demographic characteristics, and t is time. Thus, all individuals belong to the joint distribution F(y,h,t). The static joint distribution of HEI and demographics in time t is F(y,h|t). The density of HEI at any point in time f_t(y) can be written as the integral of the HEI density conditional on a set of demographics f(y|h,t_y) at a specific date t_y, over the distribution of demographics F(h|t_h) at date t_h

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0018$ (14)

where Ω_h is the domain of individual demographics. Therefore, our question posed earlier can be written with the above notation as the density of HEI scores in 1989–91 had the 2005–08 demographic landscape prevailed: f(y;t_y=89,t_h=08). This density is written as

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0019$ (15)

where ψ(h) is a reweighting function defined as ψ(h)=dF(h|t_h=08)/dF(h|t_h=89). Applying Bayes's rule to the function, we can rewrite ψ(h) as

$urn:x-wiley:00029092:equation:ajaeajaeaat104-math-0020$ (16)

To obtain an estimate $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0084$ , notice the conditional probabilities Pr(t_h=t|h) can be estimated using a probit model by pooling the data and estimating the probability an individual is observed in time t conditional on a set of characteristics. Because we only compare two dates, the unconditional probabilities Pr(t_h=t) are simply the weighted sums of individuals in period t_h over the weighted sums of individuals in both periods. Because we are interested in applying the above methodology to tests of stochastic dominance, we replace an individual's sampling weight θ_i with $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0085$ in equation (2).

Although long-run demographic changes such as sex, age, and race/ethnicity are plausibly exogenous, the claim that education is uncorrelated with omitted factors that affect diet quality is less plausible. However, we are interested in how changes in the distribution of education affects changes in diet quality, rather than how education affects diet quality. In other words, the conditional independence assumption E[ε|h]=0 is unnecessary for our decompositional analysis. Rather, we only need the weaker assumption of ignorability (also called unconfoundedness or selection on observables) to compute the aggregate compositional effects of all demographics. Ignorability asserts that the correlation between education (or any variable in h) and the error term is the same in both periods.¹⁶

Because of the aggregate decompositional nature of the DiNardo, Fortin, and Lemieux methodology (as opposed to a Oaxaca-style decomposition), the reweighting function ψ(h) does not distinguish between individual variables in the vector h. In the interests of transparency, we construct the counterfactual distributions in two stages. First, we construct a counterfactual distribution accounting for purely demographic changes (sex, age, and race/ethnicity) and denote this reweighting function by ψ_h. We then construct a counterfactual distribution accounting for changes in demographics and changes in education levels, denoted by ψ_h,e.¹⁷ We investigate the effects of the ordering herein.

Figure 5 decomposes the change in the distribution of HEI into four main parts: improvements attributed to reformulation (as shown in the previous section); additional improvements attributed to changes in demographics, with and without education; and, finally, the residual change. As noted above, 10.1% of total improvement can be attributed to changes in food composition. Here we find that roughly equal proportions of the total improvement in HEI scores can be attributed to changes in sex, age, and race/ethnicity (26.6%) and education (26.7%) over the twenty-year period.¹⁸ This leaves 36.6% of the improvement unexplained by reformulation and demographics (i.e., the residual improvement). The residual improvement encompasses many competing factors, such as changes in tastes, relative food prices, scientific discovery, and attitudes toward food in general.

As above, care must be taken in interpreting these results. One important limitation of the partial equilibrium nature of the counterfactual analysis is that food choices in the counterfactual population would not affect the set of foods made available by food manufacturers. Although this assumption is economically unappealing, the exercise provides insight into the effects of changing demographics on diet quality via clear and tractable analytical techniques.

Counterfactuals by Income Group

The counterfactual analyses above suggest that an important part of the improvement in dietary quality can be attributed to changes in food composition and demographics. Given that improvements occurred at different rates for different parts of the HEI distribution for lower-income versus higher-income individuals, we now ask whether changes in food composition and demographics account for differing amounts of improvement by income group. Results are presented in figure 6.

Changes in food composition account for a substantially larger percentage of the dietary improvement for lower-income individuals (19.6%) compared with their higher-income counterparts (6.4%). This is consistent with the observation that low-income individuals eat more processed foods (Drewnowski and Barratt-Fornell 2004), where much of the reformulation is occurring. Changes in sex, age, and race/ethnicity account for a similar share of the improvement for low-income (25.3%) versus higher-income individuals (26.8%). For low-income individuals, changes in educational attainment account for half that of higher-income individuals (13.5% versus 27.0%). The remaining residual share of the twenty-year improvement is larger within the low-income population (41.6%) than the higher-income population (39.8%). This suggests that further research into the determinants of diet quality of low-income individuals may be warranted.

Robustness

The order in which we construct counterfactual distributions using the DiNardo, Fortin, and Lemieux (1996) approach can influence the results. To investigate the robustness of our findings to ordering, we estimate the model using an alternative ordering for each of the three population groups of interest (total population, low-income, higher-income). Note that because reformulation is not estimated, but rather derived from data, it does not matter in which order it is considered. Furthermore, the total aggregate effect (ψ_h,e) remains the same as well. For example, in either case, all demographics account for 53.3% of the total improvement within the U.S. population.

Table 7 provides estimates for the original order as presented above, as well as an alternative ordering where we first consider educational attainment ψ_e and then use ψ_h,e as before. The result places bounds on the magnitude for each set of demographics. For example, the effect of education ranges between 15.6% and 26.7% for the total population, 5.0% and 13.5% for the low-income group, and 16.0% and 27.0% for the higher income group. Although point estimates change, relative comparisons remain substantively the same—changes in education appear to account for a larger share of the improvement for the higher-income group relative to the lower-income group. We note that the bounds are relatively large, and credibly point-identifying each effect remains a task for future work.

Table 7. DiNardo, Fortin, and Lemieux (1996) Counterfactual Improvements

Order	Population	Income	Income
	U.S.	Low	Higher
Original order
1. Reformulation	10.1	19.6	6.4
2. Sex, age, race/ethnicity	26.6	25.3	26.8
3. Education	26.7	13.5	27.0
Total: reformulation & demographics	63.4	58.4	60.2
Alternative Order
1. Reformulation	10.1	19.6	6.4
2. Education	15.6	5.0	16.0
3. Sex, Age, Race/Ethnicity	37.7	33.7	37.7
Total: reformulation & demographics	63.4	58.4	60.2

^a Note: Numbers represent the percentage of total improvement and may not sum accordingly due to rounding.

Discussion and Conclusion

Conventional wisdom maintains that the quality of the American diet has been deteriorating for at least the past two decades.¹⁹ In contrast, we document a previously unknown pattern of improvement in U.S. dietary quality. We find statistically significant improvements for all adults over the period 1989–2008, at all levels of dietary quality.

An important caveat is that the HEI measures diet quality on a per calorie basis and does not account for excess calorie consumption. To our the best of our knowledge, few studies have examined the quantity–quality isoquant of food in health production, and those that have generally do so within the context of specific foods in an experimental framework. In a series of dietary intervention experiments, Epstein et al. (2001, 2008) found that increasing healthy food consumption reduced obesity to a greater degree than reducing unhealthy food consumption. Moreover, in Epstein et al. (2008) individuals in the increase-healthy-food group showed no relapse in weight gain in a two-year follow-up. The implication is that a shift toward a healthier diet could have additional positive impacts on health outcomes driven by quantity, such as obesity. The mechanism is generally thought to be a higher level of satiation, which in turn leads to a reduction in overall calories consumed.

Although we find that higher-income individuals consistently have higher dietary quality than low-income individuals, we also find some evidence that the gap is shrinking over the sample period. An important caution is that the diets of low-income individuals in the lowest portion of the diet quality distribution continue to lag.

We also show that most of the improvement in dietary quality can be attributed to changes in food formulation and changes in demographics. Moreover, we find that changes in food formulation help explain considerably more of the improvement in dietary quality for low-income individuals than for higher-income individuals. These findings suggest that the direct and indirect effects of policy on food composition may represent understudied policy levers.

How large are these results? In a prospective study that roughly covers our sample period, Chiuve et al. (2012) found significantly lower risks of major chronic diseases across the entire distribution of HEI scores for both women (over the period 1984–2008) and men (over the period 1986–2008) who were free of chronic disease at baseline. For example, those in the second quintile were 7% less likely to report a chronic disease than those in the lowest quintile, all else equal. One way to assess the magnitude of changes in HEI over time is to see how many individuals move from low to moderate levels of dietary quality over the period under study. In the period 1989–91, the twentieth percentile of the HEI distribution was 37.3. In 2005–08, a HEI value of 37.3 represented the 15.4th percentile of the HEI distribution. In other words, 4.6% of individuals moved out of this higher-risk category between 1989 and 2008 because of improvements in diet quality.

Findings of a small but statistically significant increase in dietary quality should not overshadow the fact that there is still considerable room for improvement. Moreover, an important residual share of the change in dietary quality over the period remains unexplained, especially in the tails of the distributions. Because of the sheer number of overlapping and time-varying policy initiatives—particularly those that target the poor—credibly identifying effects of specific policies remains a challenging task for future work.

Supplementary material

Supplementary material is available at http://oxfordjournals.org/our_journals/ajae/online.

1 The Food and Nutrition Act of 2008 specifically aims “to provide for improved levels of nutrition among low-income households.”

2 For a comprehensive review of dietary indices see Kant (1996) and Kourlaba and Panagiotakos (2009).

3 Duclos, Sahn, and Younger (2006) provide a thorough discussion of multidimensional orderings using stochastic dominance. Alkire and Foster (2011) propose an alternative “counting” method that enables one to examine many dimensions with the caveat of having to choose a threshold a priori.

4 For all surveys but the 2001–02 NHANES, a second day of dietary intake was obtained. In keeping with standard practice, we analyze the first day of intake. One alternative is to average day 1 and 2 intakes where available. Another approach is to estimate models of usual intake (see Dodd et al. 2006). Assuming that measurement bias and within-person variation, if present, is consistent across survey waves, our results are invariant to usual intake methods. As shown in the online supplementary appendix, results are robust to using two days of intake.

5 The online supplementary appendix contains a description of how to map the MPED for the 1994–96 CSFII to the 1989–91 CSFII in greater detail.

6 Federal Poverty Guidelines are updated each year to reflect changes in the Consumer Price Index for urban consumers (CPI-U) and are a function of household income and size (U.S. Department of Health and Human Services 2013).

7 There are various ways to calculate the HEI score for a population of interest (see Freedman et al. 2008, 2010 for in-depth discussions). Because we are interested in the number and depth of individuals below a particular HEI score, we use the mean score of individuals instead of an alternative measure score of the population ratio. The mean score is computed by calculating each individual's HEI score and then averaging over the population, whereas the score of the population ratio is calculated as the population's total component intake over total calorie intake and then calculating each score from this population ratio.

8 Stochastic dominance approaches have also been used to study changes in body mass index (Madden 2012) and environmental quality (Maasoumi and Millimet 2005) and extended to qualitative health measures (Allison and Foster 2004).

9 The independence assumption seems reasonable given that our data are repeated cross-sections in which sampling units are independently drawn in each survey (see Bhattacharya 2007 for a more detailed discussion) and that some surveys were separated by nearly 20 years in time. In the Counterfactual Analysis section, we relax this assumption to ignorability (also called conditional independence or unconfoundedness) to construct aggregate counterfactual decompositions.

10 Our samples are constructed using multistage stratification where each stratum is clustered by two primary sampling units. Test statistics based on a simple random bootstrap sample drawn with replacement would be biased and inconsistent. Under the CSFII and NHANES survey design, Rao, Wu, and Yue (1992) show that bootstrap replicate weights can be obtained by randomly picking one primary sampling unit within each stratum and internally rescaling the sample weights. We use the user written Stata package bsweights (Kolenikov 2010) to automate the rescaling process to create B=1,000 balanced replicate weights $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0086$ for each sample individual. These weights are used in equation (2) to create the bootstrap distribution of $urn:x-wiley:00029092:media:ajaeajaeaat104:ajaeajaeaat104-math-0087$ .

11 The G(⋅) functions were taken from an earlier version of the two-stage test (Bennett 2010). G₁(b) is derived from the Kolmogorov–Smirnov distribution. We present G₂(a) as a numerical approximation to the Kolmogorov–Smirnov cumulative distribution. As referenced in Bennett (2013), Billingsley (p. 85) shows the closed-form expression of G(a,b).

12 As pointed out by a referee, the health-education gradient is also of considerable interest. Although our main focus here is on income, we present dominance results by education in the online supplementary appendix for interested readers.

13 As a sensitivity check, we also considered poverty thresholds of 75% to 250% of the Federal Poverty Guidelines in 25% increments. In all cases, both the low- and higher-income group, as defined by the various thresholds, exhibited a first-order dietary improvement over the 20-year period at less than 5% significance levels.

14 Quartile estimates for the U.S., low-, and higher-income populations when samples are pooled across the 20-year period reveal cutoffs of (40.4, 50.7, 61.6), (39.0, 48.8, 60.0), and (40.9, 51.4, 62.2), respectively.

15 Educational attainment is missing for 121 individuals in the 1989–91 CSFII (61 low-income and 60 higher-income) and one higher-income person in 2005–08. These individuals are dropped from all counterfactual analyses. The preceding analysis is robust to their exclusion.

16 If we believe this assumption does not hold, then we can sign the bias. For example, if we believe that more highly educated individuals use their stock of knowledge more efficiently in 2005–08 than in 1989–91, then we have a positive bias. However, there is no a priori evidence to suggest a change in the correlation of education and the error term, let alone as to its direction.

17 The conditional probability model includes a dummy for sex, 16 cells of race/ethnicity fully interacted with age dummies, and three education dummies, all as described in table 6. Results of the model are available in the online supplementary appendix.

18 See the online supplementary appendix for dominance results between each counterfactual distribution and the observed 2005–08 distribution.

19 See for example Gregory, Smith, and Wendt (2011).

References

1AlkireS., FosterJ. E. Counting and Multidimensional Poverty Measurement Journal of the Public Economics 2011 95 7 476–87
10.1016/j.jpubeco.2010.11.006
Web of Science® Google Scholar
2AllisonR. A., FosterJ. E. Measuring Health Inequality using Qualitative Data Journal of Health Economics 2004 23 3 505–24
10.1016/j.jhealeco.2003.10.006
PubMed Web of Science® Google Scholar
3AndersonG. Nonparametric Tests for Stochastic Dominance in Income Distributions Econometrica 1996 64 5 1183–93
10.2307/2171961
Web of Science® Google Scholar
4BarrettG. F., DonaldS. G. Consistent Tests for Stochastic Dominance Econometrica 2003 71 1 71–104
10.1111/1468-0262.00390
Web of Science® Google Scholar
5BennettC. On Bidirectional Tests for Stochastic Dominance 2010
Google Scholar
6BennettC. Inference for Dominance Relations International Economic Review 2013 54 4 1309–1328
10.1111/iere.12038
Web of Science® Google Scholar
7BhattacharyaD. Inference on Inequality from Household Survey Data Journal of Econometrics 2007 137 2 674–707
10.1016/j.jeconom.2005.09.003
Web of Science® Google Scholar
8BhattacharyaJ., CurrieJ., HaiderS. Poverty, Food Insecurity and Nutritional Outcomes in Children and Adults Journal of Health Economics 2004 23 4 839–62
10.1016/j.jhealeco.2003.12.008
PubMed Web of Science® Google Scholar
9BhattacharyaJ., CurrieJ., HaiderS. Breakfast of Champions?: The School Breakfast Program and the Nutrition of Children and Families Journal of Human Resources 2006 41 3 445–66
10.3368/jhr.XLI.3.445
Web of Science® Google Scholar
10BhattacharyaJ., DeleireT., HaiderS., CurrieJ. Heat or Eat?: Cold-Weather Shocks and Nutrition in Poor American Families American Journal of Public Health 2003 93 7 1149–54
10.2105/AJPH.93.7.1149
PubMed Web of Science® Google Scholar
11BillingsleyP. Convergence of Probability Measures 1968 New York John Wiley & Sons
Google Scholar
12BishopJ. A., FormbyJ. P., ThistleP. D. D. J. Slottje Statistical Inference, Income Distributions, and Social Welfare Research on Economic Inequality Greenwich, CT JAI Press, 49–82
Google Scholar
13BitlerM., HaiderS. J. An Economic View of Food Deserts in the United States Journal of Policy Analysis and Management 2011 30 1 153–76
10.1002/pam.20550
Web of Science® Google Scholar
14BosireC., StampferM. J., SubarA. F., ParkY., KirkpatrickS. I., ChiuveS. E., HollenbeckA. R., ReedyJ. Index-based Dietary Patterns and the Risk of Prostate Cancer in the NIH-AARP Diet and Health Study American Journal of Epidemiology 2013 177 6 504–13
10.1093/aje/kws261
PubMed Web of Science® Google Scholar
15ChiuveS. E., FungT. T., RimmE. B., HuF. B., McCulloughM. L., WangM., StampferM. J., WillettW. C. Alternative Dietary Indices Both Strongly Predict Risk of Chronic Disease Journal of Nutrition 2012 142 6 1009–18
10.3945/jn.111.157222
CAS PubMed Web of Science® Google Scholar
16DardanoniV., ForcianA. Inference for Lorenz Curve Orderings Econometrics Journal 1999 2 1 49–75
10.1111/1368-423X.00020
Google Scholar
17DavidsonR., DuclosJ. Y. Statistical Inference for Stochastic Dominance and for the Measurement of Poverty and Inequality Econometrica 2000 68 6 1435–64
10.1111/1468-0262.00167
Web of Science® Google Scholar
18DeatonA. The Analysis of Household Surveys: A Microeconometric Approach to Development Policy 1997 Baltimore John Hopkins University Press
10.1596/0-8018-5254-4
Google Scholar
19DiNardoJ. A., FortinN. M., LemieuxT. Labor Market Institutions and the Distribution of Wages, 1973–1992: A Semiparametric Approach Econometrica 1996 64 5 1001–44
10.2307/2171954
Web of Science® Google Scholar
20DoddK. W., GuentherP. M., FreedmanL. S., SubarA. F., KipnisV., MidthuneD., ToozeJ. A., Krebs-SmithS. M. Statistical Methods for Estimating Usual Intake of Nutrients and Foods: A Review of the Theory Journal of the American Dietetic Association 2006 106 10 1640–50
10.1016/j.jada.2006.07.011
PubMed Web of Science® Google Scholar
21DrewnowskiA., Barratt-FornellA. Do Healthier Diets Cost More? Nutrition Today 2004 39 4 161–168
10.1097/00017285-200407000-00006
Google Scholar
22DuclosJ. Y., SahnD. E., YoungerS. D. Robust Multidimensional Poverty Comparisons Economic Journal 2006 116 514 943–68
10.1111/j.1468-0297.2006.01118.x
Web of Science® Google Scholar
23EpsteinL. H., GordyC. C., RaynorH. A., BeddomeM., KilanowskiC. K., PaluchR. Increasing Fruit and Vegetable Intake and Decreasing Fat and Sugar Intake in Families at Risk for Childhood Obesity Obesity Research 2001 9 3 171–78
10.1038/oby.2001.18
CAS PubMed Web of Science® Google Scholar
24EpsteinL. H., PaluchR. A., BeecherM. D., RoemmichJ. N. Increasing Healthy Eating vs. Reducing High Energy-dense Foods To Treat Pediatric Obesity Obesity 2008 61 2 318–26
10.1038/oby.2007.61
Web of Science® Google Scholar
25 Food and Nutrition Act of 2008. 7 U.S.C. Section 2011. 2008
Google Scholar
26FordE. S., ZhaoG., TsaiJ., LiC. Low-Risk Lifestyle Behaviors and All-Cause Mortality: Findings from the National Health and Nutrition Examination III Mortality Study American Journal of Public Health 2011 101 10 1922–29
10.2105/AJPH.2011.300167
PubMed Web of Science® Google Scholar
27FreedmanL. S., GuentherP. M., Krebs-SmithS. M., DoddK. W., MidthuneD. A Population's Distribution of Healthy Eating Index-2005 Component Scores can be Best Estimated When More Than One 24-Hour Recall is Available Journal of Nutrition 2010 140 8 1529–34
10.3945/jn.110.124594
CAS PubMed Web of Science® Google Scholar
28FreedmanL. S., GuentherP. M., Krebs-SmithS. M., KottP. S. A Population's Mean Healthy Eating Index-2005 Scores are Best Estimated by the Score of the Population Ratio when One 24-Hour Recall is Available Journal of Nutrition 2008 138 9 1725–9
10.1093/jn/138.9.1725
CAS PubMed Web of Science® Google Scholar
29GastwirthJ. L., NayakT. K. J. Silber Comments on Tests of Significance for Lorenz Partial Orders by J. A. Bishop and J. P. Formby Handbook of Income Inequality Measurement 1999 Amsterdam Springer, 336–9
Google Scholar
30GregoryC., SmithT. A., WendtM. How Americans Rate their Diet Quality: An Increasingly Realistic Perspective 2011
Google Scholar
31GuentherP. M., ReedyJ., Krebs-SmithS. M. Development of the Healthy Eating Index-2005 Journal of the American Dietetic Association 2008 108 11 1896–1901
10.1016/j.jada.2008.08.016
PubMed Google Scholar
32GuentherP. M., ReedyJ., Krebs-SmithS. M., ReeveB. B. Evaluation of the Healthy Eating Index-2005 Journal of the American Dietetic Association 2008 108 11 1854–64
10.1016/j.jada.2008.08.011
PubMed Google Scholar
33GuentherP. M., ReedyJ., Krebs-SmithS. M., ReeveB. B., BasiotisP. P. Development and Evaluation of the Healthy Eating Index-2005: Technical Report 2007
Google Scholar
34GuoX., WardenB. A., PaeratakulS., BrayG. A. Healthy Eating Index and Obesity European Journal of Clinical Nutrition 2004 58 12 1580–6
10.1038/sj.ejcn.1601989
CAS PubMed Web of Science® Google Scholar
35HallK. D., HeymsfieldS. B., KemnitzJ. W., KleinS., SchoellerD. A., SpeakmanJ. R. Energy Balance and its Components: Implications for Body Weight Regulation American Journal of Clinical Nutrition 2012 95 4 989–994
10.3945/ajcn.112.036350
PubMed Web of Science® Google Scholar
36JemalA., SiegalR., WardE., HaoY., XuJ., MurrayT., ThunM. J. Cancer Statistics, 2008 CA 2008 58 2 71–96
Web of Science® Google Scholar
37KantA. Indexes of Overall Diet Quality: A Review Journal of the American Dietetic Association 1996 96 8 785–91
10.1016/S0002-8223(96)00217-9
CAS PubMed Web of Science® Google Scholar
38KimS. Y., NaygaR. M., CappsO. J. Food Label Use, Self-selectivity, and Diet Quality Journal of Consumer Affairs 2001 35 2 346–63
10.1111/j.1745-6606.2001.tb00118.x
Web of Science® Google Scholar
39KolenikovS. Resampling Variance Estimation for Complex Survey Data Stata Journal 2010 10 2 165–99
10.1177/1536867X1001000201
Web of Science® Google Scholar
40KourlabaG., PanagiotakosD. B. Dietary Quality Indices and Human Health: A Review Maturitas 2009 62 1 1–8
10.1016/j.maturitas.2008.11.021
PubMed Web of Science® Google Scholar
41Kramer-LeBlancC. S., BasiotisP. P., KennedyE. T. Maintaining Food and Nutrition Security in the United States with Welfare Reform American Journal of Agricultural Economics 1997 79 5 1600–1607
10.2307/1244388
Web of Science® Google Scholar
42LintonO., MaasoumiE., WhangY. J. Consistent Testing for Stochastic Dominance under General Sampling Schemes Review of Economic Studies 2005 72 3 735–765
10.1111/j.1467-937X.2005.00350.x
Web of Science® Google Scholar
43MaasoumiE., MillimetD. L. Robust Inference Concerning Recent Trends in Environmental Quality Applied Econometrics 2005 20 1 55–77
10.1002/jae.759
Web of Science® Google Scholar
44MaddenD. A Profile of Obesity in Ireland: 2002–2007 Journal of the Royal Statistical Society 2012 175 4 893–914
10.1111/j.1467-985X.2011.01020.x
Google Scholar
45McFaddenD. T B. Fomby T K. Seo Testing for Stochastic Dominance Studies in the Economics of Uncertainty: In Honor of Josef Hadar 1989 New York Springer 113–134
10.1007/978-1-4613-8922-4_7
Google Scholar
46NicklasT. A., O'NeilC. E., FulgoniV. L. III Diet Quality Is Inversely Related to Cardiovascular Risk Factors in Adults Journal of Nutrition 2012 142 12 2112–8
10.3945/jn.112.164889
CAS PubMed Web of Science® Google Scholar
47PopkinB. M., Siega-RizA. M., HainesP. S. A Comparison of Dietary Trends among Racial and Socioeconomic Groups in the United States New England Journal of Medicine 1996 335 10 715–20
10.1056/NEJM199609053351006
PubMed Web of Science® Google Scholar
48RaoJ. N. K., WuC. F. J., YueK. Some Recent Work on Resampling Methods for Complex Surveys Survey Methodology 1992 18 2 209–17
Google Scholar
49RavaillonM. Issues in Measuring and Modelling Poverty Economic Journal 1996 106 438 1328–43
10.2307/2235525
Web of Science® Google Scholar
50ReedyJ., MitrouP. N., Krebs-SmithS. M., WirfältE., FloodA., KipnisV., LeitzmannM., MouwT., HollenbeckA., SchatzkinA., SubarA. F. Index-based Dietary Patterns and Risk of Colorectal Cancer: The NIH-AARP Diet and Health Study American Journal of Epidemiology 2008 168 1 38–48
10.1093/aje/kwn097
CAS PubMed Web of Science® Google Scholar
51ShahrilM. R., SulaimanS., ShaharudinS. H., AkmalS. N. Healthy Eating Index and Breast Cancer Risk among Malaysian Women European Journal of Cancer Prevention 2013 22 4 342–7
10.1097/CEJ.0b013e32835b37f9
PubMed Web of Science® Google Scholar
52StraussJ., ThomasD. Health, Nutrition and Economics Development Journal of Economic Literature 1998 36 2 766–817
Web of Science® Google Scholar
53 U.S. Department of Health and Human Services Poverty Guidelines, Research, and Measurement 2013
Google Scholar
54VesperH. W., KuiperH. C., MirelL. B., JohnsonC. L., PirkleJ. L. Levels of Plasma Trans-Fatty Acids in Non-Hispanic White Adults in the United States in 2000 and 2009 Journal of the American Medical Association 2012 307 6 562–3
10.1001/jama.2012.112
PubMed Web of Science® Google Scholar
55WildeP. E., McNamaraP. E., RanneyC. K. The Effect of Income and Food Programs on Dietary Quality: A Seemingly Unrelated Regression Analysis with Error Components American Journal of Agricultural Economics 1999 91 4 959–71
10.2307/1244338
Web of Science® Google Scholar

Citing Literature

Volume96, Issue3

April 2014

Pages 769-789

Is Diet Quality Improving? Distributional Changes in the United States, 1989–2008

Abstract

Diet Quality

Data

Stochastic Dominance

Definitions

Estimation

Inference

Robustness Check for FOSD

Results

Between Periods

Between Income Groups

Rate and Location of Change

Counterfactual Analysis

Food Reformulation

Demographic Changes

Counterfactuals by Income Group

Robustness

Discussion and Conclusion

Supplementary material

References

Citing Literature

Figures

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Is Diet Quality Improving? Distributional Changes in the United States, 1989–2008

Abstract

Diet Quality

Data

Stochastic Dominance

Definitions

Estimation

Inference

Robustness Check for FOSD

Results

Between Periods

Between Income Groups

Rate and Location of Change

Counterfactual Analysis

Food Reformulation

Demographic Changes

Counterfactuals by Income Group

Robustness

Discussion and Conclusion

Supplementary material

References

Citing Literature

Figures

References

Related

Information