Visualization and interpretation of birth defects data using linked micromap plots†
Presented at the Urban and Regional Information Systems Association's (URISA) GIS in Public Health Conference, May 20–23, 2007, New Orleans, Louisiana.
Abstract
BACKGROUND: Many states have implemented birth defects surveillance systems to monitor and disseminate information regarding birth defects. However, many of these states rely on tabular methods to disseminate statistical birth defects summaries. An innovative presentation technique for birth defect data that portrays the information in a joint geographical and statistical context is the linked micromap (LM) plot. METHODS: LM plots were generated for oral cleft data at two geographical resolutions—USA states and counties of Utah. The LM plots also included demographic and behavioral risk data. RESULTS: A LM plot for the USA reveals spatial patterns indicating higher oral cleft occurrence in the southwest and the midwest and lower occurrence in the east. The LM plot also indicates relationships between oral cleft occurrence and maternal smoking rates and the proportion of American Indians and Alaskan Natives. In particular, the five states with the highest oral cleft occurrence had a higher proportion of American Indians and Alaskan Natives. Among the 15 states with the highest oral cleft occurrence, nine had a smoking rate of 16% or higher while among the 15 states with the lowest oral cleft occurrence only one state had a smoking rate greater than 16%. The LM plot for the state of Utah shows no clear geographic pattern, due perhaps to a relatively small number of cases in a limited geographic area. CONCLUSIONS: LM plots are effective in representing complex and large volume birth defects data. Integration to birth defects surveillance systems will improve both presentation and interpretation. Birth Defects Research (Part A), 2008. © 2007 Wiley-Liss, Inc.
INTRODUCTION
Birth defects are one of the leading causes of infant mortality and childhood morbidity in the US; the statistics for the US hold that birth defects alone account for 21% of all infant mortality (CDC,1998). Most of these birth defects result in a range of disabilities where the economic cost of medical treatment is substantial: according to Waitzman et al. (1994), the estimated lifetime cost of care for the number of US children born with the 18 most common birth defects exceeds $8 billion per year. In addition to the economic effects, children who are born with such birth defects often experience long-lasting psychological and physical burdens. For many years now, there has been a continuous and concentrated effort to monitor birth defects through data collection and surveillance systems—with the ultimate goal of establishing prevention strategies. As a result, many states have developed monitoring systems that collect data on the occurrence of birth defects along with other crucial information—the underlying objectives being to catalogue and disseminate information regarding the prevalence of birth defects (Sever,2004). These data are very important in providing information as to the monitoring of such fundamentals as the occurrences and trends of birth defects. Moreover, historical characteristics of the data are particularly useful in public health planning services, implementing prevention strategies such as allocating finite resources to the most affected areas, and improving health care access to affected children and families. Furthermore, these data are essential in a scientific sense as they are often used to generate hypotheses that are used to research the risk factors that may be associated with birth defects.
As a component of public health surveillance, states collect data on 45 major birth defects and related information (National Birth Defects Prevention Network [NBDPN],2005). In addition, these states and many US public health agencies (e.g., CDC and NBDPN) play an important role in making birth defects data accessible to the public through differing media. However, they tend to depend on tables to disseminate the birth defects information. For example, in its role to inform the public, the NBDPN published birth defects data for the period of 1998–2002 (NBDPN,2005). The published report contains a myriad of data that consist of estimates for each birth defect by state, race/ethnicity, and, for some birth defects, by age of mother–all of which are in tabular form that constitute multiple rows and columns that run through many pages.
Publishing large statistical datasets in tabular form is an important way of managing data but is not particularly informative from an interpretative standpoint. It may be difficult and frustrating for a reader to observe trends, relationships, and anomalies that may be present in the data. A user is forced to scan through many pages of tables, and tries to build a visual picture that permits an integrative understanding of the numbers, for example, which state has the maximum number of cases in a particular year. Equally, it can be argued that tabular data are especially useful to researchers who are interested in utilizing the raw data to conduct research; however, researchers likewise require a conversion of bulk tabular data into a visual framework to help not only in understanding the structure of the data, but further to facilitate the analysis of the data. Furthermore, there is value in reporting to the public in an informative way while at the same time facilitating the presentation of data for policy makers to enable them to make informed and timely decisions. These aforementioned circumstances suggest that the conversion of tabular data into a visual and ordered context can illustrate patterns and relationships and so forth in the data to an observer that would erstwhile be elusive and moreover, in the most practical sense, be an efficient vehicle for disseminating information to the public and decision makers.
Visualization techniques offer a set of tools that can be used to simplify large and complex datasets into more comprehensible forms. They offer the ability to transform large public health datasets such as birth defects data into a more meaningful representation of the underlying epidemiological information in a revealing way without overwhelming the reader. Using visualization, trends, relationships, and anomalies that were not at first obvious in the tables can be revealed quickly. Moreover, visualization increases the effectiveness of communicating information to the public and further enables users to do a critical evaluation of the data while at the same time likely reducing errors in its interpretation but maintaining consistency.
Over many years, many visualization tools have been developed to convert tabular information into visual graphs or plots (e.g., Carr,1994), but a fairly recent development in the field highlights the use of linked micromap (LM) plots (Carr and Pierson,1996) as a way of displaying geographically indexed data. LM plots use multiple small maps (called micromaps) to visualize complex data structures in a geographical context. LM plots have already been used in many fields, including environmental science (Carr and Pierson,1996; Carr et al.,1998; Symanzik et al.,1999), ecology (Carr et al.,2000a), epidemiology (Carr et al.,2000b; Symanzik et al.,2003), and in the case of federal statistical summaries (Hurst et al.,2003). However, LM plots have not been specifically applied to birth defects data. The purpose of this article is to highlight and examine the use of LM plots in presenting geographically indexed birth defects data. Specifically, it will demonstrate the use of LM plots to graphically represent statistical summaries and their associated uncertainties for oral cleft occurrences (oral clefts are defined as a cleft lip and cleft palate birth defects, where occurrence of oral clefts observed is prevalence at birth). This is done at two geographical resolutions: (1) at the state level for the US, and (2) at the county level for the state of Utah. Furthermore, LM plots are used to graphically relate oral cleft occurrence estimates with associated demographics and behavioral data collected at the same geographical resolutions.
A final important point is that of ensuring confidentiality. All the data used in the construction of the LM plots were aggregated values and so an individual's information is kept strictly confidential. In fact, LM plots are not designed to show specific data at a particular location but more to group information into manageable units such as a statistical summary that by its very nature removes the individual from the picture.
MATERIALS AND METHODS
Data Sources, Breakdown, and Aggregation
Birth defects and other variables of interest (including data on demographics and behavioral risk factors) were obtained from various sources. National data for oral cleft occurrence and livebirths for the period of 1998–2002 were obtained from the NBDPN (2005) as issued in Birth Defects Research (Part A). Thirty-five states participated in reporting up to 45 major birth defects and, of these, 31states contributed oral cleft occurrence. The relevant data for oral cleft occurrence were compiled for each state. Next, occurrence of oral cleft in each state was computed per 10,000 livebirths (NBDPN,2005) for the same period.
Oral cleft occurrence for the state of Utah was obtained from a case-control study of oral cleft occurrence undertaken by the Center for Epidemiologic Studies, Utah State University, that covered the period of 1995–2004. The cases used in the study were originally obtained from the Utah Birth Defect Network, a statewide surveillance program that monitors and detects birth defects in Utah. All cases had street address information of the mother's residence at the time of birth. The street address information was transformed (geocoded) to map coordinates and then aggregated to the county level. The live births at the county level for the same period (1995–2004) were obtained from US census data (http://quickfacts.census.gov/qfd/index.html) after which the oral cleft occurrence in each county was computed per 10,000 livebirths for the period of 1995–2004.
Details of the geocoding process are as follows. Case addresses were geocoded using the ArcView geocoding utility and Dynamap/2000 (Version 14.3). Street File Network information for the state of Utah was obtained from Geographic Data Technology, Inc. (GDT,2004). Of the total cases, 96.6% of them were automatically geocoded or interactively geocoded with minor editing for spelling, street aliases, and acronyms. Certain addresses (0.5%) were unmatched and geocoded manually with the assistance of internet mapping services such as Yahoo Maps, MapQuest, and Google Maps. A number of the cases (2.7%) did not have a geocodable address but geocoded either to the city or zip code centroid. The geographic centroids were obtained from a 2004 Municipalities shapefile or a zipcode shapefile available from the Utah Automated Geographic Reference Center (AGRC,2006). The remaining cases (0.2%) were excluded from the analysis because no address was resolved or the location resided outside the state of Utah.
Second, prenatal smoking is underreported on birth certificates. Underreporting might be related to the wording of the smoking question, the timing of the data collection (e.g., during prenatal care versus after the live birth), and the stigma associated with smoking during pregnancy, particularly in cases of poor birth outcome. However, despite underreporting, the trends and variations in smoking derived from birth certificate data have been confirmed with data from other sources (e.g., National Survey of Family Growth and Pregnancy Risk Assessment Monitoring System). (p. 913)
In addition, demographic factors, that is, race and ethnicity, are also understood to be risk factors in oral cleft occurrence. For example, the risk is particularly high in the American Indian and Alaskan Native (AIAN) population (Coddington and Hisnanick,1996). Therefore, data on the proportion of AIAN in the population for the year 2000 was obtained from the U.S. Census Bureau (2000), accessible at http://www.census.gov/prod/2002pubs/c2kbr01-15.pdf.
Visualization Technique
The graphical visualization technique presented in this article is referred to as LM plots. LM plots provide an alternate way (compared to traditional choropleth maps—for a comparative discussion on the relative merits of choropleth maps see Symanzik and Carr,2008) of displaying geographically indexed statistical summaries (e.g., oral cleft occurrence for each state or counties within a state) in a corresponding spatial context (Carr and Pierson,1996; Carr et al.,1998). LM plots combine both an exploratory analysis capability together with traditional statistical graphics while maintaining the geographical context.
Before LM plots are programmed and subsequently displayed (using the statistical software package S-plus or on the web), LM plots require a generalized map to work from, that is, a smoothed or simplified boundary defining a geographical region. However, such boundaries (e.g., state or county) that exist as Geographic Information System (GIS) data layers often consist of a large number of vertices that are considerably more than that required for micromap depiction on the display. Therefore, it is necessary to reduce redundant vertices in a polygon but only to the point of maintaining the essential shape and neighborhood relationship of the polygons that comprise the micromap. A generalized map for the US is available online at ftp://galaxy.gmu.edu/pub/dcarr/newsletter/micromap/. To produce a generalized map for the state of Utah, a boundary shape file was obtained from the (AGRC,2006). Using ArcGIS, a desktop GIS package, the simplified boundaries were generalized. The generalization routine applied is based on the Douglas-Peucker line simplification algorithm (Douglas and Peucker,1973). Finally, after generalizing the boundaries, LM plots for the US and the state of Utah were created using the S-plus statistical software package. The sample S-plus code for creating LM plots is also obtainable from Dan Carr's ftp site at ftp://galaxy.gmu.edu/pub/dcarr/newsletter/micromap/.
RESULTS
Template for LM Plots
A typical template of a LM plot consists of four key features (Carr and Pierson,1996). Figure 1 shows a hypothetical LM plot. The first feature is three or more sequence panels in parallel linked by location. In the hypothetical case, Figure 1 shows five parallel sequences of panels. The first (leftmost) sequence of panels is the micromap panel itself that typically contains small caricatures of map outlines of a region. The caricature map maintains the shape and neighborhood relationship while making the small subregions more visible. The second (from the left) sequence of panels is the label panel that provides the names of the geographical subregions (here, Region 1 through Region 10). The third through the fifth (from the left) sequence of panels display the statistical summaries. These panels may represent many forms of statistical summaries including box-plots, dot-plots (as shown in Fig. 1), time series plots, CIs, and so forth. Sorting the geographic subregions based on the statistical variable(s) of interest is the second feature. Sorting improves perception between consecutive panels from the top to the bottom of the display. The third feature is the partitioning of the regions into perceptual groups of size five or less to allow the viewer's attention to focus on explicit areas at a time. The fourth feature is color and location that links corresponding elements within the parallel sequence panels, that is, the color red in the topmost panels relates to the geographic subregion in the northeast of the map, the subregion name (Region 5), and a red dot in each of the three statistical panels. The color red is reused in the next consecutive set of panels for Region 2, but there is no relationship between Region 5 and Region 2 as one might at first assume. Simply, there do not exist enough distinguishable colors to populate an entire display (with, say, 50 different subregions) such that colors have to be reused in different panels.

Hypothetical LM plot illustrating the main features of such plots: the leftmost sequence of map panels, the second (from the left) sequence of label panels, and the third through the fifth (from the left) sequence of statistical panels.
In the hypothetical Figure 1, the rows are sorted by decreasing values with respect to the statistical panel 2. The statistical data displayed in the statistical panels 1 and 2 show a strong positive association (the correlation r calculated as 0.99), expressed in the almost parallel behavior of the dots and lines representing the values for these two variables. In contrast, the statistical data in panel 3 and 1 (as well as 3 and 2) show a strong negative association (the correlation r calculated as −0.94 for 3 and 1 and as −0.92 for 3 and 2). This negative association is seen in the movement of the dots and lines in opposite directions for these variables. Moreover, the data in panel 3 show an unusual outlier, the value for Region 1. It is this outlier that considerably reduces the almost perfect negative association otherwise present in this data. Just a simple numerical calculation of r might not be able to reveal the influence of a single subregion on the overall relationship.
The map panels of the LM plot in Figure 1 exhibit a strong geographic pattern: highest occurrences with respect to the statistical panels 1 and 2 can be found in the north and in the east; lowest occurrences can be found in the west and in the south. Additional features of LM plots exist and are described in more detail in Symanzik and Carr (2008).
US Level LM Plots
Figure 2 shows the LM plot for the 31 US states that reported on oral cleft occurrence. The figure shows five vertical columns that are linked by geographic location. The first column shows the generalized outline of the US wherein are drawn the map caricatures for the states. In particular, Alaska and Hawaii are modified in size and shifted towards the 48 contiguous states. Otherwise, the island to the east of Virginia represents Washington, D.C. that otherwise would not be visible. Note that redundant details of a state's boundaries are left out; however, the essential fraction that designates the boundary shape and neighborhood relationships is preserved (other than Washington, D.C.), while at the same time small states such as Rhode Island are magnified such that their assigned color is evident on the map. The second column shows the state names along with a dot in the linking color. The last three columns illustrate three statistical variables. In this particular example, dot-plots represent the three variables oral cleft occurrence, maternal smoking rate, and the AIAN proportion in the population. All the corresponding micromaps, labels, and statistical panels are linked through the same color designation. Note that five distinct colors are used to distinguish the states within a particular micromap frame.

LM plot showing oral cleft occurrence by state for the period of 1998–2002. Only oral cleft occurrence for 31 out of the 50 US states was available and displayed here. Smoking rates for California were not available. The red lines show the national average (i.e., mean) of oral cleft occurrence (17.7 per 10,000), smoking rate 16%, and AINA proportion of 1.3%. Note that Texas had the median oral cleft occurrence among the 31 states for which data were available.
The data in Figure 2 are sorted by oral cleft occurrence from largest to smallest. The micromaps are further divided into two main blocks with Texas in the middle—Texas defines the median occurrence and is plotted (and identified) in black between those states that lie above and below this median. The data are further partitioned into six micromaps each containing a grouping of five states. Such sorting (here descending) and breaking of a long list of states into smaller groups highlights the data from a discrete visual perspective and so draws the viewer's attention to a few subregions at a time. Furthermore, it also provides a viewer with additional visual perspective, that is, by sorting and breaking the data apart into, in this case, six micromaps. These LM plots provide a viewer with considerably more information than what would otherwise have been provided by a series of tables or an overall map representation (e.g., a chrolopleth map) alone. Viewers can now easily navigate through the LM plot to a place of interest in order to review oral cleft occurrence and related statistics without having to leaf backward and forward through a collection of tables or, for that matter, a series of maps. Moreover, viewers can compare the oral cleft occurrence of a particular state with a benchmark oral cleft occurrence or other states in an easier fashion. For example, it is immediately clear from the LM plot that Alaska (ranked 1st) exhibits a much higher oral cleft occurrence compared to Utah (ranked 2nd). The LM plot also reveals states that had oral cleft occurrence above, below, or equal to the median and shows states that surpassed the national average (which is 17.7 per 10,000 occurrences, i.e., 1.25 on a log10 scale). The national average is indicated with a vertical red line.
Figure 2 also provides a viewer with a quick overview of any spatial patterns present in oral cleft occurrence. The LM plot is very effective in revealing spatial trends. The immediate impression about spatial patterns observed in Figure 2 is of a few small groups of states that certainly raise questions about oral cleft occurrence similarities. For example, there is a noticeable elevation in the west (including Alaska) as compared to an observable low occurrence in the east-northeast.
However, a glance at the series of micromaps in Figure 2 reveals further details in spatial patterns. For example, light gray shading is used as a foreground to distinguish states above the median occurrence (i.e., in Texas) from those states below the median occurrence. The light gray shading draws attention to higher oral cleft occurrence in the upper half of the plot and lower oral cleft occurrence in the lower half of the plot. The state with the median occurrence (Texas) is shaded in all individual micromaps. The use of such shading provides additional spatial detail. As one can see in Figure 2, high oral cleft occurrence is primarily to be found in the west and the midwest with the exception of California, while the east coast states show up as a broad area of lower oral cleft occurrence.
LM plots can also display multiple variables simultaneously and this allows the viewer to explore the relationships between these variables. As shown in Figure 2, viewers can observe the relationship between oral cleft occurrence and maternal smoking—as mentioned earlier no data on maternal smoking rates were collated for California. The map shows that 9 of the 15 states that are above the median oral cleft occurrence have smoking rates above 16% (1.2 on a log10 scale) compared to only 1 of the 14 states that are below the median oral cleft occurrence. This difference is statistically significant (p = .0052) as tested through a two-tailed Fisher's exact test. This is consistent with the smoking-cleft association that is well established and noted previously.
The rightmost statistical panel reveals a positive relationship between oral cleft occurrence and the proportion of AIAN in the population. In fact, 7 of the 15 states with above the median oral cleft occurrence have an AIAN population equal to or above 1.3% (0.114 on a log10 scale), while none of the 15 states with below the median oral cleft occurrence exceeds the same AIAN population level. This difference is also statistically significant (p = .00632, two-tailed Fisher's exact test).
Utah Level Analysis
Figure 3 illustrates a LM plot for oral cleft occurrence by county for the state of Utah. The overall design of the LM plot in Figure 3 follows the LM template: it shows five sequence columns, the first column being a map demarking the counties of Utah, while the second column contains the county name with associated color labels. The next three columns show the statistical panels for three variables for each county respectively oral cleft occurrence (counts divided by number of live births) for each county. The counties are ranked according to the oral cleft occurrence from highest to lowest and are partitioned into seven micromaps. The number of counties in Utah is 29 and therefore it is not evenly divisible by five. Symanzik and Carr (2009) provide suggestions of how to partition subregions into the micromaps when the number of geographic units (in this case the counties of Utah) within a LM plot are not evenly divisible by the number of geographic units to be displayed in a single micromap. Here, the first three micromaps and the last three micromaps each display four counties while the fourth (middle) micromap displays five counties. Note that in this representation of the LM plot the median is not explicibly drawn but the first three micromaps outline counties above the median while the last three micromaps outline counties below the median. The county with the median occurrence (Garfield) is shaded in all individual micromaps. No additional counties are outlined in the fourth (middle) micromap (other than the five counties that constitute this micromap).

LM plot of oral cleft occurrence for the state of Utah by county for the period of 1995–2004. Only oral cleft occurrence for 24 out of the 29 counties in Utah was available.
One supplementary statistical representation included in Figure 3 is the addition of CIs as part of the statistical oral cleft occurrence panel. The CIs indicated by connected small dots correspond to the 95% lower and upper confidence limits. The larger colored dots refer to, as before, the oral cleft occurrence in each county. The 95% CI was calculated for each occurrence using an exact Poisson distribution (Leslie,1992). A viewer can now appreciate the fact that the oral cleft occurrence of each county is not quite the “true” (actual) oral cleft occurrence and that the CIs describe the uncertainties of the occurrence estimates, that is, the true value of the occurrence falls most likely somewhere between the limits of the CIs. Moreover, readers can also observe that counties where the occurrence is calculated from limited data (i.e., are more uncertain) have wider CIs and vice versa. As an example, consider how Daggett County (ranked 1st) with an oral cleft occurrence of 102.5 per 10,000 (resulting in a value of 2.01 on a log10 scale) compares to Salt Lake County (ranked 18th) that has an oral cleft occurrence of 12.7 per 10,000 (resulting in a value of 1.1 on a log10 scale). Upon initial examination of the occurrence information alone, one might be tempted to infer that Daggett County has a higher oral cleft occurrence when compared to Salt Lake County. However, the conclusion is somewhat different when one takes the CI information of both counties into account: it is evident that Daggett County has a wide 95% CI, compared to Salt Lake County, which has a narrow 95% CI. The implication that one should take from the additional information is that the oral cleft occurrence for Daggett County is less reliable, while one may consider the occurrence for Salt Lake County to be more representative/reliable. This is justified by the data displayed in the counts and livebirths statistical panels.
The addition of counts and livebirths into the statistical panels in Figure 3 provides a viewer with a more complete picture of the statistical assessment of oral cleft data at the county level. Certainly, viewers can appreciate the importance of these two variables by just comparing the oral clefts occurrence and counts in the statistical panels. As indicated in Figure 3, counties such as Salt Lake, Utah, Cache, Davis, Weber, and Box Elder have sizeable numbers in the counts and livebirths categories (a direct result of these counties being more heavily populated). This translates to narrow CIs. In contrast, counties such as Daggett, Garfield, Kane, Millard, and Sanpete correspondingly exhibit wide CIs—a direct result of a sparser population base. Overall, this demonstrates the interdependence of occurrence, counts, and livebirth numbers and implies that both the number of counts and livebirths determine the reliability of the oral cleft occurrence.
DISCUSSION
This article demonstrates the use of LM plots for the display of geographically indexed oral cleft occurrence at two geographical levels—the state and the county level. It is important to note that there are inherent limitations in the data used in this article. To begin with, all birth defects data (including oral clefts) were collected at the state level as compiled by the NBDPN—that is, the NBDPN only maintains the network of state and population-based programs for birth defects. Thus, there may be differences in the standards used when gathering birth defects data and level of ascertainments among states; this may result in certain extremes of the variability of oral cleft occurrence among states that may obscure the true difference of the oral cleft occurrence among states. Maternal smoking and AIAN data are also not without limitations as they were only available for a single year, that is, 2002 and 2000, respectively, and do not cover the same period as the oral cleft data. Despite these limitations, we respect the differences in the state and census data and surmise that the limitations in the data are not so extreme that they may preclude the visualization and analysis presented here. The data can still provide us with important insights as to patterns and relationships in birth defects. However, the readers should be alert to these limitations and use caution when they interpret the results derived from these data. Hence, our intent is not to draw definitive conclusions from the LM plots but rather to show how the visualizations can order the data such that an easier interpretation is possible. From experience in the use of micromapping, the authors believe that LM plots may well have an important role to play in birth defect surveillance because of the many advantages a LM plot offers over tabular or other graphical methods of representation and elucidate this further with the following statements.
The first advantage is that LM plots provide an improved way of viewing and communicating information about birth defects. By sorting and breaking the datasets into a series of micromaps, LM plots simplify the visual appearance by encouraging selective focus. Viewers can immediately spot their home state or county for review of the status of birth defects, and in this manner, they can engage in meaningful discussion. Moreover, LM plots allow viewers to make rapid and meaningful comparisons between different regions. Viewers can compare the rate of a particular state of interest with benchmark values (median or national average) or with other states in a stratified environment. This kind of profiling of states or counties (above or below a central tendency) is valuable information for planning public health services and their subsequent decision criteria like that of resource allocation.
The second advantage of LM plots is that they present statistical summaries and estimates of birth defects in a spatial context. Unlike traditional statistical graphical methods, LM plots combine both exploratory analysis and traditional statistical graphics while maintaining the spatial context; this is very important in birth defects epidemiology because of the intrinsic spatial nature of the events. LM plots are also very effective at describing the spatial elements of the oral cleft occurrence, that is, the varied geographical distribution of the oral clefts as well as their spatial clustering. LM plots are particularly effective in highlighting subregions in a series of micromaps, and in doing so, they reveal detailed spatial patterns that otherwise might not have been detected from data tables alone. As was illustrated in Figure 2, as one moved from the high to median oral cleft occurrence and from the median to the low oral cleft occurrence, a spatial pattern emerged. High oral cleft occurrence tended to be in the western and midwestern states, while the east coast (especially the northeast) revealed a region of low oral cleft occurrence. Such insights are as valuable for hypothesis generation as for identifying areas of high or low risk.
A third advantage of LM plots is associated with the efficacy of the technique of micromapping in handling multiple variables. It is well known that causes for birth defects are, by nature, multivariate, which advocates the linking of birth defects data with potential risk factors in order that one may investigate underlying patterns and relationships. LM plots effectively facilitate this by displaying multiple variables alongside one-another. This capability allows readers to quickly view associations between variables and further pinpoint any anomalous relationship(s) that may exist between variables. Figure 2 illustrates this by displaying maternal smoking and AIAN alongside the oral cleft occurrence. In particular, the association observed between oral cleft occurrence and AIAN was immediately evident for the 15 states in the top three micromaps when compared with the remaining states.
Also shown was the capability of LM plots to display uncertainties of the oral cleft occurrence estimates. Reporting uncertainties along with the occurrence are particularly helpful to the viewer as this permits an assessment as to the reliability of the data. Viewers are able to appreciate that the big dots (Fig. 3) are not representative of the true value but the fact that CIs indicate that there is a range into which the true occurrence falls. The viewer can also note that states or counties with small count values and livebirths produce less reliable information on the occurrence as exhibited by wider CIs, while states or counties with a large number of counts and livebirths create an occurrence that is more reliable and is evidenced by narrower CIs.
In addition to the earlier description of LM plot templates, an ample set of templates are available that offer readers considerable flexibility in visualizing their data via LM plots. For example, the statistical panels of LM plots can take many different forms such as box-plots, bar-plots, histograms, or time series plots. These alternate statistical plots offer additional avenues for one to query the underlying structure of the data and to examine patterns and relationships in the data. For example, Carr et al. (1998) used LM plots to effectively depict time series data for per capita carbon dioxide emissions. One could imagine a similar time series LM plot that would examine the trend of NTDs before and after mandatory fortification of cereal grain products with folic acid. One can also manipulate the colors by using a different set of colors or hues. Furthermore, the beauty of LM plots is that they are not limited to static representations of summary statistics; web-based LM plots can provide users with real-time data to interactively and dynamically query, sort, and compare different regions over different resolutions, for example, at the state or county level. Such web-based LM plots also permit dynamic links between databases and automatic updates of data. In this capacity, Symanzik et al. (1999) developed web-based interactive LM plots for the US Environmental Protection Agency, and in a similar fashion, Wang et al. (2002) developed web-based LM plots for the National Cancer Institute, micromap website (National Cancer Institutes, 2003) accessible at http://statecancerprofiles.cancer.gov/micromaps/.
A final interesting aspect of the national cleft data that pertains to the eastern states lies in the fact that the oral cleft occurrences in these states all fall below the median occurrence. This is notable because the northeastern states are generally high in cancer rates (Hao et al.,2006) and many (Zhu et al.,2002: Mili et al.,1993a,b; Windham et al.,1985) have suggested that cancer and birth defects may share common causes linked to location—these data, at least for clefts, do not support that notion.
In conclusion, LM plots provide a constructive geographic representation coupled to a statistical visualization tool, which also have an exploratory capability. In the context of the integration of LM plots towards the monitoring of birth defects, there is certainly provision, if not tremendous advantage, in the utilization of LM plots to augment the presentation of birth defect data. Further, the application of LM plots has distinct merit in the enhancement of data analysis, the generation of scientific hypotheses, as well as in the integration of data of various forms (e.g., census, environment, etc.). These aforementioned aspects, when linked together, can facilitate planning of public health services towards such aims as targeting limited resources to places with the greatest need.
Acknowledgements
We would like to thank Sara H. Riordan, Genetic Counselor, from Arizona Teratology Information Program, for providing us with the Birth Defects Research (Part A) issue 73(10). We are also grateful to Sam LeFevre, Environmental Epidemiologist, from the Utah Department of Health, for geocoding the oral cleft occurrence for the state of Utah as well as Marcia Feldkamp and Amy Nance (both from the Utah Birth Defects Network, Utah Department of Health) for their assistance.