POPULATION SIZE CHANGES RESHAPE GENOMIC PATTERNS OF DIVERSITY
Abstract
Elucidating the forces responsible for genomic variation is critical for understanding evolution. Under standard conditions, X-linked diversity is expected to be three-quarters the level of autosomal diversity. Empirical data often deviate from this prediction, but the reasons for these departures are unclear. We demonstrate that population size changes can greatly alter relative levels of X-linked and autosomal variation: population size reductions lead to particularly low X-linked diversity, whereas growth elevates X-linked relative to autosomal diversity. Genetic variation from a diverse array of taxa supports an important role for this effect in accounting for population differences in the ratio of X-linked to autosomal diversity. Consideration of this effect may improve the inference of population history and other evolutionary processes.
A fundamental enigma in evolutionary biology concerns the relative levels of X chromosome and autosome diversity. Because X chromosomes are present in two copies in females but only in one copy in males, the effective population size (and level of diversity) for X chromosomes is expected to be three-fourths of that of the autosomes. However, departures from this ratio are frequently observed in natural populations. A number of processes have been suggested to account for such deviations, including sex-biased mutation (reviewed in Li et al. 2002), sex-specific variance in reproductive success (Caballero 1995; Charlesworth 2001), sex-biased migration (Laporte and Charlesworth 2002), the effect of linked negative selection (Charlesworth 1996), and positive selection (e.g., Aquadro et al. 1994).
The quest to detect molecular signatures of adaptation has been a strong motivation for comparing X-linked and autosomal variation, particularly in Drosophila. Recessive beneficial mutations at low frequency can be “seen” by selection in males if the locus is X-linked, whereas similar autosomal alleles must first drift to a high enough frequency to be present as homozygotes. If beneficial mutations tend to be recessive on average, hitchhiking may be more frequent on the X chromosome (Charlesworth et al. 1987) and the X chromosome may have disproportionately lower variation than the autosomes (Aquadro et al. 1994).
In Drosophila melanogaster, interest has centered on the comparison between ancestral range populations from sub-Saharan Africa and more recently founded populations from elsewhere in the world. Andolfatto (2001) examined X-linked and autosomal sequence data in D. melanogaster, finding that cosmopolitan populations had a considerably lower X-to-autosome (X/A) diversity ratio than sub-Saharan populations. Kauer et al. (2002, 2003) confirmed this pattern with a large microsatellite dataset, arguing that a disproportionate reduction of cosmopolitan X-linked variation should not result from a founder event bottleneck, but could reflect a higher rate of hitchhiking as cosmopolitan populations adapted to new environments outside Africa.
It is commonly assumed that historical changes in population size (such as founder event bottlenecks) should have similar effects on X-linked and autosomal variation, and thus should have little or no impact on the X/A diversity ratio. However, a few studies have provided clues that population size changes may have contrasting effects on chromosomes with different modes of inheritance. For example, Fay and Wu (1999) and Hey and Harris (1999) found that mitochondrial and autosomal loci can differ in their allele frequency distributions after a population bottleneck. Wall et al. (2002) simulated the effect of a population bottleneck on X-linked and autosomal diversity, focusing primarily on linkage disequilibrium. Although in this case the prebottleneck X/A diversity ratio was assumed to be less than 0.75, the postbottleneck X/A ratio was found to be slightly lower yet. Lastly, Lawson-Handley et al. (2006) simulated haplotype diversity for mitochondrial, Y-linked, and X-linked loci (in the absence of intragenic recombination), finding that haplotype diversity recovered more quickly for the uniparentally inherited chromosomes than for the X chromosome.
Although the studies cited above have suggested that demographic history may have distinct effects on genetic markers with differing modes of inheritance, the magnitude of this effect is unclear, and its potential influence on chromosomal diversity differences is often ignored. Here we use theoretical predictions to show that population size changes can profoundly alter the X/A diversity ratio. Due to the X chromosome's smaller effective population size, X-linked variation will converge faster to its new equilibrium after a size change. Therefore, we find that reductions in population size lead to lower X/A diversity ratios, whereas population growth yields the opposite effect. We show that empirical data from diverse taxa consistently support an important role for this process. Finally, we suggest that jointly considering patterns of X-linked and autosomal variation may improve demographic inference and assist in differentiating population history from other processes, such as positive selection.
Theoretical Models
The effect of changing population sizes on the distribution of coalescence times has been extensively treated in the literature (e.g., Slatkin and Hudson 1991; Rogers and Harpending 1992; Polanski et al. 1998) and is well understood. Here we rederive expressions for the expected coalescent time for a pair of sequences with a specific inheritance factor (h). Assuming equal effective male and female population sizes, h equals 1 for autosomal markers, 0.75 for X-linked markers, and 0.25 for mtDNA or Y-linked markers. These inheritance factors will allow us to compare the expected nucleotide diversity in different types of markers.







Predictions of the Models
From the model given in equation (4) (using h1= 0.75, h2= 1, and μ1=μ2), we find that reductions in population size lead to disproportionately reduced X-linked variation (Fig. 1A). Although weaker population size reductions can lead to prolonged decreases in X/A diversity ratios, the strongest departures are caused by recent, severe size reductions. For example, if a population of initial size 10,000 suffered a 500-fold reduction in size 163 generations ago, the expected X/A diversity ratio is 0.248. Thus, population size reductions compound the difference in effective population size between X chromosomes and autosomes, as genetic drift causes X-linked variation to be lost more quickly. Conversely, population growth can generate X/A diversity ratios closer to unity (but not exceeding 1.0; Fig. 1B). Following growth, new X-linked and autosomal variation accumulates at more similar rates, and this effect can be quite long lasting.

Predictions of the population size change model for X/A diversity ratios. Predicted ratio of X chromosome to autosome diversity for a population with initial N= 10,000 following (A) population reduction, (B) population growth, or (C) population bottleneck lasting 100 generations. In each plot, the x-axis indicates the number of generations since this event (g), the y-axis indicates the magnitude of the size change (in terms of f for population growth, or 1/f for population reductions and bottlenecks; see equations for details), and color depicts the X/A diversity ratio (see scale).
Using equation (7), we find that population bottlenecks (reductions with subsequent recovery) initially reduce X/A diversity ratios, but with time they may produce X/A ratios greater than the initial value (Fig. 1C). This pattern can also be seen in Figure 2, which depicts the recovery of variation after a bottleneck (with initial size N= 10,000, reduction factor f= 0.04, and bottleneck duration g2= 100 generations): the X chromosome initially loses more of its variation after the bottleneck, but then recovers more quickly than the autosomes.

Recovery of variation after a population bottleneck. Expected diversity through time following a population bottleneck (relative to prebottleneck levels) is shown separately for autosomal, X-linked, and uniparentally inherited chromosomes (mt/Y). In this bottleneck, a population of N= 10,000 experiences a population size reduction to f= 0.004 times this size, lasting for g2= 100 generations until recovery to the initial size.
Figure 2 also shows the recovery of variation for uniparentally inherited markers (such as mitochondria and the Y chromosome), which have an inheritance factor of h= 0.25. The pattern observed for these haploid markers, as compared to the autosomes, is even more dramatic than for the X chromosome: nearly all variation is lost during the bottleneck, but recovery occurs much more quickly. Importantly, this more rapid recovery of variation is entirely due to the lower effective population sizes of the uniparentally inherited chromosomes, and would not be altered by mutation rate differences between chromosomes.
Comparing Predictions to Empirical Data
We obtained published data from every species that was found to have X-linked and autosomal sequence polymorphism data from at least two populations with differing levels of nucleotide diversity. Data of this type were found for humans (Yu et al. 2002), chimpanzees and orangutans (Kaessmann et al. 2001; Fischer et al. 2006), three subspecies of the house mouse Mus musculus (Baines and Harr 2007), D. melanogaster (Andolfatto 2001; Haddrill et al. 2005; Ometto et al. 2005), and Drosophila simulans (Andolfatto 2001). Additional details concerning the data can be found in the online Supplementary Material.
To test whether the empirical data are consistent with our model, we compared X/A diversity ratios between populations for each species/subspecies. To the extent that differences in variability between these populations reflect population size changes subsequent to their divergence (such as founder event bottlenecks leading to the formation of new populations), our model predicts that the less variable population should have a reduced X/A diversity ratio. Indeed, this pattern was observed in every case (Fig. 3). For example, as the ancestors of Asian and European human populations migrated out of Africa, they experienced a population bottleneck that reduced genetic variation, and as predicted by the size change model, the X/A diversity ratio is lower in non-African (0.68) than in African (0.84) populations.

Reduced X/A diversity ratios in less variable populations. Bars indicate the X/A diversity ratios of more genetically diverse (gray) and less genetically diverse (black) populations of Homo sapiens (Human), Pan Troglodytes (Chimp), Pongo pygmaeus (Orang), Mus musculus castaneus (Mmc), Mus musculus domesticus (Mmd), Mus musculus musculus (Mmm), Drosophila melanogaster (Dmel), and Drosophila simulans (Dsim). Information regarding the data can be found in the online Supplementary Material.
In addition to the taxa shown in Figure 3, we are aware of single nucleotide polymorphism data showing reduced X/A diversity ratios in the less variable population of rhesus macaque (Rhesus Macaque Genome Sequencing and Analysis Consortium 2007), along with microsatellite polymorphism data showing lower X/A ratios in less diverse, recently founded populations of Drosophila pseudoobscura (Reiland et al. 2002) and Drosophila subobscura (Pascual et al. 2007). The probability that the less variable population of all 11 of the above taxa would have the lower X/A diversity ratio by chance is less than 0.0005.
Discussion
Because the size change model invokes no phenomena beyond founder event bottlenecks or other population size changes, it represents a parsimonious explanation for the observed population differences in X/A diversity ratios (Fig. 3). Clearly, we cannot exclude the influence of other mutational, demographic, and selective processes in contributing to these X/A diversity ratios, particularly with regard to the differences between species. For example, male-biased mutation rates may influence X/A diversity ratios in mammalian species to differing degrees (Li et al. 2002), whereas no such effect has been detected in Drosophila (e.g., Bauer and Aquadro 1997). However, as shown in equations (4) and (7), mutation rate differences between X-linked and autosomal loci only affect X/A diversity ratios as constant factors, independently of the size change effect. Therefore, male-biased mutation cannot account for the population differences shown in Figure 3.
By comparing X/A diversity ratios between closely related populations, we restrict our focus to evolutionary processes that might differ between a given pair of populations. For example, if sex-specific variance in reproductive success is higher for males in the more variable population, but higher for females in the less variable population, a reduced X/A diversity ratio for the latter population would be expected (Charlesworth 2001). However, it seems unlikely that such a shift in reproductive variances in this direction would have occurred in all or most of the species examined here.
It has also been suggested that accelerated rates of adaptive evolution for recently founded populations in new environments may lead to reduced X/A diversity ratios, under the logic that the X chromosome's hemizygosity in males renders selection more efficient. However, the X chromosome will actually have a lower rate of adaptation than the autosomes if most beneficial alleles come from standing variation (Orr and Betancourt 2001), as may often be the case for populations expanding into new environments. Also, at least in the case of Drosophila-like recombination (none in males), hitchhiking should not reduce X/A diversity ratios unless most beneficial mutations are recessive (Betancourt et al. 2004). The fact that Thornton et al. (2006) detected no “faster-X” effect in Drosophila (i.e., X-linked loci do not have higher rates of protein evolution) might indicate that the majority of beneficial mutations do not fit the above criteria (at least in Drosophila).
While three of the species examined here have expanded from tropical Africa into temperate habitats (i.e., humans, D. melanogaster, and D. simulans), for the remaining species we are comparing populations from less drastically different environments. Although such environments may still have very important ecological differences, it is difficult to imagine why, for example, Bornean orangutans should have a vastly higher rate of adaptation than Sumatran orangutans. It is also noteworthy that a reduced X/A diversity ratio was observed in North American D. subobscura (Pascual et al. 2007), given that this population was established only about 25 years prior to the study (roughly 125 generations; Prevosti et al. 1988), which is perhaps too short an interval for a genome-wide effect of hitchhiking to be expected.
Further studies will be needed to determine whether a demographic model of historical size changes can adequately account for patterns of genetic variation in any given species. One example of such an approach is given by Baines and Harr (2007), who fit demographic models for M. musculus populations using autosomal data, and then tested whether the best-fitting model could account for X-linked polymorphism as well. Although this strategy may not formally exclude the possibility that other demographic scenarios are consistent with both X-linked and autosomal diversity, it certainly represents a step toward fully accounting for the influence of population history.
For simplicity, we have assumed throughout this study that X/A diversity ratios before a population size change were equal to 0.75, but clearly the models presented above are not limited to this case. Different mutation rates for X-linked and autosomal loci can easily be substituted into the equations given here. If additional factors are suspected to influence X/A diversity ratios in a particular species, the X-linked and autosomal inheritance factors from our equations could be adjusted to account for the predicted impact of processes such as sex-specific variance in reproductive success (Charlesworth 2001) or background selection (Charlesworth 1996) on the effective population size of X-linked and autosomal loci. Under some scenarios (for example, if males have an extremely low probability of reproductive success) the effective population size of X-linked loci may even exceed that of autosomal loci (Charlesworth 2001), and the population size change effect would then be in the opposite direction from that presented above. As an alternative to inferring such processes, if the presize change X/A diversity ratio can be estimated directly (e.g., by using data from a population that recently diverged from the study population but is not thought to have undergone a recent size change, or from ancient DNA samples), one can focus on the predicted change from initial X-linked and autosomal diversity levels by multiplying these values by the size change terms of equations (2) and (6) (the terms in parentheses).
We have stated our findings mainly in terms of X-linked and autosomal diversity, but the models given here are equally applicable to comparisons involving mitochondrial or Y-linked variation (e.g., Hey 1997; Fay and Wu 1999; Hey and Harris 1999; Lawson Handley et al. 2006). At first glance, the predictions shown in Figure 2 might suggest that comparisons involving uniparentally inherited chromosomes would be very powerful for detecting population size changes. We caution, however, that because nonrecombining markers represent only a single realization of the evolutionary process, they may be particularly sensitive to stochastic variation (and the effects of positive and negative selection). Comparisons involving these markers must account for such uncertainty before demographic inferences can be made.
In conclusion, we suggest that population history may be an important determinant of chromosomal variability in many species. Joint consideration of the diversity level of chromosomes with differing modes of inheritance—along with chromosomal differences in allele frequency spectra (Fay and Wu 1999; Hey and Harris 1999) and linkage disequilibrium (Wall et al. 2002)—should offer new insights into the relative importance of demographic, mutational, and selective forces in shaping genetic diversity.
Associate Editor: D. Presgraves
ACKNOWLEDGMENTS
We thank B. Charlesworth and two anonymous reviewers for helpful comments on earlier versions of this manuscript. This research was supported by an N.I.H. Kirschstein-NRSA postdoctoral fellowship (1 F32 HG004182-01) to JEP, and by grants from Danmarks Grundforskningsfond and the N.I.H. (U01HL084706) to RN.