Anticipation and repeat expansion in bipolar disorder
Abstract
Anticipation is the phenomenon whereby a disease becomes more severe and/or presents with earlier onset as it is transmitted down through generations of a family. The only known mechanism for true anticipation is a class of mutations containing repetitive sequences exemplified by the pathogenic trinucleotide repeat. Studies of bipolar disorder (BPD) are consistent with the presence of anticipation and, by inference, the possibility that trinucleotide repeats contribute to this disorder, although it is possible that these data are the result of methodological problems. On the assumption that anticipation in BPD may be real, several surveys of the genome of BPD probands for large trinucleotide repeats have been conducted, as have studies of many repeat-containing candidate genes. No pathogenic triplet repeat has yet been unambiguously implicated. © 2003 Wiley-Liss, Inc.
INTRODUCTION
Despite the fact that the large body of family, twin, and, to a lesser extent, adoption studies point conclusively to the importance of genetic factors in the etiology of bipolar disorder (BPD) [Craddock and Jones, 2001], molecular genetic studies have thus far failed to unequivocally identify any of the specific genes involved. This is a position familiar to many researchers involved in the genetic analysis of common phenotypes, be they psychiatric or otherwise, and arises because the relationship between genes and susceptibility to disease in most cases is complex. While genetic factors clearly play an important role, for most of the individual genes involved, the correlation between risk variants and the presence of the disorder is at best modest. Instead, individual risk of disease represents the combined effects of multiple genetic and environmental risk factors. When, as in BPD, weak correlation is combined with ignorance about the number of disease loci, the sizes of the effects at each, the disease allele frequencies, and the mechanisms by which disease alleles combine with each other (e.g., additively or multiplicatively) and with the unknown environmental risk factors, the result is a formidable obstacle to finding genes that has yet to be overcome.
While we are no closer to understanding the true modes of inheritance of BPD, one intriguing hypothesis that emerged in the early to mid-1990s is that a specific type of mutation known as a pathogenic trinucleotide repeat is involved.
While we are no closer to understanding the true modes of inheritance of BPD, one intriguing hypothesis that emerged in the early to mid-1990s is that a specific type of mutation known as a pathogenic trinucleotide repeat is involved.
In this review, we discuss the basis for that hypothesis and summarize the results of the molecular genetic investigations it has spawned.
ANTICIPATION, TRINUCLEOTIDE REPEATS, AND DYNAMIC MUTATIONS
There are a number of inherited disorders that appear to manifest themselves progressively with either an earlier age at onset or increased severity as the disease is transmitted down an affected lineage. Perhaps the most classic example of this is myotonic dystrophy, where both age at onset and severity show intergenerational change from a late-onset mild phenotype characterized perhaps by cataracts, to an earlier-onset classic adult form, and finally to a severe congenital form with learning disability [Harper, 1989]. Collectively, these phenomena of earlier age at onset and increased severity in successive generations of a pedigree are known as genetic anticipation. Among several other diseases that unambiguously display anticipation, the example that is likely to be familiar to general readers is Huntington's disease (HD), although here the change is most evident in age at onset rather than severity.
The first clue to the molecular mechanism that remains the only known cause of anticipation came from studies of classic fragile X syndrome, which is caused by expansion of the trinucleotide sequence CGG in the FMR-1 gene [Fu et al., 1991]. Normal individuals carry up to 50 copies of this sequence in the 5′UTR region of the FMR-1 gene, whereas affected individuals carry more than 200 copies [Fu et al., 1991]. Although strictly speaking fragile X syndrome does not display anticipation, it does have an equally puzzling and bizarre pattern of inheritance known as Sherman's paradox [Sherman et al., 1985]. The paradox is that the male siblings of a normal male who carries an affected X chromosome have a much lower probability of developing fragile X syndrome than the male offspring of the carrier's daughters. This is despite the fact that the proportion of both groups who carry the affected chromosome is the same (50%). The explanation is that the unaffected male carriers carry a moderately expanded FMR-1 repeat of between 50 and 200 repeat units, and these do not typically cause the full syndrome. However, repeats of this size are very prone to intergenerational expansion into the pathogenic range. The increased penetrance in the carrier's grandchildren who carry the same affected chromosome is therefore explained by change in the size of the pathogenic repeat [Fu et al., 1991].
A similar mechanism is now known to be responsible for anticipation in myotonic dystrophy and HD, which are caused by pathogenic unstable CTG and CAG repeats, respectively, as are several other neurodegenerative ataxias. In many of these disorders, repeat size correlates with severity and inversely with age at onset rather than penetrance. As the repeats tend to expand during transmission between generations, so the age at onset tends to decrease and the severity increase [Harley et al., 1993; Snell et al., 1993]. This instability has led to the description of pathogenic repeat sequences as dynamic mutations.
ANTICIPATION IN BPD
Based upon the fact that unstable DNA is BPD's only known cause, anticipation is a clue to the possible role of unstable DNA in other disorders displaying the same phenomena. Interestingly, genetic anticipation was first described in the 19th century in the context of major psychiatric disorders, but was thought most likely to be due to ascertainment bias
Interestingly, genetic anticipation was first described in the 19th century in the context of major psychiatric disorders, but was thought most likely to be due to ascertainment bias.
[McInnis, 1996]. However, the discovery of unstable mutations as a cause for anticipation has led to a reevaluation of this view. Several modern studies have sought evidence for anticipation in families affected with BPD (Table I). The majority are consistent with anticipation, but there are a number of ascertainment biases that can create the illusion of this phenomenon, particularly when the families have been ascertained for linkage rather than epidemiological analysis. For a pedigree to be sampled with affected members in multiple generations, affected members in all generations will generally be required to still be alive. This results in bias for early onset in more recent generations, as they must be affected at the time of sampling. Simultaneously, for any disorder that reduces life expectancy or reproductive fitness, there is a bias for relatively late onset in the older generations because people affected early are less likely to be alive at the time of study or to have had children.
Reference | Sample | Ascertainment | Measure of anticipation | Result | Comments |
---|---|---|---|---|---|
McInnis et al. [1993] |
34 unilineal BPAD families | Linkage study | AOO and episode frequency | Positive | Significant earlier AOO (8.9–13.5 years) and increase in illness severity (1.8–3.4 times greater) in G2. 95% of families displayed anticipation |
O'Neill et al. [1993] |
24 BP/UP families | AOO | Positive | Significant earlier AOO (9.5 years) in G2. Parents with older AOO had offspring with earlier AOO but younger onset parents had offspring with older AOO | |
Nylander et al. [1994] |
14 unilineal BPAD families | Lithium clinics | AOO and episode frequency | Positive | Significant earlier AOO (10 years) and increase in illness severity (2 times greater) in G2 |
Heiden et al. [1995] |
12 unilineal BPAD | Linkage study | AOO and episode frequency | Negative | No evidence of anticipation |
Lipp et al. [1995] |
16 BPAD family pairs | Linkage study | AOO and episode frequency | Positive | Anticipation observed in 75% of pairs |
Kumar et al. [1996] |
30 BPAD families (19 unilineal) | Multiplex families | AOO and episode frequency | Positive | Significant earlier AOO but lower episode frequency in G2 |
Souery et al. [1996] |
18 BPAD pairs | Linkage study | AOO and episode frequency | Positive | Significant earlier AOO and increase in episode frequency in G2 |
Huang and Vieland [1997] |
Reanalysed data on 25 families reported in McInnis et al. [1993] |
See McInnis et al. [1993] |
See McInnis et al. [1993] |
Positive | Significant earlier AOO (9.4 years) in G2. New statistical test increased the P value from 0.0001 to 0.014 |
Mendlewicz et al. [1997] |
29 unilineal BPAD pairs | 29 multiplex families | AOO and episode frequency | Positive | Significant earlier AOO and increased episode frequency G2 |
Grigoroiu-Serbanescu et al. [1997] |
36 unilineal BPAD families | Hospitalized patients | AOO (first symptom) | Positive | Significant earlier AOO (6–10 years) in G2, mainly in families with paternal transmission |
Ohara et al. [1998] |
12 unilineal offspring (BP)/parent (UP) families | Local hospitals, multiplex families | AOO clinical severity score, episode frequency hospitalizations, suicide attempts | Positive | Significant earlier AOO and increased episode frequency in G2 |
Alda et al. [2000] |
161 related individuals with BPAD + 320 unrelated BPAD individuals selected to create an artificial 2 generation' sample | Multiplex families and large pool of unrelated individuals | AOO | Positive and negative | Significant difference in the AOO between G1 and G2 in both samples. No significant difference in the magnitude of the difference in AOO between the 2 samples |
Mérette et al. [2000] |
96 related individuals with BP spectrum disorder | Linkage study | AOO + 5 severity indices | Positive but bias | Significant decrease in AOO (7.5 years) between G1 and G2. Result possibly due to quality of information differences in generations |
Visscher et al. [2001] |
27 BP/UP families | AOO (affecteds)/age last seen (unaffecteds) | Uncertain | Increased risk and decreased AOO in subsequent generations. Unable to distinguish between anticipation, cohort effects and other biases |
- AOO, age of onset; G1, first generation; G2, second generation.
Another bias is preferential selection of affected offspring with early onset because this is often associated with increased severity. Severe cases are more striking and more likely to be ascertained. Yet another is that awareness of a familial disorder may lead to its earlier detection and presentation to a doctor in subsequent generations. Improved diagnostics or public education will have much the same effect. Anticipation can also be mimicked by changes in environmental risk factors over time that may modify age at onset or severity depending on the year of birth (a birth cohort effect). Examples of environmental factors that may be relevant to mental disorder include patterns of alcohol and drug consumption, unemployment rates, and rates of survival after obstetric complications.
However, we should be wary of casually attributing the results in Table I to bias, particularly since anticipation was previously erroneously discounted in HD and myotonic dystrophy. When attempts have been made to minimize the effects of bias by performing restricted analyses [e.g., McInnis et al., 1993], or by applying corrective statistical methods [Huang and Vieland, 1997], the observations remain consistent with anticipation.
When attempts have been made to minimize the effects of bias by performing restricted analyses, or by applying corrective statistical methods, the observations remain consistent with anticipation.
However, an interesting study by Alda et al. [2000] suggests that bias can account for a great deal, possibly all, of the apparent anticipation in BPD pedigrees. This group compared the degree to which age at onset changed in a genuine sample of BPD pedigrees with that in artificial pedigrees constructed by simulation from unrelated BPD probands ascertained epidemiologicaly. Both groups displayed a similar degree of anticipation, suggesting that anticipation could simply be the result of bias, at least in this sample. The last two studies in Table I also cast doubt upon true anticipation in BPD. The Canadian study [Mérette et al., 2000] suggested that apparent anticipation in their pedigrees was due to systematic differences in the available clinical information, while the other study [Visscher et al., 2001] was unable to distinguish between anticipation, birth cohort effects, and other biases.
To summarize, the vast majority of studies are generally consistent with the existence of anticipation in BPD, but none have been able to entirely eliminate potential ascertainment biases. Probably no study ever can unless it is conducted prospectively and epidemiologically, and spans multiple generations. Given that such studies are unlikely, we must be guided by studies that have attempted to control for at least some of the many biases. Unfortunately, these have produced mixed results, so firm conclusions are not yet possible. Future research will need to continue to develop techniques to compensate for those biases that are impractical to avoid, and will preferably be based upon epidemiological principles.
REPEAT EXPANSION DETECTION
If we accept that clinical studies at least suggest the possibility of anticipation in BPD, based upon current knowledge of the molecular mechanism, this suggests molecular genetic approaches that target trinucleotide repeats may offer a shortcut to gene identification. One such method that has been fairly widely applied to BPD is the repeat expansion detection (RED) technique [Schalling et al., 1993]. RED allows the measurement of the size of the largest trinucleotide repeat of a defined sequence (CAG, CCG, etc.) in a sample of DNA without knowledge of the repeat's position in the genome or of its flanking sequence. The basis for its application is that if large repeats are involved in a disease, RED will detect large repeats more commonly in DNA from cases than in DNA from controls, with the proviso that the pathogenic repeats are larger than the largest innocent repeat present in most controls.
Virtually all the studies of BPD have focused upon CAG/CTG repeats, as this class of repeat is responsible for the majority of trinucleotide repeat disorders that display anticipation. In 1995, two groups reported that people with BPD did indeed carry larger CAG/CTG repeats than controls [Lindblad et al., 1995; O'Donovan et al., 1995], suggesting that this particular type of repeat is implicated in BPD. These findings were subsequently replicated, but not by everyone (Table II).
Reference | BPAD probands | Controls | Triplet repeat | Result | P value |
---|---|---|---|---|---|
Lindblad et al. [1995] |
123 | 274 | CTG/CAG | Positive | P = 0.0006 |
O'Donovan et al. [1995] |
49 | 74 | CTG/CAG | Positive | P = 0.003 |
Vincent et al. [1996] |
52 | 52 | CTG/CAG | Negative | P = 0.21 P = 0.36 |
O'Donovan et al. [1996b] |
143 | 160 | CTG/CAG | Positive | P = 0.0006 |
Li et al. [1998] |
43 | 61 | CTG/CAG | Negative | P = 0.37 |
Oruc et al. [1997] |
40 | 79 | CTG/CAG | Negative | P > 0.30 |
15 with FH | 79 | CTG/CAG | Positive | P = 0.0075 | |
Lindblad et al. [1998] |
53 relatives from 12 families | 123 unaffected relatives | CTG/CAG | Positive | P < 0.0007 |
Zander et al. [1998] |
119 | 88 | CTG/CAG | Negative | P = 0.38 |
Verheyen et al. [1999] |
10 families | CTG/CAG | Negative | ||
Vincent et al. [1999] |
91 | 91 | CTG/CAG | Negative | P = 0.98 |
Pato et al. [2000] |
24 relatives from 9 families showing anticipation | 53 | CTG/CAG | Positive | P < 0.0055 |
Jin et al. [2001] |
149 | 120 | CTG/CAG | Negative | P > 0.1 |
This pattern of inconsistency between studies is a familiar one, with many possible explanations [O'Donovan and Owen, 1999]. Inadequate matching between cases and controls seemed a particularly plausible explanation for some of the RED studies after a report that large repeats might be associated with health [O'Donovan et al., 1996a]. If correct, the positive associations might reflect differences in the general health of the cases and controls rather than affected status for BPD. However, a follow-up study where bipolar and schizophrenic probands were also selected for health did not support this [Cardno et al., 1998]. Another possible explanation is that clinical differences in the samples reflect genetic heterogeneity between studies. Success or failure to detect association might simply depend upon the mix of cases. It does not appear that age at onset or severity, which, given the rationale for RED studies, may be a priori the most important variables, are related to RED product size [O'Donovan et al., 1996b; Craddock et al., 1997], but it remains possible that other clinical characteristics may be relevant.
One issue that has not been widely commented on is laboratory error. It is unclear to what extent if any error can be blamed for the results, positive or negative. We suspect that genotyping error rates with RED are probably much greater than they are for more routine genetic procedures, where, as is now appreciated, they are not always trivial. Finally, another important factor is power. Power estimation of RED analysis is hampered by the fact that RED is not a single locus test, but nevertheless, inspection of Table II reveals that most RED studies would now be considered underpowered. On the other hand, unless major publication bias is operating, the number of positive findings is greater than one would expect simply by chance. On balance, we conclude that the RED data in BPD still tentatively support the view that expanded CAG/CTG repeats may be implicated in BPD, but studies of a thousand or more cases and controls are now required to settle the matter.
On balance, we conclude that the RED data in BPD still tentatively support the view that expanded CAG/CTG repeats may be implicated in BPD, but studies of a thousand or more cases and controls are now required to settle the matter.
As RED consumes several hundred times more DNA and is technically more demanding than most molecular genetics techniques, we suspect such studies will not be forthcoming. It should also be noted that the RED method cannot detect polymorphism in CAG/CTG repeat size below 40 repeats, as this is the minimum repeat number generally observed in the population. Thus, even if all RED studies were negative, pathogenic repeats of less than this size would not be excluded. Moreover, only CAG/CTG repeats have been investigated extensively by RED, and it remains possible that other expanding repeats, trinucleotide or otherwise, can explain the apparent anticipation in BPD [Margolis et al., 1999].
INDIVIDUAL REPEAT LOCI
As RED provides no information concerning either the position of the associated repeats in the genome or the adjacent unique sequence, RED studies must be followed up by tests of specific loci. A large number of triplet repeat sequences have been tested in BPD, and a full discussion of them all is beyond the scope of this review. Our own group analyzed around 200 CAG/CTG repeats scattered across the genome using a method we reported on the first 50 [Guy et al., 1997]. Other studies have focused on CAG/CTG repeats that map to regions of interest derived from linkage studies, including chromosomes 4 [Speight et al., 1997], 12 [Franks et al., 1999], 18 [Goossens et al., 2000], and 22 [Saleem et al., 2001]. So far, these approaches have not been rewarded by evidence for association.
Others have adopted an approach based on cloning large repeats rather than scanning repeats in candidate genes or candidate regions. Three loci have been identified that can explain a substantial proportion of the repeats detected by RED. The first of these, CTG18.1, maps to 18q21.1 and is located within an intron of the SEF2-1B gene [Breschel et al., 1997], a gene that regulates gene transcription and that now goes under the name of transcription factor 4 (TCF4). The second locus, ERDA1 [Nakamoto et al., 1997], is also known as Dir 1 [Ikeuchi et al., 1998] and maps to 17q21.3. The function of this repeat, if any, is unknown. The third is a polymorphic CTA/TAG and CAG/CTG composite repeat locus, TGC13-7a, which maps to 13q21.2-q21.31 [Vincent et al., 1999]. It has been reported that expansion of the untranslated repeat causes spino-cerebellar ataxia type 8 (SCA8), but this remains in dispute.
As a high proportion of the large CAG/CTG repeats detected by RED can be explained by repeat size at the first two of these loci [Lindblad et al., 1998], if the RED associations are correct, one would predict that repeat size at one or more of these specific loci is likely to be individually associated with BPD. One of the original groups to report positive RED findings reported a modest association between the repeat in TCF4 and BPD [Lindblad et al., 1998]. A number of further case-control studies in samples of bipolar probands have been negative [Breschel et al., 1997; Guy et al., 1999; Vincent et al., 1999; Verheyen et al., 1999; McInnis et al., 2000; Meira-Lima et al., 2001; Jin et al., 2001], but the case for TCF4 is not closed because the largest study yet undertaken (∼403 cases/484 controls) reported modest evidence, although this was only after stratification for positive family history and severity [Del-Favero et al., 2002]. One study of 24 bipolar families exhibiting anticipation found a single family in which expansions at this gene segregated with disease. This in itself is not strong support because this locus maps to a possible region of linkage [Pato et al., 2000].
The ERDA1 expansion accounts for the highest proportion of the large RED products. Association between large repeats at this locus and BPD has been reported [Verheyen et al., 1999], but it is not clear that probands and controls were well matched for ethnicity in this study, which may be particularly important as large differences in repeat size have been demonstrated in different ethnic groups [Deka et al., 1999]. Moreover, the vast majority of other studies now suggest that ERDA1 is not involved in BPD [Vincent et al., 1999; Guy et al., 1999; Lindblad et al., 1998; Meira-Lima et al., 2001], including a large multicenter European study [Del-Favero et al., 2002].
Finally, large repeats at the SCA8 locus have been reported to be more common in patients with major mental illness, including BPD [Vincent et al., 2000b]. It is not clear how much weight to lend to this study because although the sample sizes are large, they were derived from multiple populations, some relatively isolated, and contained a mixture of multiple individuals from the same family, as well as unrelated individuals. Moreover, the large alleles were rare, with frequencies of 0.7–1.25%. More extensive studies in homogeneous populations will be required to confirm or reject the involvement of SCA8.
To summarize, despite the fact that the above three loci explain a high proportion of large repeats detected by RED, none are convincingly associated with susceptibility to BPD.
Despite the fact that the above three loci explain a high proportion of large repeats detected by RED, none are convincingly associated with susceptibility to BPD.
The findings with TCF4 and SCA8 are interesting and deserve further attention in large samples. Even if they are associated, the effect is almost certainly of insufficient magnitude to explain all the RED data [Verheyen et al., 1999], and the discrepancy between RED findings and the tests of the individual loci is disquieting. In two samples (including our own), the three loci only explain around half of large CAG/CTG repeats detected by RED [Vincent et al., 1999; Guy et al., 1999]. There are two plausible explanations for the difference between those studies where most of the large RED repeats are explained by the three loci and those where they are not. The first is genuine differences in the genetic architecture underlying the RED data; the second is methodological differences in the interpretation of RED data. In order to investigate these competing explanations, an independent group that was able to explain its own RED findings by the individual repeat sizes at the known loci has reanalyzed a small number of the authors' samples. The findings thus far suggest the presence of at least one other unknown CAG/CTG locus, although insufficient numbers have been tested to determine if this fully explains the discrepancies between studies (R. Margolis, unpublished findings).
SEARCH FOR POLYGLUTAMINE SEQUENCES
In most known disorders associated with expanded CAG/CTG repeats, the pathogenic repeat encodes the amino acid glutamine. This leads to a simple hypothesis. If expanded CAG repeats are involved in the pathogenesis of BPD, large polyglutamine sequences may be evident in protein extracts from tissues obtained from BPD. Four studies have addressed this, but none have identified the postulated large polyglutamine repeats [Jones et al., 1997; Zander et al., 1998; Schurhoff et al., 1997; Turecki et al., 1999]. In myotonic dystrophy, and also possibly SCA8, the pathogenic repeat is an untranslated CTG, so these data do not allow full rejection of the hypothesis that CAG/CTG repeats are involved in BPD. They do with certain caveats [Schurhoff et al., 1997] suggest that if they are pathogenic in BPD, the CAG/CTG repeats do not exert their effects by means of a large polyglutamine sequence.
CONCLUSIONS
The hypothesis that trinucleotide repeats, and in particular CAG/CTG repeats, are involved in the etiology of BPD has provoked a major branch of genetic research into this disorder. We have reviewed the positive evidence for the existence of anticipation in bipolar families, but discussed the methodological and statistical problems that make it difficult to draw firm conclusions regarding this phenomenon. A number of case-control studies employing the RED technique have found evidence of longer repeats in bipolar probands, but other studies have been negative. Although some positive findings have emerged, no triplet repeat polymorphisms have been unequivocally found to play a role in susceptibility to BPD. We have identified a few obvious lines of research that may help clarify some of the ambiguities. At the present time it is impossible to predict whether such genes will be found or whether, like most hypotheses in science, the trinucleotide repeat hypothesis of BPD will pass into historical obscurity.