The inherited ataxias: Genetic heterogeneity, mutation databases, and future directions in research and clinical diagnostics†
For the Databases in Neurogenetics Special Issue
Abstract
The inherited cerebellar ataxias are a diverse group of clinically and genetically heterogeneous neurodegenerative disorders. Inheritance patterns of these disorders can be complex with autosomal dominant, autosomal recessive, X-linked, and mitochondrial inheritance demonstrated by one or more ataxic syndromes. The broad range of mutation types found in inherited ataxia contributes to the complex genetic etiology of these disorders. The majority of inherited ataxias are caused by repeat expansions; however, conventional mutations are important causes of the rarer dominant and recessive ataxias. Advances in sequencing technology have allowed for much broader testing of these rare ataxia genes. This is relevant to the aims of the Human Variome Project, which aims to collate and store gene variation data through mutation databases. Variant data is currently located in a range of public and commercial resources. Few locus-specific databases have been created to catalogue variation in the dominant ataxia genes although there are several databases for some recessive genes. Developing these resources will facilitate a better understanding of the complex genotype–phenotype relationships in these disorders and assist interpretation of gene variants as testing for rarer ataxia genes becomes commonplace. Hum Mutat 33:1324–1332, 2012. © 2012 Wiley Periodicals, Inc.
Introduction
The inherited cerebellar ataxias are a group of neurodegenerative disorders in which the dominant feature is progressive cerebellar degeneration, resulting in impairment of balance, gait, coordinated limb movements, and speech. Ataxia may present as an isolated cerebellar syndrome or more often is associated with a broad spectrum of neurological manifestations including pyramidal, extrapyramidal, sensory, and cognitive dysfunction. Given the significant clinical heterogeneity of these disorders and the complexity of the cerebellum and its associated connections, it is unsurprising that there exists significant genetic heterogeneity.
The prevalence of genetic forms of ataxia has been difficult to accurately determine and several reviewers suggest that previous data have underestimated the extent of the problem [Klockgether, 2011]. Worldwide prevalence of autosomal dominant ataxia has been reported to be between 1.2 and 41:100,000 [Wardle and Robertson, 2007].
Inheritance patterns of these disorders can be complex with autosomal dominant, autosomal recessive, X-linked, and mitochondrial inheritance demonstrated by one or more ataxic syndromes. The broad range of mutation types found in the inherited ataxias contribute to the complex genetic etiology of these disorders. Many of the more common ataxias are caused by a range of polynucleotide repeat expansions including trinucleotide, pentanucleotide, and hexanucleotide repeats; however, point mutations, deletions, and duplications are also represented. Dominant cerebellar ataxia genetic loci are designated with the spinocerebellar ataxia (SCA) prefix, with SCA36 being the most recent locus to be reported. The precise number of loci is open to interpretation as the list of approved SCA designations has been “polluted” with a variety of other related syndromes (SCA18—predominantly sensory ataxia; SCA29 congenital nonprogressive ataxia, allelic disorders [SCA15/16], and a SCA designation without a reported locus [SCA9]). Taking these discrepancies aside, there are a total of 30 separate SCA loci with the associated genetic variant identified in 21 of these. In addition, there are a number of other dominantly inherited neurological syndromes in which ataxia is a prominent feature including Huntington disease, dentatorubral-pallidoluysian atrophy (DRPLA), Alexander disease, and Gerstmann–Straussler–Scheinker disease. The recessive ataxias are even more diverse with nearly 100 genes having been identified (Washington Neuromuscular Website—www.neuromuscular.wustl.edu/ataxia/recatax) and although many of these are exceedingly rare, the commonest inherited ataxia worldwide is caused by recessively inherited mutations in the frataxin (FXN) gene resulting in Friedrich ataxia (FRDA).
This article will provide an overview of the genetics of the inherited ataxias with a focus on the more rare spinocerebellar ataxia genes caused by conventional mutations. In the second part of the article, the current situation with regard to mutation databases in ataxia will be discussed.
Genetics of the Autosomal Dominant Ataxias
Due to Repeat Expansions
The majority of the dominantly inherited ataxias are caused by repeat expansions in either coding or noncoding parts of the relevant genes [Dueñas et al., 2006]. Polyglutamine (CAGn) expansions are the most common of these and comprise SCAs 1, 2, 3, 6, 7, and 17 and DRPLA. Genotype–phenotype correlations of these disorders are well described [Schöls et al., 2004] with the disease manifesting above a threshold of CAG repeats. The noncoding expansion SCAs comprise SCA8 (CTGn), SCA10 (ATTCTn), SCA12 (CAGn), SCA31 (TGGAAn), and SCA36 (GGCCTGn). Larger repeat numbers generally result in an earlier age of onset and more severe phenotype (genetic anticipation). Anticipation has been demonstrated to a varying degree by all of the repeat expansion SCAs. A summary of the dominant ataxia genes is provided in Table 1.
Name | OMIM # | Locus | Gene | Protein | Mutation | % of ADCA | Geographical distribution | Characteristic features |
---|---|---|---|---|---|---|---|---|
Repeat Expansions: coding | ||||||||
SCA1 | 164400 | 6p22.3 | ATXN1 | Ataxin 1 | CAG repeat | 6–27% | Common: South Africa, Japan, India, Italy, Australia | Hyperreflexia, sensory neuropathy, mild cognitive impairment |
SCA2 | 183090 | 12q24.12 | ATXN2 | Ataxin 2 | CAG repeat | 13–18% | Common: United States, Spain, India, Mexico, Italy | Polyneuropathy, parkinsonism, dysphagia |
SCA3 | 109150 | 14q32.12 | ATXN3 | Ataxin 3 | CAG repeat | 20–50% | Most common worldwide | Spasticity, polyneuropathy, dystonia, parkinsonism |
SCA6 | 183086 | 19p13.2 | CACNA1A | Calcium channel, voltage dependent, P/Q type, α1A subunit | CAG repeat | 13–15% | Common: United States, Germany, Australia, Taiwan | Late onset, pure ataxia |
SCA7 | 164500 | 3p14.1 | ATXN7 | Ataxin 7 | CAG repeat | 3–5% | Finland, Mexico, South Africa | Retinal degeneration |
SCA17 | 607136 | 6q27 | TBP | TATA box-binding protein | CAG repeat | Rare | United Kingdom, Belgium, France, Germany, Japan | Dementia |
DRPLA | 125370 | 12p13.31 | ATN1 | Atrophin 1 | CAG repeat | 0.8:100,000 (Japan) Rare worldwide | Japan, Portugal, United States | Dementia, epilepsy |
HD | 143100 | 4p16.3 | HTT | Huntingtin | CAG repeat | 3–7:100,000 | Worldwide | Chorea, dementia |
Repeat Expansions: noncoding | ||||||||
SCA8 | 608768 | 13q21.33 | ATXN8OS | Ataxin 8 opposite strand | CTG repeat | 3% | Common: Finland | Pure ataxia |
SCA10 | 603516 | 22q13.31 | ATXN10 | Ataxin 10 | ATTCT repeat | Unknown | Mexico, Brazil | Seizures |
SCA12 | 604326 | 5q32 | PPP2R2B | Protein phosphatase 2, regulatory subunit B, β | CAG repeat | Rare worldwide 7% in India | Common: India | Tremor, polyneuropathy |
SCA31 | 117210 | 16q21 | BEAN | Brain expressed associated with NEDD4 | TGGAA repeat | 8–40% in Japan Rare worldwide | Japan esp. Nagano prefecture | Spasmodic torticollis |
SCA36 | 614153 | 20p13 | NOP56 | Nuclear protein 56 | GGCCTG repeat | 6.3% Galicia, Spain 9 Japanese families | Spain, Japan | Motor neurone involvement |
Conventional mutations | ||||||||
SCA5 | 600224 | 11q13 | SPTBN2 | Beta 3 Spectrin | Deletions, missense mutations | Rare | United States, Germany, France | Pure ataxia, facial myokymia, gaze palsy |
SCA11 | 611695 | 15q15.2 | TTBK2 | Tau tubulin kinase 2 | Nonsense, frameshift deletions/insertions | Rare | United Kingdom, France, Germany | Pure ataxia |
SCA13 | 605259 | 19q13.3-q13.4 | KCNC3 | Potassium channel, voltage-gated, shaw-related subfamily, member 3 | Missense | 1% (France) | France, Philippines | Early onset, mental retardation |
SCA14 | 176980 | 19q13.4 | PRKCG | Protein kinase C gamma | Missense, deletion | 2% (France) | United Kingdom, France, Netherlands, United States, Japan, Australia | Myoclonus, dystonia |
SCA15 | 606658 | 3p26-p25 | ITPR1 | Inositol 1, 4, 5-triphosphate receptor type 1 | Muliti-exon or whole gene deletion, missense | 1.8% (France) 0.3% (Japan) | United Kingdom, France | Pure ataxia |
SCA20 | 608687 | 11q12 | – | – | 260kb duplication | Rare | Australia | Dentate calcification, bulbar symptoms |
SCA23 | 610245 | 20p13 | PDYN | Prodynorphin | Missense | Rare | Netherlands | Pyramidal signs |
SCA27 | 609307 | 13q33.1 | FGF14 | Fibroblast growth factor 14 | Missense | Rare | Netherlands | Onset with tremor, psychiatric episodes |
SCA28 | 610246 | 18p11.21 | AFG3L2 | ATPase FAMILY GENE 3-LIKE 2 | Missense, deletion | 3% | Italy, France, United Kingdom | Ophthalmoplegia, spasticity |
SCA35 | 613908 | 20p13 | TGM6 | Transglutaminase 6 | Missense | Rare | China | Pure ataxia |
Due to Conventional Mutations
A minority of the dominant ataxia syndromes (SCAs 5, 11, 13, 14, 15, 20, 23, 27, 28, and 35) is caused by conventional mutations. In a French ataxia series, conventional mutations accounted for 6% of all dominant ataxia, repeat expansions accounted for 45% with the remaining 48% being genetically undiagnosed [Durr et al., 2009]. Genotype–phenotype correlations are much harder to determine in this group, owing to the limited number of families affected by these mutations. Functional analysis of potassium channels (EA1, SCA13) and calcium channels (SCA6, EA2) has demonstrated a correlation between the degree of functional impairment and the severity of the phenotype. In contrast to the repeat expansion SCAs, these disorders often have a “purer” cerebellar phenotype (ADCAIII), with a slower rate of progression.
SCA5 is caused by a mutation in the SPTBN2 gene, which encodes B3 spectrin [Ikeda et al., 2006]. Missense and in-frame deletions have been described resulting in a pure cerebellar syndrome with onset between 15 and 50 years. The first SCA5 kindred was reported in 1994 with 56 affected individuals over 10 generations who were descendants of the paternal grandparents of Abraham Lincoln. SCA5 has also been reported in French and German pedigrees [Zühlke et al., 2007].
SCA11, initially reported in two British families, is caused by stop mutations, frameshift insertions or deletions in the TTBK2 gene, resulting in a pure cerebellar syndrome with normal life expectancy [Houlden et al., 2007]. Pathogenic variants in TTBK2 have also been reported in French and German families [Bauer et al., 2010].
SCA13 was initially reported in French and Filipino families and is caused by missense mutations in KCNC3, which encodes a voltage-gated potassium channel [Figueroa et al., 2010]. There is a wide phenotypic spectrum that correlates with different missense mutations. The childhood-onset form, in which motor and mental developmental delay is a common feature, has been associated with two variants: (g.10693G>A p.Arg423His) and (g. 10767T>C p.Phe448Leu) described in European and Filipino families, respectively [Figueroa et al., 2011]. Females are more frequently affected and of the two missense mutations reported, the p.Phe448Leu variant results in the more severe phenotype. The p.Arg423His variant has also been reported in a Caucasian family in the United States.
SCA14 is caused by mutations in PRKCG [Yabe et al., 2003], resulting in a variable ataxic phenotype, which may include myoclonus, dystonia, or peripheral neuropathy. The onset is usually in adulthood. The majority of mutations (missense) have been reported in exons 4, 5, 10, and 18. It has been reported in more than 20 families from Europe, Japan, and Australia [Klebe et al., 2005].
SCA15/16 is caused by heterozygous deletions of the 5′ part of the ITPR1 gene [van de Leemput et al., 2007] although a missense mutation (c.1480G>A p.V494I) has been reported. The ITPR1 protein is highly expressed in cerebellar Purkinje cells and is an important modulator of intracellular calcium signaling. SCA15/16 is characterized by a mild cerebellar ataxia with slow disease progression. In a French ataxia series, SCA15 was identified in 1.8% of patients [Marelli et al., 2011]. SCA15/16 shares a locus with SCA29, raising the possibility that they are allelic disorders.
SCA20 has been described in a single Australian family of Anglo-Celtic descent and is the result of a 260 kb duplicated region comprising >12 genes at 11q12 [Knight et al., 2004]. Bulbar symptoms including dysphonia and spasmodic cough in addition to dentate nucleus calcification are characteristic of this condition.
SCA23 is due to missense mutations of PDYN [Bakalkin et al., 2010], which encodes prodynorphin protein, an opioid neuropeptide precursor. This causes a relatively pure cerebellar syndrome with a late onset (43–73 years) and slow progression. The disease has been reported in only a single large Dutch ataxia family and was not identified on screening a large German ataxia series [Schicks et al., 2011].
SCA27 causes an early-onset ataxia [Brusse et al., 2006], associated cognitive deficits, and head or limb tremor and dyskinesia that can be exacerbated by stress or exercise. The causal gene was identified, in a large Dutch kindred, fibroblast growth factor 14 (FGF14) with missense and nonsense mutations having been reported. There is normal life expectancy; however, most affected patients are unable to walk by the seventh to eighth decade. The disease has also been reported in a German ataxia patient.
SCA28 is caused by a mutation in AFG3L2, which encodes a mitochondrially located metalloprotease [Di Bella et al., 2010]. Missense mutations have been reported which are commonly located in the proteolytic domain of the protein with a mutation hotspot in exons 15–16. SCA28 has a typically early onset between 12 and 36 years and is characterized by a slowly progressive cerebellar ataxia with ophthalmoparesis and lower limb hyperreflexia. The disease is estimated to account for 1.5% of European ADCA cases [Cagnoli et al., 2010].
SCA35 is caused by mutations in the cerebral transglutaminase TGM6 and was the first dominant ataxia gene to be identified through exome sequencing [Wang et al., 2010]. Missense mutations were reported in two Chinese families in which a late-onset cerebellar syndrome with associate upper motor neuron involvement was reported. There was moderate progression with patients commonly using a wheelchair 20 years after disease onset.
Episodic Ataxias
The episodic ataxias are a group of heterogeneous channel disorders characterized by attacks of ataxia, which may be associated with a range of other neurological manifestations including myokymia, migraine, seizures, or chorea. Eight episodic ataxia syndromes have been described: EA 1–7 and episodic ataxia with paroxysmal choreoathetosis and spasticity (CSE). EA 1 and 2 are the most common and best characterized of these. The genes for EA 1, 2, 5, and 6 (Table 2) have been identified with linkage loci mapped in EA 3, 7, and CSE. Episodic ataxia is rare with a combined incidence of <1:100,000.
Name | OMIM # | Locus | Gene | Protein | Mutation | Geographical distribution | Characteristic features |
---|---|---|---|---|---|---|---|
EA1 | 160120 | 12p13.32 | KCNA1 | Potassium channel, voltage gated, shaker-related subfamily, member 1 | Missense | Worldwide | Attacks last seconds–minutes; myokymia |
EA2 | 108500 | 19p13.2 | CACNA1A | Calcium channel, voltage dependent, P/Q type, α1a Subunit | Missense, nonsense, large deletions | Worldwide | Attacks last hours; allelic with SCA6, familial hemplegic migraine |
EA5 | 613855 | 2q23.3 | CACNB4 | Calcium channel, voltage dependent, β-4 Subunit | Missense | French-Canadian family | Attacks last hours–days; late onset, seizures |
EA6 | 612656 | 5p13.2 | SLC1A3 | Solute carrier family 1 (glial high affinity glutamate transporter), member 3 | Missense | United States, Netherlands | Alternating hemiplegia, seizures |
EA1 is primarily due to missense mutations in KCNA1 [Browne et al., 1994] although truncation mutations have been reported. The disease is characterized by brief periods of ataxia (seconds to minutes) and interictal myokymia. The degree of channel impairment correlates with the severity of the phenotype… Mutations associated with severe phenotypes that may be poorly treatment responsive or associated with seizures or neuromyotonia show the most significant impairment of potassium channel function.
EA2 is due to a range of mutations in CACNA1A [Ophoff et al., 1996], which include missense, nonsense, aberrant splicing, and nucleotide insertions and deletions. EA2 typified by longer periods of ataxia lasting several hours with baseline nystagmus and progressive ataxia. There is a wide spectrum of phenotypes associated with mutations in CACNA1A. EA2 is allelic with SCA6 and familial hemiplegic migraine (FHM). Most of the mutations that cause EA2 disrupt the open reading frame, whereas FHM is caused primarily by missense mutations.
EA5 has been described in a single French-Canadian family that was heterozygous for a missense mutation in the CACNB4 gene, resulting in a phenotype similar to EA2 [Escayg et al., 2000]. The precise functional effects of this mutation are not clear as the same mutation was identified in a German family with generalized epilepsy but no ataxia.
EA6 was initially reported in a patient from the United States presenting with characteristic episodes of hemiplegia, seizures, and ataxia. A de novo mutation was identified in the SLC1A3 gene, which results in complete loss of function of the protein EAAT1—a glutamate transporter localized to astrocytes. Other cases have been reported in the Netherlands with the p.C186S variant that resulted in a milder phenotype without the manifestations of seizures or alternating hemiplegia [de Vries et al., 2009].
Recessive Ataxias
The recessive ataxias are a particular diverse group of disorders that are generally early onset with significant variation in clinical phenotype, which is variably associated with neuropathy, ophthalmological disturbance, seizures, and a range of other neurological and non-neurological manifestations. These disorders are discussed in detail in other reviews; a nonexhaustive summary of recessive ataxia genes are listed in Table 3. FRDA is the most common recessive ataxia worldwide [Palau and Espinós, 2006] and is mainly due to homozygous GAA expansions in the FXN gene, but few patients show compound heterozygosity for a point mutation and the GAA-repeat expansion. Some common pathological pathways have been described in the recessive ataxias including DNA repair dysfunction, mitochondrial dysfunction, defects in lipoprotein metabolism, and protein chaperone dysfunction. There is significant overlap of clinical phenotype with a range of metabolic ataxias (Table 4), which are invariably complex multisystem disorders that can result in severe disability despite dietary modification where possible.
Disease | OMIM ♯ | Gene | Protein | Mutation | Incidence/carrier frequency | Geographic distribution | Characteristic features |
---|---|---|---|---|---|---|---|
Friedrich ataxia | 606829 | FXN | Frataxin | GAA repeat expansions; point mutations in compound heterozygotes | Incidence: 1:30–50,000Carrier frequency: 0.9–1.6% | Worldwide except natives to: Far East, sub-Saharan Africa, Australia, America | Spasticity, neuropathy, cardiac involvement |
Ataxia-telangiectasia | 607585 | ATM | Ataxia telangiectasia mutated | Deletions: splice-site related; nonsense; missense | Incidence: 1:400,000–450,000 live birthsCarrier frequency: 0.35–1% | Reported in many worldwide populations | Oculomotor apraxia; extrapyramidal features; increased cancer risk/ radiosensitivity |
Ataxia-telangiectasia like disorder (ATLD) | 604391 | MRE11A | Meiotic recombination 11, S. cerevisiae, homolog of | Missense | 25 reported cases worldwide | Saudi Arabia (15 cases), Japan (4 cases), UK (4 cases), Italy (2 cases) | Similar to ATM but milder phenotype |
Ataxia-oculomotor apraxia type 1 | 208920 | APTX | Aprataxin | Insertion, deletion, missense | Rare worldwide—More common in Portuguese and Japanese populations | Portugal, Japan, France, Tunisia | Oculomotor apraxia, peripheral neuropathy |
Cerebellar ataxia with muscle coenzyme Q10 deficiency | 607426 | APTX | Aprataxin | Missense | Rare | Single Italian family | Low coenzyme Q10 levels; late-onset hypergonadotrophic hypogonadism |
Ataxia-oculomotor apraxia type 2 | 606002 | SETX | Senataxin | Nonsense, missense | Carrier frequency: 2.1–3.5%Incidence: 1:400,000 (Alsace) | Commoner in French-Canadian populations | Oculomotor apraxia (variable); extrapyramidal features; peripheral neuropathy |
Spastic ataxia of charlevoix-saguenay (ARSACS) | 270550 | SACS | Sacsin | Stop-gain deletions and point mutations most common | Carrier frequency (Quebec): 4.5%Incidence: 1/1930 | Most common in Quebec. Tunisian, Turkish, Italian, Japanese families reported | Myelinated retinal fibers; prominent lower limb spasticity |
Cerebellar ataxia, seizures and ubiquinone deficiency | 612016 | ADCK3 | aarF domain containing kinase 3 | Missense, splice site, frame shift, deletion | Rare | French, Dutch, British families reported | Mental retardation, seizures, low coenzyme q10 levels |
Spinocerebellar ataxia with axonal neuropathy (SCAN1) | 607250 | TDP1 | Tyrosyl DNA phosphodiesterase 1 | Missense | Rare | Saudi Arabian family | Axonal neuropathy |
Autosomal recessive spinocerebellar ataxia type 8 | 610743 | SYNE1 | Synaptic nuclear envelope protein 1 | Splice site, intronic | Rare worldwide 3rd most common ARCA in Quebec | Canada | Hyperreflexia |
Autosomal recessive spinocerebellar ataxia type 10 | 613728 | ANO10 | Anoctamin 10 | Missense, splice site, deletion | Rare | French, Dutch, Serbian families | Tortuous conjunctival vessels |
Disease | OMIM # | Gene | Protein | Mutation | Incidence/carrier frequency | Geographic distribution | Characteristic features |
---|---|---|---|---|---|---|---|
Ataxia with selective vitamin E deficiency | 277460 | TTPA | Tocopherol transfer protein alpha | Frameshift, missense | Prevalence: 0.55–3.5:1000,000 | United Kingdom, French, Italian, Moroccan, Japanese families reported. | Low vitamin E; resembles FRDA |
Abetalipoproteinemia | 200100 | MTTP | Microsomal triglyceride transfer protein | Missense, nonsense | Prevalence: <1:1000,000 | Global | Acanthocytosis; pigmentary retinal degeneration; polyneuropathy |
Refsum disease | 266500 | PHYH | Phytanoyl-CoA hydroxylase | Missense, nonsense, deletions, splice site mutations | Prevalence: 1:1000,000 | Global | Deafness, retinitis pigmentosa, icthyosis, demyelinating polyneuropathy |
Cerebrotendinous xanthomatosis | 213700 | CYP27A1 | Cytochrome p450 subfamily XXVIIA, polypeptide 1 | Missense, deletions, splice site mutations | Prevalence (Moroccan Jews): 1:108 United States: 1:50,000 | More common in Moroccan Jews | Widespread cholesterol deposits: tendons, brain, lungs; cataracts; dementia |
Niemann Pick Type C | 607623 | NPC1 | NPC1 protein | Deletions, point mutations | Prevalence: 1:100,000–150,000 | Global | Extrapyramidal features, seizures, dementia, |
Wilson's disease | 277900 | ATP7B | ATPase, Cu(2+)-Transporting, beta polypeptide | Point mutations, nonsense mutations | Prevalence: 1:10,000–30,000 | Global—higher incidence in China, Japan, Sardinia | Extrapyramidal features; liver disease |
Future Directions in Ataxia Research and Diagnostics
The rate of discovery of new ataxia genes has accelerated enormously in recent years, commensurate with rapid advances in next-generation sequencing (NGS) technologies [Bamshad et al., 2011]. Exome sequencing in particular has demonstrated its utility in the identification of causal genes in a variety of Mendelian disorders including ataxia [Montenegro et al., 2011; Ng et al., 2010; Pierson et al., 2011]. Although exome sequencing is a useful tool in new gene discovery, issues of cost and significant data storage burden associated with processing exome samples prohibit the routine use of sequencing in a diagnostic setting. A modification of the technology, targeted enrichment, and sequencing [Mertes et al., 2011; Schlipf et al., 2011] will allow focused panels of relevant genes to be sequenced in highly multiplexed and extremely cost-effective runs. This is much more likely to replace traditional Sanger sequencing runs in diagnostic laboratory as the technology becomes cheaper and more widespread and has the potential to transform the capabilities of diagnostic laboratories and research groups worldwide.
The provision of diagnostic tests for the inherited ataxias is generally limited by cost and technical considerations. In the United Kingdom, most patients (depending on the clinical presentation and inheritance pattern) the following tests are available: SCA1, 2, 3, 6, 7, 12, 17, HD, DRPLA, PRNP, FXN, ATM, AOA1/2. Testing for rarer genes is often available from specific international diagnostic laboratories or on a research basis by various interested research groups.
It seems likely that in future diagnostic, laboratories will be able to offer relatively low-cost screening for all known ataxia genes using targeted NGS techniques and which are already being employed to screen for genetic conditions including nonsyndromic deafness [Walsh et al., 2010] and hypertrophic cardiomyopathy [Voelkerding et al., 2010].
Despite the advantages of NGS in gene discovery and diagnostics over conventional methods, significant challenges remain with the interpretation and storage of the wealth of data generated by these NGS applications. Exome sequencing identifies on average between 20,000 and 24,000 single-nucleotide variants per sample [Bamshad et al., 2011]. Most analysis pipelines for NGS variant data include a step to filter sequenced variants against control datasets such as dbSNP (www.ncbi.nlm.nih.gov/projects/SNP), 1000genomes (www.1000genomes.org), and the Washington Exome Variant Server (http://evs.gs.washington.edu/EVS/). These datasets should be used with caution however, as there have been reports of “contamination” of some datasets including dbSNP with rare pathogenic variants with disease mutations that are not sufficiently annotated [Walsh et al., 2010]. This can potentially lead to the exclusion of variants that are potentially pathogenic on the basis of their presence in these datasets.
Subsequent filtering steps take into account specific inheritance patterns (e.g., exclusion of all but homozygous variants in recessive disorders) and may make use of linkage data or homozygosity mapping, where available, to further refine the list of variants. Variants can also be stratified according to the impact of the variant on protein structure and function and the degree of evolutionary conservation. A range of bioinformatics tools are available to enable this including the commonly used SIFT (Sorting Intolerant from Tolerant) and PolyPhen2. Often multiple tools are used for in silico analyses of pathogenicity. A study by Thusberg et al. (2011), investigating the performance of pathogenicity prediction methods, found MutPred and SNPs&GO to be the best performing tools; however, they noted that no single method performed optimally according to their specified parameters. Functional analysis of variants is usually limited to research laboratories and is generally not practicable in a diagnostic setting.
Equally significant in advancing such research will be the extensive sharing of next-generation datasets and associated phenotypic information for the benefit of national and international collaborations. Recently launched by the Miller School of Medicine at the University of Miami was the Genome Variant Database for Neuromuscular Diseases (hihg.med.miami.edu/gvd-nmd). The aim of the resource is to share genomic data on patients and families with neuromuscular disorders including Charcot–Marie–Tooth disease, hereditary spastic paraplegia, and amyotrophic lateral sclerosis. Complete variant data determined through exome sequencing are provided for a range of families in which the pathogenic mutation is currently unknown. Although ataxia families are not currently represented, this is a good model for NGS data sharing in investigating neurological disorders.
Neurogenetics Databases
Although the challenges in developing mutation databases for ataxia genetics are by no means unique, they are particularly well aligned to those outlined by the Human Variome Project (HVP), and its neurogenetics consortium, which aims to develop a global collaboration for the collection, storage, interpretation, and sharing of genetic variation [Cotton et al., 2009]. Recent meetings of the HVP neurogenetics consortium [Haworth et al., 2010] have determined that global access to comprehensive repositories of genetic variant data were particularly apposite for neurogenetics due to the large number of disease genes, significant genetic heterogeneity, clinical variability, and complex genotype–phenotype relationships in neurological disorders. Also stated were the significant shortcomings in the current situation with regard to databases for genes relevant to neurogenetics.
Ataxia Gene Variants within General Mutation Databases
Although a number of publically available, well-curated neurological databases exist including those for Charcot–Marie–Tooth disease (Inherited Peripheral Neuropathies Database—www.molgen.ua.ac.be/cmtmutations), Parkinson disease (PDGene database—www.pdgene.org), and Alzheimer disease (AD & FTD Mutation Database—www.molgen.ua.ac.be/admutations), comprehensive databases for ataxia genes are poorly represented. Before detailing the currently available ataxia-specific resources it is first worth considering the general mutation databases in which the majority of ataxia gene variation has been deposited. Both the Online Mendelian Inheritance in Man database (OMIM—www.ncbi.nlm.nih.gov/omim) and the Human Gene Mutation Database (HGMD—www.hgmd.org) contain variant information on ataxia genes curated from the medical literature. OMIM is a publically available resource accessed through the National Center for Biotechnology website and provides, where available, detailed clinical information albeit with a limited selection of reported variants. The HGMD attempts to curate all known published gene mutations responsible for human disease through automated searches of medical literature and also includes variants reported in locus-specific databases (LSDBs). Access to the full HGMD database requires a commercial license although a limited version is publically available. Unlike OMIM however, the HGMD eschews detailed phenotypic information.
For the more widely tested ataxia genes SCAs 1, 2, 3, 6, 7, 12, 17, DRPLA, FXN, ATM, AOA1/2, diagnostic laboratories hold a wealth of legacy data on pathogenic fragment lengths, other deleterious mutations, and nonpathogenic polymorphisms. Most variant information from diagnostic laboratories is usually reported through medical literature and not through direct submission to online databases. This may in part reflect the seemingly laborious process of data submission; however, concerns about future data ownership, patient consent, and confidentiality issues are equally relevant. In the United Kingdom, the Diagnostic Mutation Database, established in 2005 by the National Genetic Reference Laboratory, is intended as a repository of diagnostic variant data, to support the diagnostic process in UK genetic testing laboratories. This resource is primarily aimed at UK diagnostic laboratories although a large number of international laboratories, including those in China, Canada, and New Zealand, have signed up to the service. No ataxia genes are currently represented on the database; however, this is likely due to the fact that the majority of the ataxia gene mutations commonly tested in diagnostic laboratories are of the repeat expansion type, the interpretation and genotype–phenotype correlation which depend mainly on expanded allele length and where the pathogenic ranges are generally well documented in the medical literature. Advances in sequencing technology are likely to soon result in a much broader range of ataxia genes that will be tested in a diagnostic setting. This will strengthen the need to employ such databases in order to share variant data, to support the interpretation of new variants and improve the quality and consistency of diagnoses.
Locus- and Disease-Specific Databases
It has been widely argued that the best way to share data is through publically available, open-access databases and that LSDBs are a viable solution to meeting this need [Samuels and Rouleau, 2011]. LSDBs are well suited to high-penetrance monogenic genetic disorders typified by the various inherited ataxias and although some exist for a handful of ataxia genes, the list is far from comprehensive and does not reflect the extent of variant data that has been collected on these genes research and diagnostic laboratories worldwide. The most widely available platform for the creation of LSDBs is the Leiden Open Variation Database (LOVD, www.lovd.nl) supported by the European Community's Seventh Framework Programme under the GEN2PHEN project (www.gen2phen.org). Although the creation of these databases is straightforward, there are well-reported limitations relevant to the ongoing curation of variant data and maintenance of the database [Cotton et al., 2008]. Detailed recommendations for the curation of LSDBs have previously been published [Celli et al., 2011] and while it is acknowledged that expertly curated, up-to-date LSBDs offer significant benefits for patients and the research community, continued funding of these projects is a not an inconsiderable challenge.
Only one truly ataxia disease-specific LSBD is currently listed on the Human Genome Variation Society mutation database list (www.hgvs.org/dblist/glsdb.html) although there are a number of specific genes curated within other more general gene collections (see Table 5). Most of the known ataxia genes are represented in some form on various LOVD installations; however, the majority of these have merely been identified by the LOVD team as being in need of a curator and have no variant submissions to date. None of the conventional mutation SCA genes are represented in LSDBs although most of the reported mutations in these genes can be found within HGMD. A number of the recessive genes are listed on LOVD including those for several of the metabolic ataxias.
Name | Website | Genes listed | Phenotypic information | Unique variants | Last updated | Format |
---|---|---|---|---|---|---|
SCA-LSVD | http://miracle.igib.res.in/ataxia | ATXN1, ATXN2, ATXN3, ATXN8OS, PPP2R2B, ATN1, ATXN7, CACNA1A, ATXN10, TBP,FXN | Yes | 612 (repeat size only) | February 2009 | LOVD |
CACNA1A (EA2/FHM) | http://www.LOVD.nl/CACNA1A | CACNA1A | Yes | 120 | February 2011 | LOVD |
SETX | http://www.LOVD.nl/SETX | SETX | Yes | 19 | June 2011 | LOVD |
http://149.142.212.78/LOVD/home.php | SETX | Yes | 97 | July 2011 | LOVD | |
SACSIN database | http://www.medgen.mcgill.ca/SACSIN | SACS | No | 49 | July 2008 | Excel |
ATM | http://www.LOVD.nl/ATM | ATM | Yes | 430 | January 2012 | LOVD |
Human DNA POLG Mutation Database | http://tools.niehs.nih.gov/polg | POLG | Yes | ∼230 | Unknown | HTML |
Cerebrotendinous xanthomatosis | www.lovd.nl/CYP27A1 | CYP27A1 | No | 57 | May 2010 | LOVD |
Refsum disease | http://www.dbpex.org/home.php | PEX7 | Yes | 39 | February 2008 | LOVD |
Niemann–Pick type C disease gene variation database | http://npc.fzk.de | NPC1 | Yes | 244 | May 2011 | Web-form |
NPC2 | Yes | 18 |
SCA-LSVD is a LOVD installation that was created to deposit repeat-oriented variant information on 400 SCA families identified from a tertiary referral center in north India between 1998 and 2007 [Faruq et al., 2009]. Data on the repeat size of SCAs 1, 2, 3, 6, 7, 8, 12, 17, and FXN are reported on the database together with detailed phenotypic information on the individuals screened. As the data were all collected through fragment length analysis, no additional nonpathogenic polymorphism data were submitted. The study authors report that they were in the process of curating variations on all ataxia-related genes and while this aim is consistent with the intended function of such databases, it is worth noting that no submissions have been made to the database for approximately 2 years. As a means to share variant information obtained in epidemiological studies, LOVD installations are undoubtedly convenient but it would perhaps be more useful to the wider research community if disease-specific databases such as this were curated on an ongoing basis with a gene list that reflects the current population of known disease genes.
Ataxia Disease Registries
It has been suggested that a strategy for the development of neurological LSDBs should be initiated by international, multidisciplinary disease centered networks. One of the achievements of the EUROSCA project (www.eurosca.org), funded by the European Commission, was to establish the world's largest DNA registry of SCA patients, together with detailed clinical information. Data were collected on over 3,000 patients affected by dominantly inherited ataxia, which included both those with and without a genetic diagnosis. This Internet-based registry is available to participating investigators and is arguably one of the largest collections of ataxia gene variant information.
EFACTS (www.e-facts.eu) is a project funded under the EU FP7 framework and has engaged a network of European collaborators to adopt a translational research strategy for the FRDA. One of the primary aims of this project is to populate a pan-European FRDA database linked to bio-banks of patient material. Like EUROSCA, this registry will be available only to participating investigators but it is not clear whether the FXN gene variant data collated in this project will be made publically available.
It is not clear which organizations are best suited to meeting the challenges of developing comprehensive variant databases for ataxia genes linked to detailed phenotypic information. Although significant financial, technical, and ethical issues regarding the use of large patient datasets are yet to be fully addressed, robust guidance for tackling these issues has been provided by a number of interested parties. Given the rapid advances in sequencing technology. it is imperative that a coherent strategy to meeting these challenges is undertaken by ataxia research groups worldwide.