Volume 33, Issue 9 pp. 1324-1332
Review
Free Access

The inherited ataxias: Genetic heterogeneity, mutation databases, and future directions in research and clinical diagnostics

Joshua Hersheson

Corresponding Author

Joshua Hersheson

Department of Molecular Neuroscience, Institute of Neurology, University College London, London, United Kingdom

Department of Molecular Neuroscience, Institute of Neurology, University College London, Queen Square, London WC1N 3BG, United Kingdom.Search for more papers by this author
Andrea Haworth

Andrea Haworth

Department of Molecular Neuroscience, Institute of Neurology, University College London, London, United Kingdom

Search for more papers by this author
Henry Houlden

Henry Houlden

Department of Molecular Neuroscience, Institute of Neurology, University College London, London, United Kingdom

MRC Centre for Neuromuscular Diseases, Institute of Neurology and The National Hospital for Neurology and Neurosurgery, Queen Square, London WC1N 3BG, UK

Search for more papers by this author
First published: 11 June 2012
Citations: 88

For the Databases in Neurogenetics Special Issue

Abstract

The inherited cerebellar ataxias are a diverse group of clinically and genetically heterogeneous neurodegenerative disorders. Inheritance patterns of these disorders can be complex with autosomal dominant, autosomal recessive, X-linked, and mitochondrial inheritance demonstrated by one or more ataxic syndromes. The broad range of mutation types found in inherited ataxia contributes to the complex genetic etiology of these disorders. The majority of inherited ataxias are caused by repeat expansions; however, conventional mutations are important causes of the rarer dominant and recessive ataxias. Advances in sequencing technology have allowed for much broader testing of these rare ataxia genes. This is relevant to the aims of the Human Variome Project, which aims to collate and store gene variation data through mutation databases. Variant data is currently located in a range of public and commercial resources. Few locus-specific databases have been created to catalogue variation in the dominant ataxia genes although there are several databases for some recessive genes. Developing these resources will facilitate a better understanding of the complex genotype–phenotype relationships in these disorders and assist interpretation of gene variants as testing for rarer ataxia genes becomes commonplace. Hum Mutat 33:1324–1332, 2012. © 2012 Wiley Periodicals, Inc.

Introduction

The inherited cerebellar ataxias are a group of neurodegenerative disorders in which the dominant feature is progressive cerebellar degeneration, resulting in impairment of balance, gait, coordinated limb movements, and speech. Ataxia may present as an isolated cerebellar syndrome or more often is associated with a broad spectrum of neurological manifestations including pyramidal, extrapyramidal, sensory, and cognitive dysfunction. Given the significant clinical heterogeneity of these disorders and the complexity of the cerebellum and its associated connections, it is unsurprising that there exists significant genetic heterogeneity.

The prevalence of genetic forms of ataxia has been difficult to accurately determine and several reviewers suggest that previous data have underestimated the extent of the problem [Klockgether, 2011]. Worldwide prevalence of autosomal dominant ataxia has been reported to be between 1.2 and 41:100,000 [Wardle and Robertson, 2007].

Inheritance patterns of these disorders can be complex with autosomal dominant, autosomal recessive, X-linked, and mitochondrial inheritance demonstrated by one or more ataxic syndromes. The broad range of mutation types found in the inherited ataxias contribute to the complex genetic etiology of these disorders. Many of the more common ataxias are caused by a range of polynucleotide repeat expansions including trinucleotide, pentanucleotide, and hexanucleotide repeats; however, point mutations, deletions, and duplications are also represented. Dominant cerebellar ataxia genetic loci are designated with the spinocerebellar ataxia (SCA) prefix, with SCA36 being the most recent locus to be reported. The precise number of loci is open to interpretation as the list of approved SCA designations has been “polluted” with a variety of other related syndromes (SCA18—predominantly sensory ataxia; SCA29 congenital nonprogressive ataxia, allelic disorders [SCA15/16], and a SCA designation without a reported locus [SCA9]). Taking these discrepancies aside, there are a total of 30 separate SCA loci with the associated genetic variant identified in 21 of these. In addition, there are a number of other dominantly inherited neurological syndromes in which ataxia is a prominent feature including Huntington disease, dentatorubral-pallidoluysian atrophy (DRPLA), Alexander disease, and Gerstmann–Straussler–Scheinker disease. The recessive ataxias are even more diverse with nearly 100 genes having been identified (Washington Neuromuscular Website—www.neuromuscular.wustl.edu/ataxia/recatax) and although many of these are exceedingly rare, the commonest inherited ataxia worldwide is caused by recessively inherited mutations in the frataxin (FXN) gene resulting in Friedrich ataxia (FRDA).

This article will provide an overview of the genetics of the inherited ataxias with a focus on the more rare spinocerebellar ataxia genes caused by conventional mutations. In the second part of the article, the current situation with regard to mutation databases in ataxia will be discussed.

Genetics of the Autosomal Dominant Ataxias

Due to Repeat Expansions

The majority of the dominantly inherited ataxias are caused by repeat expansions in either coding or noncoding parts of the relevant genes [Dueñas et al., 2006]. Polyglutamine (CAGn) expansions are the most common of these and comprise SCAs 1, 2, 3, 6, 7, and 17 and DRPLA. Genotype–phenotype correlations of these disorders are well described [Schöls et al., 2004] with the disease manifesting above a threshold of CAG repeats. The noncoding expansion SCAs comprise SCA8 (CTGn), SCA10 (ATTCTn), SCA12 (CAGn), SCA31 (TGGAAn), and SCA36 (GGCCTGn). Larger repeat numbers generally result in an earlier age of onset and more severe phenotype (genetic anticipation). Anticipation has been demonstrated to a varying degree by all of the repeat expansion SCAs. A summary of the dominant ataxia genes is provided in Table 1.

Table 1. Autosomal Dominant Ataxias
Name OMIM # Locus Gene Protein Mutation % of ADCA Geographical distribution Characteristic features
Repeat Expansions: coding
SCA1 164400 6p22.3 ATXN1 Ataxin 1 CAG repeat 6–27% Common: South Africa, Japan, India, Italy, Australia Hyperreflexia, sensory neuropathy, mild cognitive impairment
SCA2 183090 12q24.12 ATXN2 Ataxin 2 CAG repeat 13–18% Common: United States, Spain, India, Mexico, Italy Polyneuropathy, parkinsonism, dysphagia
SCA3 109150 14q32.12 ATXN3 Ataxin 3 CAG repeat 20–50% Most common worldwide Spasticity, polyneuropathy, dystonia, parkinsonism
SCA6 183086 19p13.2 CACNA1A Calcium channel, voltage dependent, P/Q type, α1A subunit CAG repeat 13–15% Common: United States, Germany, Australia, Taiwan Late onset, pure ataxia
SCA7 164500 3p14.1 ATXN7 Ataxin 7 CAG repeat 3–5% Finland, Mexico, South Africa Retinal degeneration
SCA17 607136 6q27 TBP TATA box-binding protein CAG repeat Rare United Kingdom, Belgium, France, Germany, Japan Dementia
DRPLA 125370 12p13.31 ATN1 Atrophin 1 CAG repeat 0.8:100,000 (Japan) Rare worldwide Japan, Portugal, United States Dementia, epilepsy
HD 143100 4p16.3 HTT Huntingtin CAG repeat 3–7:100,000 Worldwide Chorea, dementia
Repeat Expansions: noncoding
SCA8 608768 13q21.33 ATXN8OS Ataxin 8 opposite strand CTG repeat 3% Common: Finland Pure ataxia
SCA10 603516 22q13.31 ATXN10 Ataxin 10 ATTCT repeat Unknown Mexico, Brazil Seizures
SCA12 604326 5q32 PPP2R2B Protein phosphatase 2, regulatory subunit B, β CAG repeat Rare worldwide 7% in India Common: India Tremor, polyneuropathy
SCA31 117210 16q21 BEAN Brain expressed associated with NEDD4 TGGAA repeat 8–40% in Japan Rare worldwide Japan esp. Nagano prefecture Spasmodic torticollis
SCA36 614153 20p13 NOP56 Nuclear protein 56 GGCCTG repeat 6.3% Galicia, Spain 9 Japanese families Spain, Japan Motor neurone involvement
Conventional mutations
SCA5 600224 11q13 SPTBN2 Beta 3 Spectrin Deletions, missense mutations Rare United States, Germany, France Pure ataxia, facial myokymia, gaze palsy
SCA11 611695 15q15.2 TTBK2 Tau tubulin kinase 2 Nonsense, frameshift deletions/insertions Rare United Kingdom, France, Germany Pure ataxia
SCA13 605259 19q13.3-q13.4 KCNC3 Potassium channel, voltage-gated, shaw-related subfamily, member 3 Missense 1% (France) France, Philippines Early onset, mental retardation
SCA14 176980 19q13.4 PRKCG Protein kinase C gamma Missense, deletion 2% (France) United Kingdom, France, Netherlands, United States, Japan, Australia Myoclonus, dystonia
SCA15 606658 3p26-p25 ITPR1 Inositol 1, 4, 5-triphosphate receptor type 1 Muliti-exon or whole gene deletion, missense 1.8% (France) 0.3% (Japan) United Kingdom, France Pure ataxia
SCA20 608687 11q12 260kb duplication Rare Australia Dentate calcification, bulbar symptoms
SCA23 610245 20p13 PDYN Prodynorphin Missense Rare Netherlands Pyramidal signs
SCA27 609307 13q33.1 FGF14 Fibroblast growth factor 14 Missense Rare Netherlands Onset with tremor, psychiatric episodes
SCA28 610246 18p11.21 AFG3L2 ATPase FAMILY GENE 3-LIKE 2 Missense, deletion 3% Italy, France, United Kingdom Ophthalmoplegia, spasticity
SCA35 613908 20p13 TGM6 Transglutaminase 6 Missense Rare China Pure ataxia

Due to Conventional Mutations

A minority of the dominant ataxia syndromes (SCAs 5, 11, 13, 14, 15, 20, 23, 27, 28, and 35) is caused by conventional mutations. In a French ataxia series, conventional mutations accounted for 6% of all dominant ataxia, repeat expansions accounted for 45% with the remaining 48% being genetically undiagnosed [Durr et al., 2009]. Genotype–phenotype correlations are much harder to determine in this group, owing to the limited number of families affected by these mutations. Functional analysis of potassium channels (EA1, SCA13) and calcium channels (SCA6, EA2) has demonstrated a correlation between the degree of functional impairment and the severity of the phenotype. In contrast to the repeat expansion SCAs, these disorders often have a “purer” cerebellar phenotype (ADCAIII), with a slower rate of progression.

SCA5 is caused by a mutation in the SPTBN2 gene, which encodes B3 spectrin [Ikeda et al., 2006]. Missense and in-frame deletions have been described resulting in a pure cerebellar syndrome with onset between 15 and 50 years. The first SCA5 kindred was reported in 1994 with 56 affected individuals over 10 generations who were descendants of the paternal grandparents of Abraham Lincoln. SCA5 has also been reported in French and German pedigrees [Zühlke et al., 2007].

SCA11, initially reported in two British families, is caused by stop mutations, frameshift insertions or deletions in the TTBK2 gene, resulting in a pure cerebellar syndrome with normal life expectancy [Houlden et al., 2007]. Pathogenic variants in TTBK2 have also been reported in French and German families [Bauer et al., 2010].

SCA13 was initially reported in French and Filipino families and is caused by missense mutations in KCNC3, which encodes a voltage-gated potassium channel [Figueroa et al., 2010]. There is a wide phenotypic spectrum that correlates with different missense mutations. The childhood-onset form, in which motor and mental developmental delay is a common feature, has been associated with two variants: (g.10693G>A p.Arg423His) and (g. 10767T>C p.Phe448Leu) described in European and Filipino families, respectively [Figueroa et al., 2011]. Females are more frequently affected and of the two missense mutations reported, the p.Phe448Leu variant results in the more severe phenotype. The p.Arg423His variant has also been reported in a Caucasian family in the United States.

SCA14 is caused by mutations in PRKCG [Yabe et al., 2003], resulting in a variable ataxic phenotype, which may include myoclonus, dystonia, or peripheral neuropathy. The onset is usually in adulthood. The majority of mutations (missense) have been reported in exons 4, 5, 10, and 18. It has been reported in more than 20 families from Europe, Japan, and Australia [Klebe et al., 2005].

SCA15/16 is caused by heterozygous deletions of the 5′ part of the ITPR1 gene [van de Leemput et al., 2007] although a missense mutation (c.1480G>A p.V494I) has been reported. The ITPR1 protein is highly expressed in cerebellar Purkinje cells and is an important modulator of intracellular calcium signaling. SCA15/16 is characterized by a mild cerebellar ataxia with slow disease progression. In a French ataxia series, SCA15 was identified in 1.8% of patients [Marelli et al., 2011]. SCA15/16 shares a locus with SCA29, raising the possibility that they are allelic disorders.

SCA20 has been described in a single Australian family of Anglo-Celtic descent and is the result of a 260 kb duplicated region comprising >12 genes at 11q12 [Knight et al., 2004]. Bulbar symptoms including dysphonia and spasmodic cough in addition to dentate nucleus calcification are characteristic of this condition.

SCA23 is due to missense mutations of PDYN [Bakalkin et al., 2010], which encodes prodynorphin protein, an opioid neuropeptide precursor. This causes a relatively pure cerebellar syndrome with a late onset (43–73 years) and slow progression. The disease has been reported in only a single large Dutch ataxia family and was not identified on screening a large German ataxia series [Schicks et al., 2011].

SCA27 causes an early-onset ataxia [Brusse et al., 2006], associated cognitive deficits, and head or limb tremor and dyskinesia that can be exacerbated by stress or exercise. The causal gene was identified, in a large Dutch kindred, fibroblast growth factor 14 (FGF14) with missense and nonsense mutations having been reported. There is normal life expectancy; however, most affected patients are unable to walk by the seventh to eighth decade. The disease has also been reported in a German ataxia patient.

SCA28 is caused by a mutation in AFG3L2, which encodes a mitochondrially located metalloprotease [Di Bella et al., 2010]. Missense mutations have been reported which are commonly located in the proteolytic domain of the protein with a mutation hotspot in exons 15–16. SCA28 has a typically early onset between 12 and 36 years and is characterized by a slowly progressive cerebellar ataxia with ophthalmoparesis and lower limb hyperreflexia. The disease is estimated to account for 1.5% of European ADCA cases [Cagnoli et al., 2010].

SCA35 is caused by mutations in the cerebral transglutaminase TGM6 and was the first dominant ataxia gene to be identified through exome sequencing [Wang et al., 2010]. Missense mutations were reported in two Chinese families in which a late-onset cerebellar syndrome with associate upper motor neuron involvement was reported. There was moderate progression with patients commonly using a wheelchair 20 years after disease onset.

Episodic Ataxias

The episodic ataxias are a group of heterogeneous channel disorders characterized by attacks of ataxia, which may be associated with a range of other neurological manifestations including myokymia, migraine, seizures, or chorea. Eight episodic ataxia syndromes have been described: EA 1–7 and episodic ataxia with paroxysmal choreoathetosis and spasticity (CSE). EA 1 and 2 are the most common and best characterized of these. The genes for EA 1, 2, 5, and 6 (Table 2) have been identified with linkage loci mapped in EA 3, 7, and CSE. Episodic ataxia is rare with a combined incidence of <1:100,000.

Table 2. Episodic Ataxias
Name OMIM # Locus Gene Protein Mutation Geographical distribution Characteristic features
EA1 160120 12p13.32 KCNA1 Potassium channel, voltage gated, shaker-related subfamily, member 1 Missense Worldwide Attacks last seconds–minutes; myokymia
EA2 108500 19p13.2 CACNA1A Calcium channel, voltage dependent, P/Q type, α1a Subunit Missense, nonsense, large deletions Worldwide Attacks last hours; allelic with SCA6, familial hemplegic migraine
EA5 613855 2q23.3 CACNB4 Calcium channel, voltage dependent, β-4 Subunit Missense French-Canadian family Attacks last hours–days; late onset, seizures
EA6 612656 5p13.2 SLC1A3 Solute carrier family 1 (glial high affinity glutamate transporter), member 3 Missense United States, Netherlands Alternating hemiplegia, seizures

EA1 is primarily due to missense mutations in KCNA1 [Browne et al., 1994] although truncation mutations have been reported. The disease is characterized by brief periods of ataxia (seconds to minutes) and interictal myokymia. The degree of channel impairment correlates with the severity of the phenotype… Mutations associated with severe phenotypes that may be poorly treatment responsive or associated with seizures or neuromyotonia show the most significant impairment of potassium channel function.

EA2 is due to a range of mutations in CACNA1A [Ophoff et al., 1996], which include missense, nonsense, aberrant splicing, and nucleotide insertions and deletions. EA2 typified by longer periods of ataxia lasting several hours with baseline nystagmus and progressive ataxia. There is a wide spectrum of phenotypes associated with mutations in CACNA1A. EA2 is allelic with SCA6 and familial hemiplegic migraine (FHM). Most of the mutations that cause EA2 disrupt the open reading frame, whereas FHM is caused primarily by missense mutations.

EA5 has been described in a single French-Canadian family that was heterozygous for a missense mutation in the CACNB4 gene, resulting in a phenotype similar to EA2 [Escayg et al., 2000]. The precise functional effects of this mutation are not clear as the same mutation was identified in a German family with generalized epilepsy but no ataxia.

EA6 was initially reported in a patient from the United States presenting with characteristic episodes of hemiplegia, seizures, and ataxia. A de novo mutation was identified in the SLC1A3 gene, which results in complete loss of function of the protein EAAT1—a glutamate transporter localized to astrocytes. Other cases have been reported in the Netherlands with the p.C186S variant that resulted in a milder phenotype without the manifestations of seizures or alternating hemiplegia [de Vries et al., 2009].

Recessive Ataxias

The recessive ataxias are a particular diverse group of disorders that are generally early onset with significant variation in clinical phenotype, which is variably associated with neuropathy, ophthalmological disturbance, seizures, and a range of other neurological and non-neurological manifestations. These disorders are discussed in detail in other reviews; a nonexhaustive summary of recessive ataxia genes are listed in Table 3. FRDA is the most common recessive ataxia worldwide [Palau and Espinós, 2006] and is mainly due to homozygous GAA expansions in the FXN gene, but few patients show compound heterozygosity for a point mutation and the GAA-repeat expansion. Some common pathological pathways have been described in the recessive ataxias including DNA repair dysfunction, mitochondrial dysfunction, defects in lipoprotein metabolism, and protein chaperone dysfunction. There is significant overlap of clinical phenotype with a range of metabolic ataxias (Table 4), which are invariably complex multisystem disorders that can result in severe disability despite dietary modification where possible.

Table 3. Autosomal Recessive Ataxias
Disease OMIM ♯ Gene Protein Mutation Incidence/carrier frequency Geographic distribution Characteristic features
Friedrich ataxia 606829 FXN Frataxin GAA repeat expansions; point mutations in compound heterozygotes Incidence: 1:30–50,000Carrier frequency: 0.9–1.6% Worldwide except natives to: Far East, sub-Saharan Africa, Australia, America Spasticity, neuropathy, cardiac involvement
Ataxia-telangiectasia 607585 ATM Ataxia telangiectasia mutated Deletions: splice-site related; nonsense; missense Incidence: 1:400,000–450,000 live birthsCarrier frequency: 0.35–1% Reported in many worldwide populations Oculomotor apraxia; extrapyramidal features; increased cancer risk/ radiosensitivity
Ataxia-telangiectasia like disorder (ATLD) 604391 MRE11A Meiotic recombination 11, S. cerevisiae, homolog of Missense 25 reported cases worldwide Saudi Arabia (15 cases), Japan (4 cases), UK (4 cases), Italy (2 cases) Similar to ATM but milder phenotype
Ataxia-oculomotor apraxia type 1 208920 APTX Aprataxin Insertion, deletion, missense Rare worldwide—More common in Portuguese and Japanese populations Portugal, Japan, France, Tunisia Oculomotor apraxia, peripheral neuropathy
Cerebellar ataxia with muscle coenzyme Q10 deficiency 607426 APTX Aprataxin Missense Rare Single Italian family Low coenzyme Q10 levels; late-onset hypergonadotrophic hypogonadism
Ataxia-oculomotor apraxia type 2 606002 SETX Senataxin Nonsense, missense Carrier frequency: 2.1–3.5%Incidence: 1:400,000 (Alsace) Commoner in French-Canadian populations Oculomotor apraxia (variable); extrapyramidal features; peripheral neuropathy
Spastic ataxia of charlevoix-saguenay (ARSACS) 270550 SACS Sacsin Stop-gain deletions and point mutations most common Carrier frequency (Quebec): 4.5%Incidence: 1/1930 Most common in Quebec. Tunisian, Turkish, Italian, Japanese families reported Myelinated retinal fibers; prominent lower limb spasticity
Cerebellar ataxia, seizures and ubiquinone deficiency 612016 ADCK3 aarF domain containing kinase 3 Missense, splice site, frame shift, deletion Rare French, Dutch, British families reported Mental retardation, seizures, low coenzyme q10 levels
Spinocerebellar ataxia with axonal neuropathy (SCAN1) 607250 TDP1 Tyrosyl DNA phosphodiesterase 1 Missense Rare Saudi Arabian family Axonal neuropathy
Autosomal recessive spinocerebellar ataxia type 8 610743 SYNE1 Synaptic nuclear envelope protein 1 Splice site, intronic Rare worldwide 3rd most common ARCA in Quebec Canada Hyperreflexia
Autosomal recessive spinocerebellar ataxia type 10 613728 ANO10 Anoctamin 10 Missense, splice site, deletion Rare French, Dutch, Serbian families Tortuous conjunctival vessels
Table 4. Metabolic Ataxias
Disease OMIM # Gene Protein Mutation Incidence/carrier frequency Geographic distribution Characteristic features
Ataxia with selective vitamin E deficiency 277460 TTPA Tocopherol transfer protein alpha Frameshift, missense Prevalence: 0.55–3.5:1000,000 United Kingdom, French, Italian, Moroccan, Japanese families reported. Low vitamin E; resembles FRDA
Abetalipoproteinemia 200100 MTTP Microsomal triglyceride transfer protein Missense, nonsense Prevalence: <1:1000,000 Global Acanthocytosis; pigmentary retinal degeneration; polyneuropathy
Refsum disease 266500 PHYH Phytanoyl-CoA hydroxylase Missense, nonsense, deletions, splice site mutations Prevalence: 1:1000,000 Global Deafness, retinitis pigmentosa, icthyosis, demyelinating polyneuropathy
Cerebrotendinous xanthomatosis 213700 CYP27A1 Cytochrome p450 subfamily XXVIIA, polypeptide 1 Missense, deletions, splice site mutations Prevalence (Moroccan Jews): 1:108 United States: 1:50,000 More common in Moroccan Jews Widespread cholesterol deposits: tendons, brain, lungs; cataracts; dementia
Niemann Pick Type C 607623 NPC1 NPC1 protein Deletions, point mutations Prevalence: 1:100,000–150,000 Global Extrapyramidal features, seizures, dementia,
Wilson's disease 277900 ATP7B ATPase, Cu(2+)-Transporting, beta polypeptide Point mutations, nonsense mutations Prevalence: 1:10,000–30,000 Global—higher incidence in China, Japan, Sardinia Extrapyramidal features; liver disease

Future Directions in Ataxia Research and Diagnostics

The rate of discovery of new ataxia genes has accelerated enormously in recent years, commensurate with rapid advances in next-generation sequencing (NGS) technologies [Bamshad et al., 2011]. Exome sequencing in particular has demonstrated its utility in the identification of causal genes in a variety of Mendelian disorders including ataxia [Montenegro et al., 2011; Ng et al., 2010; Pierson et al., 2011]. Although exome sequencing is a useful tool in new gene discovery, issues of cost and significant data storage burden associated with processing exome samples prohibit the routine use of sequencing in a diagnostic setting. A modification of the technology, targeted enrichment, and sequencing [Mertes et al., 2011; Schlipf et al., 2011] will allow focused panels of relevant genes to be sequenced in highly multiplexed and extremely cost-effective runs. This is much more likely to replace traditional Sanger sequencing runs in diagnostic laboratory as the technology becomes cheaper and more widespread and has the potential to transform the capabilities of diagnostic laboratories and research groups worldwide.

The provision of diagnostic tests for the inherited ataxias is generally limited by cost and technical considerations. In the United Kingdom, most patients (depending on the clinical presentation and inheritance pattern) the following tests are available: SCA1, 2, 3, 6, 7, 12, 17, HD, DRPLA, PRNP, FXN, ATM, AOA1/2. Testing for rarer genes is often available from specific international diagnostic laboratories or on a research basis by various interested research groups.

It seems likely that in future diagnostic, laboratories will be able to offer relatively low-cost screening for all known ataxia genes using targeted NGS techniques and which are already being employed to screen for genetic conditions including nonsyndromic deafness [Walsh et al., 2010] and hypertrophic cardiomyopathy [Voelkerding et al., 2010].

Despite the advantages of NGS in gene discovery and diagnostics over conventional methods, significant challenges remain with the interpretation and storage of the wealth of data generated by these NGS applications. Exome sequencing identifies on average between 20,000 and 24,000 single-nucleotide variants per sample [Bamshad et al., 2011]. Most analysis pipelines for NGS variant data include a step to filter sequenced variants against control datasets such as dbSNP (www.ncbi.nlm.nih.gov/projects/SNP), 1000genomes (www.1000genomes.org), and the Washington Exome Variant Server (http://evs.gs.washington.edu/EVS/). These datasets should be used with caution however, as there have been reports of “contamination” of some datasets including dbSNP with rare pathogenic variants with disease mutations that are not sufficiently annotated [Walsh et al., 2010]. This can potentially lead to the exclusion of variants that are potentially pathogenic on the basis of their presence in these datasets.

Subsequent filtering steps take into account specific inheritance patterns (e.g., exclusion of all but homozygous variants in recessive disorders) and may make use of linkage data or homozygosity mapping, where available, to further refine the list of variants. Variants can also be stratified according to the impact of the variant on protein structure and function and the degree of evolutionary conservation. A range of bioinformatics tools are available to enable this including the commonly used SIFT (Sorting Intolerant from Tolerant) and PolyPhen2. Often multiple tools are used for in silico analyses of pathogenicity. A study by Thusberg et al. (2011), investigating the performance of pathogenicity prediction methods, found MutPred and SNPs&GO to be the best performing tools; however, they noted that no single method performed optimally according to their specified parameters. Functional analysis of variants is usually limited to research laboratories and is generally not practicable in a diagnostic setting.

Equally significant in advancing such research will be the extensive sharing of next-generation datasets and associated phenotypic information for the benefit of national and international collaborations. Recently launched by the Miller School of Medicine at the University of Miami was the Genome Variant Database for Neuromuscular Diseases (hihg.med.miami.edu/gvd-nmd). The aim of the resource is to share genomic data on patients and families with neuromuscular disorders including Charcot–Marie–Tooth disease, hereditary spastic paraplegia, and amyotrophic lateral sclerosis. Complete variant data determined through exome sequencing are provided for a range of families in which the pathogenic mutation is currently unknown. Although ataxia families are not currently represented, this is a good model for NGS data sharing in investigating neurological disorders.

Neurogenetics Databases

Although the challenges in developing mutation databases for ataxia genetics are by no means unique, they are particularly well aligned to those outlined by the Human Variome Project (HVP), and its neurogenetics consortium, which aims to develop a global collaboration for the collection, storage, interpretation, and sharing of genetic variation [Cotton et al., 2009]. Recent meetings of the HVP neurogenetics consortium [Haworth et al., 2010] have determined that global access to comprehensive repositories of genetic variant data were particularly apposite for neurogenetics due to the large number of disease genes, significant genetic heterogeneity, clinical variability, and complex genotype–phenotype relationships in neurological disorders. Also stated were the significant shortcomings in the current situation with regard to databases for genes relevant to neurogenetics.

Ataxia Gene Variants within General Mutation Databases

Although a number of publically available, well-curated neurological databases exist including those for Charcot–Marie–Tooth disease (Inherited Peripheral Neuropathies Database—www.molgen.ua.ac.be/cmtmutations), Parkinson disease (PDGene database—www.pdgene.org), and Alzheimer disease (AD & FTD Mutation Database—www.molgen.ua.ac.be/admutations), comprehensive databases for ataxia genes are poorly represented. Before detailing the currently available ataxia-specific resources it is first worth considering the general mutation databases in which the majority of ataxia gene variation has been deposited. Both the Online Mendelian Inheritance in Man database (OMIM—www.ncbi.nlm.nih.gov/omim) and the Human Gene Mutation Database (HGMD—www.hgmd.org) contain variant information on ataxia genes curated from the medical literature. OMIM is a publically available resource accessed through the National Center for Biotechnology website and provides, where available, detailed clinical information albeit with a limited selection of reported variants. The HGMD attempts to curate all known published gene mutations responsible for human disease through automated searches of medical literature and also includes variants reported in locus-specific databases (LSDBs). Access to the full HGMD database requires a commercial license although a limited version is publically available. Unlike OMIM however, the HGMD eschews detailed phenotypic information.

For the more widely tested ataxia genes SCAs 1, 2, 3, 6, 7, 12, 17, DRPLA, FXN, ATM, AOA1/2, diagnostic laboratories hold a wealth of legacy data on pathogenic fragment lengths, other deleterious mutations, and nonpathogenic polymorphisms. Most variant information from diagnostic laboratories is usually reported through medical literature and not through direct submission to online databases. This may in part reflect the seemingly laborious process of data submission; however, concerns about future data ownership, patient consent, and confidentiality issues are equally relevant. In the United Kingdom, the Diagnostic Mutation Database, established in 2005 by the National Genetic Reference Laboratory, is intended as a repository of diagnostic variant data, to support the diagnostic process in UK genetic testing laboratories. This resource is primarily aimed at UK diagnostic laboratories although a large number of international laboratories, including those in China, Canada, and New Zealand, have signed up to the service. No ataxia genes are currently represented on the database; however, this is likely due to the fact that the majority of the ataxia gene mutations commonly tested in diagnostic laboratories are of the repeat expansion type, the interpretation and genotype–phenotype correlation which depend mainly on expanded allele length and where the pathogenic ranges are generally well documented in the medical literature. Advances in sequencing technology are likely to soon result in a much broader range of ataxia genes that will be tested in a diagnostic setting. This will strengthen the need to employ such databases in order to share variant data, to support the interpretation of new variants and improve the quality and consistency of diagnoses.

Locus- and Disease-Specific Databases

It has been widely argued that the best way to share data is through publically available, open-access databases and that LSDBs are a viable solution to meeting this need [Samuels and Rouleau, 2011]. LSDBs are well suited to high-penetrance monogenic genetic disorders typified by the various inherited ataxias and although some exist for a handful of ataxia genes, the list is far from comprehensive and does not reflect the extent of variant data that has been collected on these genes research and diagnostic laboratories worldwide. The most widely available platform for the creation of LSDBs is the Leiden Open Variation Database (LOVD, www.lovd.nl) supported by the European Community's Seventh Framework Programme under the GEN2PHEN project (www.gen2phen.org). Although the creation of these databases is straightforward, there are well-reported limitations relevant to the ongoing curation of variant data and maintenance of the database [Cotton et al., 2008]. Detailed recommendations for the curation of LSDBs have previously been published [Celli et al., 2011] and while it is acknowledged that expertly curated, up-to-date LSBDs offer significant benefits for patients and the research community, continued funding of these projects is a not an inconsiderable challenge.

Only one truly ataxia disease-specific LSBD is currently listed on the Human Genome Variation Society mutation database list (www.hgvs.org/dblist/glsdb.html) although there are a number of specific genes curated within other more general gene collections (see Table 5). Most of the known ataxia genes are represented in some form on various LOVD installations; however, the majority of these have merely been identified by the LOVD team as being in need of a curator and have no variant submissions to date. None of the conventional mutation SCA genes are represented in LSDBs although most of the reported mutations in these genes can be found within HGMD. A number of the recessive genes are listed on LOVD including those for several of the metabolic ataxias.

Table 5. Ataxia Mutation Databases
Name Website Genes listed Phenotypic information Unique variants Last updated Format
SCA-LSVD http://miracle.igib.res.in/ataxia ATXN1, ATXN2, ATXN3, ATXN8OS, PPP2R2B, ATN1, ATXN7, CACNA1A, ATXN10, TBP,FXN Yes 612 (repeat size only) February 2009 LOVD
CACNA1A (EA2/FHM) http://www.LOVD.nl/CACNA1A CACNA1A Yes 120 February 2011 LOVD
SETX http://www.LOVD.nl/SETX SETX Yes 19 June 2011 LOVD
http://149.142.212.78/LOVD/home.php SETX Yes 97 July 2011 LOVD
SACSIN database http://www.medgen.mcgill.ca/SACSIN SACS No 49 July 2008 Excel
ATM http://www.LOVD.nl/ATM ATM Yes 430 January 2012 LOVD
Human DNA POLG Mutation Database http://tools.niehs.nih.gov/polg POLG Yes ∼230 Unknown HTML
Cerebrotendinous xanthomatosis www.lovd.nl/CYP27A1 CYP27A1 No 57 May 2010 LOVD
Refsum disease http://www.dbpex.org/home.php PEX7 Yes 39 February 2008 LOVD
Niemann–Pick type C disease gene variation database http://npc.fzk.de NPC1 Yes 244 May 2011 Web-form
NPC2 Yes 18

SCA-LSVD is a LOVD installation that was created to deposit repeat-oriented variant information on 400 SCA families identified from a tertiary referral center in north India between 1998 and 2007 [Faruq et al., 2009]. Data on the repeat size of SCAs 1, 2, 3, 6, 7, 8, 12, 17, and FXN are reported on the database together with detailed phenotypic information on the individuals screened. As the data were all collected through fragment length analysis, no additional nonpathogenic polymorphism data were submitted. The study authors report that they were in the process of curating variations on all ataxia-related genes and while this aim is consistent with the intended function of such databases, it is worth noting that no submissions have been made to the database for approximately 2 years. As a means to share variant information obtained in epidemiological studies, LOVD installations are undoubtedly convenient but it would perhaps be more useful to the wider research community if disease-specific databases such as this were curated on an ongoing basis with a gene list that reflects the current population of known disease genes.

Ataxia Disease Registries

It has been suggested that a strategy for the development of neurological LSDBs should be initiated by international, multidisciplinary disease centered networks. One of the achievements of the EUROSCA project (www.eurosca.org), funded by the European Commission, was to establish the world's largest DNA registry of SCA patients, together with detailed clinical information. Data were collected on over 3,000 patients affected by dominantly inherited ataxia, which included both those with and without a genetic diagnosis. This Internet-based registry is available to participating investigators and is arguably one of the largest collections of ataxia gene variant information.

EFACTS (www.e-facts.eu) is a project funded under the EU FP7 framework and has engaged a network of European collaborators to adopt a translational research strategy for the FRDA. One of the primary aims of this project is to populate a pan-European FRDA database linked to bio-banks of patient material. Like EUROSCA, this registry will be available only to participating investigators but it is not clear whether the FXN gene variant data collated in this project will be made publically available.

It is not clear which organizations are best suited to meeting the challenges of developing comprehensive variant databases for ataxia genes linked to detailed phenotypic information. Although significant financial, technical, and ethical issues regarding the use of large patient datasets are yet to be fully addressed, robust guidance for tackling these issues has been provided by a number of interested parties. Given the rapid advances in sequencing technology. it is imperative that a coherent strategy to meeting these challenges is undertaken by ataxia research groups worldwide.

    The full text of this article hosted at iucr.org is unavailable due to technical difficulties.