Diagnostic Exome Sequencing to Elucidate the Genetic Basis of Likely Recessive Disorders in Consanguineous Families
Communicated by Richard A. Gibbs
Contract grant sponsors: Gebert Ruf Stiftung Foundation; the European Union ERC (FP7-IDEAS-ERC, 249968); Swiss SNF; The von Meissner Foundation; Bodossaki Foundation; Scientific Exchange Program between Switzerland and the New Member States of the European Union.
ABSTRACT
Rare, atypical, and undiagnosed autosomal-recessive disorders frequently occur in the offspring of consanguineous couples. Current routine diagnostic genetic tests fail to establish a diagnosis in many cases. We employed exome sequencing to identify the underlying molecular defects in patients with unresolved but putatively autosomal-recessive disorders in consanguineous families and postulated that the pathogenic variants would reside within homozygous regions. Fifty consanguineous families participated in the study, with a wide spectrum of clinical phenotypes suggestive of autosomal-recessive inheritance, but with no definitive molecular diagnosis. DNA samples from the patient(s), unaffected sibling(s), and the parents were genotyped with a 720K SNP array. Exome sequencing and array CGH (comparative genomic hybridization) were then performed on one affected individual per family. High-confidence pathogenic variants were found in homozygosity in known disease-causing genes in 18 families (36%) (one by array CGH and 17 by exome sequencing), accounting for the clinical phenotype in whole or in part. In the remainder of the families, no causative variant in a known pathogenic gene was identified. Our study shows that exome sequencing, in addition to being a powerful diagnostic tool, promises to rapidly expand our knowledge of rare genetic Mendelian disorders and can be used to establish more detailed causative links between mutant genotypes and clinical phenotypes.
Introduction
The recent focus on the molecular basis of apparently Mendelian disorders has been successful in identifying thousands of pathogenic variants in protein-coding genes causing these phenotypes. Since the discovery of the first human gene mutations responsible for α- and β-thalassemia in the late 1970s [Orkin et al., 1978; Chang and Kan, 1979], a total of >3,100 protein-coding genes have been causatively linked to 3,684 inherited disorders in OMIM (Online Mendelian Inheritance in Man; http://www.omim.org), and a total of more than 150,000 different pathogenic variants have been identified [Cooper et al., 2010]. The recent rapid evolution of sequencing technologies and our increasing knowledge of the structure and function of the human genome have together served to accelerate these discoveries, thereby providing accurate diagnoses as well as a detailed understanding of the pathophysiology of a considerable number of inherited disorders. This notwithstanding, the majority of patients with rare autosomal-recessive disorders remain undiagnosed even after performing available imaging, biochemical, pathological, and genetic testing; thus, the current management of undiagnosed cases is suboptimal and the genetic counseling rather uninformative.
High-throughput sequencing (HTS) has made possible the identification of almost all variants in the coding regions of protein-coding genes. This provides the opportunity to explore the genotype–disease relationship in small nuclear families as it was shown in a proof of concept study by Ng et al. (2010).
Consanguineous marriage is practiced in a large number of human populations; rates attain 20%–50% in several countries of the Mediterranean basin and the Middle East as well as in immigrant populations in western European countries [Al-Gazali et al., 2006; Hamamy et al., 2011]. Reports in the literature [Stoll et al., 1999; Stoltenberg et al., 1999] and a recent study in the UK have indicated that children born to consanguineous couples have an increased risk of presenting with congenital anomalies [Sheridan et al., 2013]. Identifying the causative pathogenic gene lesion in consanguineous families seeking genetic counseling is not always successful despite the veritable array of currently available clinical and laboratory investigations. In such instances, prevention of the recurrence of the disease in future offspring is usually not possible. Moreover, the costs of imaging, biochemical, and genetic testing for unresolved cases could easily exceed that of whole-exome sequencing and may ultimately still not provide a precise diagnosis.
In this study, we employed whole-exome sequencing and genotype analysis to screen members of consanguineous families with likely recessive disorders. Our hypothesis was that because of the homozygosity of the causative defect, our diagnostic strategy would be successful in identifying the molecular basis of the disorder in at least a proportion of the participating patients. The minimum inclusion criteria were consanguinity among parents of affected, whereas the phenotypic spectrum was unrestricted. Together with a recently published study of nonconsanguineous patients [Yang et al., 2013], we anticipate that the results of this study will have important implications for molecular diagnostics and genetic counseling, and could lead to the broad establishment of exome sequencing as a key diagnostic tool for consanguineous families with undiagnosed genetic disorders outside specialized centers.
Materials and Methods
Patients and Nuclear Families
A total of 50 families participated in this study. The main criteria for inclusion were consanguinity among parents irrespective of the patient's clinical phenotype. Families with at least two affected individuals were preferentially selected. Of these families, 45 were of Arab origin, either living in Arab countries or having immigrated to Greece or Switzerland. Three families were of Greek origin and two of Kurdish origin. Intellectual disability (ID; defined as IQ < 75) or developmental delay was the most common clinical phenotype, occurring in 39 of these families. Clinical findings in the other families included skeletal dysplasias, ataxia, cardiac malformations, thrombocytopenia, visual impairment, and microphthalmia. Table 1 summarizes the clinical data from the patients. More detailed clinical information and family trees can be found in Supp. Table S1 and Supp. Fig S1.
Country of origin | Phenotype | Affected individuals per family | Age of affected individualsa | Closest relationship between parents of probands | |
---|---|---|---|---|---|
Family_1 | Lebanon | Sclerosing bone dysplasia | 2 | 39/36 | First cousins |
Family_2 | Lebanon | Syndromic ID/DD | 2 | 34/33 | First cousins |
Family_3 | Jordan | Syndromic ID/DD | 3 | 1b | Double first cousins |
Family_4 | Lebanon | Syndromic ID + neurological disorder | 2 | 41/31 | Double second cousins |
Family_5 | Lebanon | Neurological disorder | 2 | 2/1 | First cousins |
Family_6 | Egypt | Syndromic ID/DD | 2 | 12/3 | First cousins |
Family_7 | Egypt | Syndromic ID/DD | 2 | 7/3 | First cousins, second cousins |
Family_8 | Egypt | Skeletal dysplasia | 3 | 4/3/1.5 | First cousins |
Family_9 | Greece | Nonsyndromic ID/DD | 1 | 1 | First cousins |
Family_10 | Greece | Nonsyndromic ID/DD | 3 | 35/28/1.5 | Fourth cousins |
Family_11 | Egypt | Syndromic ID/DD | 2 | 10/5 | Double first cousins |
Family_12 | Egypt | Syndromic ID/DD | 2 | 2/1 | First cousins |
Family_13 | Egypt | Syndromic ID/DD | 3 | 9/1.5/1.5 | First cousins |
Family_14 | Egypt | Syndromic ID/DD | 2 | 20/10 | First cousins |
Family_15 | Jordan | Syndromic ID/DD | 3 | 17/11/9 | First cousins |
Family_16 | Greece | Skeletal dysplasia | 3 | 11/8/5 | First cousins |
Family_17 | Egypt | Skeletal dysplasia | 2 | 19/15 | First cousins |
Family_18 | Iraq | Thrombocytopenia | 3 | 9/7/6 | First cousins |
Family_19 | Morocco | Cardiac malformation | 3 | 10/6/3 | First cousins |
Family_20 | Jordan | Syndromic ID/DD | 2 | 19/5 | First cousins |
Family_21 | UAE | Syndromic ID/DD | 3 | 1.5b | First cousins |
Family_22 | Greece | Nonsyndromic ID/DD | 2 | 10/2 | Second cousins once removed |
Family_23 | Jordan | Syndromic ID/DD | 2 | 24/5 | First cousins |
Family_24 | Switzerland | Nonsyndromic ID/DD | 3 | 19/17/6 | First cousins |
Family_25 | Jordan | Syndromic ID/DD | 2 | 4/1.5 | First cousins |
Family_26 | Jordan | Syndromic ID/DD | 3 | 7/4/2 | First cousins |
Family_27 | Tunisia | Syndromic ID/DD | 2 | 14/2.5 | First cousins |
Family_28 | Jordan | Nonsyndromic ID/DD | 2 | 13/10 | First cousins |
Family_29 | Jordan | Visual impairment | 2 | 3/0.5 | First cousins |
Family_30 | Iraq | Syndromic ID/DD | 3 | 33/29/20 | First cousins |
Family_31 | Jordan | Syndromic ID/DD | 3 | 8/6/5 | Double first cousins |
Family_32 | Jordan | Syndromic ID/DD | 3 | 15/13/7 | First cousins |
Family_33 | Jordan | Regression/ID | 2 | 7/4 | First cousins |
Family_34 | Jordan | Syndromic ID/DD | 3 | 24/15/1.5 | Second cousins |
Family_35 | Egypt | Syndromic ID/DD | 2 | 13/5 | First cousins |
Family_36 | Egypt | Syndromic ID/DD | 2 | 15/9 | First cousins |
Family_37 | Egypt | Nonsyndromic ID/DD | 2 | 11/10 | First cousins |
Family_38 | Egypt | Nonsyndromic ID/DD | 2 | 7/1 | First cousins |
Family_39 | Egypt | Nonsyndromic ID/DD | 2 | 6/5 | First cousins |
Family_40 | Egypt | Syndromic ID/DD | 2 | 11/2 | First cousins |
Family_41 | Egypt | Nonsyndromic ID/DD | 2 | 5/3 | First cousins |
Family_42 | Egypt | Syndromic ID/DD | 3 | 4/3/3 | First cousins |
Family_43 | Jordan | Syndromic ID/DD | 3 | 19/11/5 | First cousins |
Family_44 | Jordan | Microphthalmia | 3 | 15/14/4 | First cousins |
Family_45 | Jordan | Syndromic ID/DD | 2 | 6/5 | First cousins |
Family_46 | Jordan | Syndromic ID/DD | 2 | 4/1 | First cousins |
Family_47 | Jordan | Syndromic ID/DD | 2 | 9/7 | First cousins |
Family_48 | Egypt | Syndromic ID/DD | 2 | 17/10 | First cousins |
Family_49 | Egypt | Skeletal dysplasia | 2 | 6/3 | First cousins |
Family_50 | Egypt | Progressive hypotonia | 2 | 20/19 | First cousins |
- a Age in years.
- b Two of the affected individuals are deceased.
In 49 of the families included in this study, at least two individuals were affected in the same sibship or in different family members (e.g., cousins). The parents of the patients were invariably unaffected by the disorder observed in their offspring; in 49 families, the parents were up to second-degree cousins. In one case, they were fourth-degree cousins. These selection criteria were designed to potentiate the search for autosomal-recessively inherited pathogenic variants. Detailed clinical descriptions, based on a questionnaire provided to all participating centers, were available from all patients enrolled in the study. A definitive diagnosis had not been reached prior to this study for any of these patients following clinical, imaging, and biochemical laboratory tests including all recommended and available genetic tests.
Each family was studied according to the same research protocol. All family samples were genotyped to identify regions of genomic homozygosity, and to define the critical genomic regions that could harbor the causative pathogenic variant. Exome sequencing was performed in one affected member, and the candidate gene variants within the defined critical regions in each family were identified. If not already performed by the referring physician, array CGH (comparative genomic hybridization) was performed in one affected sample per family to exclude large chromosomal rearrangements responsible for the patient's phenotype and to identify homozygous copy-number variants (CNVs) (duplications and deletions).
The study was approved by the Bioethics Committee of the University Hospitals of Geneva (Protocol number: CER 11-036). All patients and/or parents provided their written informed consent for the analyses performed. In accordance with the recommendation of the Bioethics Committee, variants with potential diagnostic value (i.e., relevant to the phenotype in question) were reported to the patient's referring physicians for genetic counseling and further diagnostic and reproductive action. According to the Committee's decision, no incidental findings have been disclosed to the patients.
Laboratory Techniques
Genotyping to identify homozygous regions
DNA samples from affected family members, their unaffected siblings, and their parents were received between January 2011 and July 2013 and genotyped using the HumanOmniExpress Bead Chip by IlluminaInc® (San Diego, CA). This SNP array tests 720K SNPs with a mean distance of 4 kb between the SNPs. After filtering for quality, the data were used to define the runs of homozygosity (ROH) for every individual using PLINK [Purcell et al., 2007]. We defined as homozygous regions those regions with 50 consecutive homozygous SNPs, irrespective of the total length of the genomic region, allowing for one mismatch, only SNPs with MAF (Minimum Allele Frequency) of >0.3 were included in the analysis. The ROH were further defined as genomic regions demarcated by the first encountered heterozygous SNPs flanking each established homozygous region.
Exome sequencing
The exome was captured using the SureSelect Human All Exons v3 (20 patients), v4 (11 patients), and v5 (19 patients) reagents (Agilent Inc.® Santa Clara, CA). Sequencing was performed in an IlluminaHiSeq 2000 instrument. Each exome library was indexed, separated into two equal halves, and sequenced in two different lanes. The raw results were analyzed using a customized pipeline that utilizes published algorithms in a sequential manner (BWA [Li and Durbin, 2009] for mapping the reads, SAMtools [Li et al., 2009] for detection of variants, Pindel [Wang et al., 2010] for the detection of indels, and ANNOVAR [Ye et al., 2009] for the annotation). The entire coding sequence corresponding to the human RefSeq [Pruitt et al., 2007] coding genes was used as the reference for the calculation of coverage and reads on target. All experiments were performed using the manufacturer's recommended protocols without modifications.
Bioinformatic analysis
The ROH coordinates, the genotypes, and the exome results were processed using an in-house algorithm (CATCH v1.1, unpublished). CATCH additionally takes into account the family information and accepts different types of exclusion and inclusion filters. The final result consists of a list of those variants that respect the provided filters and assigns them to a different class according to how well they respect the segregation of the ROH. The filters used were: homozygous exonic and splicing variants (±6 bp from the intron–exon junction) with a minimum allele frequency less than 0.02 in public (dbSNP version 137 [http://www.ncbi.nlm.nih.gov/SNP/], 1000 Genomes April 2012 release, [1000 Genomes Project Consortium et al., 2012], and Exome Variant Server [v.0.018; http://evs.gs.washington.edu/EVS/]) and the local database (which includes 50 individuals with the same ethnic origin). Synonymous variants not affecting the splice site or variants that were found within segmental duplications were excluded. Only variants respecting the mentioned filters and the ROH segregation were considered for further analysis.
The final list of variants was further evaluated based on the predicted pathogenicity scores provided by SIFT (http://sift.jcvi.org/, cutoff ≤ 0.05) [Kumar et al., 2009], PolyPhen2 (http://genetics.bwh.harvard.edu/pph2/, HumVar scores, cutoff ≥ 0.447) [Adzhubei et al., 2010], and Mutation Taster (http://www.mutationtaster.org/, cutoff:qualitative prediction as pathogenic) [Schwarz et al., 2010]. Two out of three were required to declare a variant to be possibly pathogenic or possibly benign. Variants were also evaluated based on their presence in the professional [Schwarz et al., 2010] version of HGMD [Stenson et al., 2014] (version: 2013.1) and a literature search focusing on functional data. Evolutionary conservation scores were provided by PhyloP (comparison of 46 species) [Cooper et al., 2005] and GERP++ (http://mendel.stanford.edu/SidowLab/downloads/gerp/) [Davydov et al., 2010].
Initial diagnoses were re-evaluated when variants (either found in HGMD or predicted to be pathogenic) were identified in causative genes known to be responsible for phenotypes corresponding to the clinical presentations of the patients.
In the families in which no variants in known pathogenic genes were identified as being responsible for the clinical phenotype, a complementary approach was used. Each patient was assigned to a broad phenotypic category as they are defined in the Clinical Genomic Database (http://research.nhgri.nih.gov/CGD/, March 2014 version) (Supp. Table S2). The corresponding list of genes was downloaded and all the genes that typically displayed AR or X-linked inheritance (in families where all the affected individuals were male) were interrogated, irrespective of ROH and family segregation. Since none of the parents has been exome sequenced, a search for de novo variants was not possible.
All variants identified by exome sequencing and discussed in this paper were verified by Sanger sequencing in all family members. Only the variants that were respecting the segregation were retained. The variant nomenclature was controlled by Mutalyzer (https://mutalyzer.nl/, version 2.0.beta-29). The variants mentioned in this study have been submitted to LOVD (http://databases.lovd.nl/whole_genome/genes).
Array CGH
Unless already performed by the referring clinical group, array CGH (aCGH) was carried out to identify homozygous deletions or duplications and to exclude the presence of a chromosomal aberration. Briefly, a DNA sample from one patient per family was tested using the SurePrint G3 Human CGH Microarray Kit, 2 × 400K (Agilent Technologies, Santa Clara, CA) with 7.2 kb overall median probe spacing. Labeling and hybridization were performed according to the manufacturer's protocol. Data analysis was performed using Agilent Genomic Workbench Lite Edition 6.5.0.58. Probe positions are by reference to NCBI37 hg19.
Results
aCGH Results
Nineteen patients for whom no aCGH data existed were tested during this study. On average, every patient had 15.7 (range: 9–26) CNVs, 5.2 (range: 1–13) of which were duplications, whereas 10.5 (range: 5–20) were deletions. The CNVs varied in size from 1 to 2,309 kb and on average every patient had 0.9 (range 0–5) novel CNVs (defined as CNVs not reported in the Database of Genomic Variants) and 1.6 (range 0–4) homozygous CNVs (Supp. Table S3).
During the course of this project, we identified by aCGH a homozygous deletion of 32 kb encompassing the first exon of VLDLR (MIM #192977), known to be responsible for cerebellar hypoplasia and mental retardation with or without quadrupedal locomotion (also known as disequilibrium syndrome [DES]) [Boycott et al., 2005] (MIM #224050). The patients presented with ID and severe ataxia (they have never walked), compatible with the described phenotype of DES; the deletion segregated with the clinical phenotype through the family pedigree (family 4).
ROH and Exome Sequencing
Samples from 50 affected individuals, one from each family, were sequenced after exome capture. Supp. Table S4 provides the number of reads per sample. Supp. Table S5 provides the ROH for the families where a probably causative variant was identified. On average, among the 21,335 exonic and 1,370 splicing variants identified per individual, 50.3% were synonymous (98.6% of which have already been catalogued in dbSNP137; Supp. Table S6).
Using our algorithm, the putatively pathogenic homozygous variant was found in known disease-causing genes in 17 families (Table 2; Supp. Table S7). In one of these families, the patients harbored two high-confidence pathogenic variants, whereas in two further families, high-confidence pathogenic variants explaining part of the phenotype were identified. The follow-up analysis identified one additional homozygous variant in family 12 (Table 2; Supp. Table S7). This variant was missed by the original approach because we made the incorrect assumption that the two affected cousins had the same condition. After identification of the variant, further clinical examination confirmed the genotypic–phenotypic correlation with the exome-sequenced patient, and especially the fact that despite similar clinical picture the affected cousin did not have the brain MRI anomalies. It must be noted that using the same criteria, no compound heterozygous variants or variants in X-linked genes were identified.
Family ID | Gene (OMIM #) | Exon | Variant | dbSNP, frequencya | Disease (OMIM #) | Variant in literature |
---|---|---|---|---|---|---|
Family_1 | DMP1 (600980) | Exon 2 | NM_004407.3:c.1A>G: p.(?) | rs104893834, NA | Hypophosphatemic rickets (241520) | Feng et al. (2006), Lorenz-Depiereux et al. (2006) |
Family_12 | ARFGEF (605371) | Exon 20 | NM_006420.2:c.2776C>T: p.(Arg926*) | NA | Periventricular heterotopia with microcephaly, autosomal recessive (608097) | Novel |
Family_13 | FKTN (607440) | Exon 4 | NM_006731.2:c.218T>C: p.(Phe73Ser) | NA |
|
Novel |
Family_26 | SEPSECS (613811) | Exon 11 | NM_016955.3:c.1466A>T: p.(Asp489Val) | rs145703544, 0.022% | Pontocerebellar hypoplasia type 2D (613811) | |
Family_29 | GUCY2D (600179) | Exon 13 | NM_000180.3:c.2563C>T: p.(Gln855*) | NA | Cone-rod dystrophy 6 (601777), Leber congenital amaurosis 1 (204000) | El-Shanti et al. (1999) |
Family_30 | BBS4 (600374) | Exon 4 | NM_033028.4:c.157-3C>G | NA | Bardet–Biedl syndrome 4 (209900) | Harville et al. (2010) |
Family_31 | SYNE1 (608441) | Exon 142 | NM_033071.3:c.25597dup: p.(Ser8533Phefs*2) | NA | Emery–Dreifuss muscular dystrophy 4, autosomal-dominant (612998) Spinocerebellar ataxia, autosomal-recessive 8 (610743) | Novel |
Family_32 | POMGNT1 (606822) | Exon 18 | NM_017739.3:c.1539+1G>A | rs138642840, 0.0879% |
|
Yoshida et al. (2001) |
Family_36b | MTFMT (611766) | Exon 1 | NM_139242.3:c.17G>C: p.(Arg6Pro) | NA | Combined oxidative phosphorylation deficiency 15 (614947) | Novel |
MAN1B1 (604346) | Exon 13 | NM_016219.4:c.1990del: p.(Thr664Argfs*64) | NA | Mental retardation, autosomal-recessive 15 (614202) | Novel | |
Family_37 | TACO1 (612958) | Exon 3 | NM_016360.3:c.421C>T: p.(Arg141*) | NA | Leigh syndrome due to mitochondrial complex IV deficiency | Novel |
Family_38c | PYGM (608455) | Exon 20 | NM_005609.2:c.2447G>A: p.(Arg816His) | rs139230055, 0.000439 | McArdle disease (232600) | |
Family_39c | PRX (605725) | Exon 7 | NM_181882.2:c.3099del: p.(Glu1034Argfs*5) | rs139230055, 0.000439 | Charcot–Marie–Tooth disease, type 4F (614895) | |
Dejerine–Sottas disease, autosomal recessive (145900) | ||||||
Family_43 | TUSC3 (601385) | Exon 4 | NM_006765.3:c.544A>T: p.(Ile182Phe) | NA | Mental retardation, autosomal-recessive 7 (611093) | Novel |
Family_44 | STRA6 (610745) | Exon 19 | NM_022369.3:c.1931C>T: p.(Thr644Met) | rs118203960, 0.00022 |
|
Pasutto et al. (2007) |
Family_46 | ALDH3A2 (609523) | Exon 4 | NM_000382.2:c.628G>A: p.(Gly210Arg) | NA | Sjogren–Larsson syndrome (270200) | Novel |
Family_48 | RNASET2 (612944) | Exon 27 | NM_003730.4:c.115dup: p.(Met39Asnfs*7) | NA | Leukoencephalopathy, cystic, without megalencephaly (612951) | Novel |
Family_49 | MMP2 (120360) | Exon 4 | NM_004530.4:c.538G>A: p.(Asp180Asn) | NA | Torg–Winchester syndrome(259600) | Novel |
- a The variant's name in dbSNP followed by the reported frequency.
- b This patient possibly carries two pathogenic variants.
- c These variants only partially account for the patient's phenotype.
Families Diagnosed With Known Recessive Genes by Exome Sequencing
In 15 families, we were able to identify variants in genes already known to be involved in recessive disorders and which were compatible with the clinical phenotypes (Table 2).
In families 12, 26, 29, 30, 32, 37, 43, 44, 46, 48, and 49, pathogenic variants were identified in known genes responsible for disorders with a clinical description that correlated with the referred phenotypic description. These highly heterogeneous phenotypes could have been caused by mutations in a number of different genes; exome sequencing allowed the reliable detection of the causative variant and an accurate diagnosis of the syndrome in each patient.
In family 1, the two affected females were homozygous for a known variant (NM_004407:c.1A>G:p.[Met1Val]) [Feng et al., 2006; Lorenz-Depiereux et al., 2006] in DMP1 (MIM #600980) reported to cause autosomal-recessive hypophosphatemic rickets (MIM #241520). The patients’ phenotype, previously described in detail [Chouery et al., 2010], included diffuse hyperostosis and serum phosphorus only slightly below the normal limits (0.79 mmol/l, lower normal limit: 0.81 mmol/l) at the age of the diagnosis. Similarly, one of the reported families [Lorenz-Depiereux et al., 2006] with older patients also report osteosclerosis in the affected individuals and some of them have had serum phosphorus values just under the normal limits.
In family 13, a novel variant was identified in FKTN (MIM #607440) responsible for many muscular disorders as shown in Table 2. The family was ascertained with the preliminary diagnosis of familial microcephaly and developmental delay. The three patients have displayed hypotonia since an early age, and brain imaging revealed prominent frontal gyration, kinked corpus callosum, defective myelination along with a retrocerebellar cyst, and mild vermian hypoplasia; all of these findings were compatible with the clinical spectrum described in Fukuyama congenital muscular dystrophy [Kobayashi et al., 1998]. However, all three patients had microcephaly (between −3.5 and −4.5 SD) and the absence of pseudohypertrophy, whereas two of them were able to walk unaided. Plasma CPK levels measured more than once were within the normal range, in contrast to the markedly increased CPK levels typical of clinical presentations of this disease. The differences in the phenotypic spectrum could be attributed to the nature of the variant and/or unknown modifiers.
In family 31, a novel variant was identified in SYNE1 (MIM #608441). SYNE1 is a large gene with 146 exons and two different phenotypes attributed to it, with different modes of inheritance and different types of pathogenic mutation: the autosomal-dominant Emery–Dreifuss muscular dystrophy 4 (MIM #612998) and the autosomal-recessive spinocerebellar ataxia 8 (MIM #610743). The SYNE1 variant identified in family 31 was found within the coding region that usually harbors mutations causing Emery–Dreifuss syndrome [Zhang et al., 2007]; however, the mutation created a stop codon in contrast to the missense variants usually identified. Interestingly, the patient's phenotype is that of severe hypotonia since birth, which does not correspond to Emery–Dreifuss dystrophy or spinocerebellar ataxia. A literature search revealed one case report of a consanguineous Palestinian family with a nonsense SYNE1 mutation and a similar clinical picture [Attali et al., 2009]. The description of this family increases further the phenotypic spectrum related to SYNE1 variants.
In family 36, variants in two different genes known to cause monogenic disorders were identified. Both identified variants are found in genes of recessive ID (MTFMT [MIM #611766] and MAN1B1 [MIM #604346]) and the phenotype appears to be a combination of both disorders. Concerning MTFMT, lactic acidosis is not always detected [Tucker et al., 2011] and sampling of cerebrospinal fluid had not been possible. Brain MRI failed to show any brain lesions but there have been reported cases with no detected lesions; on the other hand, the patients present strabismus and hypotonia both of which are regularly seen in patients with MTFMT mutations [Haack et al., 2014]. With respect to MAN1B1I, there are no published photographs of patients [Rafiq et al., 2011] but the facial description correlates well with that of the patients reported here: wide, arched, sparse eyebrows, prominent nose, flat philtrum, and thin upper lip. Based on this evidence, we consider that the majority of the phenotype is explained by the MAN1B1 variant, whereas at the same time, it seems that the MTFMT variant contributes to a probably lower but unknown extent.
In two families with patients with ID, variants explaining at least part of the phenotype have been identified. However, the molecular cause of ID remains unknown. In family 38, a variant in PYGM (MIM #608455) could potentially account for the marked hypotonia of the patients. In the second, family 39, a homozygous frameshift variant was identified in PRX (MIM #605725) known to cause Charcot–Marie–Tooth type 4F (CMT4F). The patients in family 39 do not show any of the classic CMT4F symptoms (early-onset demyelinating sensory neuropathy) apart from a marked muscular weakness. It is interesting to note that there is a report of a Japanese patient with CMT4F [Tokunaga et al., 2012], harboring a nonsense mutation in an adjacent codon (p.R1070*), with late-onset presentation (in his late 20s).
Discussion
Consanguinity is a known risk factor for the incidence of autosomal-recessive disorders, and the risk to offspring increases with the level of consanguinity between the parents [Hamamy, 2012]. In this study, we examined 50 consanguineous families with unresolved molecular diagnoses with the aim of diagnosing known disorders causing the clinical phenotype in the offspring.
- In 11 families, high-confidence pathogenic variants were identified by exome sequencing in known genes that correlated with the corresponding phenotypes.
- In three families where causative variants were identified in DMP1, FKTN, and SYNE1, respectively, significant new aspects pertaining to the natural history of the diseases/clinical phenotypes have been noted. In family 1 (with the DMP1 mutation), the clinical picture seen in childhood was different from that observed in adulthood. In family 13, the differential diagnosis of the clinical presentation did not include FKTN. In family 31, our results expanded the phenotypic spectrum linked to SYNE1.
- In family 36, homozygous variants were identified in two genes known to cause ID, and the clinical phenotype may be considered due to a combination of the two variants.
- In two families (38 and 39), variants that can partially account for the clinical phenotype were identified. In family 38, the variant in PYGM may account for the patients’ severe hypotonia but not the ID. In family 39, the variant in PRX may be associated with the patients’ muscle weakness and provides the opportunity to better anticipate the sensory neuropathy that could occur later in life. However, the molecular cause of their ID remains unknown.
- In one family, the pathogenic variant (a homozygous 32 kb deletion) was detected by aCGH.
The sample size offered by this study provides the possibility to perform a preliminary prediction regarding the total number of autosomal-recessive disorders in human. We identified a causative mutation in 36% (18/50) of the families studied; since there are 1,828 molecularly characterized recessive disorders currently listed in OMIM, we predict that at least 5,000 recessive human clinical phenotypes may exist. This crude estimate neither takes into account the fact that some patients may have two or more genetic defects, nor the fact that many genes may be responsible for several clinical phenotypes but it does concur with a recent estimate of a total of 12,000–15,000 monogenic disorders in human [Cooper et al., 2010].
In 32 families, no variants in known pathogenic genes were identified. With the exception of variants that were missed because no gene has been attributed to the specific phenotype, additional reasons include: (1) method-related false negatives: (i) protein-coding genes in whole or in part are not captured by the reagents currently employed, (ii) insufficient coverage, and (iii) trinucleotide repeat expansions; (2) analysis false negatives: (i) functional genomic elements other than protein-coding genes were not interrogated, (ii) false-negative variant calling, and (iii) unsuspected problems in the analysis pipeline; (3) hypothesis-driven false negatives: (i) incorrect hypothesis for the mode of inheritance, (ii) affected individuals in the same family may have genetically different disorders, (iii) pathogenic variants with reduced penetrance may have resulted in false-negative conclusions, and (iv) pathogenic variants occurring in genomic regions identical by state between affected and nonaffected family members.
In a recently published study of diagnostic exome sequencing in patients suspected of having genetic disorders, a diagnostic rate of 25% was achieved (62/250 patients) [Yang et al., 2013]. Our slightly higher diagnostic rate (36%) is either due to the requirement for consanguinity, or to the more stringent inclusion criterion of at least two affected offspring or a combination of both.
The methods employed here confirm the feasibility of using HTS as a diagnostic tool [Makrythanasis and Antonarakis, 2012] and could be adapted for researching any autosomal-recessive disorder, whether in consanguineous or nonconsanguineous families, to increase substantially the likelihood of an accurate molecular diagnosis and consequently increasing the standard of care.
The HTS of exomes is also potentially useful for the detection of carrier status in consanguineous couples and the estimation of the risk for affected offspring, for family planning and reproductive decision-making purposes [Bell et al., 2011]. Although the detection space would be limited to known disease genes, the prospective identification of risk for an autosomal-recessive disorder in the offspring would be substantially increased [Kingsmore, 2012].
Conclusions
A specific diagnosis of the patients’ main symptoms was successfully reached using exome sequencing and aCGH in 16 out of 50 patients participating in this study, whereas in another two patients, a partial explanation of the clinical phenotype was obtained. This establishes HTS as an excellent first-tier diagnostic procedure. Precise diagnosis of the genetic disorder segregating in the family offers a wide range of future reproductive options including testing for carrier status with premarital and preconception counseling. Consanguineous marriages are culturally favored in a substantial number of human populations. Our study shows that exome sequencing, in addition to being a powerful diagnostic tool, promises to rapidly expand our knowledge of rare genetic Mendelian disorders and establish more detailed causative links between mutant genotypes and clinical phenotypes.
Acknowledgments
We are grateful to the members of all families enrolled in this study. P.M., H.H., and S.E.A. wrote the manuscript; P.M. and M.N. performed the ROH and exome analyses; P.M., H.H., and S.E.A. coordinated the study; F.A.S. conceived the algorithms and wrote the bioinformatics pipelines; M.G. and A.V. performed the HTS and Sanger sequencing; F.B., S.G., and E.S. performed and analyzed the aCGH; S.T., A.Me., A.Ma., M.S.A., M.S.Z., S.F., L.G., A.B., K.A. S.P., S.K.T., H.F., E.K., N.A., A.S., S.A., S.C.E., N.J., L.A., F.A., H.C.B., and E.A. examined the patients, described the phenotypic characteristics, and contributed the DNA samples; M.E.B. and G.S. analyzed exome data; D.N.C. contributed HGMD mutation data and edited the paper; H.H. designed the study and coordinated the patient collection; S.E.A. designed and conceived the general overview of the study. All authors contributed to the manuscript and approved the final version.
Disclosure statement: The authors declare no conflict of interest. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.