Volume 176, Issue 1 pp. 107-115
ORIGINAL ARTICLE
Full Access

Marked yield of re-evaluating phenotype and exome/target sequencing data in 33 individuals with intellectual disabilities

Bing Xiao

Bing Xiao

Department of Pediatric Endocrinology/Genetics, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University,, Shanghai, China

Molecular Genetics Group, Shanghai Institute for Pediatric Research, Shanghai, China

Search for more papers by this author
Wenjuan Qiu

Corresponding Author

Wenjuan Qiu

Department of Pediatric Endocrinology/Genetics, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University,, Shanghai, China

Molecular Genetics Group, Shanghai Institute for Pediatric Research, Shanghai, China

Correspondence

Wenjuan Qiu, MD, PhD, and Yu Sun, PhD, Room 803, Sci & Edu Bldg, 1665 Kongjiang Road, 200092, Shanghai, China.

Email: [email protected] (WQ); [email protected] (YS)

Search for more papers by this author
Xing Ji

Xing Ji

Department of Pediatric Endocrinology/Genetics, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University,, Shanghai, China

Molecular Genetics Group, Shanghai Institute for Pediatric Research, Shanghai, China

Search for more papers by this author
Xiaoqing Liu

Xiaoqing Liu

Department of Pediatric Endocrinology/Genetics, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University,, Shanghai, China

Molecular Genetics Group, Shanghai Institute for Pediatric Research, Shanghai, China

Search for more papers by this author
Zhuo Huang

Zhuo Huang

Department of Pediatric Endocrinology/Genetics, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University,, Shanghai, China

Molecular Genetics Group, Shanghai Institute for Pediatric Research, Shanghai, China

Search for more papers by this author
Huili Liu

Huili Liu

Department of Pediatric Endocrinology/Genetics, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University,, Shanghai, China

Molecular Genetics Group, Shanghai Institute for Pediatric Research, Shanghai, China

Search for more papers by this author
Yanjie Fan

Yanjie Fan

Department of Pediatric Endocrinology/Genetics, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University,, Shanghai, China

Molecular Genetics Group, Shanghai Institute for Pediatric Research, Shanghai, China

Search for more papers by this author
Yan Xu

Yan Xu

Department of Pediatric Endocrinology/Genetics, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University,, Shanghai, China

Molecular Genetics Group, Shanghai Institute for Pediatric Research, Shanghai, China

Search for more papers by this author
Yu Liu

Yu Liu

Department of Pediatric Endocrinology/Genetics, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University,, Shanghai, China

Molecular Genetics Group, Shanghai Institute for Pediatric Research, Shanghai, China

Search for more papers by this author
Hui Yie

Hui Yie

Department of Pediatric Endocrinology/Genetics, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University,, Shanghai, China

Molecular Genetics Group, Shanghai Institute for Pediatric Research, Shanghai, China

Search for more papers by this author
Wei Wei

Wei Wei

Department of Pediatric Endocrinology/Genetics, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University,, Shanghai, China

Molecular Genetics Group, Shanghai Institute for Pediatric Research, Shanghai, China

Search for more papers by this author
Hui Yan

Hui Yan

Department of Pediatric Endocrinology/Genetics, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University,, Shanghai, China

Molecular Genetics Group, Shanghai Institute for Pediatric Research, Shanghai, China

Search for more papers by this author
Zhuwen Gong

Zhuwen Gong

Department of Pediatric Endocrinology/Genetics, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University,, Shanghai, China

Molecular Genetics Group, Shanghai Institute for Pediatric Research, Shanghai, China

Search for more papers by this author
Lixiao Shen

Lixiao Shen

Department of Children's Healthcare, Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Search for more papers by this author
Yu Sun

Corresponding Author

Yu Sun

Department of Pediatric Endocrinology/Genetics, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University,, Shanghai, China

Molecular Genetics Group, Shanghai Institute for Pediatric Research, Shanghai, China

Correspondence

Wenjuan Qiu, MD, PhD, and Yu Sun, PhD, Room 803, Sci & Edu Bldg, 1665 Kongjiang Road, 200092, Shanghai, China.

Email: [email protected] (WQ); [email protected] (YS)

Search for more papers by this author
First published: 21 November 2017
Citations: 31
Wenjuan Qiu and Yu Sun have jointly directed this work.

Abstract

The diagnosis of intellectual disability/developmental delay (ID/DD) benefits from the clinical application of target/exome sequencing. The yield in Mendelian diseases varies from 25% to 68%. The aim of the present study was to identify the genetic causes of 33 ID/DD patients using target/exome sequencing. Recent studies have demonstrated that reanalyzing undiagnosed exomes could yield additional diagnosis. Therefore, in addition to the normal data analysis, in this study, re-evaluation was performed prior to manuscript preparation after updating OMIM annotations, calling copy number variations (CNVs) and reviewing the current literature. Molecular diagnosis was obtained for 19/33 patients in the first round of analysis. Notably, five patients were diagnosed during the re-evaluation of the geno/phenotypic data. This study confirmed the utility of exome sequencing in the diagnosis of ID/DD. Furthermore, re-evaluation leads to a 15% improvement in diagnostic yield. Thus, to maximize the diagnostic yield of next-generation sequencing (NGS), periodical re-evaluation of the geno/phenotypic data of undiagnosed individuals is recommended by updating the OMIM annotation, applying new algorithms, reviewing the literature, sharing pheno/genotypic data, and re-contacting patients.

1 INTRODUCTION

Intellectual disability (ID) is one of the most common neuropsychiatric disorders, with prevalence ranges from 1% to 3% (Leonard & Wen, 2002). As the genetic etiology of ID is heterogeneous, the extreme genetic heterogeneity of these entities poses a challenge for molecular diagnosis. Next-generation sequencing (NGS) is increasingly being employed as a diagnostic tool for Mendelian disorders, which has greatly improved the diagnostic yield to 25–68% (Shashi et al., 2014; Stark et al., 2016; Tarailo-Graovac et al., 2016; Yang et al., 2013), with the highest yield in patients with neurological diseases.

With the widespread use of NGS, the pace of disease-causing gene discovery has jumped in the past several years and has continued to rise. In 2016, Thevenon et al. (2016) found half of all diagnosed disorders (8/14 cases) were caused by mutations in genes reported to be disease-causing after 2012, which indicated pan-genomic sequencing diagnostic tests would allow a prospective re-assessment of the data in light of future publications and provide additional diagnostic results without significant extra costs. Similarly, a recent study conducted a systematic reanalysis of clinical exome data and phenotypic data, which led to four additional diagnoses in 40 previously undiagnosed patients (Wenger, Guturu, Bernstein, & Bejerano, 2017), demonstrating the feasibility and necessity of the periodic reanalysis of undiagnosed exomes.

The aim of this study was to identify the genetic causes of 33 ID/developmental delay (DD) patients using a whole exome sequencing (WES) or targeted sequencing (TS) approach. First-round analysis of WES/TS data were performed in conjunction with the assessment of clinical manifestations. Re-evaluation of the genetic and phenotypic data of undiagnosed patients after the initial analysis was performed through updated OMIM annotation, literature review, and copy number variation (CNV) detection.

2 SUBJECTS AND METHODS

2.1 Patient recruitment

The present study was not a consecutive study. In 2014, a study to examine the molecular etiology of unexplained ID/DD patients using NGS was initiated by two clinicians and a molecular geneticist in the Pediatric Endocrinology/Genetic Department of Xinhua Hospital affiliated to Shanghai Jiao Tong University School of Medicine. From 2012 to 2016, the two clinicians ordered single nucleotide polymorphism (SNP) array analyses for 212 moderate-to-severe ID patients. Of them, 146 patients were found to have no pathological CNVs based on a chromosomal microarray analysis (CMA) test. Those patients were invited to enroll in this NGS study. Patients with isolated unexplained ID/DD or ID/DD combined with at least one of the following anomalies were included: (1) facial dysmorphism; (2) congenital anomalies; (3) epilepsy; and (4) family history of affected ID/DD patients. The exclusion criteria were as follows: (i) acquired brain damage; (ii) abnormal metabolic screening; and (iii) recognizable syndromic disorders evaluated by two experienced clinical geneticists. In total, 33 patients met the inclusion/exclusion criteria, were contacted and agreed to join this study, and were followed-up for their clinical phenotype. The clinical descriptions of the patients are summarized in Table 1.

Table 1. Clinical characteristics of the 33 patients
Characteristics Patient number (%)
Sex
Male 25 (76%)
Female 8 (24%)
Age group
<6y 23 (70%)
6-18y 9 (27%)
>18y 1 (3%)
Family structure
Non-consanguineous 31 (94%)
One affected 26(84%)
Two affected 5 (16%)
Consanguineous 2 (6%)
One affected 2 (6%)
Phenotype Total Diagnosed
ID/DD 33 19
Isolated ID/DD 2 0
+Facial dysmorphism 20 14
+Microcephaly 13 7
+Hypotonia 9 5
+Epilepsy 8 5
+Short stature 8 2
+Cardiac malformation 6 4
+Abnormality on brain imaging 6/17 assessed 4

Written consent was obtained from the parents of all patients for publication. This study was approved by the ethics committee of Xin Hua Hospital (XHEC-D-2014-044).

3 METHODS

3.1 Procedures

As illustrated in Figure 1, the study comprised a first round and second round evaluation. Once a patient was recruited to the study, TS or WES was performed. TS/WES data were analyzed using a standard bioinformatics pipeline, as previously described (Sun et al., 2017). A first-round evaluation was made by a molecular geneticist and a clinician together. This stage ended in May 2016. At the beginning of 2017, the undiagnosed patients’ data underwent a second evaluation (re-evaluation). The wet lab experimental protocols are included in the Supplementary Materials (S1).

Details are in the caption following the image
Workflow of the first- and second- round analyses of the TS/WES and phenotypic data

3.2 First-round evaluation: Variant prioritization and interpretation

A standard bioinformatics pipeline applies GATK to call SNVs and small indels. This pipeline filtered out the high frequency variants based on several population databases (1000 Genomes Project, ExAC, EVS, and an in-house database containing 150 exomes). Filtered variants were analyzed following different inheritance patterns to generate three candidate gene lists: autosomal recessive—genes with at least two heterozygous variants or one homozygous variant; autosomal dominant—genes with at least one heterozygous variant; and X-linked—genes on the X chromosome with at least one variant. The affected genes were annotated using OMIM (first round ended June 2016). A molecular geneticist and a clinician further evaluated mutations in OMIM genes in the context of the clinical presentation of the patient. Candidate variants, after step-wise evaluation, were classified according to ACMG guidelines (Richards et al., 2015).

3.3 Second round evaluation: Bioinformatics analysis

In the bioinformatics pipeline, several databases were updated to provide more accurate annotations of the identified variants, including the in-house database (an additional 50 exomes were appended) and the OMIM database (January 2017 version), which allowed for the detection of defects in new ID/DD genes.

XHMM (Fromer et al., 2012) was applied to call CNVs in each sample. For TS, the calculation was performed together with another 600 samples. For WES, another 150 exome datasets were applied. The output CNV lists of each sample were compared. Only the unique CNVs were maintained. For cases with no disease-causing mutations identified by GATK, the CNVs and filtered SNV/indels were combined for a secondary analysis to detect potential causative mutations.

3.4 Second round evaluation: Variant prioritization and interpretation and phenotyping re-evaluation

In the second round analysis, for variants of uncertain significance (VUS) in known ID genes, a literature review was performed. Once a new variant association was reported, the variant was re-classified as pathogenic or likely pathogenic for a disease. All the patients identified to have causative variants in the second round analysis were re-contacted by clinicians, and then focused and efficient further clinical assessments were performed via a telephone follow-up or at our clinic to further evaluate the causative variants in the context of the patient's clinical features. Once a new diagnosis was achieved, updated clinical reports were issued and the participants were provided with updated genetic counseling.

4 RESULTS

4.1 Patient characteristics

The mean age of the patients was 3 years (range from 6 months to 21 years), and the male/female ratio was 25/8. A total of 31/33 patients presented with other major anomalies (Table 1). The detailed phenotypes of each patient were uploaded to the LOVD database and are summarized in Table S1.

4.2 Diagnostic yield of the first-round analysis of the TS/WES and phenotypic data

A total of 14 patients received a conclusive genetic diagnosis, corresponding to a yield of 42% (14/33). The underlying diseases included six autosomal dominant, five X-linked and three autosomal recessive disorders (Figure 1). In total, 13 disorders were revealed in 14 probands (Table 2).

Table 2. Disease-causing and candidate mutations identified among the 33 ID/DD patients
Patient ID Gene Chromosomal position (hg19) Accession number Nucleotide alteration Aminoacid alteration Classification
Autosomal dominant
1 TCF4 chr18:52927178 NM_001243226.2 c.1375 + 2T>G (hetero, de novo) Pathogenic
3 CREBBP chr16:3779446 NM_004380.2 c.5602C>T (hetero, de novo) p.Arg1868Trp Likely pathogenic
10 ZEB2 chr2:145161578 NM_001171653.1 c.640G>T (hetero, de novo) p.Glu214* Pathogenic
11 CREBBP chr16:3779451_3779453 NM_004380.2 c.5595_5597del (hetero, de novo) p.Met1865_Arg1866delinsIle Likely pathogenic
12 NIPBL chr2:145158840 NM_015384.4 c.771 + 1G>C (hetero, de novo) p.Val251_Asp257del Pathogenic
15 SCN2A chr2: 166229837 NM_001040142.1 c.3952T>C (hetero, de novo) p.Ser1318Pro Likely pathogenic
21 MEF2C chr5: 88100569 NM_001193347.1 c.104T>C (hetero, de novo) p.Leu35Pro Likely pathogenic
24 KMT2D chr12:49431030 NM_003482.3 c.10108del (hetero, de novo) p.Gln3370Serfs*22 Pathogenic
29 ARID1B chr6:157501988-157505801 NM_020732.3 exon 12–13 del (hetero, de novo) Pathogenic
X-linked
4 UBE2A chrX:118715520_118715521 NM_001282161.1 c.104del (hemi, maternal) p.Pro35Leufs*39 Pathogenic
7 PIGA chrX:15349955 NM_002641.3 c.98A>G (hemi, maternal) p.His33Arg Likely pathogenic
19 ATRX chrX: 76889149 NM_000489.4 c.4861A>G (hemi, maternal) p.Thr1621Ala Pathogenic
25 RPS6KA3 chrX: 20193408 NM_ 004586.2 c.1103-2A>G (hetero, de novo) Pathogenic
33 KDM6A chrX:44879885 NM_001291415.1 c.474T>G (hetero, de novo) p.Tyr158* Pathogenic
Autosomal recessive
5 UNC80 chr2:210737567 NM_032504.1 c.3719G>A (hetero, paternal) p.Trp1240* Likely pathogenic
chr2:210782592 NM_032504.1 c.4926_4937del (hetero, maternal) p.Asn1643_Leu1646del Likely pathogenic
6 UNC80 chr2:210782632 NM_032504.1 c.4963C>T (hetero, paternal) p.Arg1655Cys Likely pathogenic
chr2:210837990 NM_032504.1 c.8385C>G (hetero, maternal) p.Tyr2795* Likely pathogenic
8 PLA2G6 chr22:38528838 NM_003560.2 c.1077G>A (homo) p.Met358Ilefs*7 Pathogenic
18 TREX1 chr3:48508247_48508249 NM_016381.5 c.358_360dup (hetero, maternal) p.Asp120dup Likely pathogenic
chr3:48508028 NM_016381.5 c.139G>A (hetero, paternal) p.Gly47Ser VUS
30 ALDH3A2 chr11:19568310 NM_000382.2 c.1157A>G (homo) p.Asn386Ser Likely pathogenic
Candidate
16 MED12 chrX:70344879 NM_005120.1 c.2109G>C (hemi, maternal) p.Lys703Asn VUS
  • The first and second round analysis are underlined.

4.3 Re-evaluation of both the phenotypic and genotypic data of TS/WES

Prior to the preparation of the manuscript in January 2017, we re-evaluated the genetic and clinical data of the 19 undiagnosed patients. In this phase, we updated the OMIM data, reviewed the most current literature, and analyzed CNVs. Eventually, the genetic etiology was identified in another five patients (Figure 1). The disease-causing mutations are underlined in Table 2. The overall yield increased from 42% (14/33) to 57% (19/33). Below, we provide case-by-case descriptions of the undiagnosed patients in the first-round evaluation.

4.3.1 Patients 5 and 6

Two patients with mutations in recently identified ID genes.

Patient 5 showed severe ID, absent speech, growth retardation, infantile hypotonia, and behavioral disturbance (hand biting, self-injury, and sensory hypersensitivities); and patient 6 presented with severe ID, hypotonia, epilepsy, a happy disposition, and sensory hypersensitivities. Compound heterozygous mutations in the UNC80 gene (NM_032504.1) were identified in the two patients after reanalysis of the previous vcf file, c.3719G>A (p.Trp1240*)/c.4926_4937del (p.Asn1643_Leu1646del) in patient 5 and c.4963C>T (p.Arg1655Cys)/c.8385C>G (p.Tyr2795*) in patient 6. Sanger sequencing confirmed that the variants were both inherited from phenotypically normal parents. Homozygous or compound heterozygous UNC80 mutations cause infantile hypotonia with psychomotor retardation and characteristic facies 2 (IHPRF2, OMIM 616801). The phenotypes of both patients were consistent with the clinical description of IHPRF2. IHPRF2 was first published in November 2015. The genetic etiology remained unknown for these two patients when the first-round analysis was performed in August 2015. These two patients remained undiagnosed for more than one year until the reanalysis in January 2017.

4.3.2 Patient 29

Disease-causing mutation was not detected using the GATK pipeline.

A unique 3.8-kb exonic deletion at chr6:157,501,988-157,505,801, covering exons 12 and 13 in the ARID1B gene, was identified in this patient by CNV analysis. The results were subsequently confirmed using qPCR. Truncated ARID1B causes Coffin-Siris syndrome (CSS, OMIM 135900). The phenotype of patient 29 was consistent with CSS. Microarray analysis (CytoScan 750 K array, AffyMetrix) did not detect this deletion, as no probe is located in this region. In addition, the 3.8-kb deletion is too “big” for detection using the GATK pipeline, which was designed for SNV/small indel calling. First-round analysis for this patient was performed in June 2016. This patient remained undiagnosed for more than half a year until the reanalysis in January 2017.

4.3.3 Patients 3 and 11

VUS mutations were assigned a pathogenicity.

In these two patients, two CREBBP mutations were detected (Table 2). Patients with de novo loss-of-function mutations in the CREBBP gene demonstrated the characteristics of Rubinstein-Taybi syndrome (RSTS, OMIM 180849). RSTS phenotypes include grimacing smile, broad or angulated thumbs and halluces, or broad distal phalanges of the fingers. However, patient 11, with a c.5595_5597del (p.Met1865_Arg1866delinsIle) in-frame deletion, and patient 3, with a c.5602C>T (p.Arg1868Trp) missense mutation in CREBBP, showed no classical RSTS features. The two variants were classified as VUS in the first-round evaluation performed in November 2015. Subsequently, in an online publication in June 2016, Menke et al. (2016) reported that patients with missense mutations in exon 30/31 of the CREBBP gene showed a distinct phenotype from RSTS. The two patients shared features, including severe ID, speech delay, short stature and microcephaly. The following facial characteristics of these two patients resembled each other and those of the patients described by Menke et al.: light and sparse hair and eyebrows, blepharophimosis, up-slant palpebral fissures, low nasal bridge, upturned nare, long philtrum, small mouth and thin lips (Figure 2). The detailed clinical features of the two patients will be described elsewhere (Menke et al., in preparation). Thus, patients 3 and 11 confirmed the novel CREBBP-related entity (Menke et al., 2016).

Details are in the caption following the image
Photographs of the two patients with CREBBP mutations. (a) Frontal view of patient 3 at age 2 years. Note the light and sparse hair and eyebrows, blepharophimosis, up-slanting palpebral fissures, low nasal bridge, upturned nare, long philtrum, small mouth and thin lips. (b) Frontal view of patient 11 at age 3 years. Note the light and sparse hair, blepharophimosis, up-slanting palpebral fissures, low nasal bridge, upturned nare, long philtrum, small mouth and thin lips. [Color figure can be viewed at wileyonlinelibrary.com]

5 DISCUSSION

The wide use of TS/WES in ID-associated disorders has achieved tremendous success (Najmabadi et al., 2011). The present study confirmed the clinical utility of NGS in the diagnosis of ID/DD patients. The overall diagnostic rate was 57%. The high diagnostic yield in the present study reflected the fact that: (1) Most cases were selected with moderate or severe ID/DD, and the proportion of ID/DD patients with facial dysmorphism, epilepsy, microcephaly, or other anomalies was high (31/33). The relatively higher yield in the ID/DD and facial dysmorphism subgroup was consistent with the fact that most of the patients diagnosed in the present study had syndromic ID (Table 1). (2) We performed a reanalysis of the undiagnosed patients. The findings of a recent study suggested that the reanalysis of exome data in undiagnosed patients at a 2-3-year interval could result in a 10% diagnostic yield (Wenger et al., 2017). In the present study, five patients were diagnosed during the re-evaluation, leading to a 15% improvement in the diagnostic yield. Re-evaluation of the phenotype and genotype of the patients might focus on several of the following aspects.

5.1 Update the OMIM gene list for analysis of the recently discovered novel genes

The primary reason for new diagnoses has been attributed to the growing knowledge of gene-disease and variant-disease associations in the literature (Wenger et al., 2017). Novel gene discovery using WES has increased over the past several years (Boycott, Vanstone, Bulman, & MacKenzie, 2013). The number of OMIM disorders with a known molecular basis has steadily increased at an average rate of 266 entries per year (Wenger et al., 2017). Therefore, Thevenon et al. (2016) proposed the importance of the re-assessment of WES data. In a recent study, the prospective re-assessment of WES data leading to a 10% improvement in the diagnostic yield was primarily attributed to novel gene discovery (Wenger et al., 2017). Similarly, patients 5/6 with UNC80 mutations in the present study were diagnosed due to novel ID gene discovery. We completed the WES data analysis of patients 5 and 6 in April 2015, and the first paper on UNC80-associated ID disorder was published online in November 2015. The UNC80-related ID disorder (OMIM 616801) was added to the OMIM database in February 2016.

These findings further highlight the importance of the periodic re-assessment of WES data, focusing on novel OMIM genes, which will provide additional diagnostic results.

5.2 Apply new algorithms to detect variants missed by the previous pipeline

Variants detected by TS/WES are typically small, as the regular analysis pipeline is based on alignment and base-wise comparisons, and the reads are relatively short. With improved algorithms and available large data sets, CNV calling from the same panel has become increasingly accurate. In the present study, XHMM pinpointed the exonic ARID1B deletion in patient 29 that would have been missed using the default GATK pipeline and conventional microarray analysis, suggesting the potential for sequencing for CNV detection.

Both variant types could be easily called by improving the bioinformatics pipeline. This technique would provide additional diagnostic results without significant labor costs.

5.3 Periodic review of the literature on VUS genes

In addition to the discovery of the novel ID genes, variants in known ID genes with VUS reflecting phenotype and genotype discrepancies would be identified by WES, even if some genes have been associated with well-described syndromes. These variants may represent unique disease entities, such as the variants in the CREBBP gene. For patients 3 and 11, the variants in CREBBP were classified as VUS, reflecting a discrepancy in clinical features compared with RSTS. With the evidence reported by Menke et al. the two variants can now be re-classified as likely pathogenic.

Similarly, in patient 16, we identified a VUS in MED12 that was inherited from his mother. MED12 mutations are involved in at least three distinctive XLID phenotypes: FG syndrome (OMIM 305450), Lujan-Fryns syndrome (OMIM 309520) and X-linked Ohdo syndrome (Maat-Kievit-Brunner type; OMIM 300895) (Risheg et al., 2007; Schwartz et al., 2007; Vulto-van Silfhout et al., 2013; Verloes et al., 2006). This patient exhibited no hallmark of these syndromes and presented with a unique phenotype characterized by severe ID and speech delay, hypotonia and ASD. The literature review suggested a fourth MED12-related disorder, characterized by severe ID and absent or deficient language skills (Bouazzi, Lesca, Trujillo, Alwasiyah, & Munnich, 2015; Lesca et al., 2013; Prontera et al., 2016). Therefore, patient 16, together with the previously reported carriers of the MED12 mutation, showed distinct features that may provide evidence for a fourth MED12-related disorder. However, additional similar cases are required to draw this correlation.

These examples highlight the necessity of periodically reviewing the most current literature to understand variant-disease associations.

5.4 Re-contact patients to collect up-to-date phenotypes

Both an initial analysis and reanalysis of NGS data would benefit from detailed phenotypes. During the reanalysis, the SNVs present in an individual's exome can be reclassified, and new algorithms can be used to locate extra variants. Presently, improvements in diagnose yield will mainly come from re-analyses, and most importantly from re-phenotyping; therefore, as suggested by Hennekam and Biesecker (2012), we have shifted to a post-test diagnostic assessment mode.

If a single highly suspected gene was found during a reanalysis based on an updated database or by applying new algorithms, collecting detailed phenotype information will help correlate clinical findings with NGS results to confirm the diagnosis. In addition, it is important to collect up-to-date phenotypes to review relevant variant lists, and if several candidate variants are listed, clinicians should conduct focused and efficient further assessments of the patient; thus, causative variants should be chosen from a candidate variant list that is based on further clinical assessment. If a patient carries a VUS but does not yet demonstrate distinct clinical features of this entity, clinicians should follow-up with the undiagnosed patients in order to update the phenotype. VUS may eventually become a pathogenic, such as those patients with VUS in CREBBP.

Reflecting the significant heterogeneity and rarity of per ID-related syndrome, a promising strategy for patients with a negative molecular diagnosis is to share the data internationally, thereby facilitating the identification of additional cases with similar clinical features and common genetic changes. In addition to the literature, databases, such as ClinVar and LOVD, are used to store phenotypic and genotypic data. HGVS nomenclature and Human Phenotype Ontology (HPO) standardize genotypic and phenotypic data. Matching techniques based on the sharing of candidate phenotype and genotype data via web-based data exchange servers could greatly improve the efficiency of pathogenic variant discovery in rare disorders. Some online platforms using genotype- and phenotype-driven matching algorithms to identify cases with common phenotypes and disrupted genes have been developed, providing a robust platform for rare disease gene discovery, such as phenotype-based prioritization approaches, including Phenolyzer, GeneYenta, and PhenomeCentral MatchMaker Exchange (MME), and candidate gene-based prioritization approaches, including GeneMatcher. There are some examples of novel discoveries using matchmaking approaches, demonstrating the effectiveness of these approaches in rare disease gene discovery (Au et al., 2015; Jurgens et al., 2015; Loucks et al., 2015). With the emergence of these new web-based approaches, collaboration and connections between clinicians and researchers will be enhanced. Through these approaches, WES applications could be enhanced for accurate diagnosis, and patients can be adequately informed about the diagnosis (Thevenon et al., 2016).

5.5 The challenges and suggestions for the re-evaluation of phenotypic and genotypic WES data

The re-evaluation options described above raise challenges in clinical practice: (1) How often should reanalysis be performed? Frequent reanalysis could maximize the diagnostic yields of NGS, but may require significant extra time and manual labor. (2) Who should initiate the revaluation? Option 1 and 2, which mainly rely on updating the bioinformatics pipeline, could be initiated by the clinical lab. However, for Options 3 and 4, diagnostic labs, researchers and clinicians would need to work together. Significant human labor would be involved if all undiagnosed patients underwent reanalysis. (3) What is the cost of reanalysis? Who will pay the reanalysis costs? Reanalysis requires human time, leading to financial costs associated with the clinical follow-up and genetic counseling.

In general, Options 1 and 2 could be implemented for all the undiagnosed exomes by updating the bioinformatics pipeline, which could be initiated by the clinical lab. If a lab generates 3000 exomes of data per year, around 30% of the patients could be diagnosed during the first analysis, with the remaining 70%, approximately 2,100 patients, not diagnosed. If the evaluation was performed once a year, then 2,100 exomes of data per year would be used for the revaluation. This approach may not require significant human labor if it could be automated. Human labor would be necessary for the 2- to 3-year interval re-evaluation, which could result in a 10% diagnostic yield, or approximately 100 cases, that would need be followed-up individually each year by clinicians to further evaluate the causative variants in the context of the patient's clinical features. In addition, once a new diagnosis is achieved, clinical reports should be updated and issued and updated genetic counseling should be provided for these 100 patients. In this step, automating the data reanalysis could lessen the expense and shift the cost-benefit calculation toward more frequent reanalysis (Wenger et al., 2017). The financial burden for this part would mainly include the data reanalysis and genetic counseling. We suggest that the cost of the regular reanalysis should be included in the cost of the first-round analysis, which is covered by insurance companies, with the cost for genetic counseling attributed to the individual.

In contrast, for Options 3–4 and data-sharing approaches, diagnostic labs, researchers and clinicians would constantly need to work together, which would require significant human labor. Clinicians should follow-up with undiagnosed patients to collect update phenotypes, which are important for the lab to update their relevant variant lists. Additionally, for patients with VUS, literature reviews should be performed to search for new associations between the variants and phenotypes. Options 3–4 might be difficult to automate at present; therefore, from a practical point of view, the provider could request that specific patient groups are reanalyzed (those with highly suspected genetic etiology, the desire to have a healthy baby, and who can be followed-up), which is feasible for most clinical labs. Because of the amount of human labor involved, this option would be reasonable if the cost is placed on the individual.

The guidelines issued by EuroGentest and the European Society of Human Genetics state that “the laboratory is not expected to re-analyze old data systematically and report novel findings, not even when the core disease gene panel changes” (Matthijs et al., 2016). However, according to Wenger's experience and the present study, a 2- to 3-year interval of reanalysis could result in a 10-15% increased diagnostic yield; therefore, a 1- to 2-year interval for re-evaluation is recommended, particularly for the regular re-assessment of Options 1 and 2, which will provide additional diagnostic results without significant extra human labor. We anticipate that future guidelines will change to support re-evaluation and issue standard procedures.

In conclusion, the present study confirmed the use of TS/WES to facilitate the diagnosis of ID/DD patients with unexplained etiologies. Furthermore, the re-evaluation of the phenotype and genotype of the patient led to a 15% improvement of the diagnostic yield. Further studies are required to confirm the feasibility and define the standard procedure of reanalysis using TS/WES.

ACKNOWLEDGMENTS

The authors thank all the families for participation in the present study. This work was funded by grants from the National Natural Science Foundation of China (81400872 to YS; 30973216 to WJQ; 81401193 to BX), and Shanghai Municipal Commission of Health and Family Planning Foundation (201540054 to BX; 20154Y0153 to YS), Shanghai Jiao Tong University School of Medicine (2014XJ10044 to YS), Shanghai Health Bureau (20134005 to WJQ) and National Key Research and Development Program (2016YFC0905100).

    CONFLICTS OF INTEREST

    The authors have no conflict of interest to declare.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.