The advances of genetics research on Hirschsprung's disease
Abstract
Hirschsprung's disease (HSCR) is a rare and complex congenital disorder characterized by the absence of the enteric neurons in lower digestive tract with an incidence of 1/5 000. Affected infant usually suffer from severe constipation with megacolon and distended abdomen, and face long-term complications even after surgery. In the last 2 decades, great efforts and progresses have been made in understanding the genetics and molecular biological mechanisms that underlie HSCR. However, only a small fraction of the genetic risk can be explained by the identified mutations in the previously established genes. To search novel genetic alterations, new study designs with advanced technologies such as genome/exome-wide association studies (GWASs/EWASs) and next generation sequencing (NGS) on target genes or whole genome/exome, were applied to HSCR. In this review, we summaries the current development of the genetics researches on HSCR based on GWASs/EWASs and NGS, focusing on the newly discovered variants and genes, and their potential roles in HSCR pathogenesis.
Introduction
Hirschsprung's disease (HSCR, or congenital intestinal aganglionosis), is a complex developmental disorder characterized by the absence of parasympathetic intrinsic ganglion cells in submucosal and myenteric plexuses of the hindgut.1, 2 It is attributed to the failure of enteric neural crest cells to migrate, proliferate or differentiate in the bowel wall during embryogenesis, leading to the aganglionosis in lower gastrointestinal tract. Severity of the disease is classified into short-segment HSCR (S-HSCR: 80% of cases) when the aganglionic segment is limited in the rectosigmoid colon, long-segment HSCR (L-HSCR: 15% of cases) when aganglionosis extends to the sigmoid, and total colonic aganglionosis (TCA: 5% of cases) when the entire small and large intestines are aganglionic.2-4 Disappearance of propulsive motility in the aganglionic bowel would result in chronic constipation, abdominal distension, growth failure and bilious vomiting,5, 6 with a series of complications such as bowel perforation and enterocolitis. Even with surgical treatments removing and bypassing aganglionic bowel, about one-third of affected children still suffer from constipation, faecal incontinence or long-term enterocolitis.7-9
As a potentially fatal birth defect, the incidence of HSCR is about 1/5 000 live births, but varies across different ethnic groups, with the highest reported rate in Asians (2.8/10 000 live births).2, 10 There is a strong male gender bias with a ratio of about 4:1.35, which is much higher in S-HSCR [(4.2–4.4):1] than L-HSCR [(1.2–1.9):1].11 HSCR has been considered to be a sex modified multifactorial disorder, the effect of environmental factors (like vitamin A deficiency)12 just playing a minor role as compared to genetic factors with a relative risk of about 1/200.13 And the genetics of HSCR is complex. Syndromic HSCR, such as Mowatt-Wilson or Waardenburg Shah type 4, presents a Mendelian mode of inheritance, while isolated HSCR (>70% of cases) appears to be of non-Mendelian inheritance with low penetrance.9 For cases with L-HSCR or TCA, the inheritance mode is much likely due to a dominant gene with incomplete penetrance, while for cases with S-HSCR, the inheritance pattern is multifactorial or compatible with a recessive gene with low penetrance.11
Since 1994, positional cloning and candidate gene studies have identified a number of genes with mutations in HSCR patients, including RET, GDNF, GFRA1, NRTN, PHOX2B, NKX2.1, SOX10, EDNRB, EDN3, ECE-1, KIAA1279, ZFXH1B, NTRK3, L1CAM, TCF4, and HOXB5.10, 14-23 Most of them encode proteins that are members of three important inter-related signaling pathways: the GDNF/RET receptor tyrosine kinase, the endothelin type B receptor, and the SOX10-mediated transcription. And there have been much evidence that interactions existed between genes in those different signaling pathways.2, 10, 24 RET is considered to be the most important gene involved in HSCR, and its sufficient expression is essential for the development of enteric nervous system (ENS).23, 25, 26 However, coding or splice junction mutations at these genes account for only about 50% of familial cases and 20% of sporadic cases, and explain just 0.1% of the heritability.4 Hence, there must be additional genetic defects responsible for HSCR.
As effective strategies with new technologies emerged, genetics researchers started to apply chip-based genome/exome-wide association study (GWAS/EWAS) and next generation sequencing (NGS) on target genes or whole genome/exome, to search novel genes and corresponding variants or mutations associated with different diseases, including HSCR. In this review, we focus on the advances of HSCR's genetic etiology revealed by GWAS/EWAS and NGS.
Genome-wide association study and HSCR
To date, there are four well-design GWASs, one meta-analysis and one EWAS for HSCR in different populations (Table 1). Published in 2009, the first GWAS of HSCR not only ascertained the role of RET in Chinese patients, but also identified a new susceptibility gene neuregulin-1 (NRG1) that played an important role in survival and differentiation of the neural crest cells through binding and interaction with ErbB tyrosine kinase receptors.27 The involvement of RET and NRG1 in HSCR was also discovered in another GWAS in Korean population.28 As for Caucasians, a family-based GWAS further reported the susceptibility of individuals with variants of RET and NRG1, and located a new risk locus containing class 3 Semaphorin gene cluster (SEMA3A, SEMA3C, SEMA3D). Analysis in Ret wild-type and Ret-null mice showed specific expression of Sema3a, Sema3c, and Sema3d in ENS, while the knockdown of Sema3c or Sema3d in zebrafish embryos demonstrated the loss of migratory ENS precursors.29 To aggregate the data of three above- mentioned GWAS on HSCR, Tang et al30 conduct a trans-ethnic meta-analysis containing totally 507 cases and 1 191 controls. They not only confirmed the associations of RET, NRG1, SEMA3, and one previously well-established locus 4p13 (PHOX2B) in syndromic HSCR, but also found one novel disease-susceptibility locus 2p16.1 (VRK2/FANCL). Encoding a serine/threonine protein kinase, VRK2 was strongly implicated in central nervous system and neurodevelopmental disorders, and might interacted with receptor ErbB2, which is the co-receptor of NRG1. More recently, another HSCR GWAS of Caucasians was published. It confirmed RET and SMEA3 as being associated with HSCR in a Danish cohort, and additionally reported a novel low-frequency variant (rs144432435) of RET.31
Gene | Locus | tagSNP | Study Design | Population | Journal | Year |
---|---|---|---|---|---|---|
NRG1 | 8p12 | rs16879552/rs7835688 | GWAS + Replication | Discovery: Chinese (181 cases/346 controls) Replication: Chinese (190 cases/510 controls) | PNAS | 2009 |
RET | 10p11.21 | rs2742234 | ||||
RET | 10p11.21 | rs1864400 | GWAS | Korean (123 cases/432 controls) | PLoS One | 2014 |
NRG1 | 8p12 | rs16879552 | ||||
SLC6A20 | 3p21.31 | rs4299518/rs2159272 | ||||
RORA | 15q22.2 | rs1351544/rs8025324/rs9920560/rs7183955 | ||||
ABCC9 | 12p12.1 | rs704192/rs704191/rs4148669/rs704190 | ||||
STIM2 | 4p15.2 | rs11725593 | ||||
DEFB129 | 20p13 | rs6074578 | ||||
LOC100509398 | 3q26.2 | rs12639288 | ||||
RET | 10p11.21 | rs2506030/rs2435357 | GWAS + Replication | Discovery : Caucasian (220 trios) / Replication: Caucasian (429 trios) | AJHG | 2015 |
NRG1 | 8p12 | rs4541858/rs7835688 | ||||
SEMA3A/ SEMA3C/SEMA3D | 7q21.11 | rs12707682/rs11766001 | ||||
RET | 10q11.21 | rs9282834/rs2505998/rs2505998 | Meta-analysis of GWAS | European (212 cases/202 controls) / Chinese (173 cases/615 controls) / Korean (122 cases/374 controls) | HMG | 2016 |
SEMA3C/3D | 7q21.11 | rs80227144 | ||||
NRG1 | 8p12 | rs7005606 | ||||
PHOX2B | 4p13 | rs6826373 | ||||
VRK2/FANCL | 2p16.1 | rs4672229 | ||||
SSPO | 7q36.1 | rs10250401 | EWAS | Chinese (167 cases/900 controls) | Mol Neurobio | 2017 |
EEF1D | 8q24.3 | rs10282929 | ||||
SLC34A3 | 9q34.3 | rs35699762 | ||||
ABO | 9q34.2 | rs1053878 | ||||
BOC4L | 12q24.33 | rs78871841 | ||||
CACNA1H | 16p13.3 | rs36117280 | ||||
TELO2 | 16p13.3 | exm1202536 | ||||
CARD14 | 17q25.3 | rs11652075 | ||||
COMT | 22q11.21 | rs6267 | ||||
ARVCF | 22q11.21 | rs80068543 | ||||
SEMA3 | 7q21.11 | rs62472985/rs117617821 | GWAS | Discovery: Danish (170 cases/4717 controls) Replication: European (416 cases/903 controls) | EJHG | 2018 |
MOB1AP1/DDX6P2 | 13q31.1 | rs12428625 | ||||
RET | 10p11.21 | rs17653445/rs2505994/rs4519046/rs144432435 |
- GWAS, genome-wide association study; EWAS, exome-wide association study; PNAS, Proceedings of the National Academy of Sciences of the United States of America; AJHG, The American Journal of Human Genetics; HMG, Human Molecular Genetics; EJHG, European Journal of Human Genetics.
Exome-wide association study and HSCR
Most of the susceptibility variants discovered by GWASs are common variants with minor allele frequency (MAF) > 0.05, conferring relatively small effect sizes with odds ratios (OR) from 1.1 to 1.5. These variants could explain only a small fraction of genetic risk of investigated diseases. Therefore, rare variants and loci that are undetected by GWAS-used chips may have a stronger
effect and contribute to the missing heritability.32 Exome-chip platforms have been developed to capture low-frequency variants in protein-coding regions and have been proved to be an effective complementary approach for genetic researches on complex diseases. An exome-wide association study was applied to scan the exonic variants for HSCR.33 In this study, Tang et al identified ten variants and ten novel genes associated with HSCR at P < 10−4 in a Chinese population. Among these SNPs, the missense variants in catechol-O-methyltransferase (COMT) and armadillo repeat gene deleted in velocardiofacial syndrome (ARVCF) indicated an ectopic expression in HSCR colons. Specially, the variant Ala72Ser in COMT decreased proliferation activity of neural cell via NOTCH signal pathway, while the mutant ARVCF suppressed cell migration by downregulating RHOA and ROC (Table 1).
Deep-targeted sequencing and HSCR
As NGS technologies emerged, some researchers started to apply deep-target sequencing on candidate genes or loci that have been implied in HSCR (Table 2). An early study in 2012 sequenced all 16 exons of the HSCR-associated gene NRG1 in 358 cases and 333 controls, and reported 13 different heterozygous variants. 34 RET, as the most well-established gene in HSCR, was also screened for somatic mutations through targeted exome sequencing and whole genome sequencing. Eight de novo mutations were found in 152 patients, of which six were pathogenic mosaic mutations.35 These findings were in line with the evidence that genes containing common disease-associated variants were likely to harbor functional rare variants in coding exons. Considering that aberrant hedgehog signaling could disrupted neural crest cells (NCCs) differentiation and might cause Hirschsprung's disease, Li et al36 performed targeted sequencing on GLI1, GLI2, GLI3, SUFU, and SOX10 in 20 HSCR patients. Four rare heterozygous missense variants in the coding sequence of GLI1, GLI2, and GLI3 were located for the first time, and aberrant Gli activity were found to perturb the Sox10-Sufu-Gli regulatory loop, leading to attenuated differentiation of enteric NCCs and delayed gut colonization.
Gene | Locus | Mutation | Study Design | Population | Journal | Year |
---|---|---|---|---|---|---|
NRG1 | 8p12 | A28G / E134K / V266L / H347Y/ P356L / V486M /P24P / T169T / L483L / E239fsX10 / c.503-4insT | TES | Chinese (358 cases and 333 controls) | Hum Genet | 2012 |
GLI1 | 12q13.3 | R557C | TES | Chinese (20 cases and 20 controls) | Gastroenterology | 2015 |
GLI1 | 12q13.3 | P763S | ||||
GLI2 | 2q14.2 | G191R | ||||
GLI3 | 7p14.1 | H1200D | ||||
RET | 10p11.21 | c.254G > A / c.754G > T / c.789C > G / c.2308C > T / c.2333delT / c.2578C > T / c.2802-2A > G / c.229C > T / c.200insTCC | TES & RET single gene screening | Chinese (152 patients) | Genet Med | 2017 |
NRG3 | 10q23.1 | chr10:84118524 | WES | Chinese (2 affected familial patients) | Mol Neurobio | 2013 |
TMPRSS11E | 4q13.2 | chr4:69342021 | ||||
SPRY1 | 4q28.1 | chr4:124323240 | ||||
OR8J3 | 11q12.1 | chr11:55905101 | ||||
PRSS1 | 7q34 | chr7:142460335 | ||||
LAMA3 | 18q11.2 | chr18:21453118 | ||||
RNF10 | 12q24.31 | chr12:121004700 | ||||
VARS2 | 6p21.33 | chr6:30884719 | ||||
KRT6A | 12q13.13 | chr12:52885316 | ||||
PLA2G4C | 19q13.33 | chr19:48558271 | ||||
JARID2 | 6p22.3 | chr6:15520402 | ||||
PRB4 | 12p13.2 | chr12:11461427 | ||||
BRIP1 | 17q23.2 | chr17:59878736 | ||||
GSTM4 | 1p13.3 | chr1:110200278 | ||||
NBPF16 | 1q21.2 | chr1:148591281 | ||||
NRG3 | 10q23.1 | chr10:84733588 | TES | Chinese (96 cases and 110 controls) | ||
NRG3 | 10q23.1 | chr10:84733624 | ||||
NRG3 | 10q23.1 | chr10:84118499 | ||||
RET | 10p11.21 | 3splicing9 + 1 / c.2511_2519 delCCCTGGACC:p.S837fs / c.1818_1819insGGCAC:p. Y606fs / c.1761delG :p.G588fs / c.1858 T > C:p.C620R / c.409 T > G:p.C137G / c.1710C > A:p.C570X / c.526_528delGCA:p.R175del | WES | Chinese (5 trios) + Caucasian (19 trios) | Genome Biol | 2017 |
NCLN | 19p13.3 | c.496C > T:p.Q166Xb | ||||
NUP98 | 11p15.4 | c.5207A > G:p.N1736S | ||||
DENND3 | 8q24.3 | c.1921delT:p.K640fs | ||||
TBATA | 10q22.1 | c.157C > T:p.R53C | ||||
LRBA | 4q31.1 | rs140666848 | TES | Dutch (A multi-generational family: 5 patients and 2 functional constipation) | Gastroenterology | 2018 |
RET | 10p11.21 | c.1196C>T:p.P399L | WES | |||
NRP2 | 2q33.3 | rs114144673 | ||||
PGRMC2 | 4q28.2 | rs41298555 | ||||
OR1F1 | 16p13.3 | rs142486394 | ||||
CLUH | 17p13.3 | rs201361018 | ||||
PELP1 | 17p13.2 | rs199636910 | ||||
PELP1 | 17p13.2 | rs200062536 | ||||
IHH | 2q35 | c.151C>A:p.Q51K | ||||
GLI3 | 7p14.1 | rs121917716 | ||||
GDNF | 5p13.2 | c.676_681delGGATG:p.G226_ C227del | ||||
CCT2 | 12q15 | g.69993654 G > A | WGS | Chineses (9 trios) | EJHG | 2018 |
VASH1 | 14q24.3 | g.77242233 A > G | ||||
CYP26A1 | 10q23.33 | g.7481 A > G | ||||
PKD1L2 | 16q23.2 | g.84039 G > A | ||||
TMEM175 | 4p16.3 | g.952275 C > T | ||||
CSMD3 | 8q23.3 | g.113841961 T > C | ||||
CCDC82 | 11q21 | g.96117858 A > T | ||||
NRG1 | 8p12 | g.667454 G > C, g.92222 G > T, g.146124 A > G | ||||
ERBB4 | 2q34 | g.835055_835059delAAACA | ||||
SEMA3A | 7q21.11 | g.210732delT | ||||
ZEB2 | 2q22.3 | g.145137510 C > T | ||||
DCC | 18q21.2 | g.651331 G > A |
- TES, targeted exome sequencing; WES, whole exome sequencing; WGS, whole genome sequencing; EJHG, European Journal of Human Genetics.
Whole exome/genome sequencing and HSCR
In these years, whole exome sequencing (WES) and whole genome sequencing (WGS) have been more practical in genetics research on human diseases with technological development.37-39 For HSCR, more risk genes were successfully identified via both two strategies (Table 2). In 2013, our group performed whole exome sequencing of two HSCR patients from a Han Chinese family, obtained a total of 15 novel nonsynonymous single nucleotide variants (SNVs) in 15 genes, and validated the involvement of NRG3 mutations in 96 additional sporadic cases and 110 healthy controls by targeted sequencing of all nine exons.40 Recently, Gui et al41 reported another WES study in 24 HSCR trios and identified 28 de novo mutations in 21 different genes. They further showed that the orthologues of four genes (DENND3, NCLN, NUP98, and TBATA) are indispensable for ENS development in zebrafish, and these genes are also expressed in human and mouse gut and/or ENS progenitors. Lately, a targeted exome sequencing on a linkage interval 4q31.3–4q32.3 previously identified, coupled with a WES study identified several variants in LRBA, RET, GDNF, IHH, and GLI3 in a multigenerational Dutch family with history of HSCR. Further functional experiments showed that these variants disrupted the function of their encoded proteins, and knockdown of ihh in zebrafish significantly reduced the number of enteric neurons in the gut.42 In addition, a WGS study43 was conducted on 9 trios where the sporadic probands had L-HSCR or TCA and harbored no rare coding variants affecting the function of RET and other known HSCR risk genes. The authors located de novo protein-altering variants in three genes CCT2, VASH1, and CYP26A1, and de novo SNV/indels in non-coding regions of NRG1, ERBB4, SEMA3A, ZEB2, and DCC. They further indicated that the shared genetic features of the patients were enriched in the extracellular matrix–receptor (ECM–receptor) pathway, which was involved in the migration of enteric neurons precursors.
Conclusions and perspectives
Unravelling the genetics of polygenic diseases is a major challenge in the field of human genetics. As HSCR is a representative example of complex multigenic disorders with limited treatments and poor prognosis, much effort has been made in the investigation on genetics and pathogenesis of the disease. With the applications of GWAS/EWAS and NGS, a sum of novel mutations and genes has been stated in these years as we discussed in this review. However, they could account for only a minority of the total genetic risk for HSCR. Additional pathogenic mutations, causal variants and contributing genes are still to be found through more comprehensive genetics researches on subjects with larger sample size. Moreover, making use of GWAS/EWAS, NGS or both in combination with effective statistical analysis in silico, followed by the system biology approaches like high-throughput functional assays and appropriate models from animals or human induced pluripotent stem cell (HIPSC), should yielded huge advances in our understanding of the HSCR genetic basis. It may finally lead to precise prediction of HSCR risk and potentially to new therapies and improved outcomes.
ACKNOWLEDGEMENTS
We are grateful to all members of the Miao lab for helpful suggestions.
CONFLICT OF INTEREST
None of the authors declared any conflicts of interest.