Identification and characterization of novel genetic variants in the first Chinese family of mucopolysaccharidosis IIIC (Sanfilippo C syndrome)
Hongjun Zhao and Lijing Wang have contributed equally.
Abstract
Mucopolysaccharidosis type IIIC (MPS IIIC) is one of inherited lysosomal storage disorders, caused by deficiencies in lysosomal hydrolases degrading acidic mucopolysaccharides. The gene responsible for MPS IIIC is HGSNAT, which encodes an enzyme that catalyses the acetylation of the terminal glucosamine residues of heparan sulfate. So far, few studies have focused on the genetic landscape of MPS IIIC in China, where IIIA and IIIB were the major subtypes. In this study, we utilized whole-exome sequencing (WES) to identify novel compound heterozygous variants in the HGSNAT gene from a Chinese patient with typical MPS IIIC symptoms: c.743G>A; p.Gly248Glu and c.1030C>T; p.Arg344Cys. We performed in silico analysis and experimental validation, which confirmed the deleterious pathogenic nature of both variants, as evidenced by the loss of HGSNAT activity and failure of lysosomal localization. To the best of our knowledge, the MPS IIIC is first confirmed by clinical, biochemical and molecular genetic findings in China. Our study thus expands the spectrum of MPS IIIC pathogenic variants, which is of importance to dissect the pathogenesis and to carry out clinical diagnosis of MPS IIIC. Moreover, this study helps to depict the natural history of Chinese MPS IIIC populations.
1 INTRODUCTION
Mucopolysaccharidosis (MPS) refers to a collection of lysosomal storage disorders (LSD) that arise from deficiencies in lysosomal hydrolases, resulting in the accumulation of acidic mucopolysaccharides (glycosaminoglycans).1 Mucopolysaccharides are the main components of connective tissues, including hyaluronic acid, chondroitin sulfate, dermatan sulfate, heparan sulfate and keratin sulfate.2, 3 These polysaccharides are straight chain heteropolysaccharides that can be combined with a protein peptide chain and polymerize into larger molecules. The degradation of mucopolysaccharides must occur in lysosomes, while over 10 enzymes are known to be involved in their degradation process, with the lack of either enzyme hinders the breakdown of sugar chains. In the LSD patients, excessive mucopolysaccharide accumulates in bone, cartilage and other tissues or organs, thus affecting the normal development of these tissues or organs.4
Based on the clinical manifestations and enzyme defects, MPS can be divided into seven major types, and each type is further divided into several subtypes. Of those, MPSIII, also known as Sanfilippo syndrome, is an autosomal recessive metabolic genetic disorder characterized by the accumulation of heparan sulfate in the lysosomes.5 Four distinct subtypes of MPS III have been identified based on the genetic deficiencies of specific enzymes: SGSH (type A; OMIM 252900), NAGLU (type B; OMIM 252920), HGSNAT (type C; OMIM 252930) and GNS (type D; OMIM 252940).6 Among them, MPS IIIC (Sanfilippo C) is a multi-system MPS that begins in early childhood and is marked by progressive degeneration of the central nervous system, severe mental retardation and other neurological symptoms.7-9 Other clinical symptoms include skeletal and muscular issues (such as joint stiffness, scoliosis, hip dysplasia and contractures), hearing loss, respiratory and sinus infections, and heart problems.8, 9 Specifically, MPS IIIC is caused by mutations in the HGSNAT gene, which encodes acetyl-CoA:α-glucosaminide N-acetyltransferase (EC 2.3.1.78), a lysosomal transmembrane enzyme. This enzyme catalyses the acetylation of the terminal glucosamine residues of heparan sulfate before it is hydrolyzed by α-N-acetyl glucosaminidase (NAGase).10 The HGSNAT gene was identified as being responsible for MPS IIIC in 2006,11, 12 and since then, around 72 HGSNAT mutations have been reported (from HGMD).
MPS is a rare disease, accounting for less than 0.1% of all genetic diseases. The prevalence of different types of MPS is related to race and geography,13 with MPS type II mainly found in Asian population, and MPSI and MPSIII types in Europe. In particular, MPS IIIC and D were less prevalent in most populations, with estimates of MPS III prevalence ranging from 1 to 9 per 1 million individuals.14 So far, few studies have focused on the genetic landscape of MPS IIIC in China, where IIIA and IIIB were the major subtypes.15-17
In this study, we identified compound heterozygous HGSNAT variants in a Chinese patient with MPS IIIC: c.743G>A; p.Gly248Glu and c.1030C>T; p.Arg344Cys. Through in silico analysis and experimental validation, we confirmed that these variants are pathogenic and result in complete loss of HGSNAT enzyme function and failure of lysosomal localization. To the best of our knowledge, the MPS IIIC is first confirmed by clinical, biochemical and molecular genetic findings in China. Our study thus expands the mutation spectrum of MPS IIIC pathogenic genes, which is of importance to clarify the pathogenesis of MPS IIIC and to carry out genetic diagnosis.
2 MATERIALS AND METHODS
2.1 Subjects
A Chinese family, in which the proband had MPS IIIC, along with all available family members, were recruited at Xiangya Hospital of Central South University, Hunan, China. Blood samples were obtained from all family members using vacutainer tubes containing EDTA. The study was approved by the Institutional Review Board of Xiangya Hospital of Central South University (#2022020132) and adheres to the principles of the Declaration of Helsinki. Written informed consent was provided by all participants.
2.2 WES analysis
The phenol-chloroform extraction method was used to isolate genomic DNA from peripheral blood of the proband.18 Yikon Medical Laboratory Co., Ltd. (Shanghai, China) conducted WES on the proband (II:6) using the HiSeq 2000 platform (Illumina) and the SureSelect Human All Exon V6 kit (Agilent) for exome capture, according to the manufacturer's manual. The exome library was constructed using 350 ng genomic DNA that was sheared into 150–200 bp for enrichment using the Covaris instrument (Covaris). The platform collected 101 bp pair-end reads for sequencing the enrichment libraries for target regions. The genomic regions harbouring candidate variants were further amplified by PCR and Sanger sequencing in all family members.
2.3 In silico analysis
The variants were evaluated and annotated with a list of servers and databases including SIFT, Polyphen-2, MutationTaster, CADD, gnomAD, dbSNP, ClinVar, ChinaMAP and HUABIAO. The pathogenicity was further predicted with the algorithms, most recently empowered by machine learning, deep learning or neural network model, such as EVE,19 gMVP (version 2021-02-28),20 PrimateAI (v0.2),21 AlphaMissense,22 MutFormer23 and MAVERICK.24 The predictions were also carried out with InterVar (version 2022-06-13),25 based on the ACMG/AMP 2015 guidelines. The topologies of transmembrane HGSNAT were described by DeepTMHMM (version 1.0.15)26 and TOPCONS 2.0,27 involving several other topology-prediction tools. The conservation of variant sites was manifested by WebLogo3.28 The structure of HGSNAT protein was predicted by AlphaFold,29 and visualized and edited with PyMOL (Schrodinger, version 2.5.4). The stability of proteins were assessed by I-Mutant 2.0.30
2.4 Overexpression plasmids
To construct overexpression plasmids, the vector pUC57-HGSNAT::MYC::FLAG was synthesized (Tsingke Bio., Beijing, China). The fragment of HGSNAT::MYC::FLAG was amplified by PCR and ligated into the pCSC vector31 with AgeI/BsrGI double digestion. The obtained vector served as the backbone for site-directed mutagenesis to construct expression plasmids with genetic variants: c.743G>A; p.Gly248Glu, c.1030C>T; p.Arg344Cys and c.710C>A; p.Pro237Gln, respectively. All constructed plasmids were confirmed by Sanger sequencing before use.
2.5 Cell culture
293T cells (Procell) were routinely cultured in a humidified incubator at 37°C with 5% CO2. Dulbecco's Modified Eagle Media (DMEM) supplemented with 10% fetal bovine serum (FBS) and 1% each of penicillin, streptomycin and amphotericin B (Beyotime) was used as the cell culture medium.
2.6 Cellular HGSNAT/NAGase activity
293T cells were seeded into a 6-well plate and transfected, on the next day, with 1 μg of plasmids (pCSC-HGSNAT::MYC::FLAG) by polyethylenimine (PEI) respectively. Forty-eight hours later, cells were washed with PBS for three times. Cells were then lysed with H2O and collected into EP tubes for sonication. The concentration of obtained proteins were measured by the BCA method with a protein assay kit (Pierce), according to the manufacturer's protocol.
In a 96-well plate, add 10 μL protein homogenate, 5 μL McIlvain buffer (pH 5.5), 5 μL 3 mM substrate (MU-βGlcNH2, Biosynth, #EM31025) or 3 mM substrate (MU-βGlcNAc, Biosynth, #M5504) and 5 μL 5 mM Acetyl-CoA (Sigma, #A2056). Incubate at 37°C for 1 h and add 225 μL 0.4 M Glycine buffer (pH 10.4) to quench the reaction. The mixture was immediately monitored at 360 nm excitation and 450 nm emission wavelengths with a multi-mode plate reader (Cytation 5; BioTek). The HGSNAT / NAGase specific activity was calculated as:
Fluorescence × 0.25 × 1 × 1 = nmol/h/mg.
Standard curve slope 0.01 1 h homogenate (mg/mL).
2.7 Western blot and immunostaining
For western blot, 293 T cells were seeded in a 6-well plate and the next day transfected with 1 μg of plasmids respectively with PEI (pCSC-HGSNAT::MYC::FLAG). Twelve hours later, the cells were harvested with 200 μL of RIPA buffer (Beyotime). Then, each sample was added with 4× loading buffer and 2-ME and heated for 8 min at 55°C. Twenty micrograms of each protein was then separated using 10% SDS-PAGE and then transferred to a PVDF membrane (Millipore). The blots were blocked with 5% skimmed milk, and later incubated with diluted primary antibodies, including MYC (Proteintech #60003-2-Ig, 1:2000) and GAPDH (Abclonal #AC033, 1:20,000), overnight at 4°C. The next day, the blots were washed three times with TBST at a 15-min interval, and later incubated with the diluted secondary antibodies conjugated with HRP for 1 h at room temperature. After washing with TBST, western blots were detected using the ECL substrate and eventually visualized using the ChemiDoc XRS imaging apparatus (Bio-Rad).
For immunostaining, 293 T cells were seed on coverslips of a 24-well plate, and transfected with 250 ng plasmids (pCSC-HGSNAT::MYC::FLAG) and the reporter plasmid LAMP1::mCherry, respectively. After 48 h, cells were washed with PBS and fixed with 4% PFA. Cells were then incubated with BSA blocking buffer and further diluted primary antibody: FLAG (Abclonal #AE005, 1:100), at 4°C for overnight. Cells were washed with PBST and incubated with secondary antibody conjugated with AlexaFluor488 (Invitrogen #A21202, 1:500) at room temperature for 1 h. Finally, cells were counter-stained with DAPI and mounted with anti-fade PVA for further observation under confocal microscopy.
2.8 Statistics analysis
GraphPad Prism software (version 8.0.1) was utilized for statistical analysis and graphing. The Student's t-test was used for statistical analysis, with a 95% confidence level considered the significance of differences between groups. A p < 0.05 indicates statistically significant.
3 RESULTS
3.1 Clinical description and molecular analysis
In our study, the proband, a 15-year-old Chinese girl, was admitted for clinically performing both-hip pain and mental retardation examinations. Specifically, the proband experienced pain and discomfort in both hips 5 years earlier and was unable to walk in severe cases. The x-ray examination showed that the bilateral femoral head became flattened; multiple cystic low-density changes under the bilateral hip surface, as well as increased bone density of the bilateral sacroiliac joint surface (Figure 1A). Still with the physiological curvature of the spine, the vertebral sequence was slightly discontinuous; the lower margin of T10 and T11 vertebrae showed the formation of Schmorl's nodes, and the corresponding vertebral body was slightly flattened (Figure 1B). In addition, the proband is currently in grade 6 of primary school, with poor grades and cannot read by herself, suggesting that she has serious intellectual defects (data not provided). Her parents were all phenotypic normal and had non-consanguineous marriages. She has an older sister, who is normal and bearing two normal children, as well as three siblings who were already dead (Figure 2A).


To investigate the possibility of an inherited cause, we collected a peripheral venous blood sample from the proband for whole-exome sequencing (WES). The analysis revealed novel compound heterozygous variants in the HGSNAT gene: c.1030C>T (chr8:43037305 (hg19); NM_152419.3) and c.743G>A (chr8:43025837 (hg19); NM_152419.3). We subsequently performed targeted Sanger sequencing of the HGSNAT gene in other family members, revealing either a heterozygous variant or unaffected health status (Figure 2A,B). Considering this, the proband was thus diagnosed as MPS IIIC, which is rarely seen within the Chinese population.
Specifically, we confirmed that one variant (c.1030C>T), inherited from the father, leads to protein change (p.Arg344Cys); whereas the other variant (c.743G>A), inherited from the mother, causes protein change (p.Gly248Glu) that is novel and unappreciated before, after a thorough search of public databases and Chinese cohorts, including gnomAD, dbSNP, ClinVar, ChinaMAP and HUABIAO, among others (Figure 2B; Table 1).
Amino acids | SIFT | Polyphen2 | MutationTaster | CADD | gnomAD Exome | dbSNP | ClinVar | ||||
---|---|---|---|---|---|---|---|---|---|---|---|
Score | Prediction | Score | Prediction | Score | Prediction | Score | Prediction | ||||
p.Gly248Glu | 0 | Damaging | 1 | Probably damaging | 1 | Disease causing | 25.4 | Damaging | - | - | - |
p.Arg344Cys | 0.003 | Damaging | 1 | Probably damaging | 1 | Disease causing automatic | 34 | Damaging | 1.218e-05 | rs121908285 |
Accession: 1237 Phenotype: MPS IIIC Clinical significance: Pathogenic |
p.Pro237Gln | 0.223 | Tolerable | 0.151 | Benign | 1 | Polymorphism | 8.451 | Tolerable | 2.992e-05 | rs727503962 |
Accession: 167177 Phenotype: not specified Clinical significance: Benign |
3.2 In silico analysis of variants
The HGSNAT variants result in the substitution of proteins at amino acid position 344 and 248 respectively. Those positions are highly conserved in HGSNAT homologues across vertebrates after the evolutionary conservation analysis by WebLogo (Figure 2C,D). Functional prediction of both variants was conducted by a list of bioinformatics algorithms, including SIFT, Polyphen-2, MutationTaster, CADD and others, favoured a deleterious effect (Table 1). We further predicted the pathogenicity with the most recent algorithms, empowered by machine learning, deep learning or neural network model, such as EVE, gMVP, PrimateAI, AlphaMissense, MutFormer and MAVERICK, the p.Gly248Glu/p.Arg344Cys variants were all scored as highly pathogenic (Figure S1). As a control, we also examined another reported HGSNAT variant (p.Pro237Gln), which is benign and caused by a single nucleotide polymorphism (SNP) known as rs727503962. Moreover, to standardize the clinical interpretation of genetic variants according to the ACMG/AMP 2015 guidelines, we carried out the prediction with the help of InterVar (https://wintervar.wglab.org/). The p.Gly248Glu variant is of uncertain significance (PM2-moderate, PM5-moderate and PP3-supporting), while the p.Arg344Cys variant is likely pathogenic (PM1-moderate, PM2-moderate, PP3-supporting and PP5-supporting) (Table 2).
p.Gly248Glu | p.Arg344Cys | |
---|---|---|
InterVar | Uncertain significance | Likely pathogenic |
PVS1 | 0 | 0 |
PS1 | 0 | 0 |
PS1 grade | 1 | 1 |
PS2 | 0 | 0 |
PS2 grade | 1 | 1 |
PS3 | 0 | 0 |
PS3 grade | 1 | 1 |
PS4 | 0 | 0 |
PS4 grade | 1 | 1 |
PS5 | 0 | 0 |
PS5 grade | 1 | 1 |
PM1 | 0 | 1 |
PM1 grade | 2 | 2 |
PM2 | 1 | 1 |
PM2 grade | 2 | 2 |
PM3 | 0 | 0 |
PM3 grade | 2 | 2 |
PM4 | 0 | 0 |
PM4 grade | 2 | 2 |
PM5 | 1 | 0 |
PM5 grade | 2 | 2 |
PM6 | 0 | 0 |
PM6 grade | 2 | 2 |
PM7 | 0 | 0 |
PM7 grade | 2 | 2 |
PP1 | 0 | 0 |
PP1 grade | 3 | 3 |
PP2 | 0 | 0 |
PP2 grade | 3 | 3 |
PP3 | 1 | 1 |
PP3 grade | 3 | 3 |
PP4 | 0 | 0 |
PP4 grade | 3 | 3 |
PP5 | 0 | 1 |
PP5 grade | 3 | 3 |
PP6 | 0 | 0 |
PP6 grade | 3 | 3 |
3.3 Diminished HGSNAT activities caused by the variants
To further evaluate the potential deleterious impacts of the variants on HGSNAT, we generated plasmids expressing the full-length wild-type (WT) and mutant HGSNAT (p.Arg344Cys and p.Gly248Glu), respectively. To facilitate further biochemical assays, the HGSNAT ORF was C-terminally fused with two small tags, MYC (~1.2 kDa) and FLAG (~1 kDa) (Figure 3A). Equal amounts of these plasmids were then transiently transfected into 293 T cells, which produced HGSNAT::MYC::FLAG proteins within a short time of 12 h (Figure 3A). Forty-eight hours later, HGSNAT activity was found to be significantly increased in the HGSNATWT transfected cells (average 342.903 nmol/h/mg), whereas both mutant HGSNATs resulted in diminished HGSNAT activities (average 22.2692 and 19.0039 nmol/h/mg, respectively) (Figure 3B). As a control, the NAGase specific activities were comparable among all groups (Figure 3C), reminiscent of the type IIIC specificity.

To examine if the mutant HGSNAT proteins were correctly localized to lysosomes, 293 T cells were co-transfected with both HGSNAT constructs and LAMP1::mCherry, a lysosome reporter, for 48 h and fixed for immunostaining. Examination of cells via confocal microscope demonstrated that LAMP1::mCherry co-localized with HGSNATWT and HGSNATPro237Gln proteins, whereas HGSNATArg344Cys or HGSNATGly248Glu exhibited limited overlapping with lysosomes (Figure 3D). The diminished enzyme activities, as well as the failure in lysosomal location, thus confirmed the pathogenic effects of the variants.
3.4 The underpinnings of detrimental effects caused by the variants
As human HGSNAT is a transmembrane protein that undergoes glycosylation, there is currently no crystal structure available for it, which poses additional difficulties in assessing the functional impact of its variants. In silico analysis by DeepTMHMM and TOPCONS predicted the topologies of HGSNAT enzyme with 11 transmembrane helices in the lysosomal membrane (Figure 4A,B). As predicted by a list of web servers, the p.Gly248 position was placed either on the edge of or inside the lumen area of a transmembrane segment, while the p.Arg344 position was located in the domain exposed into the lysosome (Figure 4C). The p.Gly248 variant appears not to dampen the formation of transmembrane helices as a whole (Figure 4C), however, considering that the p.Gly248Glu variant alters the hydrophobic Gly into hydrophilic negative charged Glu, it is highly possible that the variant contributes to the unstable topology of the transmembrane segment (Figure 4D). Likewise, the p.Arg344Cys variant changes the hydrophilic positive charged Arg into hydrophobic Cys, the surrounding chemical bonds are likely to be remould (Figure 4D). This might be further reinforced by appreciating the protein structure predicted by AlphaFold, which showed essential alterations of hydrogen bonds caused by the variants (Figure 4E). Additionally, we also evaluated the effect of protein mutations of HGSNAT (UniProt Q68CP4) by I-Mutant, and showed that both variants may lead to large decrease of stability (Table S1).

4 DISCUSSION
This study characterizes compound heterozygous HGSNAT variants that were identified in a Chinese patient exhibiting typical symptoms of MPS IIIC: c.743G>A; p.Gly248Glu and c.1030C>T; p.Arg344Cys. We verified them both as detrimental pathogenic variants and dissected the possible underpinnings, by both in silico analysis and experimental validation. The MPS IIIC is, by far, first confirmed in China, which may help to outline the natural history of Chinese MPS IIIC populations. Note that a recent study has reported one case of MPS IIIC out of 34 Chinese patients with MPS III, to be frustrated, no reasonable/clear variants were found out in that case.15 The majority of MPS cases in China remains I, II, IIIA and IIIB.15-17
Revisiting from another in silico perspective, as the variant c.743G>A just located on the boundary between exon 7 and 8, it is possible that the variant may interfere with the RNA splicing process. However, since the coding mutant of p.Gly248Glu has been proved to diminish the HGSNAT activity, the splicing possibility becomes greatly dwarfed. Alternatively, its effect on the transmembrane helix merits further experimental tests. To this end, several assays might be feasible.
The first assay involved the Escherichia. coli inner membrane protein leader peptidase, which can detect the precise free energy (ΔGapp) of translocon-mediated integration of transmembrane helices into the membrane. This assay quantified the proper integration of the transmembrane segment, including both WT and mutants.32, 33 Another approach to investigate the topology of HGSNAT involves a Cys-accessibility assay, wherein the reagent 4-acetamido-4′-maleimidylstilbene-2,2′-disulfonic acid (AMS), which cannot permeate the membrane, is used to react with Cys residues; Only Cys residues that are exposed to the cytosol and not shielded by the membrane will undergo this reaction.34, 35 The assay successfully validated the possible topology changes by CFTR p.Gly91Arg and p.Gly85Glu variants, and specifically, p.Gly85Glu misfolding is based in transmembrane destabilization by Glu and loss of Gly.35 Interestingly, the membrane topology can also be assessed by biophysical avenues, for example, atomistic simulations and nuclear magnetic resonance (NMR) measurements. Through the use of those techniques, Xu et al. illustrated that a single-residue polymorphism in the transmembrane domain of FcγRIIB, known as FcγRIIB-T232, which is linked to systemic lupus erythematosus (SLE) across global populations, resulted in a significant increase in the bending angle of the transmembrane helix. This led to a more tilted orientation in the lipid membrane, thereby decreasing the lateral mobility and inhibitory capabilities of FcγRIIB.36, 37
While this study expands the spectrum of MPS IIIC pathogenic variants and facilitates clinical diagnosis, gene editing efforts to restore the HGSNAT activity have also been devoted. The multi-pass transmembrane biology of HGSNAT leads to its own challenges, compared to those soluble secreted proteins. As molecular chaperones might be possible options,38, 39 the in situ gene editing to reverse the mutation is also promising, due to the rapid development of CRISPR/Cas9 DNA editing system. Given that both variants result in a G>A change (equivalent to C>T), it is probable that adenine base editors (ABEs) will be employed to facilitate the conversion of A•T to G•C base pairs in genomic DNA.40 Although our preliminary efforts to search for candidate short guide RNAs (sgRNAs) matching with ABEs have been limited in HGSNAT, due to the lack of suitable editing windows (data not shown), more evolved and improved versions of ABEs are expected to be employed in near future.
AUTHOR CONTRIBUTIONS
Hongjun Zhao: Methodology (equal); resources (equal). Lijing Wang: Methodology (equal); resources (equal). Mengfei Zhang: Data curation (supporting). Huakun Wang: Data curation (supporting). Sizhe Zhang: Data curation (supporting). Junjiao Wu: Conceptualization (equal); supervision (equal); writing – original draft (equal); writing – review and editing (equal). Yu Tang: Conceptualization (equal); funding acquisition (lead); supervision (equal); writing – original draft (equal); writing – review and editing (equal).
ACKNOWLEDGEMENTS
We are grateful for the kind help of Dr. Alexey V. Pshezhetsky and Xuefang Pan from University of Montreal, Canada, Dr. Isaac Canals from Lund University, Sweden and Dr. Dominic Winter from University of Bonn, Germany. We thank the patients and their families who participated in the study and the members of the Tang laboratory for insightful discussions.
FUNDING INFORMATION
This study was funded by National Natural Sciences Foundation of China [No. 82271280 to YT and 82301433 to JJW], Hunan Provincial Natural Science Foundation of China [No. 2022JJ40824 to JJW], Scientific Research Project of Hunan Provincial Health Commission [No. B202303070054 to YT], Talents Startup Fund [No. 2209090550 to YT], Youth Science Fund [No. 2021Q04 to JJW] and Project Program of National Clinical Research Center for Geriatric Disorders [No. 2022LNJJ14 to HJZ] of Xiangya Hospital, Central South University, Changsha, China.
CONFLICT OF INTEREST STATEMENT
The authors declare no conflict of interests.
CONSENT FOR PUBLICATION
All participants have consented to publication.
Open Research
DATA AVAILABILITY STATEMENT
Data are available from the corresponding author upon request.