Volume 35, Issue 1 pp. 86-95
Research Article
Full Access

The ETFDH c.158A>G Variation Disrupts the Balanced Interplay of ESE- and ESS-Binding Proteins thereby Causing Missplicing and Multiple Acyl-CoA Dehydrogenation Deficiency

Rikke K. J. Olsen

Corresponding Author

Rikke K. J. Olsen

Research Unit for Molecular Medicine, Aarhus University Hospital and Department of Clinical Medicine, Aarhus University, Aarhus, Denmark

Correspondence to: Rikke K. J. Olsen, Research Unit for Molecular Medicine, Aarhus University Hospital and Department of Clinical Medicine, Brendstrupgårdsvej 100 (Science Center Skejby, Building G), Aarhus N DK-8200, Denmark. E-mail: [email protected]

Correspondence to: Brage S. Andresen, Department of Biochemistry and Molecular Biology, University of Southern Denmark, Campusvej 55, Odense M DK-5230, Denmark. E-mail: [email protected]

Search for more papers by this author
Sabrina Brøner

Sabrina Brøner

Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark

Search for more papers by this author
Rugivan Sabaratnam

Rugivan Sabaratnam

Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark

Search for more papers by this author
Thomas K. Doktor

Thomas K. Doktor

Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark

Search for more papers by this author
Henriette S. Andersen

Henriette S. Andersen

Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark

Search for more papers by this author
Gitte H. Bruun

Gitte H. Bruun

Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark

Search for more papers by this author
Birthe Gahrn

Birthe Gahrn

Research Unit for Molecular Medicine, Aarhus University Hospital and Department of Clinical Medicine, Aarhus University, Aarhus, Denmark

Search for more papers by this author
Vibeke Stenbroen

Vibeke Stenbroen

Research Unit for Molecular Medicine, Aarhus University Hospital and Department of Clinical Medicine, Aarhus University, Aarhus, Denmark

Search for more papers by this author
Simon E. Olpin

Simon E. Olpin

Department of Clinical Chemistry, The Children's Hospital, Sheffield, United Kingdom

Search for more papers by this author
Angus Dobbie

Angus Dobbie

Department of Clinical Genetics, St James's University Hospital, Leeds, United Kingdom

Search for more papers by this author
Niels Gregersen

Niels Gregersen

Research Unit for Molecular Medicine, Aarhus University Hospital and Department of Clinical Medicine, Aarhus University, Aarhus, Denmark

Search for more papers by this author
Brage S. Andresen

Corresponding Author

Brage S. Andresen

Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark

Correspondence to: Rikke K. J. Olsen, Research Unit for Molecular Medicine, Aarhus University Hospital and Department of Clinical Medicine, Brendstrupgårdsvej 100 (Science Center Skejby, Building G), Aarhus N DK-8200, Denmark. E-mail: [email protected]

Correspondence to: Brage S. Andresen, Department of Biochemistry and Molecular Biology, University of Southern Denmark, Campusvej 55, Odense M DK-5230, Denmark. E-mail: [email protected]

Search for more papers by this author
First published: 07 October 2013
Citations: 27

Contract grant sponsors: The Danish Medical Research Council (271-07-342, 11-107174, 271-08-0120); The Novo Nordisk Foundation (15430).

Communicated by Peter K. Rogan

ABSTRACT

Multiple acyl-CoA dehydrogenation deficiency is a disorder of fatty acid and amino acid oxidation caused by defects of electron transfer flavoprotein (ETF) or its dehydrogenase (ETFDH). A clear relationship between genotype and phenotype makes genotyping of patients important not only diagnostically but also for prognosis and for assessment of treatment. In the present study, we show that a predicted benign ETFDH missense variation (c.158A>G/p.Lys53Arg) in exon 2 causes exon skipping and degradation of ETFDH protein in patient samples. Using splicing reporter minigenes and RNA pull-down of nuclear proteins, we show that the c.158A>G variation increases the strength of a preexisting exonic splicing silencer (ESS) motif UAGGGA. This ESS motif binds splice inhibitory hnRNP A1, hnRNP A2/B1, and hnRNP H proteins. Binding of these inhibitory proteins prevents binding of the positive splicing regulatory SRSF1 and SRSF5 proteins to nearby and overlapping exonic splicing enhancer elements and this causes exon skipping. We further suggest that binding of hnRNP proteins to UAGGGA is increased by triggering synergistic hnRNP H binding to GGG triplets located upstream and downsteam of the UAGGGA motif. A number of disease-causing exonic elements that induce exon skipping in other genes have a similar architecture as the one in ETFDH exon 2.

Introduction

Multiple acyl-CoA dehydrogenation deficiency (MADD) is a devastating, multisystemic, and often lethal disorder of fatty acid, amino acid, and choline metabolism. It is characterized by dysfunction of a number of intramitochondrial dehydrogenation reactions resulting in energy deficiency due to mutated electron transfer flavoprotein (ETF) or ETF dehydrogenase (ETFDH; MIM #231680) [Frerman and Goodman, 2001]. Patients are categorized into severe (MADD:S) or milder (MADD:M) forms depending on the nature of the defects in the genes (ETFA, ETFB, or ETFDH) that code for the two ETF subunits and for ETFDH [Goodman et al., 2002; Olsen et al., 2003]. Recently, a third form of MADD (RR:MADD) with ETFDH variations and responsiveness to riboflavin (vitamin B2) was recognized [Gempel et al., 2007; Gregersen et al., 1982; Olsen et al., 2007]. Riboflavin is the precursor of ETFDH's FAD cofactor, and riboflavin responsiveness results from the ability of FAD to act as a chemical chaperone that promotes the folding of certain misfolded ETFDH proteins, and thereby mitigate or normalize disease symptoms [Cornelius et al., 2012]. As such, patients are not expected to benefit from riboflavin treatment if they are diagnosed as homozygous or compound heterozygous for variations, which severely affect ETFDH mRNA splicing since such variations prevent production of functional proteins on which FAD can act.

Most known variations that affect mRNA splicing directly change or introduce new splice sites, and the pathological consequences of such variations can often be predicted solely on the basis of the genomic sequence and calculated splice site scores. However, it has in recent years become evident that also a significant proportion of nucleotide variations located in sequences outside the splice sites can cause aberrant mRNA splicing and human disease [Andresen and Krainer, 2009; Dobrowolski et al., 2010; Doktor et al., 2011; Heintz et al., 2012; Homolova et al., 2010; Nielsen et al., 2007]. The best known of such sequences are the exonic splicing enhancers (ESEs), which are short, degenerate sequence elements that direct the splicing machinery to the correct splice sites by serving as binding sites for splicing factors such as the serine/arginine-rich (SR) proteins [Manley and Krainer, 2010]. On the other hand, exonic splicing silencers (ESSs) suppress splicing and often bind proteins from the heterogeneous nuclear ribonucleoprotein family [Dreyfuss et al., 2002], but the precise mechanisms by which splicing is inhibited is still only characterized in a limited number of genes [Andresen and Krainer, 2009]. Together with splice site strength, the balanced interplay between binding of splicing regulatory proteins to splicing enhancers and silencers in a given transcript controls its splicing efficiency and specificity [Andresen and Krainer, 2009]. The fact that exonic sequence changes can cause aberrant mRNA splicing and thereby show a completely different effect from what can be predicted from the genetic code, has important implications for the clinical diagnosis of human genetic diseases in general, but especially for diseases, such as MADD, where correct assessment of the pathogenic consequences of identified variations is not only of diagnostic, but also of prognostic value with important consequences in decision making for treatment regimes.

In the present study, we used patient samples in combination with splicing reporter minigenes and RNA affinity purification of nuclear proteins to address the impact on ETFDH exon 2 splicing of a predicted benign missense variation (c.158A>G/p.Lys53Arg) identified in a deceased newborn with biochemical indications of severe MADD.

Materials and Methods

Patient and Control Specimens

Primary human dermal fibroblasts from the deceased newborn with suspected MADD deficiency and from two healthy individuals were cultivated in RPMI 1640 medium (Lonza, Copenhagen, DK) supplemented with 10% (v/v) fetal calf serum, 0.29 mg/ml glutamine, 100 U/ml penicillin, and 0.1 mg/ml streptomycin at 37°C in 5% (v/v) carbon dioxide (CO2). Cell pellets for DNA and RNA isolation were frozen at −80°C. DNA samples from healthy controls were obtained from an in-house bank of human DNA specimens. The studies are diagnostic and were performed according to regulations from the Danish Ethical Committee (#1-10-72-20-13).

Fatty acid Oxidation Flux Studies

Fatty acid oxidation flux was measured in cultured fibroblasts as previously described [Manning et al., 1990; Olpin et al., 1999]. Flux results were expressed as a percentage of 3–5 simultaneous healthy controls.

Mutation Analysis

DNA was extracted from frozen fibroblasts pellets using the Puregene Blood Core Kit A (GentraSystems, Minneapolis, MN). Sequence analyses of all 31 exons and their flanking introns that make up the human ETFA, ETFB, and ETFDH genes were performed essential as described by Olsen et al. (2003) on an ABI 3100 Avant Genetic Analyzer (Applied Biosystems, Foster City, CA), and analyzed using Sequencer v3.1.1 (Gene Codes Corp., Ann Arbor, MI). All references to nucleotides or amino acids in the text are based on the cDNA sequence of human ETFDH (RefSeq NM_004453.2). The initiating ATG codon is numbered as bp 1_3, and the initiator methionine is numbered as amino acid 1. The ETFDH c.158A>G variation has been submitted to the locus-specific database LOVD (www.lovd.nl/ETFDH).

RNA Isolation and RT-PCR

Total RNA was isolated from frozen fibroblasts using TRIzol® Reagent (Invitrogen, Carlsbad, CA). Complementary DNA (cDNA) was synthesized from 1 μg RNA using iScriptTM cDNA Synthesis Kit (Bio-Rad, Hercules, CA). An ETFDH exon 1 forward primer (5′CGCGAGCAGCGGACAGTCCTCCTGTT) and an ETFDH exon 3 reverse primer (5′ACACGGATGTCCTTTTCATGTGCCACAGC) were used for PCR amplification and sequence analysis of cDNA. PCR fragments were separated on a 2% agarose gel and used directly for sequence analysis.

Protein Isolation and Western Blot Analysis

Protein was extracted from frozen fibroblast pellets in a lysis buffer containing 50 mM Tris–HCL, pH 7.8, 5 mM EDTA, pH 8.0, 1 mM DTT, 10 μg/ml Aprotinin (Sigma–Aldrich, Seelze, Germany), 1 mg/ml trypsin inhibitor (Bie & Berntsen, Rodovre, Denmark), one tablet of protease inhibitors (Roche, Mannheim, Germany) in 10 ml, and 1% Triton X-100. Following centrifugation, 5 or 20 μg of the lysate was analyzed by SDS-PAGE on 12.5% Tris–HCl Criterion Gels (Bio-Rad). Blotting to a Polyvinylidene difluoride (PVDF) membrane was performed using a Semi-Dry Transfer Cell (Bio-Rad), and the membrane was incubated overnight with a monoclonal human anti-ETFQO antibody (MitoSciences, Eugene, OR) and voltage-dependent anion channel VDAC1/porin (Abcam, Cambridge, MA), followed by incubation with secondary goat anti-rabbit-HRP antibody (Dako, Glostrup, Denmark). Detection was done using ECL plus Western Blotting Detection System (GE Healthcare, Little Chalfont, UK), and the blots were scanned using Chemi-Doc (UVP, Upland, CA).

Splicing Reporter Minigene Analyses

Wild-type and variant double-stranded DNA oligonucleotides corresponding to c.149_167 of ETFDH exon 2 were inserted into the alternatively spliced second exon in the RHC-Glo splicing reporter minigene [Singh and Cooper, 2006]. The second exon in the RHC-Glo splicing reporter is immediately flanked upstream and downstream by the last and first 91 and 73 nucleotides of human β-globin intron 1, respectively. The distal upstream segment of intron 1 contains introns 1 and 3 of chicken skeletal troponin I (sTNI), and the distal downstream region of intron 2 contains the last 364 nucleotides of sTNI intron 3. The inclusion of the alternatively spliced second exon is critically dependent on the balance between ESE's and ESS's in the inserted sequence. The integrity of all constructs was confirmed by sequencing. Transfection studies and RT-PCR were performed as previously described [Heintz et al., 2012].

RNA Affinity Purification of Nuclear Proteins

The affinity purification of RNA-binding proteins was performed with 3′-biotin-coupled RNA oligonucleotides (DNA Technology, Risskov, Denmark) as previously described [Nielsen et al., 2007]. The sequences of the RNA oligonucleotides, comprising ETFDH position c.149_170 were: 5′-CCCGGGAUAAGGACAAGAGAUG-3′(Wt), 5′-CCCGGGAUAGGGACAAGAGAUG-3′(Mut), 5′-CCC GGGAUCAGGACAAGAGAUG-3′(Wt-C), and 5′-CCCGGGAUC GGGACAAGAGAUG-3′(Mut-C). For each purification, 100 pmol of RNA oligonucleotide was coupled to 100 μl of streptavidin-coupled magnetic beads (Invitrogen) and incubated with HeLa nuclear extract (Cilbiotech, Mons, Belgium). After washing, bound proteins were investigated by western blotting using a monoclonal mouse antibody toward SRSF1 (SF2/ASF) (AK96 from Zymed Laboratories [Invitrogen]) or polyclonal antibodies toward hnRNP A1, hnRNP A2/B1, hnRNP H, or SRSF5 (SRp40) (sc-10029, sc-10035, sc-10043, or sc-33418 from Santa Cruz Biotechnology, Santa Cruz, CA).

Knockdown of SR Proteins

siRNA knockdown experiments were performed in HEK293 and HeLa cells. Approximately 150,000 cells were seeded in each well in a six-well plate and 30 pmol of SRSF1 siRNA oligonucleotides (L-018672; Dharamacon, Lafayette, CO), SRSF5 siRNA oligonucleotides (L-007279; Dharmacon), or a scrambled control (D-001810, Dharmacon) were transfected into the cells using RNAiMAX (Invitrogen). After 48 hr, cells were harvested. RNA isolation and RT-PCR analysis of ETFDH exon 2 splicing was performed as described above. Downregulation of SRSF1 and SRSF5 was confirmed by RT-qPCR.

In Silico Predictions

The effect of the ETFDH c.158A>G variation on ETFDH protein structure and function was assessed using the PolyPhen (http://genetics.bwh.harvard.edu/pph/) [Sunyaev et al., 2001] and SIFT (http://sift.jcvi.org) [Ng and Henikoff, 2003] programs. Bioinformatics prediction of mitochondrial location was analyzed using MitoProt II (http://ihg.gsf.de/ihg/mitoprot.html) [Claros and Vincens, 1996] and Predotar (http://urgi.versailles.inra.fr/predotar) [Small et al., 2004]. Putative splicing regulatory elements were predicted using the ESEfinder 3.0 program (http://rulai.cshl.edu/cgi-bin/tools/ESE3/esefinder.cgi?process=home) [Cartegni et al., 2003] and the ASSEDA server (http://splice.uwo.ca/) [Mucaki et al., 2013]. Splice site strengths were calculated using the MaxEntScan software (http://genes.mit.edu/burgelab/maxent/Xmaxentscan_scoreseq.html) [Yeo and Burge, 2004]. Potential 3′ splice sites in a sequence covering position c.35_50 to c.175+47 of the human ETFDH gene were analyzed using the NNSplice 0.9 program (http://www.fruitfly.org/seq_tools/splice.html) [Reese et al., 1997], the NetGene2 program (http://www.cbs.dtu.dk/services/NetGene2/) [Brunak et al., 1991], and the Human Splicing Finder 2.4.1. (http://www.umd.be/HSF/) [Desmet et al., 2009].

Results

Identification of a Seemingly Benign ETFDH Missense Variation in a Deceased Neonate with Severe MADD

A newborn child, born to related parents, died on day 2 after birth. Tandem mass spectrometry analysis of acylcarnitines from a dried postmortem blood spot showed increases of short, medium, and long-chain acylcarnitine species, suggesting MADD. Fatty acid oxidation flux in patient fibroblasts supported severe MADD with oxidation of myristate, palmitate, and oleate of 3%, 5%, and 2% of that of control samples, respectively. Genotyping of the ETFA, ETFB, and ETFDH genes revealed a novel ETFDH c.158A>G variation, which was not identified in an in-house collection of 106 control DNAs. Also, the c.158A>G variation has not been reported in the SNP database [http://www.ncbi.nlm.nih.gov/snp/], or in the Exome Variant Server database [http://evs.gs.washington.edu/EVS/], suggesting that it is a rare variation. Both parents were heterozygous for c.158A>G. From the genetic code, the c.158A>G variation is predicted to result in the conservative replacement of lysine-53 with arginine. Lysine-53 is located in a less conserved α-helix on the surface of the ETFDH protein, far from any of the known catalytic residues. Accordingly, the pathological consequence of this variation on protein structure and function is predicted to be “benign” or “tolerated” using the algorithms PolyPhen [Sunyaev et al., 2001] and SIFT [Ng and Henikoff, 2003], respectively (data not shown).

The ETFDH c.158A>G Variation Causes Exon 2 Skipping Resulting in Deletion of the Mitochondrial-Targeting Sequence and Lack of ETFDH Protein in Patient Cells

The amino acid change resulting from the c.158A>G variation was predicted to be benign, which is in sharp contrast to the fact that we observed a dramatically reduced fatty acid oxidation flux in the patient's fibroblasts and a lethal outcome. Therefore, we speculated that the c.158A>G variation could be disease-causing due to aberrant splicing. The c.158A>G variation is located in exon 2, which has a suboptimal 3′ splice site with mismatches to the U2AF35 binding motif at the exonic +1 and +2 positions and a weak polypyrimidine tract (Fig. 1A). As such, the intron 1/exon 2 junction of ETFDH is recognized as a 3′ splice site only by one of the three splice predictive programs (see Materials and Methods). We analyzed fibroblast ETFDH cDNA from the patient and two control samples, and observed complete skipping of exon 2 in cDNA from the patient. Both control samples expressed the expected wild-type product containing exon 2. Skipping of exon 2 results in an in-frame deletion (c.35_175del141nt) in the ETFDH mRNA and thus encoding of a truncated ETFDH polypeptide (Fig. 1B and C). This truncated polypeptide (p.Tyr13_Gly59del47aa) lacks 21 of the 33 amino acids that make up the mitochondrial-targeting signal of ETFDH. Accordingly, MitoProt II [Claros and Vincens, 1996] and Predotar [Small et al., 2004] predict that the mutated ETFDH polypeptide will not be localized to the mitochondria. In order to reach a proteolytically stable and functional configuration, the ETFDH polypeptide needs to interact with mitochondrial chaperones and cofactors [Cornelius et al., 2012]. Western blot analysis of cellular protein extract revealed lack of ETFDH protein in fibroblasts from the patient, which suggests that the mutated ETFDH protein is degraded in the cytosol (Fig. 1D).

Details are in the caption following the image
The ETFDH c.158A>G variation is associated with exon 2 skipping in patient cells. A: Schematic representation of ETFDH exon 1–3 and location of the c.158A>G variation in exon 2, which has a weak 3′ splice site. B: Test gel of RT-PCR products amplified from fibroblast cDNA from the patient (P) and two control samples (C1 and C2). The inclusion or exclusion of exon 2 is indicated on the right and verified by sequence analysis of the RT-PCR products in C. D: Western blot analysis showing lack of ETFDH protein in patient cells. Five or 20 μg of protein extracts isolated from patient (P) or control (C1 and C2) fibroblasts were separated by SDS-PAGE and ETFDH-specific protein was visualized using an antibody against ETFDH and a VDAC antibody as a loading control. Purified human ETFDH antigen was loaded to verify the identity of the band as ETFDH.

The ETFDH c.158A>G Simultaneously Creates Binding Motifs for hnRNP A1, hnRNP A2/B1, and hnRNP H That Inhibits Splicing

The ETFDH c.158A>G variation creates a TAGGGA sequence motif (c.156_161) with a perfect match to the high-affinity hnRNP A1 SELEX winner motif, UAGGG(A/U) [Burd and Dreyfuss, 1994] (Fig. 2A). According to a position-weight scoring matrix for hnRNP A1 binding [Cartegni et al., 2006], the ETFDH wild-type motif UAAGGA has a score of 3.87, which is increased to the maximum score of 6.61 by the c.158A>G variation. Binding of hnRNP A1 is known to repress splicing and aberrant splicing caused by creation of its binding motif by disease-causing variations has been described in several genes [Doktor et al., 2011; Expert-Bezançon et al., 2004; Nielsen et al., 2007; Zhu et al., 2001]. Splicing regulatory proteins from the SR family or hnRNP families share overlapping and degenerate recognition sequences. Therefore, in silico splicing prediction algorithms cannot always reliably and accurately predict which proteins will bind and impact splicing of a certain gene segment. The UAGGGA sequence motif (c.156_161) created by the c.158A>G variation also matches the UAGRGA (R = A or G) and the DGGGD (D = A, U, or G) motifs, which are recognized by splicing inhibitory proteins from the hnRNP A2/B1 and hnRNP H/F families, respectively [Huelga et al., 2012; Hutchinson et al., 2002; Schaub et al., 2007]. In particular, the presence of two or more triplet GGG sequences seems to be important for hnRNP H binding [Dobrowolski et al., 2010; Schaub et al., 2007], and the c.158A>G variation results in the creation of two closely spaced GGG triplets corresponding to cDNA positions 152_154 and 158_160 (Fig. 2A), and an additional flanking GGG triplet is present only a few nucleotides further downstream (c.170_172) immediately flanking the 5′-splice site. The precise mechanisms by which hnRNP proteins can repress splicing are still not fully elucidated. One mechanism is by counteracting the binding of SR proteins to nearby or overlapping ESE elements and in turn preventing efficient recruitment of splicing factors to splice sites.

Details are in the caption following the image
Splicing minigene assay to verify a putative ESS element. A: Schematic representation of the RHC-Glo splicing reporter minigene and the subcloned constructs. A pictogram of the hnRNP A1 score matrix [Cartegni et al., 2006] is shown above the putative ESS element and the corresponding calculated hnRNP A1 binding scores for each construct are shown on the right along with binding scores for the SR proteins SRSF1, SRSF6, and SRSF2 calculated using the ESEfinder 3.0 tool [Cartegni et al., 2003]. The threshold binding scores, for each splicing regulatory protein, are indicated in italics with their corresponding binding elements marked in similar colors in the minigene constructs on the left. The two SRSF1 ESE motifs CCCGGGA (c.149_155) and GACAAGA (c.160_166) with predicted binding scores of 2.8 and 2.5, respectively, do not change among the different constructs. The control construct represents the original gene fragment inserted in the RHC-Glo splicing reporter minigene [Singh and Cooper, 2006]. B: Test gel of RT-PCR minigene splicing products expressed in HEK293 cells. The inclusion or exclusion of exon 2 is indicated on the right. The gel pictures are representative results from two experiments.

We used the ESEfinder 3.0 tool [Cartegni et al., 2003] to analyze ETFDH exon 2 for putative ESE motifs located in the vicinity of the c.158A>G variation. The ESEfinder program predicts that a SRSF1 (SF2/ASF) ESE motif (CCCGGGA) is located 3 nt upstream of the c.158A>G variation and another SRSF1 motif (GACAAGA) is located 2 nt downstream. The downstream SRSF1 motif overlaps with the hnRNP A1, hnRNP A2/B1, and hnRNP H binding motifs created by the c.158A>G variation, but the affinity score of the SRSF1 motif is not abrogated by the variation (Fig. 2A). We therefore speculated that the ETFDH c.158A>G variation causes aberrant splicing by creating a high-affinity ESS-binding site for the above-mentioned splicing inhibitory hnRNP proteins, and that this prevents SRSF1 from promoting splicing of exon 2 by blocking its binding to the overlapping or nearby ESE's. Alternatively, SRSF1 binding to the two flanking ESE motifs normally functions by inhibiting binding of hnRNP A1/hnRNP A2/B1 to the low-score hnRNP A1/hnRNP A2/B1 site in the wild-type sequence, and SRSF1 binding is not sufficiently strong to antagonize hnRNP A1/hnRNP A2/B1 binding to the high-affinity motif created by the c.158A>G variation and this allows hnRNP proteins to inhibit splicing.

To investigate the effects of these closely spaced hnRNP A1-, hnRNP A2/B1-, hnRNP H-, and SRSF1-binding sites on splicing, wild type, and mutant forms of the ETFDH sequence from position c.149_167 were inserted into the alternatively spliced exon 2 of the RHC-Glo splicing reporter minigene [Singh and Cooper, 2006], and exon 2 inclusion was assessed by RT-PCR after transfection of HEK293 cells (Fig. 2B). This showed that the wild-type sequence results in some exon skipping when inserted, indicating that it is not fully effective in promoting splicing of the reporter minigene (Wt in Fig. 2B). As primary fibroblasts from healthy individuals with a wild-type ETFDH genotype show 100% exon 2 inclusion (Fig. 1B), other sequences within ETFDH exon 2 may also be required for proper exon 2 recognition, and/or certain sequence contexts of the minigene or the splice machinery of transformed HEK293 cells may preclude effective exon 2 inclusion. Cells transfected with the ETFDH c.158A>G-mutated splicing reporter showed complete skipping of exon 2 (Mut in Fig. 2B). This demonstrates that this part of exon 2 holds ESE sequences capable of driving splicing in a nonnative gene context, and that the c.158A>G variation inhibits this, which is consistent with the creation of an ESS or the loss of an ESE. To further delineate the nature of this sequence element, we introduced an A>C change at the invariant position 2 in the consensus high-affinity hnRNP A1 motif UAGGGA/U. This mutation also disrupts the hnRNP A2/B1 motif, UAGRGA, and the hnRNP F/H family DGGGD motif. We and others have

previously demonstrated that changing this conserved position abrogates the silencing effect of similar motifs (CAGGGG and CAGGGU) [Dobrowolski et al., 2010; Doktor et al., 2011; Expert-Bezançon et al., 2004; Nielsen et al., 2007]. The introduction of the c.157A>C substitution in the c.158A>G-mutated minigene (Mut-C) increased exon 2 inclusion significantly, although not to the same level as that observed when the construct contains the wild-type sequence (compare Mut-C and Wt in Fig. 2B). An increase in splicing efficiency as compared with wild-type sequence constructs was observed when the ETFDH c.157A>C substitution was introduced into the wild-type ETFDH minigene (compare Wt-C and Wt in Fig. 2B). Because the c.157A>C substitution does not disrupt the GGG triplet motif, this may illustrate that both the low-score hnRNP A1 motif, UAAGGA, and in particular the UAGGGA motif created by the c.158A>G variation inhibit splicing, and that this can be alleviated by disruption of the invariant A. Next, we tested two additional constructs in which low-score hnRNP A1 motifs, with an hnRNP A1 score similar to that of the wild-type motif, were created by either a c.158A>T or a c.158A>C variation. These mutations both disrupt the hnRNP A2/B1 UAGRGA motif, the DGGGD motif for the hnRNP F/H family, as well as the GGG triplet motif. In addition, the c.158A>T variation is predicted to create a high-affinity binding motif (UAUGGA) for SRSF6 (SRp55), and the c.158A>C variation is predicted to create overlapping binding sites for SRSF6 (UACGGA), SRSF2 (SC35) (GGAUACG), and SRSF1 (GAUACGG, CGGACA) (Fig. 2A). These variants could therefore be expected to dramatically improve splicing by directly decreasing hnRNP A2/B1 and hnRNP H binding by disrupting their motifs and at the same time also indirectly decreasing hnRNP A1, hnRNP A2/B1, and hnRNP H binding by creating additional overlapping binding sites for competing positive splicing regulators, such as SRSF1, SRSF2, and SRSF6. In accordance with these predictions, exon 2 inclusion from the minigenes harboring the c.158A>T or the c.158A>C variation were higher than that of the ETFDH wild-type minigene (compare Wt with 158T and 158C in Fig. 2B). All together, these results suggest that proper splicing of ETFDH exon 2 depends on the balance in binding of the positive and negative splicing regulatory proteins around position c.158, and that splicing repression, introduced by the c.158A>G variation, may act through increased hnRNP A1, hnRNP A2/B1, and hnRNP H binding.

Binding of SRSF1 to the ETFDH c.158A>G Variant Sequence Is Blocked by hnRNP A1, hnRNP A2/B1, and hnRNP H Binding

We next aimed to directly test whether the motif created by the c.158A>G variation causes increased hnRNP A1, hnRNP A2/B1, or hnRNP H binding and decreases binding of SRSF1 to the overlapping and nearby consensus motifs. We designed wild-type and mutated RNA oligonucleotides comprising ETFDH position c.149_170 and used them in pull-down experiments with nuclear extracts followed by Western blot analysis with antibodies to hnRNP A1, hnRNP A2/B1, hnRNP H, SRSF1, and SRSF5 (Fig. 3A). As expected, the c.158A>G variant RNA (Mut in Fig. 3) showed increased binding of hnRNP A1 when compared with wild-type RNA, and hnRNP A1 binding was abolished when the ETFDH c.157A>C variation was introduced to disrupt the high-affinity hnRNP A1 binding motif (UCGGGA) in the c.158A>G variant sequence (Mut-C in Fig. 3). The hnRNP A2/B1 protein showed a binding pattern identical to that observed for hnRNP A1 with all ETFDH oligonucleotides, which is consistent with the nearly identical binding motifs of these proteins. Surprisingly, the introduction of the c.157A>C variation into the wild-type sequence resulted in increased binding of hnRNP A1 and hnRNP A2/B1 as compared with the wild-type sequence (Wt-C in Fig. 3).

Details are in the caption following the image
Binding of hnRNP and SR proteins to the mutated splicing regulatory ETFDH element. A: Schematic representation of the used RNA oligonucleotides comprising ETFDH position c.149_170. The calculated hnRNP A1 and SRSF1 binding scores for the putative ESS and ESE elements are shown on the right. B: The RNA oligonucleotides were used in a pull-down experiment with HeLa nuclear extract followed by SDS-PAGE and Western blot analysis using antibodies against SRSF1, hnRNP A1, hnRNP H, hnRNP A2/B1, or SRSF5. Bl and NE indicate control lanes without RNA oligonucleotides or with nuclear extract alone, respectively. The shown blots are representative result from at least two pull-down experiments with the four different RNA oligonucleotides.

Both RNA oligonucleotides harboring the c.158A>G variation (Mut and Mut-C in Fig. 3) showed significantly increased binding of hnRNP H as compared with wild-type RNA indicating that the AGGGA/CGGGA (Mut/Mut-C) motif alone or together with the upstream CGGGA (c.151-155) motif may function as a hnRNP H-binding silencer motif as shown for other genes [Chen et al., 1999; Dobrowolski et al., 2010; Pagani et al., 2003]. In contrast to hnRNP A1 and hnRNP A2/B1, the hnRNP H binding motif is not disrupted by the c.157A>C variation. Consistent with this, hnRNP H binding to the c.158A>G variant sequence was not abolished when the c.157A>C variation is present (Mut-C in Fig. 3).

Interestingly, a reciprocal binding pattern to that of the hnRNP proteins was observed when using antibodies to SRSF1. SRSF1 binding to wild-type RNA oligonucleotides was completely abolished by introduction of the c.158A>G variation. Binding of SRSF1 was partially reestablished, when the c.157A>C variation was introduced into the c.158A>G variant sequence, and increased when it was introduced into the wild-type sequence. These results suggest that the ESS, created by the c.158A>G variation, binds hnRNP A1, hnRNP A2/B1, and hnRNP H and that this displaces SRSF1 binding. The intensity of SRSF1 binding to the four tested ETFDH sequences parallels the observed pattern for exon 2 inclusion from the minigene with these sequences inserted (Fig. 2), indicating a key role for SRSF1 binding in exon inclusion. It is of note that only the c.158A>G variant sequence showed increased binding of all three splicing inhibitory proteins, hnRNP A1, hnRNP A2/B1, and hnRNP H and a complete loss of SRSF1 binding. This might indicate that binding of the three hnRNP proteins work synergistically to exclude SRSF1 binding to the c.158A>G variant sequence and cause exon skipping.

To investigate the involvement of SRSF1 in ETFDH exon 2 inclusion, we performed siRNA-mediated knockdown of SRSF1 in HEK293 and HeLa cells. Surprisingly, no decrease in exon 2 inclusion could be observed (data not shown). This finding might reflect that other SR proteins can also bind to ETFDH exon 2 and compensate for the lack of SRSF1. A ACAAGAG (c.161_167) SRSF5 binding motif, which is present in all the tested ETFDH sequences has a score of 2.43, which is just below the artificial threshold of 2.67 employed by the ESEfinder 3.0 program. This potential SRFS5 motif is evolutionary conserved in species ranging from mice to human (Supp. Fig. S1), and in mouse and rat the sequence (ACAAAAG) of the conserved-binding site has a SRSF5 score of 3.1, which is above the threshold. We therefore speculated if the ETFDH sequence could also bind SRSF5, and if SRSF5 binding is also influenced by hnRNP binding in a manner similar to SRSF1. Indeed, we observed a similar binding pattern for SRSF5 as for SRSF1, although the changes in SRSF5 binding to the sequences were less pronounced (Fig. 3). The potential role of SRSF5 in ETFDH exon 2 inclusion was tested by siRNA-mediated knockdown of SRSF5 in HEK293 and HeLa cells, but there was no decrease in exon 2 inclusion (data not shown).

While this paper was under review, information theory-based position weight matrices for SR and hnRNP splicing factors were published [Mucaki et al., 2013], and made available for sequence analysis through the ASSEDA server (http://splice.uwo.ca/). In contrast to the ESEfinder, the ASSEDA server measures binding site strengths on a common scale (in bits), which allows changes in binding sites of different splicing regulatory proteins to be compared directly. To compare the relative strengths of the splicing regulatory binding sites involved in c.158A>G-induced ETFDH exon 2 skipping, we analyzed wild-type and mutated ETFDH using the ASSEDA server (Supp. Table S1). According to the ASSEDA server, the c.158A>G variation abolishes a SRSF5 site (c.157_162: 3.0 → −3.3 bits) and creates an hnRNP A1 site (c.156_161: −11.2 → 5.9 bits) and an hnRNP H site (c. 158_162: −1.0 → 0.6 bits). It also strengthens an upstream hnRNP H site (c.152_161: 3.8 → 4.0). Furthermore, the variation creates two weak SRSF1 sites (c.154_160: −4.6 → 0.8, c.158_162: −2.2 → 0.4) and strengthens a SRSF2 site (c.153_160: 0.4 → 2.1). The bit values of the SRSF1 and SRSF2 sites created/strengthened by the variation are however significantly lower than the bit value of the hnRNP H (4.0) site, which is lower than the bit value of the hnRNP A1 (5.9) site. This means that hnRNP H has a weaker calculated binding affinity for the mutated ETFDH as compared with hnRNP A1, which again has a higher calculated affinity for the mutated ETFDH than SRSF1 and SRSF5. These results are consistent with our functional studies and suggest that splicing stimulatory SR proteins in general compete with splicing inhibitory hnRNP proteins in proper splicing of ETFDH exon 2, and that in particular the creation of a strong hnRNP A1 site by the c.158A>G variation is responsible for the abnormal ETFDH exon 2 splicing. Unfortunately, we were unable to downregulate hnRNP H and hnRNP A1 proteins by siRNA-mediated knockdown in patient fibroblast cells carrying the ETFDH c.158A>G variation despite several attempts.

Discussion

Correct splicing of genes in relation to tissue-specific needs, developmental stage, and metabolic requirements is fundamental for cellular health. Splicing regulatory elements composed of overlapping splicing enhancer and splicing silencer elements have evolved to accomplish the required complex regulation. Together with the strength of the splice sites, the balanced binding of splicing stimulatory and inhibitory proteins to the splicing regulatory elements control splicing efficiency and specificity and thereby the amounts and function of the encoded proteins [Andresen and Krainer, 2009]. The ETFDH c.158A>G variation illustrates how finely this interplay is balanced as this single-nucleotide substitution simultaneously affects binding of several splicing regulatory proteins in different directions. We have in the present study demonstrated that the ETFDH c.158A>G variation, which was predicted to be a benign missense variation, instead causes exon 2 skipping and ETFDH protein degradation. ETFDH exon 2 skipping was demonstrated in a native genomic context in patient cells, and also in a splicing reporter minigene demonstrating that the c.158A>G variant ETFDH sequence can induce exon skipping also in a nonnative gene context. We show that the c.158A>G variation increases the affinity of existing binding sites for hnRNP A1 and hnRNP A2/B1 and creates a new hnRNP H-binding site. The increased binding of these splicing inhibitory proteins abolishes binding of the splicing stimulatory proteins SRSF1 and SRSF5 and most likely also other stimulatory SR proteins to flanking and/or overlapping ESEs. Downregulation of individual SR proteins (SRSF1 or SRSF5) by siRNA in two different model cells did not affect ETFDH exon 2 splicing. This most likely underscores that ETFDH exon 2 splicing is dependent on a complex set of relationships, where a number of stimulatory SR proteins in general compete with splicing inhibitory hnRNP proteins in proper splicing of ETFDH exon 2, so that depletion of a single splicing regulatory protein such as SRSF1 is accompanied by compensatory gain in the RNA interaction of other proteins. This interpretation is in agreement with a recent genome-wide CLIP-seq analysis revealing that the loss of one SR protein (SRSF1 or SRSF2) remodels the in vivo binding profile of the other SR protein (Pandit et al., 2013).

According to the ASSEDA server, the abolishment of a SRSF5 site (3.0 → −3.3 bits) and creation of an hnRNP A1 site (−11.2 → 5.9 bits) at the site of the c.158A>G variation are the most important. Our experiments support that SRSF5 binding is important for proper ETFDH exon 2 splicing as introduction of the c.158A>G variation into RNA oligonucleotides caused decreased binding of SRSF5 protein, whereas introduction of c.157A>C in mutant (−3.3 → 4.9 bits) or wild-type (3.3 → 4.9 bits) ETFDH RNA oligonucleotides caused increased binding of SRSF5 to RNA oligonucleotides and increased exon 2 inclusion in the minigene experiments. Even though the splice-predictive programs predict that the binding affinity of SRSF1 is not decreased by the c.158A>G variation, our experiments clearly show that blocking of SRSF1 binding by hnRNP protein binding to overlapping or nearby ESS elements is crucial for the observed aberrant splicing. Our RNA affinity experiments confirms that the c.158A>G variation causes increased binding of all the three tested splicing inhibitory hnRNP proteins, and when we introduced different substitutions in the sequence, which were predicted to affect the individual binding motifs differently, binding of the individual hnRNPs was affected accordingly. Remarkably, only the c.158A>G variant was able to simultaneously cause increased binding of all the three hnRNP's and complete loss of SRSF1 binding. The other variants, which either caused increased hnRNP A1 and hnRNP A2/B1 binding or increased hnRNP H binding, only partially reduced SRSF1 binding and exon inclusion. Since the pattern of reduction in SRSF1 binding was paralleled by a similar pattern in the efficiency of exon inclusion from the splicing reporter minigene, we propose that ETFDH exon 2 splicing is dependent on SRSF1, SRSF5, or other splicing stimulatory SR protein with similar binding motifs, and that this can be antagonized by hnRNP A1, hnRNP A2/B1, and hnRNP H binding. We suggest that the dramatic effect of the c.158A>G variation is caused by simultaneous creation of multiple overlapping binding motifs in a context with flanking GGG triplets, which causes synergistic binding of hnRNP A1, hnRNP A2/B1, and hnRNP H, completely blocking the access of positive splicing factors, such as SRSF1 and SRSF5 to flanking-binding sites. This disrupts cooperative assembly of the SR proteins along exon 2, which may be crucial for the recruitment of the U2AF complex to the rather weak 3′splice site (Fig. 4).

Details are in the caption following the image
Model for c.158A>G induced miss-splicing of ETFDH exon 2. A: The c.158A>G variation is located in ETFDH exon 2, which has a suboptimal 3′ splice site with mismatches to the U2AF35 binding motif at the exonic +1 and +2 positions and a weak polypyrimidine tract. Correct splicing of exon 2 depends on binding of positive splicing regulatory proteins such as SRSF1 and SRSF5 to ESE elements (green boxes) flanking the c.158A>G variation. It is suggested that these SRSF proteins may interact with SR proteins allowing the cooperative assembly of SR proteins along exon 2 and recruitment of the U2AF complex to the 3′splice site eventually causing exon 2 recognition and inclusion. B: The c.158A>G variation increases the strength of a preexisting ESS motif UAGGGA (red box) that binds the hnRNP A1 and hnRNP A2/B1 splicing inhibitory proteins and also creates a binding site for hnRNP H. Binding of these hnRNP proteins to the UAGGGA motif is most likely increased by triggering synergistic binding to flanking silencer GGG triplets (red). This leads to a shift in the balance between bound SRSF and hnRNP splicing regulatory proteins, so that binding of SRSF proteins to the ESE elements flanking the c.158A>G variation is blocked eventually preventing binding of the U2AF complex to the 3′ splice site and causing exon skipping.

In this respect, it is interesting to note that in three other genes, namely APC [Gonçalves et al., 2009], PAH [Dobrowolski et al., 2010], and CHRNA1 [Masuda et al., 2008] disease-causing variations that cause aberrant splicing create an identical architecture with overlapping splicing regulatory motifs as we observe in ETFDH, although the situation is a bit more complex in CHRNA1 since the GGG triplet is located in the intron and the CAGGGT motif overlaps the 3′ splice site (Fig. 5). We suggest that also in these cases the disease-causing variation apparently has a dual negative effect by simultaneously creating or increasing the score for the hnRNP A1 and hnRNP A2/B1 motif, CAGGGU or UAGGGA, and at the same time creating a DGGGD or triple GGG motif for hnRNP H. In addition, the simultaneous presence of a second flanking GGG triplet most probably contributes further by serving as a hnRNP H-binding site. There are several examples of published single-nucleotide variations that create this kind of closely spaced GGG triplets and cause aberrant splicing even without the simultaneous creation of hnRNP A1 or hnRNP A2/B1 motifs, indicating that isolated creation of GGG triplets also causes aberrant splicing [Cogan et al., 1997; Dobrowolski et al., 2010; Llewellyn et al., 1996; Matern et al., 2003; Santoro et al., 2007]. We have previously shown that disruption of the preexisting GGG triplet by a G>C substitution can correct splicing from the mutant sequence of PAH and HEXB [Dobrowolski et al., 2010], whereas McCarthy and Phillips (1998) have shown this for the GH1 gene GGG triplets. In Figure 5, we have aligned sequences exemplifying this mechanism, including examples from the HEXB gene [Santoro et al., 2007], the HMBS gene [Llewellyn et al., 1996], and the ACADSB gene [Matern et al., 2003].

Details are in the caption following the image
Analysis of disease-causing splicing regulatory motifs in other genes having similar architecture as the one observed in ETFDH exon 2. Relevant sequences from the genes are listed with disease-causing variations marked in red. A pictogram of the hnRNP A1 score matrix [Cartegni et al., 2006] (top) and identified hnRNP A1 binding motifs in different genes are boxed in red. GGG triplets are boxed in blue, and solid green lines indicate SRSF1 binding motifs predicted by the ESEfinder 3.0 tool [Cartegni et al., 2003]. Splice site strengths of the exons containing the splicing regulatory elements were calculated using the MaxEntScan software [Yeo and Burge, 2004]. *, the variation is located in the first or the last exon. (GGG), the presence of an inhibitory GGG triplet in the 5′ splice site sequence. The genes listed are APC, adenomatosis polyposis coli [Gonçalves et al., 2009]; PAH, phenylalanine hydroxylase [Dobrowolski et al., 2010]; CHRNA1, cholinergic receptor, nicotinic, alpha 1 [Dobrowolski et al., 2010; Masuda et al., 2008]; HSD17B10, hydroxysteroid (17-beta) dehydrogenase 10 [Lenski et al., 2007]; ACADSB, short/branched chain acyl-CoA dehydrogenase [Dobrowolski et al., 2010; Matern et al., 2003]; HEXB, hexosaminidase B (beta polypeptide) [Dobrowolski et al., 2010; Santoro et al., 2007]; and HMBS, hydroxymethylbilane synthase [Llewellyn et al., 1996].

Based on the present data, we propose that the UAGGGA motif, and also the CAGGGU motif, binds hnRNP A1, hnRNP A2/B1, and hnRNP H, and that binding is stimulated further by the presence of hnRNP H binding GGG triplets in close proximity and vice versa. It is likely that binding to the UAGGGA/CAGGGU and flanking GGG triplets may be synergistic, since not only hnRNP H proteins can interact with each other, but also interactions between hnRNP A1 and hnRNP H as well as hnRNP A1 with itself has been demonstrated [Fisette et al., 2010; Martinez-Contreras et al., 2006]. This combined motif may be an even stronger splicing repressor than the isolated creation of GGG triplets or the UAGGGA/CAGGGU motif alone, and it is likely that the combined motif is able to repress splicing from exons, which are better defined than exons, which are skipped by either motif alone. In line with this hypothesis, the weak 5′ splice site of HSD17B10 exon 5 where isolated creation of the CAGGGT motif by a c.574C>A variant causes skipping and disease [Lenski et al., 2007] is only 6.9 (MaxEnt), whereas the 5′ splice site of PAH exon 1 where the created CAGGGT motif is flanked by a GGG triplet is stronger with a score of 9.6 (MaxEnt) (Fig. 5). Similarly, the weak 5′ splice site of exon 10 from ACADSB and the weak 3′ splice site of exon 12 from HEXB, where exonic variations results in the creation of two GGG triplets, causing exon skipping, without the simultaneous presence of a TAGGGA or CAGGGT motif, are weaker than those of the ETFDH exon 2 and APC exon 14, which harbor the combined TAGGGA motif together with a flanking GGG triplet motif, and also weaker than the 5′ splice site of PAH exon 1 with the combined CAGGGT and a flanking GGG triplet motif (Fig. 5). Although skipping of exon 3 in HMBS is caused by a C>G variation, which results in the creation of two GGG triplets, without the simultaneous presence of a TAGGGA or CAGGGT motif, the 5′ and 3′ splice sites of exon 3 from HMBS do not appear weak based on the MaxEnt scores. However, it may be important that the 5′ splice sites of HBMS and HEXB both themselves harbor GGG triplets, which overlap with the U1snRNP binding motif and that this makes them weaker than indicated by the splice site score and particularly vulnerable to sequence variations that create exonic GGG triplets, perhaps through synergistic formation of a hnRNP H complex that interacts with all the GGG triplets and thereby masks the 5′ splice site. In support for this notion, it has previously been demonstrated that GGG triplets in 5′ splice sites compromises their strength by allowing hnRNP H to bind and compete with U1snRNP binding [Buratti et al., 2004]. Disruption of the GGG triplet in the 5′ splice site of NF1 exon 3 and TSHβ exon 2 disrupts hnRNP H binding and corrects splicing from disease-causing variants at the +5 position that cause exon skipping [Buratti et al., 2004].

As illustrated in the present study, bioinformatics tools, which allow prediction of the functional consequences of gene defects on mRNA and protein level, may be useful by providing hints to the molecular mechanisms that underlie a suspected disease-associated gene defect. The location of the ETFDH c.158A>G (p.Lys53Arg) variation in a predicted splicing regulatory element encouraged us to analyze and clarify its impact on pre-mRNA splicing, which revealed that the molecular consequences of the c.158A>G variation are much more severe than could be predicted from the genetic code. Unfortunately, although a variety of in silico-predictive splicing algorithms have been developed and made publicly accessible, our knowledge of ESE/ESS consensus motifs and our current understanding of the pre-mRNA splicing process are still insufficient to allow their use directly to assess if and how a new gene variant will impact mRNA splicing. This is illustrated in the present study, where the ESEfinder and ASSEDA server bioinformatics tools predicted that the c.158A>G variation does not abolish SRSF1 binding, whereas our functional studies showed that the ETFDH c.158A>G variation mainly acts by creating a strong ESS, which interrupts the function of overlapping and flanking SRSF1 and SRSF5 sites. Future in silico prediction programs will need to take into account this kind of complex mechanisms by simultaneously considering splice site strength and the overall architecture of splicing regulatory elements in the exon affected by a sequence variation.

Acknowledgments

We thank the patient's family for providing patient material for the study. We are also thankful to Tom A. Cooper for providing the RHC-Glo splicing reporter vector. We thank Hanan Hassoun, Lone Sundahl, and Jane Serup Pedersen for expert technical assistance.

Disclosure statement: The authors have no conflict of interest to declare.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.