Volume 32, Issue 4 pp. 467-475
Research Article
Free Access

Pure intronic rearrangements leading to aberrant pseudoexon inclusion in dystrophinopathy: a new class of mutations?

Mouna Messaoud Khelifi

Mouna Messaoud Khelifi

Université Montpellier 1, UFR médecine, Montpellier, France

INSERM, U827, Montpellier, France

These two authors contributed equally to this work.

Search for more papers by this author
Aliya Ishmukhametova

Aliya Ishmukhametova

Université Montpellier 1, UFR médecine, Montpellier, France

INSERM, U827, Montpellier, France

These two authors contributed equally to this work.

Search for more papers by this author
Philippe Khau Van Kien

Philippe Khau Van Kien

CHU Montpellier, Hôpital Arnaud de Villeneuve, Laboratoire de Génétique Moléculaire, Montpellier, France

Search for more papers by this author
Delphine Thorel

Delphine Thorel

CHU Montpellier, Hôpital Arnaud de Villeneuve, Laboratoire de Génétique Moléculaire, Montpellier, France

Search for more papers by this author
Déborah Méchin

Déborah Méchin

CHU Montpellier, Hôpital Arnaud de Villeneuve, Laboratoire de Génétique Moléculaire, Montpellier, France

Search for more papers by this author
Serge Perelman

Serge Perelman

LENVAL Foundation, Children's Hospital, Nice, France

Search for more papers by this author
Jean Pouget

Jean Pouget

Service de Neurologie et Maladies Neuromusculaires, Centre de Référence national pour les Maladies Neuromusculaires et la SLA, Hôpital la Timone, Marseille, France

Search for more papers by this author
Mireille Claustres

Mireille Claustres

Université Montpellier 1, UFR médecine, Montpellier, France

INSERM, U827, Montpellier, France

CHU Montpellier, Hôpital Arnaud de Villeneuve, Laboratoire de Génétique Moléculaire, Montpellier, France

Search for more papers by this author
Sylvie Tuffery-Giraud

Corresponding Author

Sylvie Tuffery-Giraud

Université Montpellier 1, UFR médecine, Montpellier, France

INSERM, U827, Montpellier, France

Laboratoire de Génétique Moléculaire et INSERM U827, IURC, Institut Universitaire de Recherche Clinique, 641 Avenue du Doyen Giraud, 34093 Montpellier Cedex 5, FranceSearch for more papers by this author
First published: 08 February 2011
Citations: 27

Communicated by Claude Ferec

Abstract

We report on two unprecedented cases of pseudoexon (PE) activation in the DMD gene resulting from pure intronic double-deletion events that possibly involve microhomology-mediated mechanisms. Array comparative genomic hybridization analysis and direct genomic sequencing allowed us to elucidate the causes of the pathological PE inclusion detected in the RNA of the patients. In the first case (Duchenne phenotype), we showed that the inserted 387-bp PE was originated from an inverted ∼57 kb genomic region of intron 44 flanked by two deleted ∼52 kb and ∼1 kb segments. In the second case (Becker phenotype), we identified in intron 56 two small deletions of 592 bp (del 1) and 29 bp (del 2) directly flanking a 166-bp PE located in very close proximity (134 bp) to exon 57. The key role of del 1 in PE activation was established by using splicing reporter minigenes. However, the analysis of mutant constructs failed to identify cis elements that regulate the inclusion of the PE and suggested that other splicing regulatory factors may be involved such as RNA structure. Our study introduces a new class of mutations in the DMD gene and emphasizes the potential role of underdetected intronic rearrangements in human diseases. Hum Mutat 32:1–9, 2011. © 2011 Wiley-Liss, Inc.

Introduction

The Duchenne Muscular Dystrophy gene (DMD; MIM♯ 300377) is the largest gene detected to date. It spans approximately 2.2 megabases of the X chromosome and encodes several transcripts alternatively generated from 79 exons and 7 promoters. The transcript variant Dp427m expressed in muscle lineages is nearly 14 kb long and is one of the longest [Muntoni et al., 2003]. Consequently, more than 99% of the gene sequence is composed of noncoding sequences. Mutations in the DMD gene cause the dystrophinopathies, a collective term for Duchenne Muscular Dystrophy (DMD; MIM♯ 310200), Becker Muscular Dystrophy (BMD; MIM♯ 300376), and the rare X-linked dilated cardiomyopathy (MIM♯ 302045). DMD is a severe and rapid progressive neuromuscular disorder with the onset of symptoms generally occurring between 3 and 5 years and early loss of ambulation between the ages of 9 and 10 years, whereas BMD is a clinically less severe form of the disease in which affected individuals remain ambulatory beyond the age of 16 years and a few may lead a normal or near-normal life [Emery, 2002]. DMD is caused by mutations that disrupt the reading frame leading to a complete loss of functional dystrophin in muscle. In contrast, BMD is typically associated with in-frame mutations that allow production of either a reduced amount of normal dystrophin or an altered but partially functional dystrophin protein [Monaco et al., 1988].

The most common changes in the DMD gene consist of large genomic deletions or duplications of one or more exons, which account for mutations in 43–85% and 7–11% of all patients, respectively [Dent et al., 2005; Flanigan et al., 2009; Tuffery-Giraud et al., 2009]. Over the past few years, the development of new diagnostic techniques such as the Multiplex Amplifiable Probe Hybridization (MAPH) [White et al., 2002], or the Multiplex Ligation-dependent Probe Amplification (MLPA) technique [Lalic et al., 2005] covering all 79 exons, has allowed the detection of gene dosage imbalance for each of the 79 exons and thus accurate definition of the extent of genomic rearrangements. However, all these techniques focused on coding regions, leaving mutations located deep in the introns undetected. Recently, the development of high-density microarray-based comparative genomic hybridization (array CGH) has provided a powerful tool to explore the entire genomic region of the DMD gene for unrecognized large copy number variations (CNVs) as defined by rearrangements of more than 1 kb and even smaller insertions/deletions (Indels) as defined by a size <1 kb [Bovolenta et al., 2008; Hegde et al., 2008].

The ∼30% remaining mutations consist of small lesions, which are evenly distributed across the DMD gene. The implementation of a semiautomated direct sequencing methodology of all exons along with flanking intronic sequences, and promoters has enabled efficient detection of these small lesions [Flanigan et al., 2003]. Alternatively, RNA-based methods proved to be successful in detecting point mutations in the DMD gene [Deburgrave et al., 2007; Tuffery-Giraud et al., 2004] and diagnostically valuable in clinical practice to determine the outcome of splice-site mutations and/or to identify alternative splicing patterns that may account for exceptions to the reading frame rule [Kesari et al., 2008]. Moreover, the analysis of mRNA obtained from muscle biopsies made possible the recognition of a novel class of disease-causing mutations in introns that cause missplicing by inducing inclusion of intronic sequences as exons (pseudoexon [PE] inclusion) [Gurvich et al., 2008; Tuffery-Giraud et al., 2003]. As reported in other genes [Buratti et al., 2006], the vast majority of this type of mutations was found to strengthen preexisting weak cryptic splice sites or to create new splice sites.

In this study, we report a novel class of mutation for PE activation in the DMD gene. We show that this missplicing event can occur in the context of pure intronic rearrangements as illustrated by the double deletions detected in two unrelated patients. One was located deep in intron 44 and coupled with an inversion of a large genomic region while the second one consists of two small deletions directly flanking a PE in intron 56. We performed minigene assays to provide evidence of the pathogenic role of the identified deletions in intron 56 upon exonization of the intronic sequence, and to investigate whether the local context plays a role in splicing regulation of the PE in the wild-type and mutant context.

Materials and Methods

Patients

Genetic and laboratory testing was performed in the probands under conditions established by French law, and appropriate written informed consents were collected. Patient 1 was referred to us at 5 years old because of manifestation of DMD with very high serum creatine phosphokinases (CK) levels. He had been adopted, but had a compatible familial history since his mother was reported to suffer from myalgia and to have high serum CK levels. There was no other familial information. A muscle biopsy was performed, and immunofluorescence (IF) staining with dystrophin antibodies (Dys-1, -2, and -3) was negative. The patient was lost to follow-up until the last evaluation at the age of 19 when a poor motor evolution of the disease was noted. Wheelchair use was reported at 8 years of age. Echocardiography revealed the beginning of a dilated cardiomyopathy with a left ventricular hypokinesia and reduced ejection fraction (EF) (<50%) and spirometry diagnosed a mild restrictive respiratory insufficiency with a 62% forced vital capacity (FVC).

Patient 2 was a 30-year-old man who was first examined at the age of 8 because of fatigability. He complained of neither muscle pain nor cramps. He showed enlarged calves and he had an increased level of serum CK (7.500 IU/l, normal <200 IU/l). The family history was negative, and his development was normal. Due to consistently elevated serum CK levels, a muscle biopsy was performed at the age 16. It revealed discrete dystrophic features, and immunohistochemical dystrophin analysis showed decreased and irregular sarcolemmal labeling with Dys-2 and Dys-1 antibodies. Western blotting showed reduced amount of a normal-sized protein (about 25% of the control level). The patient showed no signs of muscle weakness during childhood and later in early adulthood, and was able to participate in intensive sport activities. Echocardiography was normal until 21, then the left ventricular ejection fraction (LVEF) decreased below 50% (last LVEF of 45% at age 26) and the patient was treated with angiotensin-converting enzyme (ACE) inhibitors.

Mutation Analysis

Genomic DNA from the patients was screened for deletion and/or duplication using MLPA (Salsa MLPA kit P034/P035 DMD/Becker MRC-Holland; Amsterdam, The Netherlands). The dystrophin (or DMD) transcripts were analyzed as previously described [Tuffery-Giraud et al., 2004]. Briefly, total RNA was isolated from a frozen muscle biopsy and full-length cDNA was amplified as 10 separate and partially overlapping fragments using the Access Quick RT-PCR System (Promega, Charbonnières-les-Bains, France). The amplified products were subsequently analyzed by electrophoresis on a 1.5% agarose gel and by the protein truncation test (PTT). For patient 1, because of the absence of amplification of the cDNA fragment encompassing exon 43 to exon 51, additional primers were designed to amplify the region in three overlapping fragments. Fragments of normal size were obtained except for the region located between the junction of exons 43/44 and exon 46 (forward, 5′-CCGACAAGGGCGATTTGACA-3′ and reverse, 5′-CTTGACTTGCTCAAGCTTTTCTTTTAG-3′). Abnormal cDNA fragments were sequenced using the Big Dye Terminator v1.1 Cycle Sequencing Kit (Applied Biosystems, Courtaboeuf, France). Following the detection of inserted intronic sequence in the transcripts, primers were designed to amplify the related genomic region (intron 44, forward, 5′-TGTATTGTCTGCTTTCATAC-3′ and reverse 5′-GTGCCTGTATGTTAATTGTGA-3′; intron 56, forward, 5′-TGGCTAAGGGAAATGTTGCT-3′ and reverse, 5′-CAGAAGTTCCTGCAGAGAAA-3′) by using the PCR Master Mix (Promega). The PCR products were then directly sequenced. Nucleotide numbering for mutation reflects cDNA numbering with +1 corresponding to the A of the ATG translation initiation codon of GenBank NM_004006.2 (www.hgvs.org/mutnomen). Nucleotide numbering for X chromosome position is given according to the GenBank NC_000023.9 and the Human Genome reference sequence of NCBI build 36/hg18 Mar.2006 (http://genome.ucsc.edu/).

Array Comparative Genomic Hybridization (Array CGH)

We used a custom-designed 12X135K NimbleGen microarray format (Roche NimbleGen, Madison, WI). It includes 22,750 probes spanning the entire 2.2-Mb DMD gene sequences on chromosome X: 31,032,794–33,277,530 and numerous internal controls on autosomal, X and Y chromosome loci (>32,000 probes). Average probe length is 60 bases (range: 45–70 bases) with isothermal melting temperature (Tm) of 42°C across the array. The average spacing between starts of overlapping probes is 10 bp (inner spacing of 10 bases) accross the 79 exons with their 100-bp intronic borders, and 7 promoters. The average spacing between adjacent probes is 100 bp (outer spacing of 100 bp) in the introns and in the 140 kb upstream genomic region of the ATG start codon and the 14.5 kb downstream genomic regions of the TAG stop codon of the Dp427m isoform. The experiments were carried out according to the manufacturer's recommended protocol (Roche NimbleGen). Briefly, 1 µg of patient and reference DNA samples were labeled with green (Cy3) and red (Cy5) cyanines fluorescent dyes, respectively. The microarray slides were hybridized for 72 hr at 42°C, then washed, dried, and scanned using Innoscan 700A (INOPSYS, Toulouse, France). Array-CGH data were extracted and analyzed using the NimbleScan version 2.5 software and SignalMap version 1.9 software. For determining each breakpoint sequence, oligonucleotide primer pairs were designed with the help of the Primer3Plus on-line tool (http://www.bioinformatics.nl/cgi-bin/primer3plus/primer3plus.cgi) using both proximal and distal 0.7-kb flanking regions determined by the CGH-array analyses (list of primers available in Supp. Table S1). PCR was done in patient 1 and patient 2 using the Qiagen LongRange PCR kit (Qiagen, Courtaboeuf, France) or the PCR Master Mix (Promega), respectively. Amplified junction fragments were sequenced using the Big Dye terminator version 1.1 Cycle Sequencing Kit.

In Silico Analysis of DMD Sequences

BLAST program (http://blast.ncbi.nlm.nih.gov/) was used to search for the origin of the inserted sequence detected in the mature dystrophin transcripts. Splice-site score predictions for the PEs were performed using the Human Splicing Finder (HSF) Web interface (version 2.4; http://www.umd.be/HSF/), which includes position weight matrices to calculate consensus values (CV) and an algorithm for the calculation of the MaxEnt scores [Desmet et al., 2009]. To investigate the sequence characteristics in the vicinity of the breakpoints (±100 bp), we searched for extended homologies by means of the BLAST program and interspersed repeat-element content with the BLAT and the RepeatMasker tools in the UCSC genome browser program (http://genome.ucsc.edu).

Minigene Constructs, Transfections, and RT-PCR

In patient 2, we carried out functional assays to evaluate the splicing mechanism of the PE identified in intron 56. Briefly, the pSPL3 exon trapping vector was used using the procedure described previously [Le Guédard-Méreuze et al., 2010]. For the PE–WT (wild-type) and PE–MT (mutant) constructs, fragments corresponding to the PE and flanking regions were amplified from control and patient genomic DNA, respectively, and inserted into the XhoI and NheI restriction sites of the pSPL3 vector. The mutant constructs PE–D2 and PE–ISE were generated by PCR-based mutagenesis (Quick Change Site Directed Mutagenesis Kit, Stratagene, La Jolla, CA) from the PE–WT and PE–MT, respectively, while the constructs PE–D1, PE–D1-1, PE–D1-2, PE–D1-3, PE–D1-A, PE–D1-B, PE–AmpR, and PE–D50 were created using the overlap extension method [Lee et al., 2010] (list of primers and sequences of deleted fragments available in Supp. Table S2). Three independent transfection assays of the minigenes in HeLa cells were performed. RNA extraction and reverse transcription (RT)-PCR reactions were accomplished as reported before [Le Guédard-Méreuze et al., 2010]. The products were resolved on 1.5% agarose gel and splicing patterns confirmed by sequencing. The proportion of PE-inclusion transcripts was measured using the Quantity one (v. 4.6.5) software (Bio-Rad, Marnes-La-Coquette, France).

Results

Dystrophin transcript analysis was performed in two MLPA-negative patients to search for a small lesion in the DMD gene. In both cases, the inclusion of an intronic sequence in the mature transcripts was identified as the cause of the disease in the patients.

PE Characterization (Fig. 1)

Patient 1

All RT-PCR products were identical to control samples except a cDNA fragment encompassing exons 43–46 for which a RT-PCR product of higher molecular weight was detected. Sequencing of the cDNA fragment disclosed the presence of a 387-bp-long sequence inserted between exon 44 and exon 45, leading to premature insertion of a termination codon in the mature mRNA, whose origin was undetermined at the time of analysis in 1996. This case was reevaluated recently. Alignment of the inserted sequence against genome sequences indicated that the sequence derived from the DMD intron 44 (c.6438_6439ins6439–106,288_6439–106,674) and was in inverse orientation (the nucleotide sequence of the insertion is available in Supp. Table S1). Sequencing of a genomic fragment encompassing the PE in the patient failed to detect any nucleotide change in the adjacent genomic regions. We thus decided to use high-density oligonucleotide array CGH targeted to the entire DMD gene to be able to explore the whole 248-kb intron 44. Array CGH analyses identified two noncontiguous deletions of ∼52-kb and ∼1 kb within intron 44 (Supp. Fig. S1). We designed a series of primers to amplify the junction fragments (Supp. Table S1). Because amplifications with primers facing inward failed to give any products and the inserted sequence in the transcripts was reversed, we hypothesized that the entire region between the two intronic deletions might be inverted. The use of two forward and two reverse primers coupled together yielded amplification products and sequencing identified the intronic breakpoints at chrX: 31,969,241 and chrX: 32,079,215 in intron 44, the entire 57,133 bp region situated between the two deletions of 51,889 bp (del 1) and 951 bp (del 2) being inverted (Fig. 1A). This inversion put good splicing signals (acceptor splice site: HSF CV = 79.5%, MaxEnt score = 6.60; donor splice site: HSF CV = 94.3%, MaxEnt score = 11.01) in a favorable configuration around the 387-bp sequence so that it could be recognized as an exon during pre-mRNA splicing. A 4-bp insertion (ACAT) was present at the 5′ breakpoint and 2-bp of microhomology was identified at the 3′ breakpoint (Table 1). Moreover, bioinformatic analysis showed the presence of interspersed repetitive elements at the two breakpoints (Table 1).

Details are in the caption following the image

Schematic representation of the rearrangements in the patients (not to scale). A: Organization of intron 44 in wild-type (WT) and patient 1 (Patient 1). Patient 1, upper line: scheme of the double deletion (del 1 and del 2), indicated by scissors signs detected by the array CGH. The distance between the different elements is given in base pairs (bp), exons are represented by gray boxes. The position/orientation of the primers is indicated in arrows: 1F,2F, forward primers and 1R,2R, reverse primers for the deletions 1 and 2, respectively; Patient 1, lower line: scheme of the rearrangement after breakpoint definition showing the inversion indicated by dashed arrows of the genomic region of intron 44 between the two deletions, and localization of the pseudoexon (black box). The chromosomal position and sequence of the 387-bp PE are detailed in Supp. Table S1. The insertion (ins) of the ACAT motif in the junction 1 is indicated by a triangle symbol. The AG and GT dinucleotides denote activated acceptor and donor splice sites, respectively (the consensus value [CV] for the acceptor and donor splice sites is given in percentage as calculated by the HSF program. The corresponding MaxEnt scores are of 6.60 [acceptor splice site] and 11.01 [donor splice site]). Repeated elements found out across the deletions are indicated. B: Organization of intron 56 in patient 2 (Patient 2). Patient 2, upper line: the scheme of the double-deletion (del 1 and del 2) in patient 2. The size of the deletions and the distance from the exons (gray boxes) to the PE (back box) are given in base pairs (bp). The chromosomal position and sequence of the 166-bp PE are detailed in Supp. Table S1. The AG and GT dinucleotides denote activated acceptor and donor splice sites, respectively (the consensus value [CV] for the acceptor and donor splice sites is given in percentage as calculated by the HSF program. The corresponding MaxEnt scores are of 8.19 [acceptor splice site] and 9.72 [donor splice site]). Patient 2, lower line: sequence context of the deletions (in lowercase in the boxes), and the sequence motifs around the junctions (in uppercase), showing microhomologies in bold characters. Topoisomerase I sites are indicated by lightning signs.

Table 1. Characteristics of Mutations Detected in the Patients
Patient 1 Patient 2
Phenotype DMD Mild BMD
Multiplex PCR Negative Negative
MLPA Negative Negative
RT-PCR NM_004006.2: r.6438_6439ins6439-106,288_6439−106,674 NM_004006.2: r.8390_8391ins8391-300_8391−135
Array CGH Intron 44 deletion 1: NC_000023.9:g.(32,026,950–32,027,046)_(32,078,958–32,079,054)del Intron 56: NC_000023.9:g.(31,425,001–31,425,017)_(31,425,801–31,425,911)del
Intron 44 deletion 2: NC_000023.9:g.(31,969,151–31,969,231)_(31,969,935–31,970,377)del Intron 2: NC_000023.9:g.(32,897,108–32,897,190)_(32,898,216–32,898,708)del
Junction sequence Junction 1: ins ACAT Junction 1: microhomology ATTAGT
Junction 2: microhomology AA Junction 2: microhomology CTTT
NC_000023.9:g.[31969242_31970192del951;32027326_32079214del51889insACAT;31970193_32027325inv57133] NC_000023.9:g.[31425055_31425083del29;31425308_31425899del592]
Breakpoint findings Junction 1: LCR: repeat AT-rich/Simple repeat: (CATA)n, (AT)n Unique sequence at the both sides of the deletions
Junction 2: LINE: L1PA4/DNA: Repeat Tigger 1 Topoisomerase I consensus cleavage site: CTT
Possible molecular mechanism Nonhomologous end joining (NHEJ) Nonhomologous end joining (NHEJ)
Microhomology mediated replication-dependent recombination (MMRDR) Microhomology-mediated replication-dependent recombination (MMRDR)
  • Abbreviations: PCR, polymerase chain reaction; MLPA, multiplex ligation-dependent probe amplification; RT-PCR, Reverse Transcription-PCR; DMD, Duchenne muscular dystrophy phenotype; BMD, Becker muscular dystrophy phenotype.
  • a aThe chromosomal positions and the nucleotide sequence of the pseudoexons are available in Supp. Table S1; array-CGH, array Comparative Genomic Hybridization. NM_004006.2 and NC_000023.9: accession numbers for DMD coding reference sequence and chromosome X reference sequence at NCBI Build 36.1 assembly (http://www.ncbi.nlm.nih.gov), respectively.

Patient 2

The dystrophin transcript analysis revealed the presence of two products for the cDNA region spanning exons 56 to 58, one corresponding to the control and one of higher molecular weight (Fig. 2B). Sequencing of the upper band identified an out-of-frame insertion of 166 nucleotides between exons 56 and 57 (sequence available in Supp. Table S1), derived from intron 56 and located only a short distance (134 bp) upstream from exon 57 (r.8390_8391ins8391–300_8391–135) (Fig. 1B). The 166-bp sequence displayed strong predicted splicing signals at either end (acceptor splice site: HSF = 91.28%, MaxEnt score = 8.19; donor splice site: HSF CV = 88.47%, MaxEnt score = 9.72), making the probability of a mutation reinforcing these sites unlikely to explain the PE activation. The first attempts to amplify the genomic region including the PE and 200 bp of flanking intronic sequences failed thus, we redesigned a forward primer located about 1 kb upstream of the PE, and used a reverse primer located within exon 57, that was known to be present. A smaller PCR product than expected was obtained for the patient, whose sequencing revealed the presence of two distinct intronic deletions, one of 592 bp (del 1) and a second one of 29 bp (del 2), on each side of the PE [c.8391−73_101del;8391−326_917del] (Fig. 1B). No other mutation was detected at the 5′ and 3′ splice sites making it likely that the identified complex genomic rearrangement could be responsible for the out-of-frame insertion of the intronic sequence in the mature transcripts. Sequence inspection revealed six base pair (ATTAGT) and four base pair (CTTT) microhomologies at the junctions of del 1 and del 2, respectively (Fig. 1B), the latter corresponding to Topoisomerase I recognition sites [(G/C)(A/T)T] [Been et al., 1984]. An array CGH analysis confirmed the presence of the rearrangement in intron 56 but as a single deletion (because of the array resolution) (Supp. Fig. S1). It did not reveal any other changes within the DMD gene except a previously described frequent CNV in intron 2 [Bovolenta et al., 2008] (Table 1). The mother of patient 2 was found to harbor a genomic rearrangement identical to her son's indicating that the noncontiguous two-part deletion probably occurred as a single concerted event. We verified the absence of the two deletions in intron 56 in a panel of more than 298 ethnically matched control chromosomes (95% confidence to detect a variant with an allele frequency of 1%) to rule out that the patient-associated complex rearrangements could be explained by benign unreported “CNVs/Indels.”

Details are in the caption following the image

Role of the two adjacent intronic deletions in activation of the pseudoexon in DMD intron 56. A: Schematic representation of the heterologous three-exon, two-intron pSPL3 splicing reporter minigenes used in splicing assays and the subcloning of DMD intron 56 fragments isolated from wild-type (PE–WT) or mutant (PE–MT) alleles. The pSLP3 constructs contain an SV40 promoter, globin coding sequences (E1 and E2), HIV-1 tat splice donor (SD, MaxEnt score: 9.07) and acceptor (SA, MaxEnt score: 7.15) sequences compatible with splice sites from unrelated genes [Buckler et al., 1991], with the DMD pseudoexon (PE) as the middle exon (black boxes). B: Reverse-transcription (RT)-PCR analysis of muscular dystrophin transcripts in patient 2 (P2) showing the presence of a larger-sized product for the cDNA region spanning from exons 56 to 58 in addition to the normal-sized product obtained from the normal control (C). Sequencing of the normal sized product confirmed sequence normality. The identity of RNA products is shown on the right. C, D: RT-PCR analysis using vector specific primers of transcripts derived from the indicated reporter minigenes following their expression in HeLa cells. C: Note that only the minigene carrying the del 1 (PE–D1) induces pseudoexon insertion (PE insertion, PE+) at the same level as the PE–MT construct carrying the two deletions. The construct containing only the del 2 (PE–D2) gave rise to a normal splicing pattern (PE exclusion, PE−). D: The replacement of the 592-bp sequence corresponding to the del 1 by a heterologous sequence derived from the bacterial gene for ampicillin resistance (AmpR) was unable to repress pseudoexon inclusion (100% PE inclusion, PE+). (*) An extra band was detected with this construct corresponding to the use of alternative splice sites located in the inserted AmpR sequence. The identity of RNA products was established by sequencing of each band. Numbers at the bottom of gels indicate the proportion (%) of PE inclusion (PE+) compared to normal transcript (PE−). The percentages were determined using the Quantity one (v. 4.6.5) software.

A Sequence Upstream of the Pseudoexon in Patient 2 is Important for Its Regulation

The sequence of the PE and its flanking regions was PCR-amplified from genomic DNA of the patient 2 (PE–MT) and a control (PE–WT) and cloned in a pSPL3 exon trapping minigene (Fig. 2A). We analyzed the splicing pattern of the minigenes following transient transfection into HeLa cells. The PE–WT transcripts showed full PE exclusion (Fig. 2C). In contrast, the PE–MT minigene produced two bands corresponding to the PE inclusion and to a splicing event between the vector exons, a profile similar to that observed in the patient (Fig. 2B). To determine the role of each deletion individually on PE activation, we constructed minigenes carrying either one (PE–D1) or the other (PE–D2) deletion (Fig. 2A). Interestingly, we found that only the PE–D1 construct containing the upstream 592-bp deletion (del 1) was able to promote PE inclusion whereas a normal splicing pattern was obtained with the PE–D2 construct containing the downstream 29-bp deletion (del 2) (Fig. 2C). To confirm further the crucial role of the 592-bp sequence in PE repression in the wild-type context and PE activation when deleted, we inserted a heterologous 592-bp sequence from the bacterial gene AmpR [Kuga et al., 2000] in place of the del 1 (Fig. 2A). Transcript analysis of the PE–AmpR minigene after transfection in HeLa cells revealed a complete PE inclusion (Fig. 2D), arguing for a role of the upstream 592-bp sequence in PE repression in the normal context. We wondered whether this repression role was attributable to specific regulatory elements such as Intronic Splicing Silencers (ISS) whose function would be to negatively regulate the PE splicing. To provide clues to this hypothesis, we sequentially deleted the 592-bp sequence and generated a series of five different constructs (PE–D1-1, PE–D1-2, PE–D1-3, PE–D1-A, and PE–D1-B) harboring a combination of deletions to narrow the region containing the putative regulatory element (Supp. Fig. S2). Surprisingly, none of the truncated versions of the minigene allowed exonization of the PE. We then concluded that the serial deletions we made had not removed important silencer elements and that the deletion of the whole 592-bp sequence (del 1) was required for PE activation.

Role of Splicing Regulatory Elements in the PE Activation

We next sought to explore whether the activation of the PE may result from the creation of a new Intronic Splicing Enhancer (ISE) motif at the breakpoint junction of the del 1 that would explain why its inclusion was dependent on the presence of the complete 592-bp deletion. An in silico analysis with the HSF software predicted that the del 1 creates two overlapping potential binding sites for the SC35 (score: 82.85) and SRp40 (score: 93.41) SR proteins (Fig. 3A). To assess their role in PE activation, we abolished the two ISEs by introducing a single point mutation. The resulting construct (PE–ISE) was transfected in HeLa cells, but the transcript analysis did not reveal any change in the splicing pattern compared to the PE–MT construct (Fig. 3B). We also investigated whether the del 1 could bring the PE closer to an activating element. Indeed, the HSF analysis indicated the presence of numerous ISEs and potential branch points upstream of the del 1. We thus deleted a sequence of 50 bp upstream of the del 1 (PE–D50) to eliminate all potential regulatory elements, but no major effect upon PE inclusion (46 vs. 54%) was observed in the minigene assay (Fig. 3C). Taken together, these results argued against a role of splicing regulatory elements in the exonization of the PE in the mutated context.

Details are in the caption following the image

Role of splicing cis-acting elements in pseudoexon recognition. A: Schematic representation of the pSPL3 minigene constructs used to investigate the role of cis-acting splicing elements in the pseudoexon (black box) activation using the PE–MT construct carrying the del 1 and del 2 deletions (scissors symbols). Two newly created splicing enhancer sequences corresponding to SC35 and SRp40 binding motifs as predicted by the HSF program (http://www.umd.be/HSF/, scores are given in brackets) were abrogated by site-directed mutagenesis (PE-ISE construct). The role of flanking cis-acting elements in PE activation, in particular, branch point sequences, was assessed by deleting a 50-nt sequence upstream the del 1 (PE–D50 construct). B, C: PE–ISE and PE–D50 minigenes were used to transiently transfect HeLa cells. After RNA isolation the splicing products were analyzed by RT-PCR using minigene specific primers. No significant changes in the level of PE inclusion (PE+) was obtained compared to the PE–MT construct (as defined in Fig. 2A). The PE exclusion rate (PE−) was 100% for the PE–WT construct. Numbers at the bottom of gels indicate the proportion (%) of PE inclusion (PE+) compared to normal transcript (PE−). The percentages were determined using the Quantity one (v. 4.6.5) software.

Discussion

Although the DMD gene was one of the earliest genes to be identified, finding all the mutations affecting the gene has been challenging owing to its large size and complexity. The report of two unparalleled cases of pure intronic double-deletion leading to missplicing events substantiates this observation.

Over recent years, the development and implementation of new DNA-based diagnostic methods has facilitated and improved the detection of the wide variety of different mutations (deletions, duplications, triplications, small lesions, insertions of repetitive sequences, genomic inversions, complex alleles, etc.) that occur in the DMD gene. Despite these major technological advances, it is apparent that DNA-based strategies cannot pick up all mutations, and RNA analyses are still required, especially to recognize PE mutations. Moreover, the mutation remains unidentified in a few ascertained cases of dystrophinopathy suggesting the existence of as yet unknown mutational mechanisms [Flanigan et al., 2009].

PEs are intronic sequences that are approximately the same length as exons (50–200 bp) with apparently viable donor, acceptor, and branch splice sites but which are not normally spliced into the mature mRNA transcript. There is evidence that inclusion of many of these sequences, which are usually very abundant in the introns of most genes, is actively inhibited due to the presence of intrinsic defects in their composition, the presence of silencer elements or the formation of inhibiting RNA secondary structures [Dhir and Buratti, 2010]. A distinct class of PE sequences derives from the exonization of Alu elements, the most abundant transposed elements in the human genome, which have accumulated mutations during the course of evolution and became recognized by the splicing machinery as exons. Transposed elements play a major role in shaping mammalian genomes and are involved in numerous genetic diseases [Sela et al., 2007; Vorechovsky, 2010]. About 30 cases of pathological PE inclusion in the DMD gene are reported in locus specific databases (http://www.umd.be/DMD/ and http://www.dmd.nl), mainly resulting from single nucleotide substitutions that function by strengthening preexisting splice sites or by creating new ones as observed in other genes [Bovolenta et al., 2008; Dhir and Buratti, 2010; Gurvich et al., 2008; Takeshima et al., 2010]. A few additional cases have been reported that involve the rearrangement of genomic regions. Among them, two reside in intron and do not extend to adjacent exons. They consist in one case of a 11-kb deletion in intron 11 identified in an X-linked dilated cardiomyopathy patient and leading to the exonization of a novel fusion Alu exon [Ferlini et al., 1998], and in the other case, of a small 18-bp deletion within intron 37 inducing the incorporation of a 77-bp PE between exons 37 and 38 of the DMD gene [Bovolenta et al., 2008].

To our knowledge, these cases together with the herein unprecedented cases of double-deletion mutations are the only four experimentally proven examples of pure intronic rearrangements in the DMD gene that lead to PE activation. However, the mutations reported here differ from the previously reported ones as they involve double deletions. These two double deletions greatly differ in size and genomic configuration, as one occurred only a short distance upstream of an exon and involves small deletions, close to each other, which may escape detection by array CGH (patient 2, intron 56), while the other one occurred deep in an intron, and involves distant large size deletions coupled with the inversion of the intervening 58-kb genomic sequence (patient 1, intron 44). Four cases of genomic inversions flanked by deletion/duplication in the DMD gene are described in the literature [Bovolenta et al., 2008; Cagliani et al., 2004; Madden et al., 2009; Oshima et al., 2009], and it is worth noting that all of these events are present in the major deletion hot-spot around exons 43 to 53. All of them involve coding regions leading to the skipping of the exons included in the aberrations in all cases, and the creation of novel exons in two cases [Cagliani et al., 2004; Madden et al., 2009].

The analysis of the deletion junction sequences in patient 1 and patient 2 revealed the presence of specific signatures frequently associated with complex genomic rearrangements. In particular, we noticed the presence of microhomologies (2–6 bp) in three of the four deletion junctions and one with a 4-bp inserted sequence (patient 1) suggesting that microhomology-mediated processes may have contributed to these rearrangements. Different categories of mutational mechanism have been reported to give rise to genomic rearrangements [Chen et al., 2010]. They include nonhomologous end joining (NHEJ), the most prominent DNA repair mechanism, which is divided into two pathways, classical and nonclassical, originally termed microhomology mediated end joining (MMEJ). In NHEJ, the presence of terminal microhomologies (typically 1–4 bp) facilitates classical NHEJ but is not absolutely necessary. By contrast, the NHEJ junctions of two incompatible ends of the same double-strand breaks (DSB) are often characterized by small (typically 1–4 bp) deletions and/or insertions as seen in patient 1. Recently, replication-based mechanisms have been proposed to account for the multiple breakpoints involved in complex rearrangements. Different models assuming serial replication slippage have been described in particular the fork stalling and template switching (FoSTeS), the microhomology mediated break-induced replication (MMBIR) [reviewed in Chen et al., 2010; Zhang et al., 2009], or the more recently proposed synthesis-dependent MMEJ model (SD-MMEJ) [Yu and McVey, 2010]. FoSTeS, MMBIR, and SD-MMEJ are consistent with the features of complexity (deletion/inversion) and microhomology at the junctions reported here; although MMBIR is break-induced (i.e., generated by a collapsed replication fork), FoSTeS is initiated by replication fork stalling (i.e., no DSB is required). Our observations are in line with a recent proposed model, which postulates that mitotic events, rather than meiotic events, would play an important role in the formation of rare pathogenic CNVs [Vissers et al., 2009]. Nevertheless, it remains a technical challenge to determine which exact mechanism is responsible for selected disease-associated rearrangements. Moreover, other genomic architectural features that may have contributed to the deletion formation were present at the deletion breakpoints and consisted of repetitive elements (LINE, DNA repeats) in patient 1 and topoisomerase I cleavage sites in patient 2, known to promote illegitimate recombination [Zhu and Schiestl, 1996].

Whereas the genomic inversion could reasonably be considered as the cause of the PE activation in patient 1, the molecular mechanisms underlying the recognition of the novel exon in patient 2 were unclear. Strikingly, this PE located only 134 bp upstream from exon 57, is ignored by the splicing machinery in normal conditions despite having splice-site strength scores higher than the calculated average scores for DMD exons (CV/MaxEnt scores of 91.28/8.19 vs. mean 3′ ss scores of 86.26/8.13, and 88.47/9.72 vs. mean 5′ ss scores of 86.99/8.25). In an attempt to clarify the mechanisms involved, we used splicing reporter minigenes to investigate whether the two intronic deletions identified in patient 2 have removed or created splicing regulatory elements (SREs). Our findings indicated that only the upstream 592-bp sequence (del 1) is decisive for the PE inclusion and that the presence of this specific 592-bp sequence is required for the PE repression in the wild-type context. Indeed, its replacement by a heterologous sequence (AmpR) induced complete PE inclusion. Nevertheless, deletion mutants did not allow the identification of candidate silencer motifs within the 592-bp sequence. We also ruled out the hypothesis that the deletion has created a new enhancer splicing element at the deletion junction or brought a favorable branch point sequence within proximity of the PE 3′ splice site, provided that the potential branch-point sequences have been correctly predicted. The DMD intron 56 PE is one of the few examples where PE inclusion occurs without changing the splice sites directly. We could not identify here how the upstream deletion (del 1) exerts its pathogenicity upon PE splicing. Based on our minigene experiments, SREs seem unlikely to play a role in the activation event, even though we have to consider the possibility that the minigene constructs used were not appropriate to evidence this putative splicing element. Despite extensive efforts made to elucidate the splicing code [Barash et al., 2010], the factors that drive splicing decisions and allow differentiation of exons from long flanking introns are far from being understood. Pre-mRNA secondary structure is increasingly recognized as a general modifier of splicing events, and in particular, would play a role in helping the splicing machinery to distinguish between real exons and PE sequences [Buratti et al., 2007]. Conserved stem–loop regions within introns can regulate donor-site usage and splicing efficiency as reported for ATM and CFTR PEs [Buratti et al., 2007] or for tau exon 10 alternative splicing [Donahue et al., 2006]. Stem–loop variants that destabilize this structure result in increased splicing of tau exon 10 and contribute to neurodegenerative disorders [Liu and Gong, 2008]. In the DMD gene, the skipping of the dystrophin Kobe exon 19, which has a 52-bp intraexon deletion near the 5′ splice donor site has been attributed to the loss of a hairpin structure in the truncated exon, which prevents the splicing machinery from recognizing the splice sites [Matsuo et al., 1992]. Evaluating the influence of RNA secondary structure on the processing of individual exons is considered a difficult task. In silico predictions may help in determining the impact of mutations on RNA structure, but this approach is more challenging in the presence of complex rearrangements such as the deletions identified in patient 2 than for single base substitutions. MFold predictions [Zuker, 2003] with flanking intronic sequences of different sizes around the PE gave rise to a large number of different models for the DMD intron 56 PE. We did not observe major differences in the accessibility of the PE donor and acceptor splice sites between the wild-type and mutant context, but rather the analyses showed differences in the structure of the PE itself (data not shown). However, no reliable working model of PE recognition could be proposed. Besides RNA secondary structure, other factors would be involved in splicing regulation. Recent data suggest that DNA structure in terms of nucleosome positioning and specific histone modifications, which have a well-established role in transcription, may also have a role in splicing [Schwartz and Ast, 2010].

Our study reiterates the importance of combining DNA- and RNA-based approaches to detect the full range of mutations in a gene, and in particular, in the huge DMD gene whose mutational spectrum is of unparalleled complexity. We demonstrate here that pure intronic rearrangements could represent a new class of disease-causing mutations by inducing missplicing events. They are nondetectable by the PCR-based methods commonly used for molecular diagnosis, and can escape detection by array CGH analysis, depending on their size. Furthermore, complex alleles are known to occur within the DMD gene even though at a low frequency. Our data raise the possibility that some affected individuals may carry undetected cryptic intronic rearrangements that have an impact on splicing. This hypothesis could explain some exceptions to the reading-frame rule. Most importantly, the screening of such rearrangements (by RNA studies and/or array CGH) should influence the inclusion criteria in the design of exon-skipping clinical trials.

Acknowledgements

The authors thank Sylvie Chambert and Céline Saquet for their excellent technical contribution, and Dr. Sabrina Sacconi and Dr. Véronique Humbertclaude for providing clinical information. We are also grateful to Christiane Branlant for useful advice about RNA secondary structure, and Sue Malcolm for English proofreading of the manuscript.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.