Volume 38, Issue 2 pp. 180-192
Research Article
Full Access

Whole-Genome Sequencing of Cytogenetically Balanced Chromosome Translocations Identifies Potentially Pathological Gene Disruptions and Highlights the Importance of Microhomology in the Mechanism of Formation

Daniel Nilsson

Daniel Nilsson

Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, 171 76 Sweden

Center for Molecular Medicine, Karolinska Institutet, Stockholm, 171 76 Sweden

Department of Clinical Genetics, Karolinska University Hospital, Stockholm, 171 76 Sweden

Science for Life Laboratory, Karolinska Institutet Science Park, Solna, 171 21 Sweden

These authors made equal contribution.

Search for more papers by this author
Maria Pettersson

Maria Pettersson

Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, 171 76 Sweden

Center for Molecular Medicine, Karolinska Institutet, Stockholm, 171 76 Sweden

These authors made equal contribution.

Search for more papers by this author
Peter Gustavsson

Peter Gustavsson

Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, 171 76 Sweden

Center for Molecular Medicine, Karolinska Institutet, Stockholm, 171 76 Sweden

Department of Clinical Genetics, Karolinska University Hospital, Stockholm, 171 76 Sweden

Search for more papers by this author
Alisa Förster

Alisa Förster

Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, 171 76 Sweden

Center for Molecular Medicine, Karolinska Institutet, Stockholm, 171 76 Sweden

Search for more papers by this author
Wolfgang Hofmeister

Wolfgang Hofmeister

Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, 171 76 Sweden

Center for Molecular Medicine, Karolinska Institutet, Stockholm, 171 76 Sweden

Search for more papers by this author
Josephine Wincent

Josephine Wincent

Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, 171 76 Sweden

Center for Molecular Medicine, Karolinska Institutet, Stockholm, 171 76 Sweden

Search for more papers by this author
Vasilios Zachariadis

Vasilios Zachariadis

Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, 171 76 Sweden

Center for Molecular Medicine, Karolinska Institutet, Stockholm, 171 76 Sweden

Search for more papers by this author
Britt-Marie Anderlid

Britt-Marie Anderlid

Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, 171 76 Sweden

Center for Molecular Medicine, Karolinska Institutet, Stockholm, 171 76 Sweden

Department of Clinical Genetics, Karolinska University Hospital, Stockholm, 171 76 Sweden

Search for more papers by this author
Ann Nordgren

Ann Nordgren

Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, 171 76 Sweden

Center for Molecular Medicine, Karolinska Institutet, Stockholm, 171 76 Sweden

Department of Clinical Genetics, Karolinska University Hospital, Stockholm, 171 76 Sweden

Search for more papers by this author
Outi Mäkitie

Outi Mäkitie

Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, 171 76 Sweden

Center for Molecular Medicine, Karolinska Institutet, Stockholm, 171 76 Sweden

Department of Clinical Genetics, Karolinska University Hospital, Stockholm, 171 76 Sweden

Children's Hospital, Helsinki University Central Hospital and University of Helsinki, Helsinki, 00290 Finland

Folkhälsan Institute of Genetics, Helsinki, 00290 Finland

Search for more papers by this author
Valtteri Wirta

Valtteri Wirta

SciLifeLab, School of Biotechnology, KTH Royal Institute of Technology, Stockholm 171 71, Sweden

Search for more papers by this author
Max Käller

Max Käller

SciLifeLab, School of Biotechnology, KTH Royal Institute of Technology, Stockholm 171 71, Sweden

Search for more papers by this author
Francesco Vezzi

Francesco Vezzi

SciLifeLab, Department of Biochemistry and Biophysics, Stockholm University, Stockholm, 171 21 Sweden

Search for more papers by this author
James R Lupski

James R Lupski

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, 77030

Texas Children's Hospital, Houston, Texas, 77030

Search for more papers by this author
Magnus Nordenskjöld

Magnus Nordenskjöld

Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, 171 76 Sweden

Center for Molecular Medicine, Karolinska Institutet, Stockholm, 171 76 Sweden

Department of Clinical Genetics, Karolinska University Hospital, Stockholm, 171 76 Sweden

Search for more papers by this author
Elisabeth Syk Lundberg

Elisabeth Syk Lundberg

Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, 171 76 Sweden

Center for Molecular Medicine, Karolinska Institutet, Stockholm, 171 76 Sweden

Department of Clinical Genetics, Karolinska University Hospital, Stockholm, 171 76 Sweden

Search for more papers by this author
Claudia M. B. Carvalho

Claudia M. B. Carvalho

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, 77030

Search for more papers by this author
Anna Lindstrand

Corresponding Author

Anna Lindstrand

Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, 171 76 Sweden

Center for Molecular Medicine, Karolinska Institutet, Stockholm, 171 76 Sweden

Department of Clinical Genetics, Karolinska University Hospital, Stockholm, 171 76 Sweden

Correspondence to: Anna Lindstrand, Department of Molecular Medicine and Surgery, Karolinska Institutet, Clinical Genetics Unit, Karolinska Institutet and Karolinska University Hospital Solna, Stockholm S-171 76, Sweden. E-mail: [email protected]Search for more papers by this author
First published: 16 November 2016
Citations: 57

Contract grant sponsors: Swedish Research Council [2012-1526, 2013-2603]; The Swedish Society for Medical Research; the Marianne and Marcus Wallenberg Foundation [2014.0084]; Stockholm County Council; the Harald and Greta Jeanssons Foundation; the Ulf Lundahl memory fund through the Swedish Brain Foundation; the Nilsson Ehle donations and the Erik Rönnberg Foundation; US National Institutes of Health [HG006542] to the Baylor Hopkins Center for Mendelian Genomics.

Communicated by Hildegard Kehrer-Sawatzki

ABSTRACT

Most balanced translocations are thought to result mechanistically from nonhomologous end joining or, in rare cases of recurrent events, by nonallelic homologous recombination. Here, we use low-coverage mate pair whole-genome sequencing to fine map rearrangement breakpoint junctions in both phenotypically normal and affected translocation carriers. In total, 46 junctions from 22 carriers of balanced translocations were characterized. Genes were disrupted in 48% of the breakpoints; recessive genes in four normal carriers and known dominant intellectual disability genes in three affected carriers. Finally, seven candidate disease genes were disrupted in five carriers with neurocognitive disabilities (SVOPL, SUSD1, TOX, NCALD, SLC4A10) and one XX-male carrier with Tourette syndrome (LYPD6, GPC5). Breakpoint junction analyses revealed microhomology and small templated insertions in a substantive fraction of the analyzed translocations (17.4%; n = 4); an observation that was substantiated by reanalysis of 37 previously published translocation junctions. Microhomology associated with templated insertions is a characteristic seen in the breakpoint junctions of rearrangements mediated by error-prone replication-based repair mechanisms. Our data implicate that a mechanism involving template switching might contribute to the formation of at least 15% of the interchromosomal translocation events.

Introduction

Using banded chromosomes, the overall incidence of balanced structural rearrangements without apparent cytogenetically observed gain or loss has been estimated to be 0.212% in an unselected newborn population [Jacobs et al., 1992]. Such balanced chromosome aberrations (BCAs) may be subdivided into reciprocal translocations, Robertsonian translocations, pericentric inversions, and paracentric inversions. Specifically, reciprocal translocations are observed in 0.09% of newborns, the majority of which are inherited from either parent. Approximately, 20% of translocations occur de novo with an estimated mutational rate of 2.7×10−4 per gamete per generation [Jacobs et al., 1992].

Only some 6% of de novo reciprocal translocations are thought to have an associated disease phenotype, recognized at birth or at a 1-year follow-up physical examination [Warburton, 1991]. However, due to errors in meiotic recombination and malsegregation of the rearranged chromosomes, translocation carriers have a risk of recurrent abortions and having children with inherited unbalanced rearrangements. Importantly, translocations are of high interest in identifying the causes of new genetic disease [Higgins et al., 2008; Hofmeister et al., 2015]; this requires sequencing of translocation breakpoint junctions in patient DNA samples to identify genes potentially mediating the observed clinical phenotype. The advent of massively parallel sequencing and affordable whole-genome sequencing (WGS) allows rapid and cost-efficient detailed investigation of translocation events, but this technology remains rarely used in clinical cytogenetics practice. An important potential clinical application is the identification of de novo translocations in prenatal samples. The disruption of known disease genes or potential dysregulation through a position effect may provide a molecular diagnosis, allowing for better and more accurate clinical management of the translocation carrier and genetic counseling information for the family [Talkowski et al., 2012a]. Similarly, in symptomatic carriers, the identification of the specific disease-causing gene may enable personalized therapeutic strategies.

Breakpoint junctional analyses of WGS data obtained from translocation carriers can provide insights into the potential molecular mechanisms of chromosome break and repair that cause the aberrations; observed “mutational signatures” may allow inference as to potential mechanisms for formation. By studying the break-join events of the DNA molecule at the chromosomal junctions, the potential mechanism(s) underlying the rearrangement can often be inferred. For example, the presence of large segments of DNA homology flanking translocation junctions (usually >200 bp) suggest nonallelic homologous recombination (NAHR) mechanism [Stankiewicz and Lupski, 2002], which mediates some interchromosomal recurrent balanced translocations [Giglio et al., 2002; Hastings et al., 2009b; Ou et al., 2011; Lupski, 2015]. In contrast, most previous publications of balanced translocations have shown a lack of both large homology and microhomology in the breakpoint junctions. This was proposed to indicate that canonical nonhomologous end joining (c-NHEJ) is the major mechanism underlying the formation of balanced translocations [Chiang et al., 2012]. c-NHEJ is a repair mechanism that joins double-stranded DNA breaks with a high degree of precision but occasionally deleting or inserting a few random nucleotides during DNA processing before ligation [Pannunzio et al., 2014]. Replication-based mechanisms (RBMs), such as fork stalling and template switching/microhomology mediated break-induced replication (MMBIR) [Lee et al., 2007; Hastings et al., 2009a] can underlie the formation of many disease-causing nonrecurrent structural variants in humans [reviewed in Conrad et al., 2010; Stankiewicz and Lupski, 2010; Abyzov et al., 2015; Carvalho and Lupski, 2016]. Occasional intrachromosomal and/or interchromosomal template switches during RBM repair can occur, due to the involvement of a lower processivity polymerase, and result in complex rearrangements. The mutational signature that may be observed in the breakpoint junctions of RBM-mediated rearrangements involves the presence of microhomology, small templated insertions at breakpoint junctions as well as inversion of large genomic segments accompanied by copy-number gains (e.g., duplications and triplications) [Carvalho et al., 2011, 2013]. RBMs have been suggested to contribute to the formation of BCAs during mitosis based on the observation of complex rearrangements associated with seemingly balanced translocations, for example, copy-number neutral inversions and copy-number loss [Chiang et al., 2012; Hsiao et al., 2014]. Other examples of constitutional translocations formed by RBM are recurrent translocations involving palindromic AT-rich repeats [Kato et al., 2012] and one chromosomal translocation t(14;17)(q32;q11.2) disrupting NF1 (MIM# 613113), that seem to have been generated by a mechanism involving fork stalling and a rereplication process [Hsiao et al., 2014].

Here, we used low-coverage mate pair WGS followed by capillary-sequencing confirmation to pinpoint the breakpoint location of 46 breakpoints from 22 balanced translocation carriers in order to (1) ascertain gene variations of potential clinical utility for genetic counseling and facilitate gene discovery, and (2) access the mutational signatures of translocation junctions and infer potential underling mechanism for formation. The cohort includes both clinically unaffected (n = 8) and affected individuals (n = 14), thus contrasting the makeup of benign and pathological events. Finally, we validated our “mutational signature” findings by reanalyzing the junctional sequences from 37 previously published translocations.

Materials and Methods

Subjects

The samples included in this study were originally referred for chromosome analysis either at the Clinical Genetics department clinical laboratory at the Karolinska University Hospital, Stockholm, or in one case (8480THO) at the Helsinki University Hospital. In the 22 individuals studied, the chromosome analysis had identified a balanced chromosomal aberration that could also be detected by WGS. Individual 872-05Ö harbored two separate events [Hofmeister et al., 2015] making the total number of translocations studied here 23. Ten individuals were referred for chromosome analysis due to amniocentesis, multiple miscarriages, or the birth of a child with an unbalanced karyotype, eight of these individuals were unaffected and two had mild neurocognitive deficits. Twelve individuals were referred for chromosome analysis due to neurocognitive deficits and/or malformations (Table 1). A custom array comparative genomic hybridization analysis had been done in six individuals (872-05Ö, 109-06Ö, 232-07F, 2–03E, 337-01D, 29-03E) [Lindstrand et al., 2010] and fluorescent in situ hybridization breakpoint mapping had been performed in 12 individuals (31-05E, 862-06Ö, 106-06Ö, 58-06Ö, 157-06Ö, 191-06-Ö, 263-06Ö, 175-06Ö, 872-05Ö, 887-05Ö, 109-06Ö, and 232-07F) (data not shown). The local ethical boards in Stockholm, Sweden, and in Helsinki, Finland approved the study. Karyotypes and phenotypic status are provided in Table 1.

Table 1. Karyotypes and Mode of Ascertainment of Included Cases
Case Karyotype Ascertainment Inheritance Phenotype summary
31-05E 46,XY,t(14;22)(q24;q12) Recurrent miscarriages n.i. Healthy
862-06Ö 46,XX,t(16;22)(p11;q13.1) Parent of a 46,XX,der(22)t(16;22)(p11;q13.1) miscarriage Paternal Healthy
106-06Ö 46,XX,t(4;7)(q25;q22) Recurrent miscarriages n.i. Healthy
58-06Ö 46,XY,t(3;12)(q23;q21) Recurrent miscarriages n.i. Healthy
157-06Ö 46,XX,t(4;7)(q21;p15) Recurrent miscarriages Maternal Healthy
191-06Ö 46,XY,t(2;3)(p13;p25) Recurrent miscarriages Mother not carrier; father not tested Healthy
263-06Ö 46,XX,t(10;11)(q22;p15) Recurrent miscarriages Father not carrier; mother not tested Healthy
175-06Ö 46,XX,t(10;15)(q23;q15) Parent of a 46,XY,der(10)t(10;15)(q23;q15) miscarriage Paternal Healthy
872-05Ö 46,XX,t(1;8)(p22;q24) Amniocentesis because of previously stillborn child Maternal Proband: reading difficulties
t(5;18)(p15;q11) Mother: reading difficulties
887-05Ö 46,XX,t(5;7)(q14;q34) Amniocentesis because of advanced maternal age Maternal Proband: ADHD and reading difficulties
Mother: reading difficulties
109-06Ö 46,XY,t(2;6)(q34;q21)dn Affected phenotype De novo Autism, epilepsy
232-07F 46,XY,t(3;7)(p33;p12) Affected phenotype Paternal Proband: mild ID, autistic traits
Father: speech delay
851-06Ö 46,X,t(X;17)(p22.1;p13) Affected phenotype Maternal Proband: epilepsy, psychiatric illness
Mother: epilepsy, psychiatric illness
8480THO 46,XY,t(2;9)(q37.3;q32)dn Affected phenotype De novo Autistic features, epilepsy
841-95D 46,XY,t(2;21)(p13;p11.2)dn Affected phenotype De novo Autism, ADHD
155-90D 46,XX,t(2;13)(q24;q33) Affected phenotype Mother not carrier; father not tested XX male, tics, Tourettes
2644-07D 46,XY,t(4;8)(q21;q13)dn Affected phenotype De novo DD, autism, ADHD
2587-07D 46,XY,t(1;2)(q42;q31)dn Affected phenotype De novo Vertebral anomaly
2-03E 46,XX,t(6;8)(q23;q24)dn Affected phenotype De novo DD, epilepsy
337-01D 46,XX,t(3;12)(q26.1;p11.2)dn Affected phenotype De novo ID, autism, epilepsy
782-95D 46,XY,t(10;12)(q24;q13)dn Affected phenotype De novo DD
29-03E 46,XY,t(9;16)(p21;q21)dn Affected phenotype De novo Obesity, ID
  • ADHD, attention deficit hyperactivity disorder; ID, intellectual disability; DD, developmental delay.

Finally, we included 37 previously published seemingly balanced translocations in which breakpoints junctions had already been characterized and junction fragment sequences were available for reanalysis (Supp. Table S1). For phenotypic data of these individuals, we refer to the original publications [Chen et al., 2008; Higgins et al., 2008; Chiang et al., 2012; Talkowski et al., 2012a; Talkowski et al., 2012b; Schluth-Bolard et al., 2013; Hsiao et al., 2014].

Mate Pair WGS

To pinpoint the exact genomic positions of the chromosome breaks, we used low-coverage mate pair [Collins and Weissman, 1984] WGS. Libraries were prepared using Illumina's Nextera Mate Pair Sample Preparation Kit according to the manufacturer's instruction (Illumina Document #15035209 Rev. D, May 2013). The workflow uses 1 μg of high-quality DNA estimated from gel imaging and concentration measured using a Qubit HS fluorometer (ThermoFisher Scientific, Pittsburg, PA). Gel purification was not used to allow for a higher spread of insert sizes and less laborious laboratory protocol. In brief, the method uses simultaneous fragmentation to approximately 2 kb and ligation of the circularization adapter using 4 μl of the nextera enzyme provided in the kit. After strand displacement to fill-in a remaining gap and reaction clean-up, the insert sizes are controlled using a Bioanalyzer with the DNA 12000 kit (Agilent Technologies, Santa Clara, CA) and quantified using Qubit. Then, 600 ng of product is circularized by ligation and noncircularized material is degraded enzymatically. Next, the circles are fragmented to 300–1,000 bp using a Covaris S2 with T6 glass tubes (Covaris, Woburn, MA). The adapter containing fragments are magnetic bead purified using a biotin moiety on the adapter. The remaining DNA is then subjected to the Illumina library preparation procedure including end repair, A-tailing, barcoded adapter ligation, PCR, and finally magnetic bead clean up using carboxylic acid-coated beads (Dynabeads MyOne CA; ThermoFisher Scientific) automated on a Bravo Workstation setup B (Agilent Technologies) [Borgstrom et al., 2011]. The final libraries were quality controlled using Qubit and Bioanalyser, diluted to 10 nM and sequenced as two samples per lane on an Illumina 2500 sequencer (2×100 bp). A technical summary of the sequencing raw data is provided in Supp. Table S2 (size distribution mode 2 kb, coverage 4x).

The raw sequence reads were base called using CASAVA RTA 1.18 (http://support.illumina.com/sequencing/sequencing_software/casava.htm). Following Illumina guidelines for Mate Pair post processing, adapter sequences were removed using Trimmomatic v0.32 [Bolger et al., 2014]. Remaining pairs were aligned to the hg19 human reference genome sequence using the Burrows-Wheeler Alignment tool (BWA; MEM-algorithm, version 0.7.4-r385) [Li and Durbin, 2009] resulting in a 4X mapping coverage. Discordant read mapping was processed using FindTranslocations (https://github.com/vezzi/TIDDIT/releases/tag/v0.9), a publicly available open source code software developed in-house, implementing a sliding window analogue of a previously published procedure [Talkowski et al., 2012a]. Briefly, chromosomes are divided into overlapping windows, which are scored for discordant read pairs, and the discordant reads are investigated for connection to a common receiving cluster of reads elsewhere. The algorithm proceeds linearly by considering only links to later, uninvestigated positions. Sufficient read mapping quality and deviating mapped insert size—or different chromosomes—are inclusion criteria. If the number of reciprocal read pairs are above a threshold, and the read coverage in the two cluster windows is not excessive, an event is called, and quality information such as the fraction of reads in each mapping orientation is stored. The program has been used previously for the detection of structural variants from WGS data, both balanced [Hofmeister et al., 2015] and unbalanced [Lieden, et al., 2014] as well as leukemic aberrations [Nord et al., 2014]. A window size of 10 kb, stepping of 1 kb and a minimum of eight supporting read-pairs was used. Calls were then annotated with frequency of occurrence in a local database containing calls from 62 samples analyzed with the same WGS protocol. Split read analysis was not implemented in FindTranslocations version 0.9, and was carried out using BLAT [Kent, 2002] of reads showing soft clipping or supplementary alignments in the area pinpointed by read-pair analysis. CNVnator [Abyzov et al., 2011] was used to call CNVs. Custom scripts were used to visualize variations with Circos [Krzywinski et al., 2009].

Breakpoint Junction PCR

Primers flanking the junctions were designed approximately 1–2 kb away from the estimated breakpoint area. For Sanger sequencing, new primers were designed approximately 300–500 bp away from the estimated break. In some cases, the genomic environment required primers to be designed further away or closer to the break. Primer sequences are shown in Supp. Table S3. Breakpoint PCR was performed by standard methods using Phusion High-Fidelity DNA Polymerase (ThermoFisher Scientific) and subjected to electrophoreses on a 1.5% agarose gel. To ensure specificity, a control sample of pooled genomic DNA from Promega (Madison, WI) was run together with the patient samples. Specific products of expected size, not present in control samples, were Sanger sequenced. Sequences were aligned using BLAT (UCSC Genome Browser) [Kent, 2002] and visualized in CodonCode Aligner (CodonCode Corp., Dedham, MA).

Nomenclature

The description of chromosome aberrations has previously been governed by ISCN, and sequencing results according to the guidelines of HGVS nomenclature. The current introduction of large-scale sequencing into the determination of chromosomal aberrations requires an updated nomenclature. Previous suggestions include a BLAST-result centric description [Ordulu et al., 2014]. Currently, the nomenclature suggestion under discussion is SVD-WG004 (http://www.hgvs.org/mutnomen/ comments004.html), a hybrid between ISCN and HGVS nomenclatures. For molecular karyotypes, we use a previous version of this scheme as suggested by Peter Taschner (http://www.hgvs.org/mutnomen/SVtrans_HGVS2013_PT.pdf).

Results

Low-Coverage WGS Detects Balanced Translocations at Nucleotide-Level Resolution

To ascertain clinically relevant gene disruptions, detect small genomic imbalances below the level of resolution of the clinical cytogenetics techniques used, and glean inferences from breakpoint junctional events to elucidate the potential underlying mechanisms of rearrangement formation, we investigated known translocation carriers with low-coverage mate pair WGS. In total, 46 junctions from 22 balanced translocation carriers were analyzed, including both individuals who were deemed phenotypically normal (n = 8) and those with a clinical pathological phenotype (n = 14) (Table 1).

All the cytogenetically defined and balanced translocations presented here were detected by WGS (Table 2). Using an in-house software, FindTranslocations, we detected an average of 130 structural variants per sample. These may include common polymorphic variations, patient personal genome structural differences from the reference genome assembly, and potential experimental or computational analysis methods artifacts. Filtering these variants to remove recurring events in a database resulted in an average of 3.5 unique rare variant events per individual. In one-third of the study subjects, only the previously karyotyped novel junctions were identified. The remaining samples showed one or more putative additional unique structural variant event calls. Generally, the additional unique event calls were of lower quality, but provided signal that passed the computational filters. The vast majority are additional rare variant interchromosomal events, leaving on average 0.5 intrachromosomal, for example, large inversions, deletions, or duplications per genome analyzed. Since the individuals included in this study had undergone rigorous karyotype analysis previously, the additional unique events observed in WGS data were considered as potential artifacts of the experimental methods and therefore not considered further.

Table 2. Molecular Karyotypes and Gene Disruptions in Our Cohort
Disrupted gene Disrupted gene
Case Molecular karyotype (hg19) Breakpoint A Breakpoint B
31-05E t(14;22)(q24;q12)(chr14:g.77200096_?::chr22:g.?_33670453;chr22:g.33670154_?::chr14:g.?_77202063)   LARGE1 (AR)
862-06Ö t(16;22)(p11;q13.1)(ochr22:g.?_44818191::chr16:g.?_23411495;chr22:g.44818097_?::ochr16:g.23411178_?) COG7 (AR)  
106-06Ö t(4;7)(q25;q31.1)(chr4:g.109487500::chr7:g.108207030; chr7:g.108207034::GTTG::chr4:109487495) THAP5
58-06Ö t(3;12)(q21.2;q22.4)(chr3:g.140432520_?::chr12:g.?_78361967; chr12:g.78361576_?::chr3:g.?_140433048)   NAV3
157-06Ö t(4;7)(q13.2;p14.3)(chr4:g.69061163::ochr7:g.30238755; chr4:g.69061167::chr7:g.30238757) TMPRSS11BNL FTLP10  
191-06Ö t(2;3)(p13;p25.1)(chr2:g.73661733::chr3:g.15920865; chr3:g.15920804_?::chr2:?_73661842) ALMS1 (AR)  
263-06Ö t(10;11)(chr10:g.81408289::T::ochr11:g.7293897;ochr10:g.81408292::chr11:g.7293897)   SYT9
175-06Ö t(10;15)(q23.1;q13.1)(chr10:g.85563120::T::chr15:g.28330698; chr15:g.28330697::chr10::g.85563132)   OCA2 (AR)
872-05Ö t(1;8)(p31.3;q24.23)(ochr8:g.137813394::chr1:g.78931227;chr8:g.137813393::ochr1:g.78931219) CTNND2 (AD) ZSCAN30
t(5;18)(p15.2;q12.2)(ochr18:g.32849314::chr5:g.11291110; chr18:g.32849312::ochr5:g.11291110)
887-05Ö t(5;7)(q14.1;q34)(chr5:g.78475862::chr7:g.138295848; chr7:g.138295847::chr5:g.78475863)   SVOPL
109-06Ö t(2;6)(q34;q21)(chr2:g.212140397::chr6:g.?_107800308; chr6:g.107800226::chr2:g.212140479)    
232-07F t(3;7)(p22.1;p11.2)(chr3:g.35670981::AAAAG::chr7:g.54888748; chr7:g.54888752::chr3:g.35670986)    
851-06Ö t(X;17)(p22.1;p13)(chr17:g.11479837::chrX:g.30607945; chrX:g.30607932::TATACCTTTATA::chr17:g.11479842)    
8480THO t(2;9)(chr2:g.241890512::chr9:g.114876902; chr9:g.114876465_?::chr2:g.?_241890544)   SUSD1
841-95D t(2;21)(p13.4; q21.1)(chr21:g.?_17525142::chr2:g.?_72983886; chr21:g.17524360_?::chr2:72983776_?) EXOC6B (AD) LINC00478
155-90D t(2;13)(chr2:g.150242475::chr13:g.93150076; chr13:g.93150075::chr2:g.150242493) LYPD6 GPC5
2644-07D t(4;8)(q21.23;q12.1)(chr4:g.85939427::chr8:g.59822562; chr8:g.59822520::chr4:g.85939429)   TOX
2587-07D t(1;2)(q42;q31)(chr1:g.210823159::chr2:g.162498672; chr2:g.162498662::chr1:g.210823170) HHAT SLC4A10
2-03E t(6;8)(q23;q24)(chr6:g.93938325::chr8:g.103098296; chr8:g.103098297::chr6:g.93938333)
del(1p31)(chr1:g.69814692_81274032del) NCALD
337-01D t(3;12)(q26.1;p11.2)(chr3:g.175570204::chr12:g.13863014; chr3:g.?_175576921::chr12:g.?_13863426)   GRIN2B (AD)
782-95D t(10;12)(q24;q13)(chr10:g.113094447_?::chr12:g.?_49092809; chr12:g.49091842_?::chr10:g.?_113094574)   CCNT1
29-03E t(9;16)(p21;q21)(chr16:g.63446103::TTGGC::chr9:g.26190816; chr16:g.63445881::CATC::chr9:g.26190776)    
  • AR, autosomal recessive; AD, autosomal dominant; bold text indicates split-read resolution; underlined text indicates known disease-causing gene.

Of the 46 analyzed junctions, 32 (70%) breakpoints were delineated by WGS split-read analysis. For the remaining 14 junctions, discordant read pairs mapped the breakpoints to within 2 kb. The detailed findings from chromosomal aberration to base pair resolution are shown in Figure 1 for one individual (862-06Ö). The exact findings from the mate pair sequencing of all the individuals are presented as molecular karyotypes in Table 2.

Details are in the caption following the image
Molecular cytogenetic and genomic findings in subject 862-06Ö. A: G-banded chromosomes showing the balanced translocation between chromosome 16 and chromosome 22 present in individual 862-06Ö (the aberrant chromosome is shown in the left). B: FISH analysis with BAC clone RP11-350D02 (red) localized at chr16p11 (Ensembl., GRCh37). The signal is split between the two derivative chromosomes (der16 and der22). C: Circos plot illustrating the WGS results in individual 862-06Ö. Fusion events between chromosome 16 and 22, as predicted by FindTranslocations, from read pair mapping data are illustrated as gray lines. On chromosome 16, COG7 is disrupted by the breakpoint. The chromosomes are karyogram painted, chr22 in gray and chr16 in blue, with the centromeres shown in shaded dark red. Copy-number changes according to CNVnator are shown in the central ring, with a light red bar corresponding to low coverage and light green to high. As can be seen, short-read sequence mapping does not cover the centromeric region or the heterochromatic 22p-arm. All copy-number changes were evaluated as benign normal variation for this patient. D: Sanger sequencing traces showing the chromosomal junctions at the nucleotide level with der16 on top and der22 on the bottom. A two-nucleotide microhomology (TT) that may have originated from either parental chromosome is present in the der16 breakpoint junction (black box) and a clean break is present on der22 (black vertical line).

Gene Disruptions Are Present in Both Phenotypically Normal and Clinically Affected Individuals

Genes were disrupted at the breakpoints to the same extent in both affected and unaffected individuals, 44% (7/16) and 47% (14/30) for affected and unaffected, respectively (P = 1.0, Fisher's exact test). However, differences in the inheritance pattern of the disrupted genes in the two cohorts were observed: in the unaffected cohort, 50% of disrupted gene loci were known disease genes, all in which disease traits are associated with a recessive inheritance pattern (i.e., LARGE1 [MIM# 603590], COG7 [MIM# 606978], ALMS1 [MIM# 606844], OCA2 [MIM# 611409]). In contrast, only three of the genes disrupted in the affected cohort were known disease causing, all linked to dominant neurodevelopmental disorders concordant with the phenotype observed (i.e., CTNND2 [MIM# 604275], EXOC6B [MIM# 607880], GRIN2B [MIM# 138252]) (Table 2). A systematic evaluation of all genes disrupted or in the vicinity (<250 kb) of the chromosomal breakpoints is provided in Supp. Table S4.

A Clinically Significant Copy-Number Variant is Present in One Affected Individual

To identify possible gene dose abnormalities that were not detected by the cytogenetic analysis, we used the CNVnator software [Abyzov et al., 2011] to analyze the mate pair whole-genome data for deletion and duplications ≥2 kb. One clinically significant copy-number variant (CNV) was detected in individual 2–03E; an 11.4-Mb heterozygous deletion in chromosome 1p31 affecting 37 protein coding genes, previously described (de novo; patient 5 in Lindstrand et al., 2010). The deletion was also apparent from discordant read-pair analysis, and had split-read information delineating it to single-nucleotide resolution.

Mutational Signatures Underlying Mechanisms of Rearrangement Formation

To confirm the WGS results, all junctions were defined at the nucleotide level by breakpoint PCR and Sanger sequencing (Supp. Figs. S1 and S2). This enabled delineation of mutational signatures at the translocation junctions; this information could then be used to infer the potential underlying mechanisms for rearrangement formation.

None of the reported junctions were mediated by low-copy repeats or repetitive elements such as LINEs or SINEs located in distinct chromosome translocation substrates, nor was there any evidence for palindromic AT-rich repeats at the breakpoints. Two individuals had one out of two breakpoints mapped to repetitive elements, L1M4 in 191-06Ö/chr2 and L1Mb4 in 263-06Ö/chr10. In one case, 887-05Ö, the translocation involved two Alu elements (Fig. 2), but as the derivative breakpoint junctions are characterized by very short homologies (3 nt of microhomology each), no fusion Alu was generated. Interestingly, in the same individual (887-05Ö), an intrachromosomal 1,579 bp deletion occurred 121 nucleotides upstream to the translocation break site on the derivative chromosome 5. The deletion involves two Alu elements that generate a fusion Alu (AluSx3-AluY) directly upstream of the translocation junction [Boone et al., 2011; Boone et al., 2014; Gu et al., 2015; Mayle et al., 2015]. Remarkably, the same AluY involved in the fusion Alu created by the deletion is also involved in the translocation junction, and connects to an AluSz6 on chromosome 7, but no Alu fusion is generated by the translocation event as discussed above (Fig. 2). This observation, an AluAlu rearrangement 121 nt upstream of the translocation, is consistent with the downstream translocation as an intrachromosomal template switching event (Table 3; Fig. 2). Interchromosomal template switches between nonhomologous chromosomes have been demonstrated previously in both yeast [Smith et al., 2007] and in humans [Carvalho et al., 2015] despite in those cases they were not mediated by Alu repeats.

Details are in the caption following the image
Evidence for template switching during translocation formation in individual 887-05Ö. A: A schematic overview of chr5q14.1 and 7q34 region and the structural events in individual 887-05Ö. The derivative 5 (der5) in blue and derivative 7 (der7) in green are aligned to chromosome 5 (top) and chromosome 7 (bottom). Alu elements are shown as gray boxes. The 1,579-nt upstream deletion on der5 is shown as a dashed blue line. Both deletion breakpoints as well as the translocation breakpoints are located in Alus. B: Sequence alignment of der5 to the corresponding regions on chromosome 5 and chromosome 7. The derivative chromosome sequences as well as the corresponding parental chromosome sequence are labeled in blue. The deletion is shown in lower case bold letters. Microhomology is highlighted in purple with the most plausible parental chromosome indicated by bold text. A 22-nt microhomology is present in the first slippage event (TS 1) between the proximal and distal end of the der5 upstream deletion. In the second event, chromosome 5 to chromosome 7 (TS 2), a 3-bp microhomology is present. C: Der7 is illustrated in green, otherwise as in (B). A three-nucleotide microhomology is present in the junction.
Table 3. Breakpoint Junction Characteristics
Case Microhomology Genomic deletion Genomic duplication Insertion and SNVs Additional features
31-05E der14: TT chr14: 4 nt chr14: 0 nt der14: 0
der22: 0 chr22: 1 nt chr22: 0 nt der22: RIns GACG
862-06Ö der16: TTA chr16: 4 nt chr16: 0 nt der16: TIns TTATAC
der22: TT chr22: 5 nt chr22: 0 nt der22: 0
106-06Ö der4: T chr4: 0 nt chr4: 6 nt der4: Ins or SNV T
der7: 0 chr7: 0 nt chr7: 5 nt der7: RIns GTT
58-06Ö der3: TG chr3: 5 nt chr3: 0 nt der3: 0
der12: AGT chr12: 0 nt chr12: 0 nt der12: 0
157-06Ö der 4: TC chr4: 4 nt chr4: 0 nt der 4: 0
der 7: AGT chr7: 0 nt chr7: 0 nt der 7: 0
191-06Ö der2: GT chr2: 0 nt chr2: 3 nt der2: 0
der3: 0 chr3: 0 nt chr3: 0 nt der3: SNV C
263-06Ö der10: 0 chr10: 2 nt chr10: 0 nt der10: Ins or SNV T
der11: 0 chr11: 0 nt chr11: 0 nt der11: 0
175-06Ö der10: GCTGT chr10: 11 nt chr10: 0 nt der10: Ins or SNV T  
der15: GGCTGT chr15: 0 nt chr15: 0 nt der15: 0
872-05Ö der1: 0 chr1: 7 nt chr1: 0 nt der1: 0
der8: G chr8: 0 nt chr8: 0 nt der8: 0
872-05Ö der5: G chr5: 0 nt chr5: 0 nt der5: 0
der18: G chr18: 0 nt chr18: 0 nt der18: SNV T
887-05Ö der5: GGC chr5: 0 nt chr5: 0 nt der5: 0 chr 5 upstream deletion:
der7: GGC chr7: 0 nt chr7: 0 nt der7: 0 (AluSx3-AluY)
109-06Ö der2: CTA ch2: 10 nt ch2: 0 nt der2: 0 chr2: palCTAG
der6: 0 chr6: 3 nt chr6: 0 nt der6: 0
232-07F der3: 0 chr3: 0 nt chr3: 10 nt der3: 0
der7: 0 chr7: 3 nt chr7: 0 nt der7: del C + ins G
851-06Ö derX: TGGGG chrX: 9 nt chrX: 0 nt derX: 0
der17: 0 chr17: 7 nt chr17: 0 nt der17: TInsTATACC
TTTATA
8480THO der2: CA chr2: 0 nt chr2: 0 nt der2: 0 chr2: (TCCA)n
der9: CA chr9: 1 nt chr9: 0 nt der9: 0
841-95D der2: TA chr2: 3 nt chr2: 0 nt der2: 0
der21: 0 chr21: 0 nt chr21: 5 nt der21: RIns AAAAA
155-90D der2: GTATG chr2: 22 nt chr2: 0 nt der2: 0
der13: T chr13: 0 nt chr13: 0 nt der13: 0
2644-07D der4: 0 chr4: 1 nt chr4: 0 nt der4: 0
der8: 0 chr8: 0 nt chr8: 1 nt (G) der8: 0
2587-07D der1: TATA chr1: 13 nt chr1: 0 nt der1: 0
der2: 0 chr2: 5 nt chr2: 0 nt der2: 0
2-03E der6: TAAA chr6: 3 nt chr6: 0 nt der6: A>C (mosaic) chr6: palTTTTA/TAAAA
der8:ATG chr8: 0 nt chr8: 0 nt der8: 0 chr8: palTTTAAA/TTTAAA
337-01D der 3: A chr3: 6594 nt chr3: 5 nt der3: 0
der12: CTTTT chr12: 13 nt chr12: 0 nt der12: RIns A, TIns TTTTAAAATGT
782-95D der10: CTGA chr10: 0 nt chr10: 0 nt der10: 0
der12: A chr12: 4 nt chr12: 0 nt der12: 0
29-03E der9: CTT, ATA chr9: 29 nt chr9: 0 nt der9: 0
der16: ATA chr16: 222 nt chr16: 0 nt der16: TIns TTGGC
  • chr, chromosome; SNV, single-nucleotide variant; der, derivative chromosome; nt, nucleotides; jct, junction; RIns, random insertion; TIns, templated insertion; pal, palindrome.

We next aligned the junction fragments to both parental chromosomes and quantified differences with respect to the haploid reference genome including the presence of microhomology at the junction and a few base pair deletions/duplications, insertions, and single-nucleotide variants (SNVs) not present in dbSNP. The detailed translocation breakpoint features observed in our cohort are shown as Table 3.

Microhomology, that is, nucleotide sequence found in both substrate sequences that is reduced to one copy at the breakpoint junction, ranging from 2 to 6 nt was observed in 25 out of 46 sequence junctions (54%) involving 17 of the 23 translocation events studied (74%) (Table 4). Deletions of a few base pairs (1-6,594 bp, median 5 nt) were observed in 26 junctions from 19 patients (Tables 3 and 4). Nucleotide insertions varying from 1 to 12 nt were identified at 11 translocation junctions from 10 individuals (24% of junctions). Among those, at least four insertions likely originate from nearby genomic segments (< 200 nt) resulting in an overall fraction of templated insertions in the studied translocation carriers of 17.4% (Fig. 3; Tables 3 and 4). Duplications of 1–10 nt were present in six individuals (Tables 3 and 4), all in conjunction with additional either insertions and/or SNVs.

Table 4. Breakpoint Characteristics of Reported and Reanalyzed Reciprocal Chromosome Translocations
All translocations Our cohort Reanalysis cohort
By translocation Total number 60 100% 23 100% 37 100%
Balanced (cytogenetically) 60 100% 23 100% 37 100%
By breakpoint Total number 120 100% 46 100% 74 100%
Balanced 44 37% 14 30% 30 41%
Microhomology, total 70 58% 32 70% 38 51%
<2 nt 24 20% 7 15% 17 23%
2–20 nt 46 38% 25 54% 21 28%
Insertions, total 34 28% 11 24% 23 31%
<2 nt 5 4% 4 9% 1 1%
2–20 nt 26 22% 7 15% 19 26%
>20 nt 3 3% 0 0% 3 4%
Templated insertions 19 16% 4 9% 15 20%
Base deletions, total 67 56% 26 57% 41 55%
<2 nt 8 7% 3 7% 5 7%
2–20 nt 42 35% 19 41% 23 31%
>20 nt 8 7% 3 7% 5 7%
>1,000 nt 9 8% 1 2% 8 11%
Base duplications, total 9 8% 7 15% 2 3%
<2 nt 1 1% 1 2% 0 0%
2–20 nt 8 7% 6 13% 2 3%
  • nt, nucleotide.
Details are in the caption following the image
Four additional translocation carriers with evidence for templated insertions originating from nearby genomic segments. Sequence alignments of the derivative chromosome sequences to the corresponding regions on the parental chromosomes. Deletions are shown in lower case bold letters. Microhomology is highlighted in purple with the most plausible parental chromosome indicated by bold text. Insertions and SNVs are shown in pink. A: Sequence alignments from case 862-06Ö. Derivative chromosome 22 (der22) is shown on top and derivative chromosome 16 (der16) on the bottom. The derivative chromosome sequences as well as the corresponding parental chromosome (chr16 and chr22) sequences are labeled in blue for der22 and in green for der16. Short deletions are present on both parental chromosomes, 4 nt on chr16 and 5 nt on chr22. On chr16, palindromic sequences are present on each side of the breakpoint. A six nucleotide (nt) insertion is present in the junction of der22 (TTATAC), likely due to template slippage (TS) copying from the palindrome sequence using a 3-nt (TTA) microhomology. A 2-nt microhomology is present in der16. B: Sequence alignment from case 851-06Ö. Derivative chromosome 17 (der17) is shown on top and derivative chromosome X (derX) on the bottom. The derivative chromosome sequences as well as the corresponding parental chromosome (chr17 and chrX) sequences are labeled in blue for der17 and in green for derX. Short deletions are present on both parental chromosomes, 9 nt on chr17 and 7 nt on chrX. On der17, a 5-nt microhomology is present. A 12-nt insertion is present on derX that may originate from template switching to two different places on chr17 (- strand). In the TS1, a potential CACCT microhomology is present but for TS2 no microhomology could be observed. C: Sequence alignment from case 337-01D. Derivative chromosome 3 (der3) is shown on top and derivative chromosome 12 (der12) on the bottom. The derivative chromosome sequences as well as the corresponding parental chromosome (chr3 and chr12) sequences are labeled in blue for der3 and in green for der12. Deletions are present on both parental chromosomes, 6,594 nt on chr3 and 13 nt on chr12. On der3, a 1-nt microhomology is present. A 10-nt insertion is present on der12 that may have arisen through error-prone backward slippage using a 4-nt (TTTT) microhomology. D: Sequence alignment from case 29-03E. Derivative chromosome 9 (der9) is shown on top and derivative chromosome 16 (der16) on the bottom. The derivative chromosome sequences as well as the corresponding parental chromosome (chr9 and chr16) sequences are labeled in blue for der9 and in green for der16. On der9, microhomology of 3 nt is present on both sides of a 27-nt resection/deletion. On der16, the junction presents with a 222-nt deletion and the 5-bp insertion (TTGGC) originates from inside the deletion.

Novel SNVs not present in dbSNP, ExAC, or 1000 Genomes were identified in six individuals (Table 3). In all cases, parental DNA was not available for origin studies or to determine whether they occurred de novo concomitantly with the translocation event. In one individual (2-03E), a mosaic A>C SNV was detected adjacent to the der6 junction (Supp. Figs. S1 and S2). The variant was confirmed with multiple rounds of PCR using different primers. The PCR product was then cloned and subsequent Sanger sequencing showed that 16/29 (55%) of clones carried the C allele.

Finally, to further investigate for potential mutational signatures in common with those observed in our study subjects, we reanalyzed 37 previously published seemingly balanced translocations with available junction fragment sequences [Chen et al., 2008; Higgins et al., 2008; Chiang et al., 2012; Talkowski et al., 2012a, 2012b; Schluth-Bolard et al., 2013; Hsiao et al., 2014]. Inversions, complex translocations, and chromothripsis events were excluded from reanalysis. Two additional papers presented breakpoint positions for balanced translocations [Dong et al., 2014; Suzuki et al., 2014]; however, they did not report enough sequence information for reanalysis. In the reanalysis cohort, microhomology (2–6 nt) is present in 21 of the 74 chromosomal junctions (28%) and in 16 of the 37 translocation events (43%). Furthermore, templated insertions are observed in 15 junctions and 13 translocations (Figs. 2 and 3; Tables 3 and 4; Supp. Fig. S3; Supp. Table S1). By combining the data from our cases and the reanalysis cohort, junction sequences from 60 translocations were analyzed for breakpoint features. In aggregate, the combined data show that blunt ends are present in 36.7% of junctions (44/120) but only 17% of translocation events present with blunt ends on both chromosomal derivatives. Microhomology (>2 nt) was observed in 38.3% (46/120) of all the junctions. Finally, insertions are present in 34 junctions, and in 19 instances the inserted sequences have originated from local sequences making the overall incidence of templated insertions in 120 reciprocal translocation breakpoint junctions ∼15.8%. The overall breakpoint characteristics are summarized in Table 4.

Discussion

Gene Disruptions Are Identified in Both Unaffected and Affected Carriers

We used low-coverage mate pair WGS to characterize the breakpoints of 23 translocations identified in a cohort of 22 carriers, both phenotypically normal (n = 8) and abnormal individuals (n = 14). In the phenotypically normal cohort four genes, in which biallelic pathological variants convey a disease trait with a recessive inheritance pattern, were disrupted making the BCA carriers also recessive carriers of the disease linked to the specific genes (31-05E, LARGE1; 862-06Ö, COG7; 191-06Ö, ALMS1; 175-06Ö; OCA2; Table 2; Supp. Table S4). This is in contrast to the phenotypically affected cohort where we identified the disruption of three known disease genes in which the disease traits are associated with an autosomal-dominant inheritance pattern (described in detail in the next section). Disruption of recessive alleles has to be taken into consideration concerning reproduction and risk for disease in the offspring as structural variation can contribute to recessive disease carrier states [Boone et al., 2013].

In the translocation carriers with a neurodevelopmental phenotype, disrupted genes or genes in the vicinity of the breakpoint were manually curated and classified from A to G according to a predefined set of criteria on their likelihood of being causal for the phenotype (Supp. Table S4). In addition to the previously reported CTNND2 disruption segregating with reading difficulties [Hofmeister et al., 2015], two disruptions of known disease genes were scored as likely causal (Class A; Supp. Table S4). First, in individual 337-01D, a de novo translocation disrupts the glutamate receptor subunit GRIN2B, known to cause epileptic encephalopathy (MIM# 616139) [Lemke et al., 2014] and autosomal-dominant intellectual disability (ID) (MIM# 613970) [Endele et al., 2010], concordant with the moderate ID, autism, and epilepsy phenotype in our patient (Table 1; Supp. Table S4). Second, in case of 841–95D, EXOC6B is disrupted on chromosome 2p13.2 in patient 841–95D with high-functioning autism, ADHD, and hypertelorism. Disruption of EXOC6B by a de novo balanced translocation was previously described in a patient with autistic traits, developmental delay, ID, epilepsy, aggressive behavior, and various minor dysmorphic features [Fruhmesser et al., 2013]. In addition, a heterozygous de novo mosaic deletion of exons 2–20 in EXOC6B was reported in a child with developmental delay, speech delay, and minor dysmorphic features [Evers et al., 2014]. Overall, these studies provide evidence suggesting that the disruption of EXOC6B is causing the clinical phenotype in case 841–95D (Table 1; Supp. Table S4).

We further explored for potential regulatory effects caused by the translocations. Even though position effects have been reported as far as 1.3 Mb from the chromosomal breakpoints [Velagaleti et al., 2005], in order to provide evidence for long-range effects either the genotype–phenotype correlations need to be very specific or molecular evidence, such as a reduction in RNA and/or protein levels, should be shown. Since several hundred disease genes have been described in neurodevelopmental disorders [Gilissen et al., 2014] and the relevant tissue was unavailable for functional studies, we chose to focus on genes in the immediate breakpoint regions (250 kb upstream/downstream). In one affected individual (232-07F), no gene disruption was observed on either derivative, but the vicinity search showed that the chromosome 3 breakpoint was located 9.7 kb upstream of ARPP21 (MIM# 605488), a candidate ID gene [Marangi et al., 2013], possibly disrupting a gene regulating region or altering the genomic environment (Table 1; Supp. Table S4).

Finally, several potential candidate genes were disrupted in the affected cohort involved in various cellular functions and pathways such as calcium–ion binding (SUSD1; 8480THO) [Clark et al., 2003], synaptic signaling (SVOPL [MIM# 611700]; 887-05Ö) [Jacobsson et al., 2007], Wnt/ß–catenin signaling (LYPD6 [MIM# 613359]; 155–90D) [Ozhan et al., 2013], hedgehog signaling (GPC5 [MIM# 602446]; 155–90D) [Witt et al., 2013], and mammalian corticogenesis (TOX [MIM# 606863]; 2644-07D) [Artegiani et al., 2015]. None of these genes have been previously reported in human neurobehavioral syndromes, and although identification of more individuals with mutations in these genes will be required to clearly determine causality, they present good candidates for further functional studies to investigate their role in neurodevelopmental disorders (Table 1; Supp. Table S4).

Evidence for Template Switching Suggests That Replicative Mechanism May Underlie a Portion of Balanced Translocations

The detailed analysis of breakpoint junction sequences from 60 reciprocal translocations show mutational signatures consistent with an underlying mechanism involving template switching in approximately 16% of the junctions (Tables 3 and 4; Supp. Table S1).

Template switching, highlighted in breakpoint junctions by microhomology and templated insertions, is a hallmark feature of replicative repair. In humans, replicative errors (e.g., RBMs) give rise to complex genomic rearrangements, including gross copy-number gains, losses, and inversions [Lee et al., 2007; Sakofsky et al., 2015; Carvalho and Lupski, 2016]. Distinct from NHEJ that uses microhomology to facilitate ligation, in RBMs microhomology, is used to prime and assist resumption and productive synthesis of a stalled/collapsed replication fork. Mostly, RBMs seem to result in the formation of intrachromosomal structural variants but interchromosomal events resulting in complex copy-number gain have also been reported [Carvalho et al., 2015; Gu et al., 2016].

Previous publications that have characterized nonrecurrent balanced constitutional reciprocal translocations at the breakpoint level [Chen et al., 2008; Higgins et al., 2008; Chiang et al., 2012; Talkowski et al., 2012a, 2012b; Schluth-Bolard et al., 2013; Dong et al., 2014; Hsiao et al., 2014; Suzuki et al., 2014] have suggested c-NHEJ as the predominant mechanism underlying translocations in humans [Chiang et al., 2012]. In the largest previous study, Chiang et al. (2012) state that most BCAs show little or no microhomology at the junctions. However, the detailed analysis of 81 breakpoints from simple translocations and chromosomal inversions showed both microhomology (1–20 nt, 31.3%) and insertions (20.5%) in a high fraction of junctions [Chiang et al., 2012]. In the same paper, the authors also observed a surprisingly high fraction of BCAs with three or more breakpoints, many of which are accompanied by strand inversions. In the latter cases, the authors implicated that RBMs may have been involved in the formation of these complex interchromosomal rearrangements [Chiang et al., 2012]; in fact, such observations are indeed most consistent with iterative template switching during replicative repair [Carvalho and Lupski, 2016].

Recently, it was shown in embryonic mouse fibroblast that Pol theta is responsible for repair of DNA breaks in a context where c-NHEJ is defective (Ku70/) and that templated insertions from heterologous chromosomes are inserted in some of the junctions during this repair process. However, the authors also showed that Pol theta does not contribute to balanced translocations, at least in their model system, and that an alternative mechanism should be in place [Wyatt et al., 2016]. Additional support for the notion that mechanisms other than NHEJ and NAHR underlie the formation of some balanced translocations is provided by studies on human cells, ∼6% of translocation junctions generated in vitro show the presence of templated insertions varying from 20 bp to several hundred bp [Ghezraoui et al., 2014]. Finally, chromosomal translocations have also been well studied in leukemia, where the same somatic events arise in multiple patients, and are indicative of prognosis and guide medical management. The recurrent translocation, t(9;22), seen in the formation of the Philadelphia chromosome, illustrates that replicative mechanisms may underlie balanced reciprocal translocations [Czuchlewski et al., 2011].

In the translocations analyzed here, 37% of translocation junctions are precisely joined (blunt ends), consistent with c-NHEJ underlying a portion of the breakpoint junctions of balanced translocations. Nonetheless, 38% of the junctions we studied have 2–6 nt microhomology, some of them associated with templated insertions. Such templated insertions at the breakpoint junctions of structural variants may result from single or multiple iterative template switches during DNA repair processes that involve DNA replication [Lee et al., 2007; Hastings et al., 2009a; Deem et al., 2011; Sakofsky et al., 2015; Carvalho and Lupski, 2016]. Another possibility that could explain some of the observed junction signatures is alternative nonhomologous end-joining (alt-NHEJ). Alt-NHEJ is hypothesized to take over when components of c-NHEJ are absent, producing rearrangements with longer microhomology at the translocation junctions [Ghezraoui et al., 2014].

In our original cohort, template switches seem to occur in at least five carriers (22.7%; 862-06Ö, 887-05Ö, 851-06Ö, 337-01D, 29-03E; Figs. 2 and 3). All events, except the complex Alu-mediated event in case 887-05Ö, can be interpreted as short-range backwards template switching resulting in the insertion of short segments within the same fork (<200 nt). In our cohort, low-copy repeats and repetitive elements do not seem to mediate BCAs. Nonetheless, a deletion mediated by Alu elements generating a fusion AluAlu was observed nearby the translocation breakpoint of individual 887-05Ö. Events mediated by Alu generate AluAlu fusions [Boone et al., 2014] and can also lead to the formation of intrachromosomal complex structural variants (e.g., inverted triplications interspersed by duplications or DUP-TRP/INV-DUP structures) [Gu et al., 2015]. The limited similarity provided by genome-wide extensive presence of Alu elements in the human genome is hypothesized to provide enough homology to allow template switches to occur during RBMs [Gu et al., 2015], a contention that is supported by recent yeast experiments that model human Alu-mediated deletions [Mayle et al., 2015]. Therefore, the presence of such a deletion nearby the translocation breakpoint supports a role for RBM in the formation of this translocation. Our hypothesis is that this intrachromosomal event was followed by an interchromosomal event but since we do not have access to parental DNA we cannot prove that the AluAlu deletion was formed concomitantly with the translocation.

End Processing of the Original Chromosome Breaks May Give Rise to Short Duplications and Deletions

The junction analysis also confirmed previous observations that many seemingly balanced translocations are not balanced at the nucleotide level [Baptista et al., 2008; Higgins et al., 2008]. We observed both the presence of not only deletions of a few base pairs but also short duplications (1–10 nt). To better understand this observation, we reexamined the junction fragment sequences. The derivative translocation breakpoint junctions represent the rearrangement end-products and may enable us to infer some properties of the reactions by which the original double-stranded breaks (DSBs) that gave rise to the translocation were resolved. For instance, upon double- or single-strand break (SSB), both 5′ and 3′ ends may be processed. In the MMBIR repair mechanism, one ended, double-stranded DNA breaks (oeDSB) can result from a collapsed fork after DNA replication proceeds through a nicked molecule. The 5′ ends undergoes resection to expose the 3′ end that will further anneal to a single-stranded DNA that shares microhomology and resume replication [Hastings et al., 2009a]. Resection of the 5′ end is generally inhibited in NHEJ [Pannunzio et al., 2014], but alt-NHEJ has been observed to present longer deletions than NHEJ [Ghezraoui et al., 2014]. If only one of the ends is processed, there should be no loss of genetic information; therefore, the derivative products will present no copy-number variation nearby the ligated junction (Fig. 4A). Nonetheless, if both 5′ and 3′ ends are processed, then deletions are expected (Fig. 4A).

Details are in the caption following the image
Schematic of double-stranded break (DSB) and single-stranded break (SSB) prior to the formation of balanced translocations. A: The formation of two DSBs in distinct heterologous chromosomes is illustrated at the top in blue and green, respectively. (1) Left: DSB formation followed by 5′ resection. If 5' end resection occurs without 3′ end processing, the translocation junctions (jcts) will be copy neutral (right). 2.) If 5′ resection in addition to 3'end processing occurs at the breakpoints, there will be formation of short deletions at the translocation jcts. B: SSB or nick formation on top as in (A). If two nearby SSBs generated on opposite strands are converted into a DSB, short duplications may be present at the translocation jcts. Small letters (a, b, c, d) indicate breakpoint segments. Gap filling is indicated by dashed lines with respective breakpoint segments indicated by primed letters (a’, b’, c’, d’).

In our original cohort, there was an overall lack of extensive processing of the ends as indicated by the short intrachromosomal distance between the endpoints in the broken chromosomes. The distance between the breakpoints located on the same chromosome range from 0 to 6,594 nt, but the majority of chromosome ends (n = 42) are less than 20 nt apart (median 2 nt; Tables 3 and 4). In all the analyzed translocations, copy-number neutral junctions are observed in 44 out of 120 DSB ends (37%), indicating that those ends underwent either no or single end processing (Tables 3 and 4; Supp. Table S1). Interestingly, most of the breaks, that is, 67 out of 120 (56%), are accompanied by a deletion that varies from 1 to 3,600,000 nt (median 1 nt). Although the occurrence of short deletions at the junctions is consistent with alt-NHEJ [Ghezraoui et al., 2014], larger end processing is not consistent with this mechanism. Supporting this observation, the two carriers in our cohort (29-03E and 337-01D) with larger processed ends (222 nt and 6,594 nt) also have templated insertions at the junctions consistent with our hypothesis that RBM underlie formation of those translocations. However, since we lack inheritance data for both individuals, we cannot know apodictically for sure that the observed deletions have originated at the same time as the translocations.

Unexpectedly, nine out of 120 breaks (7.5%) present with gain of genetic material, that is, a short duplication varying from 1 to 14 nt. One possible explanation for such gains is by two nearby SSBs or nicks that were generated in opposite strands, which can be converted into DSBs (Fig. 4B). After processing and ligation of those overlapping, overhanging short segments to heterologous derivative chromosomes, duplication of the segments flanking the junctions can be observed (Fig. 4B; Tables 3 and 4; Supp. Table S1).

In aggregate, breakpoint characterization of 60 balanced translocations reveal: (1) that both deletions and duplications of a few base pairs are frequently present in the chromosome junctions, and (2) mutational signatures indicate an underling replicative mechanism involving templates switching in 16% of the junctions. These findings can help explain the observation that ∼37% of apparently balanced translocations actually present with imbalances and cryptic rearrangements at or nearby the translocation junctions [Higgins et al., 2008]. Cryptic genomic imbalances in apparent balanced chromosomal aberrations have been associated with affected carriers and are less frequently observed in individuals without an associated clinical phenotype [Baptista et al., 2008]. The contribution of such variants for symptomatic carriers needs to be further assessed. Finally, the possibility that some translocations may have a mitotic origin influences genetic counseling. A de novo translocations showing mutational signatures indicative of a mitotic origin could have arisen either as a mitotic event in one parent; this individual will then be a low-level mosaic with a higher recurrence risk, or in an early mitotic division after fertilization with no recurrence risk for the parents.

In conclusion, our studies highlight the importance of breakpoint resolution in the clinical molecular interpretation of chromosomal translocations. First, we show that the disruption of disease-causing genes directly provides a molecular diagnosis to a subset of affected carriers. However, this is important also in the healthy carriers, as we identify significant gene disruptions, important to their own health and reproductive genetics. The disruption of a gene by a balanced chromosome break is a type of disease-causing mutation that goes undetected by all the current methods used in genetic diagnostics except large-scale sequencing. With 1/500 individuals being a carrier, this most likely represents a highly underappreciated cause of disease, especially taking into consideration that the resolution of a chromosome-banding assay is above 5–10 Mb (well illustrated by the cryptic 11 Mb deletion in individual 2–03E). We therefore propose that diagnostic WGS will be clinically important for characterizing balanced structural chromosomal variants in the investigation of diverse patient populations from monogenic disorders and infertility to neurocognitive diseases.

Acknowledgments

We are grateful to the patients and their families for their cooperation and enthusiasm during this study. We also gratefully acknowledge the use of computer infrastructure resources at UPPMAX, projects b2011162 and b2014152.

    Conflict of Interest

    J.R.L. has stock ownership in 23andMe, is a paid consultant for Regeneron Pharmaceuticals, has stock options in Lasergen, Inc., is a member of the Scientific Advisory Board of Baylor Genetics, and is a coinventor on multiple United States and European patents related to molecular diagnostics for inherited neuropathies, eye diseases, and bacterial genomic fingerprinting. The Department of Molecular and Human Genetics at Baylor College of Medicine derives revenue from the chromosomal microarray analysis (CMA) and clinical exome sequencing offered in the Baylor Genetics (BMGL: http://www.bmgl.com/BMGL/Default.aspx).

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.