6 Heterochromatin Positioning and Nuclear Architecture
Abstract
Heitz (1928) first described the two states of chromatin known as euchromatin and heterochromatin. Heterochromatin is often described as the gene-poor part of the genome associated with a silent and condensed state of chromatin inaccessible to transcription factors. However, this simple view has been challenged many times as heterochromatin is indeed transcribed and differences between the two states of chromatin depend on many other criteria. Data collected from the model species Arabidopsis thaliana indicate that heterochromatin relies on the repetitiveness of specific DNA sequences such as satellites, transposable elements and ribosomal DNA (rDNA) but also on epigenetic marks specifically associated with these repeated arrays. Recent studies on small RNA pathways have highlighted the central role of the RNA-directed DNA methylation pathway in heterochromatin specification indicating that heterochromatin is indeed an epigenetic state required for many other genome functions including chromosome segregation, gene regulation or maintenance of genome stability. Heterochromatin is a specific feature of eukaryotes and preferential localization at the nuclear periphery and to the nucleolus is observed in most organisms. This spatial organization of heterochromatin is maintained through the cell cycle although DNA is replicated, chromatin is condensed into chromosomes, and the nuclear envelope is disrupted and reformed. Whether spatial positioning participates in heterochromatin function and how plant cells are able to establish and maintain such a genome organization across the cell cycle is still largely unknown in Arabidopsis. Mechanisms leading to the appropriate positioning of heterochromatin within the nucleus will be discussed in the light of data coming from other species. A better understanding of heterochromatin organization and functions will be an important field of investigation for plants such as maize and wheat with considerably larger genome sizes than Arabidopsis.
6.1 Heterochromatin Structure
In order to better understand how heterochromatin is assembled and what functions are assumed by it, it is first necessary to describe some structural aspects of heterochromatin. Emil Heitz (1928) historically identified heterochromatin as the nuclear material that remains highly condensed within the interphase nucleus. He named these regions ‘heterochromatin’ to distinguish them from the regions showing variable staining and condensation, which he called ‘euchromatin’. The boundaries of heterochromatin as specified by such cytological analyses are not very well defined and may vary in diverse tissues or with different analytical techniques. Subsequently, heterochromatin was subdivided into two classes, constitutive and facultative heterochromatin, depending on its persistent presence throughout the cell cycle (Brown, 1966). Formation of facultative heterochromatin at specific genomic loci is fundamentally important in defining cellular properties such as differentiation, stress response and reaction to developmental, physiological, or environmental stimuli.
At the same time, renaturation kinetics of DNA, also known as Cot analysis, gave striking results when applied to genomic DNA from eukaryotes (Britten and Kohne, 1968). The eukaryotic genome includes three distinct fractions according to their rates of renaturation because repeated sequences renature faster than single DNA sequences (Figure 6.1a). The repeated nature of DNA turns out to be a very interesting criterion for better defining heterochromatin, as most of heterochromatic sequences fall into the highly repeated (HR) fraction whereas euchromatin sequences are included in the single or low (SL) fraction. Since these early cytological and kinetic characterizations, euchromatin and heterochromatin are subsequently distinguished according to many other criteria such as their chromosome distribution, DNA sequence composition, protein binding or epigenetic features as will be detailed below.

6.1.1 Heterochromatic Sequences
In small genome species such as Arabidopsis (150Mbp), heterochromatin represents 10–15% of the whole genome (Arabidopsis Genome Initiative, 2000). The Arabidopsis genome is organized in five linear chromosomes with dedicated domains made of heterochromatin (Figures 6.1b and 6.1c). The predominant fraction of heterochromatin is found at 45S ribosomal1 (45S rDNA) loci (that is, the genes encoding the 45S ribosomal RNA precursor molecule) known as the Nucleolus Organizing Regions (NOR), which are at the top of the short arm of chromosomes two and four, near the telomeres. The 45S rRNA genes are organized in large tandem arrays of about 3.5–4.0Mbp and thus constitute the larger part of Arabidopsis heterochromatin (Copenhaver and Pikaard, 1996). Another heterochromatic fraction is derived from centromeres, a chromosome region dedicated to sister chromatid cohesion and to normal chromosome segregation during mitosis and meiosis. In Arabidopsis, centromeres consist of long stretches of about 0.4–1.4Mpb including short tandem repetitive ‘satellite’ DNA, called 180bp repeats and 106B repeats and also known as the Athila retrotransposon (Hosouchi et al., 2002). Flanking regions of centromeres called pericentric regions include other types of repetitive sequences such as transposable elements (Wicker et al., 2007) and 5S ribosomal DNA (5S rDNA) (Tutois et al., 1999). Heterochromatin is also found in telomeres protecting the chromosome ends from deletion upon DNA replication. Telomeres in most plants consist of tandem arrays of simple TTTAGGG repeats (Zellinger and Riha, 2007). Finally knobs, also called interstitial heterochromatin, first observed by Barbara McClintock in maize but present in several other plants, including Arabidopsis, are heterochromatic loci containing transposable elements located on chromosomal arms (Lippman et al., 2004). On the whole, in Arabidopsis, heterochromatin is formed at repetitive sequences clustered mainly at centromere, pericentric regions and NOR (Figure 6.1c).
Transposable elements, rDNA and satellites are common components of heterochromatin in other plant species. They can vary in term of sequences but their properties remain the same. Indeed, most plant centromeres include satellite DNA elements and a survey of these sequences can be found in Wang et al. (2009). Heterochromatin content and heterochromatin organization can be very different among eukaryotes and this is especially true for plants where a three-log range of variation in genome size is observed. Over the past decades, although polyploidy and gene duplication have been described as important mechanisms occurring in plants, it turns out that an increase in heterochromatic sequences was the largest contributor to genome size expansion (Gaut and Ross-Ibarra, 2008). Ma and collaborators carried out a comparative analysis of plant centromere organizations including Arabidopsis, Oriza sativa and Zea mays with a genome size of 150Mb, 370Mb and 2700Mb, respectively (Ma et al., 2007). Centromere sizes among these species do not explain the variation in genome sizes. Alternatively, transposition of mobile DNA on chromosome arms was shown as the main explanation of genome size expansion (Gaut and Ross-Ibarra, 2008). Thus, in many genomes other than Arabidopsis, heterochromatic sequences are not only concentrated to centromeres and rDNA but can also be found in large numbers on chromosome arms. This is now well described in cereals such as maize and wheat in which 70–80% of the genome is made of transposable elements (Choulet et al., 2010). Using Arabidopsis as a plant model system to study heterochromatin should therefore be carefully considered since it has a very compacted genome in which heterochromatin is mostly centric and pericentric compared to most plant species.
6.1.2 Epigenetic Marks
Heterochromatin forms at repeated sequences including rDNA, transposable elements and tandem repeats. These repeated sequences are tightly associated with epigenetic marks which define the heterochromatin state. Main epigenetic modifications such as DNA methylation, histone modifications, non-coding RNA and histone variants are among the major players in the propagation and modulation of the heterochromatic status and will be described in the following section.
6.1.2.1 DNA Methylation
In plants, DNA methylation occurs at cytosine residues in three different contexts, CG, CHG and CHH, where H is either A, C or T. DNA methylation depends on a large number of proteins but with central roles for METHYLTRANSFERASE1 (MET1) for CG, SU(VAR)3-9 HOMOLOG4 / KRYPTONITE (SUVH4/KYP) a member of the Su(var)3-9 class of histone methyltransferases, CHROMOMETHYLASE3 (CMT3) for CHG and CHH and DOMAIN REARRANGED METHYLTRANSFERASE2 (DRM2) for CHH. The function of all three methylases is in some way overlapping as DRM2 is a de novo DNA methylase for CG, CHG and CHH whereas MET1 and CMT3 respectively maintain methylation at CG and CHG (reviewed in Feng et al., 2010). Many other factors are involved in this process including DECREASE in DNA METHYLATION1 (DDM1) required for both methylation and histone tail modification at H3K9 (see below). It is worth noting that met1 and ddm1 mutants were indeed identified in screens for loss of centromeric-repeat methylation (Vongs et al., 1993; Kankel et al., 2003). Further investigations of those two key players on transposable elements such as the retrotransposon ÉVADÉ (EVD), and transposons from the CACTA family (named CACTA because of the occurrence of CATCA sequences at the edges of the elements) or from MUtator-Like Element (MULE) class, which are usually silent, clearly show that they can be reactivated in met1 and ddm1 mutants (Mirouze et al., 2009; Tsukahara et al., 2009). On a genome-wide level, 24% of CG, 6.7% of CHG and 1.7% of CHH are methylated. CG, CHG and CHH methylation are enriched in heterochromatic sequences however, with a marked increase for CG methylation in the gene coding sequences (Cokus et al., 2008). Within the gene coding sequences, up to 30% of CG are methylated while less than 1% of the CHG and CHH are methylated (Widman et al., 2009). Conversely heterochromatic sequences are enriched in CHH as a consequence of methylated cytosine deamination in thymine, one of the most frequent point mutations within eukaryotic genomes (Widman et al., 2009). The overall consequence is that heterochromatic sequences are usually AT-rich sequences which are heavily methylated at CHH (and to a lesser extends at CHG) (Figure 6.1b and 6.2a).

6.1.2.2 Histone Code
Core histones H2A, H2B, H3 and H4 are implicated in the formation of the nucleosome core including 147bp of DNA. Histone H1 are linker histones involved in the formation of more condensed chromatin fibres limiting the access for regulatory proteins to nucleosomal components (Happel and Doenecke, 2009). Histones have important implications in gene regulation through modifications at the C-terminal histone tails through methylation, acetylation, phosphorylation or ubiquitination; these modifications are referred to as the histone code (reviewed in Grant-Downton and Dickinson, 2005).
6.1.2.3 Histone-Repressive Marks
Heterochromatin-specific histone methylation marks are mainly mono- and dimethyl H3K9 (H3K9me1 and me2) and mono- and dimethyl H3K27 (H3K27me1 and me2) (Figure 6.2a). Specific patterns of histone marks at heterochromatic loci are believed to promote a close chromatin configuration not favourable for gene expression. Basically, two major histone marks are usually followed as heterochromatic marks. These are H3K9me2 implemented by KYP, and H3K27me1 established by histone methyltransferases ARABIDOPSIS TRITHORAX-RELATED PROTEIN5 and 6 (ATRX5, ATRX6). H3K9me2 and H3K27me1 are usually found in constitutive heterochromatin but H3K9me2 is dependent upon DNA methylation while H3K27me1 is not, probably reflecting two independent mechanisms occurring at heterochromatic loci (Liu et al., 2010).
6.1.2.4 Histone-Activating Marks
On the other hand, heterochromatin H3K36me, H3K4me2 and H4K16 acetylation are poorly represented in heterochromatin and mostly found in euchromatic gene-rich regions (Figure 6.2a). H3K36me and H3K4me2 are established respectively by the histone methyltransferases ABSENT SMALL OR HOMEOTIC DISCS1 (ASH1) and ARABIDOPSIS TRITHORAX1 (ATX1) acting mostly at euchromatic loci (reviewed in Liu et al., 2010). Interestingly, histone methyltransferases are unable to methylate target lysine residues that are acetylated, and therefore histone deacetylases (HDAC) are required to allow methylation at those regions (Noma et al., 2001). To date 16 Arabidopsis HDACs have been identified and among them HDA6, a homolog of RPD3 in yeast, has a central role in heterochromatin (Pandey et al., 2002). The Hda6 mutation causes enrichment for euchromatic epigenetic marks such as H3K4 methylation, leading to a decondensed state of chromatin, visible at the cytological level at rDNA loci (Probst et al., 2004). Furthermore, up-regulated loci in hda6 overlap with those in met1, and the hda6 mutation causes the complete loss of DNA methylation on some HDA6 target loci. These results suggest that HDA6 and MET1 DNA targets are overlapping. Further, this indicates that HDA6 is required together with MET1 to establish and/or maintain heterochromatin (To et al., 2011).
6.1.2.5 Histone Variants
Apart from C-terminal core histone modifications, H2A, H2B and H3 histone variants also modulate gene expression (Figure 6.2b). Studies of H3 variants have recently started in plants. Histone H3 variants include H3.1, H3.3 and CENP-A, a centromeric H3 variant, and are encoded by 15 HISTONE THREE RELATED (HTR) genes (Ingouff and Berger, 2009). These variants could contribute to the epigenetic memory of chromatin states in somatic cells. Centromeres are epigenetically specified by incorporation of CENH3 (CENP-A in humans; CenH3 is encoded by the HTR12 gene in Arabidopsis), replacing conventional H3 in centromeric nucleosomes (Henikoff and Dalal, 2005; Ravi and Chan, 2010). Histone variants H2A and H2B were reviewed in Wyrick and Parra (2009) but, to date, very little is known in plants. H2A variant H2A.Z deposition and DNA methylation seem to be mutually exclusive both in actively transcribed genes and in methylated transposons. H2A.Z marks actively transcribed genes and is found at promoter sequences (Zilberman et al., 2008). This may be of particular interest as the yeast H2A histone variant HTZ1 (homologous to plant H2AZ) might tether active promoters to NUP2, a component of the Nuclear Pore Complex (NPC). Recently, it has also been proposed that nucleoplasmic NPC components such as NUP98 and NUP50 can activate expression of internally localized genes, whereas DNA interaction with NUP88 leads to gene silencing (see also Meier this issue; reviewed in Arib and Akhtar, 2011). Future investigations of plant NPC functions will then be of particular interest for gene regulation and heterochromatin.
6.1.2.6 Non-Coding RNA
The structural features of heterochromatin led to a simple model where the gene-poor heterochromatin is the condensed state of chromatin and is poorly transcribed while euchromatin is less compacted allowing active transcription of the gene-rich regions. The epigenetic era drastically modified this oversimplified view. Indeed, heterochromatin is transcribed by plant-specific Polymerase II variant enzymes known as Polymerase IV and Polymerase V (Lahmy et al., 2010) and produces large non-coding RNAs (ncRNA) processed through the RNA-directed DNA methylation pathway (RdDM) (Figure 6.2c). This involves ARGONAUTE (AGO) and DICER-LIKE (DCL) families of silencing factors and yields short RNA molecules, 24 nucleotides (24nts) in length, called short-interfering RNA (siRNA) homologous to heterochromatic sequences. The RdDM pathway is a complex network including many proteins and a more detailed description can be found elsewhere (Law and Jacobsen, 2010). SiRNAs are responsible for repression of repeated sequence expression both at the transcriptional and post-transcriptional levels. From an evolutionary point of view, RdDM is believed to be a defence mechanism evolving to protect eukaryotic genomes against mutagenesis by endogenous transposable elements and against exogenous viruses (see discussion in Cavalier-Smith, 2010). As a consequence of the production of 24nts RNA molecules, chromatin is modified both at the DNA level through DNA methylation and at the histone level through post-translational modifications; among them acetylation and methylation are the most studied (see Sections 1.2.3 and 1.2.4 above; Liu et al., 2010).
6.1.3 Non-Histone Protein Binding
In the early years of the use of Arabidopsis as a plant model species, attempts were made to apply data collected from Drosophila and yeast. The best example related to heterochromatin was the search for Suppressor and Enhancer of variegation (the so called Su(var) and E(var) genes) involved in the Position Effect Variegation (PEV) in Drosophila. In PEV, when a gene is brought close to heterochromatin, this led to its stochastic repression (variegation). In the course of the Su(var) and E(var) screens, it was demonstrated that PEV was induced by non-histone proteins specific to heterochromatin, which have the property of spreading from heterochromatic to euchromatic loci inducing a silent state of chromatin. Among them, SU(VAR)3-9, a histone methyltransferase establishing methylation at H3K9 and HETEROCHROMATIN PROTEIN 1 (HP1) a highly conserved protein found in yeast, human and plants, were discovered. Drosophila HP1 interacts with H3K9me and is a main actor in heterochromatin assembly through interaction with other heterochromatic factors. Once HP1 binds to a given locus, it can induce the spread of heterochromatin in adjacent regions unless a blocking element (called ‘boundary elements’, see Section 2.3.1) blocks heterochromatin propagation (reviewed in Grewal and Jia, 2007). Indeed, plant homologues for SU(VAR) were identified and as an example, SUVH4/KYP (see Section 1.2.1) is the functional homologue of SU(VAR)3-9 (Fischer et al., 2006). Arabidopsis HP1, called LIKE HETEROCHROMATIN PROTEIN1 (LHP1) was also discovered but very unexpectedly binds euchromatic loci and recognizes H3K27me3, a function assumed in Drosophila by another class of protein called the Polycomb group proteins. LHP1 should then not be considered as the functional homologue of Drosophila HP1 (Gaudin et al., 2001; Turck et al., 2007). Another striking specificity of plant heterochromatin came from the characterization of a duplicated segment of the long arm of chromosome four translocated onto the short arm of chromosome four. This new segment is called hk4S (Lippman et al., 2004). Sequence comparison of the two duplicated segments indicated that hk4S includes the same numbers of genes (33 genes) as the original chromosomal segment but was invaded by 34 retrotransposons and 40 transposons. As expected for heterochomatic sequences, transposable elements at hk4S have high DNA methylation and increased H3K9 methylation while genes are almost unaffected and fully expressed. This was very unexpected and very different from Drosophila in which genes would have been repressed by the vicinity of heterochromatic sequences. In plants, spreading of heterochomatin does not seem to occur. Instead, using the ddm1 and met1 mutants, Lippman et al. (2004) proposed that heterochromatic epigenetic marks are set up by siRNA instead of SU(VAR) in Arabidopsis, in a more accurate and sequence specific manner than the SU(VAR)-mediated process in Drosophila.
6.1.4 Heterochromatin is an Epigenetic State
An in depth review of heterochromatin can be found in Grewal and Jia (2007) but in summary four main epigenetic features are associated with a repressed chromatin state; DNA methylation (in all three sequence contexts), 24nts siRNA (RdDM), histone methylation at H3K9me2 and H3K27me1 and histone hypoacetylation at H4K16. The exact function of histone variants are as yet largely unknown except for the centromeric H3 variant that will be discussed below. To date there is not a clear understanding of the various steps leading to heterochromatin assembly and this is further complicated by the fact that many processes are interconnected. As an example, DNA methylation, histone modifications and RdDM are tightly linked processes. Indeed if met1 context is maintained over several generations, RdDM becomes misdirected. The data suggest that CG methylation directs non-CG and H3K9 methylation and is essential for long-term inheritance of epigenetic information (Mathieu et al., 2007).
6.2 Heterochromatin Organization
In the previous section, the main properties of heterochromatin were described in terms of sequences and chromatin features. These structural features clearly highlight the fact that heterochromatin distribution is not random but rather clustered at centromeres, pericentromeres, telomeres and knobs although repeated sequences can also be found within euchromatic chromosome arms especially when the genome size increases such as in maize or wheat (Figure 6.1a). However, this linear distribution of heterochromatin along the chromosome does not account for its real organization within the nucleus. Strikingly, the very first investigations of chromosome organization within interphase nuclei in Arabidopsis indicated that chromosomes are not arranged in a linear way but rather are distributed within the nucleus with a special location of heterochromatin near the nuclear periphery and around the nucleolus (Fransz et al., 2002). These localizations turned out to reflect specific functional aspects required for proper genome expression. Since then, many nuclear sub-compartments have been described and this nuclear organization of chromatin and sub-compartments are collectively referred to as the ‘Nuclear Architecture’ (Lanctôt et al., 2007). This section will review our knowledge regarding nuclear architecture focusing on heterochromatin and will give possible mechanisms to explain the positioning of heterochromatin within the nucleus.
6.2.1 Heterochromatin and Nuclear Architecture
6.2.1.1 Chromosome Territories in Arabidopsis
Early studies of chromosome spatial organization came from the comparative analysis of active (Xa) and inactive (Xi) X chromosome in humans (Eils et al., 1996). The results suggested that Xa exhibited a larger surface than Xi territories, which should be in a more condensed heterochromatic domain. According to this model, chromosomes would then occupy discrete nuclear domains known as Chromosome Territories (CTs). CTs are an evolutionary conserved structural feature of chromosome organization among plants and animals and are believed to have functional significance for appropriate genome expression. Chromosome painting was performed on the five Arabidopsis chromosomes using 669 Bacterial Artificial Chromosomes (Figure 2000a). Chromosome Territories were then defined in Arabidopsis interphase nuclei in which each CT corresponds to about 25Mbp of DNA, a number very similar to the physical chromosome size ranging from 17 to 25Mbp. Chromosome Territories occur in differentiated and actively dividing cells such as meristematic cells, in various tissues including roots and leaves and in cells of different ploidy levels (Pecinka et al., 2004). Positioning of CTs is random, meaning that there is no specific pattern arrangement in respect to the Nuclear Envelope (NE) or between various CTs except for NOR which are clustered at the nucleolus (Berr and Schubert, 2007; Schubert and Shaw, 2011). The nucleolus was found to be localized in most cases at or near the centre of the nucleus. In addition, the nucleolus represents a significant volume excluding the remaining chromatin at a more peripheral position within the nucleus and thus influences the organization of CTs. In addition, CTs are also constrained by nuclear shape and nuclear size (De Nooijer et al., 2009).
6.2.1.2 Chromocentres and the Rosette-Loop Model of Chromatin Organization
Each interphase chromosome of Arabidopsis consists of a single heterochromatic domain, or chromocentre (CC), from which euchromatic loops of about 0.2–2 Mbp emanate (Van Driel and Fransz, 2004). This has been called the ‘Rosette-loop model’ of heterochromatin (Figure 6.3b). One possible mechanism of loop assembly may rely on pericentric heterochromatin, which includes ancient transposable elements used as a nucleation site to recruit homologous sequences scattered through the chromosome arms (Fransz and De Jong, 2011). The small genome size of Arabidopsis provides a unique model species where most of heterochromatin is organized in chromocentres that can be easily followed by light microscopy as dense chromatin foci (Figure 6.3c). Usually, six to ten CCs can be scored in somatic cells because two or more CCs can fuse together (Fransz et al., 2002; Fang and Spector, 2005; Berr and Schubert, 2007). Chromocentres contain epigenetic markers for silent chromatin, such as DNA methylation and repressive histone marks. In contrast, euchromatic loops are enriched in gene-coding regions and contain acetylated histones as well as H3K4me (Fransz et al., 2002). Together, a given CC and associated euchromatic loops form a chromosome territory.

Euchromatin loops can extend from the heterochromatin domains and form either a single rosette structure or multiple rosettes per chromosome, depending on the genome size. In Arabidopsis, centromeres are randomly distributed in peripheral positions and near the nucleolus (Fransz et al., 2002). Using centromeric H3 variant CENH3 fused to a fluorescent protein, Fang and Spector (2005) clearly showed that centromeres as part of the CC localize predominantly at the nuclear periphery. Further information on CC assembly was gained by studying heterochromatin in protoplasts. Indeed CCs are virtually absent from protoplasts except for heterochromatic foci formed at 45S rDNA. Reduction of pericentric heterochromatin is not accompanied by a global loss of histone H3K9me2, indicating that the H3K9me2 histone mark is not sufficient on its own to induce compaction of heterochromatin. Following heterochromatin during a longer period of protoplast culture, reformation of CCs could then been observed. The repetitiveness of the sequences explains the kinetics of CC reformation starting with 45S rRNA genes (sub-telomeric NOR), followed by the 180 bp repeats (centromeric repeats) and the 5S rDNA genes (pericentric repeats) (Tessadori et al., 2007). Spatial positioning of Arabidopsis centromeres was similar in several diploid cell types including guard cells, small leaf epidermal pavement cells, root meristematic cells and sepal and petal epidermal pavement cells (Fang and Spector, 2005). However, CC conformation changes have been documented as for instance in endosperm where an Endosperm-Specific Interspersed (ESI) heterochromatin has been described (Baroux et al., 2007). In this tissue, the Rosette model still stands but with a more relaxed conformation. Another situation in which CCs are altered is the use of epigenetic modifier mutations such as ddm1 and met1. In both mutants hypomethylation is observed as well as a reduction of CC size (Soppe et al., 2002). Chromocentres can still be detected in the null met1-3 mutant, where both H3K9m2 and CG methylation are lacking indicating that those epigenetic marks are not essential for initial heterochromatin assembly (Tariq et al., 2003). Using one specific LacO transgenic line called EL702C and containing three copies of the LacO arrays in two separate loci located on chromosome three, Pecinka et al. (2005) found that LacO tandem repeats are often co-localized with chromocentres. This co-localization has also been observed for other tandem arrays such as those of transgenic multicopies of the HYGROMYCIN PHOSPHOTRANSFERASE (HPT) locus (Probst et al., 2003). In EL702C, LacO arrays are marked with H3K9me2 but a mutation at SUVH4/KYP does not alter its co-localization with the chromocentre (Jovtchev et al., 2011).
Altogether, these studies indicate that H3K9me2 is not responsible for CCs formation. Instead, the repeated nature of heterochromatic sequences, histone variants or siRNA could play an important role in heterochromatin assembly while histone marks and DNA methylation would help to maintain and reinforce heterochromatin compaction. Interestingly, Cavalier-Smith (2010) proposed that heterochromatin first evolved from the characteristic folding of repeated sequences at centromeres. This process would have been subsequently reinforced by the emergence of a specific centromeric histone. siRNA would then be viewed as a mediator of the recognition between heterochromatic sequences and heterochromatin establishing proteins.
6.2.1.3 Chromatin Organization in Large Genome Species
Chromocentres are not specific to Arabidopsis, although in this species they are easily visualized. Of 67 plants species studied, most display chromocentres in various numbers (Ceccarelli et al., 1998). Organisms with medium-sized genomes (500–3000Mbp), such as tomato contain many heterochromatin domains along their chromosomes. In interphase nuclei, chromosomes form decondensed loops in a rosette-like structure around one or a few number of chromocentres along the chromosome. In mouse, chromocentres are more internally positioned than in Arabidopsis (Fedorova and Zink, 2008). In organisms with very large genomes (more than 3000Mpb), such as barley (5000Mbp) and wheat (16000Mbp), in which a large proportion of the genome consists of transposable elements, chromosomes are more condensed. Therefore, the chromocentre model does not really apply in these species (or at least there is a high number of very small chromocentres) and a high number of chromatin loops occur over the complete length of the chromosome (reviewed in Van Driel and Fransz, 2004). In some species or tissues, chromosomes are in ‘Rabl’ conformation – i.e. telomeres at one side of the nucleus and centromeres at the other side (Rabl, 1885). The Rabl configuration seems to keep the anaphase configuration of chromosomes in a ‘frozen’ state until the next cell cycle (Figure 6.3d). This organization of the chromosomes clearly supports the existence of links between centromeres and telomeres and the NE. These links will be discussed in the following section. In wheat and its close relatives, the interphase chromosomes adopt a highly regular Rabl configuration with centromeres and telomeres located at opposite poles of the nuclei (Heselop-Harrison et al., 1990). Peter Shaw's group investigated if heterochromatic marks such as methylation and acetylation could affect the Rabl organization of chromosomes. For that purpose they used 5-azacytidine, which reduces DNA methylation, and Trichostatine A, an inhibitor of histone deacetylase (HDAC), and in both cases, the Rabl configuration was not altered, indicating that the links between centromeres, telomeres and the NE are rather strong (Santos et al., 2002). It is quite difficult to draw a clear picture of why a given species adopts the Rabl conformation, as great variability is observed between species and among tissues or developmental stages. A correlation may exist between chromosome length and organisation where species with large chromosomes adopt the Rabl confomation. Exceptions for this are, however, yeast and Drosophila, which have small chromosomes in the Rabl conformation; Arabidopsis does not.
6.2.2 Recruitment of Heterochromatin at the Nuclear Periphery
Using electron microscopy, heterochromatin has been defined as a dense structure concentrated close to the NE in most eukaryotic nuclei including plant and animals as illustrated in Figure 6.3e. In Arabidopsis, chromocentres clearly co-localize with the NE whereas some are found at the vicinity of the nucleolus (Figure 6.3b). Although preferential localization to the nucleolus has been ascribed to the presence of the NOR at the tips of chromosome two and four (reviewed in Fransz and De Jong, 2011), little is known about the preferential localization of the remaining chromocentres at the NE. Links between heterochromatin and NE should then exist and the NE has a special place within this review as it is believed to participate in chromatin anchorage in plants. A more exhaustive description of NE composition and structure is given in Chapter The Nuclear Envelope – Structure and Protein Interactions and the NPC is described in Chapter The Plant Nuclear Pore Complex – The Nucleocytoplasmic Barrier and Beyond. This section will focus on chromatin-NE interactions. Over the past decades, many interactions involving NE and DNA/chromatin have been described, especially in animals, but with similarities in plants. These data may provide insights into possible mechanisms for maintaining heterochromatin close to the NE in plants.
6.2.2.1 The Central Role of Lamins in Animals
From a cytological point of view, heterochromatin is in the immediate vicinity of the internal face of the NE (the Inner Nuclear Membrane, INM) and in contact with the lamina, a thin meshwork made of lamins A, B and C (Figure 6.3f). Lamins belong to the large coiled-coil family of proteins and form a scaffold, which connects the various NPCs and lends mechanical strength to the nucleus. In humans, the premature aging disease Hutchinson–Gilford progeria syndrome (HGPS) is caused by a mutant lamin A (LAΔ50). In LAΔ50, nuclei are abnormally shaped and display a loss of heterochromatin. In humans, H3K27me3 and H3K9me3 identify facultative heterochromatin, whereas H4K20me3 marks constitutive heterochromatin. Consistently with the loss of heterochromatin, constitutive and facultative heterochromatic marks are altered in LAΔ50 (Shumaker et al., 2006). In vitro lamin B has some affinity to H2A but not to the other core histones (see Mattout et al., 2007 and references herein) and interactions between lamins and chromatin may involve other protein factors (see below).
6.2.2.2 The Inner Nuclear Membrane and Heterochromatin
At least 80 unique proteins have been found localized to the NE in animal cells (Tzur et al., 2006), but few of them have been well characterized. These include KLARSICHT/ANC-1/SYNE HOMOLOGY (KASH), LAMIN B RECEPTOR (LBR), LEM proteins (LAMINA-ASSOCIATED POLYPEPTIDE-EMERIN-MAN1), SAD-1/UNC-84 (SUN) and NESPRIN (see Chapter The Nuclear Envelope – Structure and Protein Interactions in this volume). LBR is an integral NE protein interacting both with lamins and heterochromatin. It has a complex function and structure that have been recently reviewed in Olins et al. (2010). In vitro it binds directly to DNA within the linker region between nucleosomes and interacts with a large number of partners including HP1 and Methyl Binding Protein 2 (MeCP2) (Olins et al., 2010). HP1 has a central role in the assembly of heterochromatin in other organisms such as yeast and Drosophila where it recognizes histone H3K9 methylation while MeCP2 recognizes methylated cytosine at the DNA level. Strikingly, the LBR loss of function led to HP1α mislocalization. Whether this is a direct or indirect effect remains to be determined. As previously stated (see Section 2.1.3), chromosome organization in mouse chromocentre-like structures recalls the rosette-loop model adopted by the Arabidopsis chromosomes. Indeed, the LBR knock-out mouse displays a reduced number of chromocentre-like structures, which are more diffused suggesting a more relaxed state of heterochromatin (Cohen et al., 2008).
Another interesting NE component was shown to directly interact with centromeric heterochromatin in fission yeast Schizosaccharomyces pombe. In this species, centromeres are connected to the primary microtubule organizing centre (MTOC; or spindle pole body (SPB), equivalent to the animal centrosome) through interactions between a SUN domain protein called SPINDLE POLE BODY-ASSOCIATED PROTEIN1 (SAD-1) and two KASH domain proteins KARYOGAMY MEIOTIC SEGREGATION PROTEIN1 and 2 (KMS1 and KMS2). Connections are stabilized by the INM protein ISOMALTASE 1 (IMA1) (reviewed in Mekhail and Moazed, 2010). SUN and KASH proteins are part of the LInker of Nucleoskeleton and Cytoskeleton (LINC) complex connecting the cytoskeleton and the nucleoskeleton (see Chapter The Nuclear Envelope – Structure and Protein Interactions; see also Figure 6.3f in the current chapter). Recently, the first KASH protein has been discovered in plants and has been shown to interact with SUN domain proteins (Zhou et al., 2012). Further investigations will be needed to establish if SUN domain proteins or other LINC-like complexes are used in Arabidopsis to link the NE and the chromocentres, offering one possible mechanism of chromocentre positioning in Arabidopsis.
6.2.2.3 Heterochromatin Positioning in Plants
In plants, no functional lamina homologue has been described so far, although heterochromatin still localizes peripherally. Electron microscopy performed on tobacco BY-2 cells identified a filamentous structure closely attached to the INM and linked to NPCs. The organization and dimensions of these filaments are very similar to those observed for the lamina in Xenopus laevis oocytes used as a control in the experiments (Fiserova et al., 2009). These data suggest the existence of a lamina-like structure in plants but are not a definite proof of its existence. As lamins belong to the coiled-coil family, several researchers then investigated this large gene family in plants. In tomato, the Meier group identified MATRIX ATTACHMENT REGION (MAR) BINDING FILAMENT-LIKE PROTEIN1 (MFP1, Gindullis and Meier, 1999a), MFP1 ASSOCIATED FACTOR1 (MAF1, Gindullis et al. 1999b) a serine threonine-rich protein interacting with MFP1 and the FILAMENT-LIKE PLANT proteins (FPP family, Gindullis et al., 2002) also interacting with MAF1. MAF1 was subsequently shown to be the first WPP-domain protein conserved in Arabidopsis. However, WPP-domain proteins were recently shown to be part of the plant NPC and not directly connected with heterochromatin (Meier et al., 2010). A similar approach was developed in carrot, in which a coiled-coil protein called NUCLEAR MATRIX CONTTTUENT PROTEIN1 (NMCP1) was identified as a plant-specific insoluble nuclear protein localized at the nuclear periphery. Using NMCP1 as a starting point for a reverse genetics screen, Dittmer et al. (2007) identified a new family of proteins called LITTLE NUCLEI 1-4 also known as CROWDED NUCLEI1-4 (LINC1-4, CRWN1-4) in Arabidopsis. Disruption of either the LINC1 or LINC2 gene induces a dwarf phenotype and causes a reduction in nuclear size associated with an increased DNA compaction. Moreover, combining the linc1 and linc2 mutations reduced by two-fold the number of chromocentres (8.6 compared to 4.6 chromocentres per nucleus in wild type and double mutant respectively). This suggests that barriers against association between distinct chromocentres are lowered, resulting in the fusion of some of the chromocentres. Altogether, the linc1 linc2 double mutant displays a reduced nuclear size, an increased DNA compaction and a reduction of the number of chromocentres. One possible explanation could be that the nuclear shape alters chromatin organization or vice versa. Although the function and structure of the LINC proteins may recall those of animal lamins, further work on LINC proteins is needed and will determine whether they influence NE structure or modify links between chromatin and NE.
6.2.3 Higher Order of Chromatin Organization
Spatial positioning of heterochromatin in rosette-loops at the chromocentre suggests that heterochromatin can bring distant loci closer together. For Arabidopsis, one possible mechanism has already been proposed in Section 2.1.2 in which transposable elements on chromosome arms are recruited to the chromocentre (Fransz and De Jong, 2011). In other species, dedicated sequences and proteins that can fold chromatin into loops were identified with similar properties. In Drosophila, boundary elements like CCCTC-BINDING FACTOR (CTCF), SUPPRESSOR OF HAIRY-WING (Su(Hw)), ZESTE WHITE 5 (ZW5), BOUNDARY ELEMENT ASSOCIATED FACTOR (BEAF) or CENTROSOMAL PROTEIN 190 (CP190) are involved in chromatin organization (Bushey et al., 2008) and some of them have been implicated in chromatin loop formation (Blanton et al., 2003). Some of these proteins were hypothesized to interact with the nuclear lamina, recruiting the boundary elements in a way very similar to that described for heterochromatic sequences of the rosette-loop model (Gerasimova et al., 2000). Boundary elements and their binding partners are thus among the possible mechanisms responsible for chromatin loop formation and defining chromosome territories in eukaryotes.
6.2.3.1 Boundary Elements
6.2.3.1.1 Drosophila and Animal CTCF
In order to recover an enriched fraction of NE-associated chromatin, lamin was used in human and drosophila cell cultures as a fusion with Dam methylase from Escherichia coli, the so-called ‘DamID’ method (Van Steensel et al., 2001). Dam methylase targets adenines while the lamin counterpart tethers the fusion protein to the NE and binds specific DNA sequences. This led to the discovery of Lamin-Associated Domains (LADs) of about 0.5 to 1Mpb each (Pickersgill et al., 2006; Guelen et al., 2008). In Drosophila Kc cells, these domains contain essentially silent genes characterized by a low level of H3K4 methylation and low H4K16 acetylation (see Figure 6.2a). Careful analyses revealed that CTCF-binding sites are at the edges of LADs (Pickersgill et al., 2006). CTCF is a zinc finger protein binding CCCTC motifs; it is well known as both an insulator (enhancer-blocking property) and a boundary element (blocking repressive effects mediated by heterochromatin) (reviewed in Gaszner and Felsenfeld, 2006). To date no CTCF homologues have been identified in plants
6.2.3.1.2 Transcription Factor IIIC
How do organisms that lack CTCF homologues accomplish the same goal? In yeast, the transcription factor IIIC (TFIIIC) could have a similar function. TFIIIC is an associated factor of RNA Polymerase III (Pol III), which has been linked to insulator activity. TFIIIC binds B-Box elements found in Pol III promoters such as in tDNA, rDNA, U6snDNA or SINE transposable elements all transcribed at the nucleolus. TFIIIC recognizes the 274 tRNA gene clusters distributed throughout the 16 chromosomes and could contribute significantly to organizing the yeast genome by tethering tRNA gene clusters to the nucleolus. This can be viewed as an alternative mechanism to establish or maintain loops, which bring loci from distinct chromosomal positions closer together. As an example, in fission yeast, transition from CENP-A centric heterochromatin to outside repeats coincides with the presence of two to four tRNA genes, which may define boundaries of centromeres (reviewed in White and Allshire, 2008).
Genome-wide studies identified a second type of TFIIIC binding site not associated with RNA Pol III binding. While RNA Pol III-associated TFIIIC loci are recruited at the nucleolus, such sites are most often anchored to the NE and defined as ‘Chromosome Organizing Clamp’ (COC) (Noma et al., 2006). The recent discovery of hundreds (possibly even thousands) of COC sites in the human genome points towards an important, conserved function for these sites in organizing eukaryotic genomes (Moqtaderi et al., 2010; Oler et al., 2010).
6.2.3.2 Condensin and Cohesin
Condensins and cohesins, two chromosome scaffold proteins, are members of the Structural Maintenance of Chromosomes (SMC) family of proteins. They act respectively on chromosome compaction prior to and during mitosis and cohesion between the sister chromatids upon replication during the S phase until mitosis (reviewed in Wood et al., 2010). Arabidopsis cohesins (SMC1 and 3) and condensins (SMC2 and 4) have been identified and mutants in SMC1, 2 and 3 display a similar ‘Titan’ (TTN) phenotype: giant endosperm nuclei and arrested embryos with a few small cells (Liu et al., 2002). One of the functional roles of TTN is to provide normal microtubule function during seed development, which may explain the enlargement of the nuclei in this tissue. Strikingly, DMS3, another Arabidopsis SMC-like protein, was shown to be involved in the RdDM connecting proteins with nuclear scaffold function and epigenetic regulation (Kanno et al., 2008).
In animals, SMC complexes are bound to chromatin all through the cell cycle and recent investigations have highlighted implications of this binding in some aspects of genome organization. First, cohesin was shown to co-localize with CTCF binding sites. It is even hypothesized that the boundary function first described for CTCF may be achieved by cohesin. Cohesin may be recruited at CTCF binding sites and subsequently bring together unlinked loci to form DNA loops. The current hypothesis would then be that CTCF defines binding sites for cohesin, which in turn induces DNA topology responsible for insulator and/or boundary effects (Wendt et al., 2008). Second, genome-wide studies identified condensin-binding sites at tRNA loci. A mutation in the condensin sub-units induces a loss of preferential localization of tRNA loci to the nucleolus. As for cohesin, it is suggested that condensin facilitates tRNA clustering to the nucleolus by participating in long range interactions between distant chromosome sites (D'Ambrosio et al., 2008).
6.2.3.3 Matrix Attachment Regions
Matrix Attachment Regions (MARs) are A-T rich repeated DNA sequences found in animals and plants. At least 12 proteins were described as binding at MARs in various organisms, including lamins, histone H1, topoisomerase II, as well as plant-specific factors already discussed such as MFP1 and MAF1, which are components of the NE (Wang et al., 2010). In Mus musculus, another MAR binding factor called SPECIAL AT-RICH BINDING PROTEIN1 (SATB1) is responsible for repression of numerous genes mediated by deacetylation at histone H3K9 meaning that MARs can also be defined at the epigenetics level (Cai et al., 2003). MARs are attached to the nuclear scaffold and are believed to organize metaphase chromosomes into rosette-like structures forming loops. It is worth noting that transposable elements can be recovered, to a certain extent, as MARs (Tikhonov et al., 2001) according to their possible function in the rosette-loop model in Arabidopsis. The attachment sites of the loops are hypersensitive to DNaseI treatment and co-localize with Topoisomerase II binding sites. Using these two criteria, MARs were identified in plants in which they form loops of various sizes ranging from 25 kb in Arabidopsis to 45 kb in maize (Paul and Ferl, 1998). In silico analyses predicted smaller loops around 6–7 kpb as 21705 MAR sequences were detected across the Arabidopsis genome by Rudd et al. (2004). Among them, 10% lie within genes where they may influence gene expression and confer tissue, organ, and developmental specificity of gene expression in Arabidopsis (Tetko et al., 2006). Finally, in vertebrates, the β-globin insulator element that is binding CTCF can also be recovered as a MAR (Yusufzai and Felsenfeld, 2004). The authors clearly ruled out the fact that all the sequences isolated as MARs are also insulator elements (there can be as many as 30 000 to 80 000 MARs in the human genome). Alternatively, CTCF may have some yet unknown function in MARs. The exact significance of MARs is still elusive and their effective relationships with the rosette-loop model in Arabidopsis remains to be defined.
6.2.3.4 Future Prospects in Plants
In summary, interactions of heterochromatin to form chromatin loops may rely on putative structural elements such as transposable elements, boundary elements, MARs, or on protein complexes such SMC or TFIIIC complexes or by directly interacting with the NE or through a lamina-like structure (Figure 6.3f). In plants, no boundary or insulating elements similar to CTCF or other Drosophila insulators have been identified. Boundary elements may not be needed and alternatively, as proposed by Lippman et al. (2004), siRNA-dependent mechanism may be sufficient to fulfil the function of a boundary element by defining well delimited chromatin domains.
6.3 Functional Significance of Heterochromatin Positioning
The structure of heterochromatin led us to discover that it is composed of repeated sequences and constitutes a specific chromatin state distinct from euchromatin. The study of its organization has revealed that heterochromatin is grouped in specific regions of the chromosome and that this organization may participate in the positioning of heterochromatin at the periphery of the nucleus and euchromatin at a more internal part of the nucleus. If this positioning is a biological reality, it is probably relevant for some functional properties of the cell. In the following section, three main roles in cell functions will be reviewed: chromosome segregation, transcriptional regulation and genome protection against instability. Whether these functions are dependent upon the peripheral positioning of heterochromatin will be discussed.
6.3.1 Centric Heterochromatin Directs Chromosome Segregation
Clearly, centromere positioning at the nuclear periphery is an essential process to achieve chromosome segregation. Thus mechanisms dragging chromosomes to each side of the cell can yield clues about one of the possible mechanisms for positioning chromatin at the nuclear periphery. At the sequence level, centromere identity relies on particular chromatin specificities made of dedicated repeated sequences (satellite DNA in plants) and involving small non-coding RNA and specific histone variants such as CenH3/CENP-A and H3.1.
During mitosis and meiosis, centromeric heterochromatin is attached to spindle microtubules through the kinetochore, a proteinaceous structure connecting the centromere and the spindle. CENP-A together with CENP-C are required to build up a fully functioning kinetochore and thus have a central role in the centromeric function in chromosome segregation (Verdaasdonk and Bloom, 2011). In Arabidopsis, the 180bp satellite is believed to define a key centromeric feature of all centromeres. Indeed 180bp arrays bind the centromeric histone variant CenH3 but in extended chromatin fibres studied by FISH, only 10–12% of the repeats bind CenH3. These were compacted chromatin regions forming knobs and were hypothesized to be sites of kinetochore formation (Shibata and Murata, 2004). What makes some of the 180bp competent to bind CenH3 and to form kinetochores remains to be elucidated. CenH3 mutants have been recently obtained and this will provide the opportunity to investigate the functions of the centromeric histone variant in chromocentre positioning (Lermontova et al., 2011).
Once kinetochores are formed, they are connected to spindle microtubules and allow chromosome segregation. It is worth noting that, in plants, spindle attachment does not rely on specialized organelles (centrosomes in animals). Instead, plants are believed to follow a specific mechanism of spindle formation known as ‘spindle self-organization’ relying on the Ran pathway (reviewed in Zhang and Dawe, 2011). Surprisingly, in this model, Histone H1 acts as a microtubule organizing factor at the nuclear periphery and indeed Histone H1 was shown in tobacco BY2 cells to be localized at the nuclear periphery (Nakayama et al., 2008). The identified histone H1 in this experiment was shown to be similar to one of the two main histone variants from tobacco BY2 cells. The authors observed that histone H1 and DNA do not co-localize suggesting that the recognized histone H1 variant may, in this case, have a distinct function from the classical linker histone association.
In Arabidopsis, H1 variants were characterized and shown to be encoded by three genes. H1-1 and H1-2 share extensive homology with each other whereas H1-3 is a more divergent variant related to a drought-inducible class of gene (Wierzbicki and Jerzmanowski, 2005). RNAi plants with ∼90% decrease in the overall level of histone H1 display developmental phenotypes, with stochastic alterations of DNA methylation at various loci including 180bp, 5S rDNA, transposable element and FLOWERING WAGENINGEN (FWA). It is currently difficult to reconcile the two reported functions of histone H1, namely linker histone involved in chromatin compaction (reviewed in Jerzmanowski, 2007) and its microtubule organizing function (Wierzbicki and Jerzmanowski, 2005). This may be solved in the future by carefully looking at each H1 variant to address its specific functions and localization within the nucleus.
In most plant species, centromeres are close to the NE whether or not they display the Rabl configuration. This close proximity to the NE favours rapid interaction with microtubules during NE breakdown. Further investigations are needed to define clearly the function of histone variants and ncRNA or unknown factors in centromere positioning. This may be of importance for deciphering one of the possible mechanisms to achieve heterochromatin positioning.
6.3.2 Spatial Positioning of Heterochromatin Affects Transcriptional Activity
As previously discussed, chromosomes are spatially organized in interphase nuclei and heterochromatin tends to be situated next to the NE and the nucleolus (see Figure 6.3). As heterochromatin represents a repressive state of chromatin, it has long been suggested that transcription may be repressed at the NE and this has been well described for telomeric silencing in yeast (Akhtar and Gasser, 2007). Yeast and mammals were largely used to investigate the effect of heterochromatin on gene expression but also the effect of gene tethering to the nuclear periphery or to the nucleolus (Towbin et al., 2009; Sáez-Vásquez and Gadal, 2010). However this turned out to be a difficult task, and the effect on transcription of the perinuclear positioning of chromatin is indeed highly dependent upon the chosen locus, how it is recruited to the nuclear periphery and how it is positioned relative to various NE components such as the NPC. What about plants? One example in plants came from the study of FWA gene (Soppe et al., 2000). In its wild-type configuration, FWA is methylated at the DNA level whereas mutants fwa-1 and fwa-2 display a complete loss of DNA methylation associated with a late-flowering phenotype. FWA contains in its promoter region some sequence repeats that were mapped as the key determinant of this epigenetic regulation. It is interesting to note that FWA is among the 33 genes of the hk4S translocation (see Section 1.3). As described above, most genes included within hk4S are not altered by the spreading of repressive marks from the nearby transposable elements as only very slight differences can be recorded in their DNA methylation profiles. However, using the ddm1 mutant background, Lippman et al. (2004) identified a derepression of FWA at the transcriptional level, indicating that genes can be modulated by DNA repeats such as transposable elements but only when they are located close to their transcription units. Is positioning of FWA responsible for this derepression? The position (inside or outside a given chromosome territory) of the FWA gene was then determined when FWA was in the active (ddm1 background) and in the silent state (wild type background). Unfortunately, no significant difference in the nuclear position of the FWA locus could be identified (Pecinka et al., 2004). Of course this is only one example and, as stated before, the effect of spatial positioning may not apply to every gene or may be dependent upon chromosomal position or sequence environment.
In an attempt to get a more general tool to investigate the function of spatial positioning in gene activity, Eric Lam's group engineered a set of 277 transgenic lines containing a lac operator (LacO) and expressing Luciferase under the CaMV 35S promoter (Rosin et al., 2008). Each locus has been characterized for its chromosomal position and its level of gene expression through the LUC activity. LacO arrays can then be targeted using fluorescent protein fused to the Lac repressor (LacI) in order to track chromosome positioning (a given LacO transgene) in living plants. A similar approach has also been developed by the Matzke group using the tet operator (Matzke et al., 2005). Using ddm1, met1 and 5-azacytidine, they showed that changing the epigenetic status can, for some transgenes, decrease the Luciferase activity and modify the nuclear position of the LacO transgene within the nucleus (Rosin et al., 2008). However, this is not a general rule and the effect differs depending on the chromosomal position of the transgene array (reviewed in Lam et al., 2009).
Gene regulation by spatial positioning is a new field of investigation in plants. Tools to study transgene tethering to sub-nuclear localizations are becoming available and future investigations will be needed to establish whether the nuclear periphery induces a repressive or activation effect in respect to gene expression.
6.3.3 Heterochromatin Positioning Protects against Genome Instability
In most organisms, pericentric heterochromatin contains inactive transposable elements that are used to silence the homologous euchromatic copies. Transposable elements are then mainly in a repressed state, controlled by the RdDM pathway in plants. One clear consequence of this genomic organization is that pericentric heterochromatic repeats protect the genome against transposable element mobilization, which would otherwise lead to genome mutagenesis. The repressed state can be released in some epigenetic mutants such as ddm1 or met1, in which transposable element mobilization can then be observed (Mirouze et al., 2009; Tsukahara et al., 2009). In ddm1, there is a clear correlation between heterochromatin decondensation illustrated by a reduced number of chromocentres and derepression of heterochromatic sequences.
One striking feature of heterochromatin not yet investigated in plants is that of rDNA as highlighted by recent yeast studies. This is an essential gene encoding ribosomal RNA but it also belongs to heterochromatic sequences due to its tandem repeat organization (see Section 1 and Figure 6.1). rDNA is found in all organisms from prokaryotes to eukaryotes where it is organized in tandem arrays, meaning that a gene amplification system maintains rDNA cluster(s) in a tandemly repeated organization. Each species displays a given number of copies: seven in Escherichia coli, 150 in Saccharomyces cerevisea, 350 in Homo sapiens, at least 1000 in Arabidopsis and up to 12 000 in Zea mays. rDNA arrays are an evolutionary conserved process to produce a large amount of rRNA without increasing the transcription rate of a given gene but surprisingly not all the copies within an array are transcribed. Why are untranscribed units kept within the rDNA array? Unexpectedly, deletion of untranscribed rDNA copies induces an increased sensitivity of DNA to mutagenic agents. Silent extra rDNA copies were shown to facilitate condensin association and sister-chromatid cohesion, thereby facilitating recombination repair between sister-chromatids. Losing the extra copy thus reduces the capacity of DNA repair using the sister-chromatid. In Arabidopsis, chromosome pairing between sister-chromatids occurs at transgene repeats such as LacO or HPT. Homologous pairing is correlated with DNA hypermethylation at the array (Watanabe et al., 2005). Yeast genetics yielded one more role of condensin, which not only functions in DNA compaction but is also required for the attachment of rDNA arrays from the sister-chromatids to each other with important implication in genome stability. This remains to be studied in plants.
6.4 Perspectives
The definition of heterochromatin relies on various criteria. In this review, three main criteria were used: structure, organization and biological functions (Table 6.1). Despite this, it is difficult to establish a general definition of heterochromatin that would suit all situations encountered. Some scientists will use the repeated nature of DNA sequences, its position on the chromosome, a combination of epigenetic marks or a sub-nuclear localization of chromatin. However, none of them are sufficient by themselves to define the heterochromatin state of chromatin. As an example, repeated gene families such as the NUCLEOTIDE-BINDING SITE–LEUCINE-RICH REPEAT (NBS-LRR) disease resistance genes (150 genes in Arabidopsis, 400 in rice – McHale et al., 2006) are organized in clusters but cannot be considered as heterochromatic loci; methylation at H3K9 can also be found at euchromatic loci and cannot be strictly assigned to heterochromatin; and recruitment at the nuclear periphery does not always lead to repression as actively transcribed genes may be recruited to NPCs. When considering well established heterochromatic sequences such as the pericentric 5S rDNA repeats at chromosome five, some repeats are clearly expressed whereas others remain repressed. Active and repressed 5S genes are associated respectively with active and repressive epigenetic marks although all these genes are within the same repeated array (Mathieu et al., 2003). Finally, it remains to be elucidated how to define heterochromatin in species with large genomes, such as wheat, in which a large number of transposable elements are scattered along the chromosome arms. It is therefore important to direct future research towards better defining heterochromatin. For instance, in Drosophila, sub-classes in relation to the various criteria set out above have been established (Filion et al., 2010).
Criteria | Heterochromatin Features | Heterochromatin Characteristics | Short Description | References |
---|---|---|---|---|
Structure | DNA sequences | Gene poor | Heterochromatin is made of repeated sequences | Cokus et al. (2008) |
Contains satellites | 180bp repeats in Arabidopsis | Hosuchi et al. (2002) | ||
Contains transposable elements | Mostly pericentric in Arabidopsis | Arabidopsis Genome Initiative (2000) | ||
May form knobs | hk4S knob in Arabidopsis | Lippman et al. (2004) | ||
Includes 5S and 45S rDNA | Subtelomeric 45S and pericentric 5SrDNA | Copenhaver & Pikaard (1996); Tutois et al. (1999) | ||
Epigenetic marks | High DNA methylation | CG, CHG and CHH methylation at cytosine residues | Cokus et al. (2008) | |
Contains repressive histones marks | Enriched H3K9me2, H3K27mel | Liu et al. (2010) | ||
Depleted in activating histone marks | Decreased H3K4me2, H4K16Ac | Probst et al. (2004) | ||
Contains specific Histone variants | H3.1, CenH3 Centromeric variant (CenPA homolog) | Ingouff et Berger (2009) | ||
Is targeted by non-coding RNA | Heterochromatin targeting by RNA Directed DNA Methylation (RdDM) | Law and jacobsen (2010) | ||
Non-histone protein binding | Binds specific Su(var) proteins | Arabidopsis LHP1 has functions distinct from animal HP1 | Gaudin et al. (2001) | |
Organization | Chromosome distribution | Located at centromere, telomere, pericentromeres, knob | See DNA sequences | Arabidopsis Genome Initiative (2000) |
Nuclear architecture | Involved in Chromosome Territories | Arabidopsis Chromosomes are organized in CTs | Pecinka et al. (2005) | |
Forms chromocentre vs Rabl conformations | Arabidopsis Chromosomes form chromocentres | De Jong and Fransz (2011) | ||
Nuclear positioning at NE | Enriched at nuclear periphery | Repressive context for transcription | Fang and Spector (2005); Andrey et al. (2010) | |
Under represented at NPC | Activating context for transcription in yeast | Arib and Akhtar (2011) | ||
Higher order of chromatin | May contain boundary elements, MARs | To be investigated in Arabidopsis | – | |
Functions | Chromosome segregation | Forms centromere | Satellite, histone variants and epigenetic state are involved | Zhang and Dawe (2011) |
Transcription regulation | Is a silenced state of chromatin | Epigenetic repressive marks, chromatin compaction | Lam et al. (2009) | |
Protection against genome instability | Repressed transposition and recombination | Transposable elements, rDNA (in yeast) | Mirouze et al. (2002); Watanabe (2005) |
While heterochromatin components are well documented, the chronological events leading to its assembly are still largely unknown. Clearly, several studies suggest that histone methylation at H3K9 is not needed to build heterochromatin (Tariq et al., 2003; Tessadori et al., 2007; Jovtchev et al., 2011). Cavalier-Smith (2010) proposed that centric heterochromatin evolved first, the repeated nature being the primary determinant of heterochromatin subsequently stabilized by centrosomal histone variants CENP-A. According to this hypothesis, plasmid partitioning in prokaryotes involves three simple components: a repeated DNA sequence (centromeric repeats in eukaryotes), a centromere binding protein (histone variants in eukaryotes), and an associated motor protein (microtubules in eukaryotes) to separate the segregating plasmids (Wilson and Dawson, 2011). In plants, it will be very relevant to study the role of the various histone variants to determine whether they are needed in the initial steps of heterochromatin assembly.
Nuclear architecture is a recent field of investigation in plants and is hypothesized to participate in the regulation of genome expression. Heterochromatin may be an important component in this mechanism by anchoring chromatin loops, a mechanism that nicely explains the formation of chromocentres in Arabidopsis (Fransz and De Jong, 2011). Chromocentre positioning falls into two main classes as some chromocentres cluster at the nucleolus while others are in close proximity to the NE. Nucleolus association can be explained by the fact that 45S rDNA repeats have to be actively transcribed within the nucleolus and, as a more general rule, genes transcribed by RNA Pol III and Pol I may be recruited to the nucleolus (Sáez-Vásquez and Gadal, 2010). However, preferential association of chromocentres at the nuclear periphery remains elusive. This may be reminiscent of chromosome segregation or other functions discussed in Section 3. Further, mechanisms responsible for chromocentre positioning still remain to be discovered in plants. Collectively, authors suggest that the nucleolus, because of its central position, excludes the chromocentres from internal localization except those bearing NOR. Chromosome territories were shown to be randomly distributed but form distinct areas within the nucleus suggesting a mutual exclusion between different territories (Pecinka et al., 2004; Berr and Schubert, 2007; De Nooijer et al., 2009; Andrey et al., 2010). Data coming from other species suggest specific interactions between NE components and heterochromatin, providing an alternative hypothesis that could explain heterochromatin positioning. Many issues are yet unsolved: lamin and LBR are not yet described in plants and KASH proteins have only recently been discovered. The possible implication of SUN and LINC proteins in NE-heterochromatin interactions also remain to be established.
In most species heterochromatin is the main form of chromatin. Its functional significance is constantly growing as our understanding of the cell process is increasing. Research into its assembly, maintenance and positioning will give us a better view of the processes involved in the regulation of genome expression.
Acknowledgements
The authors thank Aline Probst and David Evans for helpful discussions and helpful comments on the manuscript.