Volume 48, Issue 3 pp. 219-228
Full Access

Secondary structure and phylogenetic analysis of the internal transcribed spacers 1 and 2 of bush crickets (Orthoptera: Tettigoniidae: Barbitistini)

Sekundärstruktur und phylogenetische Analyse der Internal Transcribed Spacer 1 und 2 von Laubheuschrecken (Orthoptera: Tettigoniidae: Barbitistini)

Berit Ullrich

Berit Ullrich

Zoological Research Museum Alexander Koenig, Bonn

Department of Evolutionary Biology, Bielefeld University, Bielefeld

Search for more papers by this author
Klaus Reinhold

Klaus Reinhold

Department of Evolutionary Biology, Bielefeld University, Bielefeld

Search for more papers by this author
Oliver Niehuis

Oliver Niehuis

University of Osnabrück, Behavioral Biology, Osnabrück

Search for more papers by this author
Bernhard Misof

Bernhard Misof

University of Hamburg, Biozentrum Grindel und Zoologisches Museum, Hamburg, Germany

Search for more papers by this author
First published: 04 July 2010
Citations: 29
Corresponding author: Berit Ullrich (e-mail: [email protected])
Contributing authors: Klaus Reinhold ([email protected]); Oliver Niehuis ([email protected]); Bernhard Misof ([email protected])

Unpublished for the purposes of zoological nomenclature (Art. 8.2. ICZN).

Abstract

en

We inferred secondary structure models of the internal transcribed spacers (ITS) 1 and 2 of bush crickets using a combined comparative and thermodynamic approach. The inferred secondary structure models were used to account for interdependency of interacting nucleotides in a phylogenetic analysis of the bush cricket genus Poecilimon. Our analysis indicates that the two previously reported conformational structures (i.e., hairpin and ring) of ITS2 are likely to fold in bush crickets as well and that both predicted structures are similar to those proposed for other eukaryotes. Comparing predicted ITS1 secondary structure models proved to be difficult because of substantial variation in their nucleotide sequence length. Our study revealed that the phylogenetic signal of ITS1 and ITS2 is largely congruent with that preserved in the mitochondrial genes 16S rRNA, tRNA-Val and 12S rRNA. The phylogenetic signal in both the nuclear and the mitochondrial genome question the monophyly of the genus Poecilimon: species of the genera Poecilimonella, Parapoecilimon, Polysarcus and Phonochorion consistently cluster within Poecilimon.

Zusammenfassung

de

Sekundärstrukturmodelle der Internal Transcribed Spacer (ITS) 1 und 2 von Laubheuschrecken wurden durch Kombination eines vergleichenden und eines thermodynamischen Ansatzes hergeleitet. Diese Modelle wurden dann in einer phylogenetischen Analyse der Laubheuschreckengattung Poecilimon herangezogen, um der Interdependenz interagierender Nukleotide Rechung zu tragen. Unsere Analyse deutet darauf hin, dass die beiden von anderen Eukaryoten bekannten Konformationen (Haarnadel- und Ringstruktur) des ITS2 auch in Laubheuschrecken eingenommen werden, und dass sie in ihrer spezifischen Form denen anderer Eukaryoten ähneln. Ein Vergleich der Sekundärstrukturmodelle des ITS2 erwies sich auf Grund der beachtlichen Sequenzlängenvariation des ITS2 innerhalb der Eukaryota als schwierig. Unsere Untersuchung zeigt, dass das phylogenetische Signal des ITS1 und des ITS2 weitgehend mit dem der mitochondrialen Gene 16S rRNA, tRNA-Val und 12S rRNA kongruent ist. Sowohl das im nukleären als auch das im mitochondrialen Genom detektierte phylogenetische Signal stellt die Monophylie der Gattung Poecilimon in Frage: Arten der Gattungen Poecilimonella, Parapoecilimon, Polysarcus und Phonochorion gruppieren durchwegs zwischen denen der Gattung Poecilimon.

Introduction

Ribosomal internal transcribed spacers (ITS) are frequently used for phylogenetic inference and their suitability to answer phylogenetic questions has been studied in several taxonomic groups (e.g., Schlötterer et al. 1994; Hung et al. 1999, 2004; Weekers et al. 2001; Goertzen et al. 2003; Young and Coleman 2004; Wei et al. 2006; Aguilar and Sánchez 2007; Beiggi and Piercey-Normore 2007; Biffin et al. 2007; Rosselló et al. 2007). Almost all previous investigations treated the nucleotides of the ITS molecule as independent characters. It is known, however, that the ITS molecules fold into a complex structure, which is stabilized by intra-molecular hydrogen bonds (e.g., van Nues et al. 1995; Lalev and Nazar 1998; Joseph et al. 1999; Côté et al. 2002). As only certain nucleotide pairs can form thermodynamically stable hydrogen bonds, the interacting nucleotides tend to co-vary. This has potentially far reaching consequences for the accuracy of phylogenetic estimates (Galtier 2004).

The secondary structure of the ITS molecules appears to be pivotal for the proper processing of mature rRNA (Yeh et al. 1990; van Nues et al. 1995; van Beekvelt et al. 2001). In yeasts, ITS2 folds into two different conformational structures, which both seem necessary for an accurate and efficient processing of the 5.8S rRNA and 28S rRNA (Yeh and Lee 1990; Joseph et al. 1999; Côté et al. 2002). Little is known about the function and the secondary structure of ITS1, but the molecule seems to play a role in the maturation of the 18S rRNA (Yeh et al. 1990; van Nues et al. 1994; Lalev and Nazar 1998).

The importance of the secondary structure for the overall function of ITS1 and ITS2 has consequences for the phylogenetic analysis of ITS sequence data, as co-variation of interacting nucleotides of the ITS molecules can lead to inflated support values and biased phylogenetic estimates (Tillier and Collins 1995; Galtier 2004; Kjer 2004). Substitution models, commonly referred to as doublet or RNA substitution models, have been developed to analyse data sets with co-varying nucleotide sites (e.g., Schöniger and von Haeseler 1994; Higgs 2000; Savill et al. 2001). Because of their complexity and computational demands, these models have been primarily applied in a Bayesian framework (e.g., Jow et al. 2002; Hudelot et al. 2003; Kjer 2004; Niehuis et al. 2006a, 2007). However, interdependency of paired nucleotides can easily be accounted for under the maximum parsimony optimality criterion as well, by treating each pair of interacting nucleotides as single character. Programs that recode a data matrix accordingly are already available (e.g., 4to20, Smith et al. 2004; RNArecode, Fleck et al. 2008).

In the present study, we infer bush cricket-specific secondary structure models of ITS1 and ITS2. We demonstrate how the obtained secondary structure information can be used to account for an interdependency of paired nucleotides under the maximum parsimony optimality criterion. We apply ITS1 and ITS2 sequence data to infer phylogenetic relationships in the bush cricket tribe Barbitistini and assess the monophyly of the genus Poecilimon. We finally compare the phylogenetic signal of the ITS sequence data with that of the mitochondrial gene cluster 16S rRNA, tRNA-Val and 12S rRNA.

Materials and Methods

Taxon sampling

We analysed 152 ethanol preserved specimens of bush crickets (Tettigoniidae: Barbitistini) representing a total of 12 nominal genera (Table S1). Our taxon sampling includes 90 (plus 2 yet to be described) species of the about 140 currently recognized species in the genus Poecilimon (Eades et al. 2007). We further analysed Phaneroptera falcata and Scudderia furcata (Phaneropterini) as well as Tylopsis liliifolia (Tylopsini) for outgroup comparison. Voucher specimens are deposited in the collection of the Zoological Research Museum Alexander Koenig in Bonn, Germany.

Molecular procedures

Total genomic DNA was extracted from muscle tissue or spermatophores using the DNeasy Tissue kit (Qiagen, Gaithersburg, MD, USA). Complete sequences of the internal transcribed spacers (ITS) 1 and 2 were obtained via a single polymerase chain reaction (PCR) using the primers 18S–28S and 28S–18S (Weekers et al. 2001; Table 2). In species where this PCR failed to work, we used primers in the highly conserved 5.8S rRNA gene to amplify two smaller fragments which cover combined the same gene cluster. Specifically, we amplified ITS1 using the forward primer 18S–28S (Weekers et al. 2001) and one of the four bush cricket-specific reverse primers ITS-R1, ITS-R2, ITS-R3 and ITS-R4 (Table S2). ITS2 was amplified using the forward primer ITS2-28S (Weekers et al. 2001; Table 2) or one of the bush cricket-specific forward primers ITS-F1, ITS-F2, ITS-F3, ITS-F4 (Table S2) and the reverse primer 28S–18S (Weekers et al. 2001). We further studied a section of the mitochondrial genome comprising the large ribosomal (16S) RNA, tRNA-Val and an about 630-bp long section of the small ribosomal (12S) RNA. Complete sequences of this region were obtained by means of PCRs amplifying four overlapping fragments. The first fragment, encompassing the 5′ end of the 16S rRNA, was amplified using the primers Leu-F1 and 16S-R1 (Table S2). The second fragment of the 16S rRNA was obtained applying the primers LR-J-New (Misof et al. 2001; Table 2) and LR-N-13398 (Xiong and Kocher 1991; Table 2). For the third fragment, containing the 3′ end of the large subunit, we used different combinations of the bush cricket-specific primers 16S-F1, 16S-F2, 16S-F3 and 12S-R1, 12S-R2 and 12S-R3 (Table S2). In a few instances, the third PCR did not yield enough product. In these cases, we subsequently amplified two smaller overlapping fragments. The segment near the 5′ end of the 16S rRNA was amplified applying the reverse primers 16S-R2 or 16S-R3 (Table S2) with one of the above mentioned forward primers. The 3′ end near section of the 16S rRNA was amplified with the bush cricket-specific forward primers 16S-F4 and 16S-F5 (Table S2) and the above mentioned reverse primers. The 12S rRNA was amplified using the reverse primer 12S-R4 with one of the forward primers 16S-F6, 16S-F7, 12Sf1a (Niehuis et al. 2006b; Table 2) or SR-J-14233 (Simon et al. 1994; Table 2).

All PCRs were performed according to the protocol given by Niehuis et al. (2006b) and using either the GeneAmp PCR Systems 2700, 2720 and 9600 (Applied Biosystems, Foster City, CA, USA) or a TGradient (Biometra, Göttingen, Germany). The temperature profile for amplifying the ITS region started with a 5 min denaturation step at 94°C. It was followed by 25 cycles of 1 min at 95°C, 1.5 min at 52°C and 2 min at 72°C. The profile ended with a final extension step of 10 min at 72°C. The mitochondrial genome fragments were amplified with the touchdown temperature profile given by Niehuis et al. (2006b).

PCR products were purified with the NucleoSpin Extract kit (Macherey-Nagel, Düren, Germany). Amplified products were sequenced in both directions using the same primers as in the PCR reactions. Cycle sequencing reactions were carried out using the BigDye ReadyMix (Applied Biosystems, Foster City, CA, USA) and following the manufacturer’s recommendations. The cycle sequencing products were finally purified with a standard ethanol precipitation protocol and separated on an ABI PRISM 377 sequencer (Applied Biosystems). Complementary strands and overlapping fragments were assembled into continuous arrays using bioedit 7.0.5.3 (Hall 1999). All sequences have been submitted to EMBL (Table S1).

Sequence and secondary structure analyses

All sequences were pre-aligned using clustalx 1.8 (Thompson et al. 1997) and subsequently checked visually for obviously misaligned positions in bioedit 7.0.5.3 (Hall 1999). As the secondary structures of the mitochondrial genes 16S rRNA, tRNA-Val and 12S rRNA are well-characterized based on crystallographic studies (Ban et al. 2000; Schluenzen et al. 2000; Yusupov et al. 2001) and comparative sequence analyses (e.g., Hickson et al. 1996; Buckley et al. 2000; Page 2000; Page et al. 2002; Misof and Fleck 2003; Gillespie et al. 2006; Niehuis et al. 2006a,b), we first manually aligned the bush cricket 12S and 16S rDNA sequences to published sequences and secondary structure models of these genes in the honeybee (Gillespie et al. 2006) and identified conserved structural motifs. The hypothesized nucleotide interactions were then checked for validity with the mutual information examiner M(x, y) (Gutell et al. 1992) in the program bioedit. Using the obtained structure skeletons as a priori estimates for the secondary structures of the 16S and 12S rRNA in bush crickets, we subsequently analysed the corresponding sequence alignments with the software rnasalsa (Stocsits et al. 2009) to identify additional possible nucleotide interactions. rnasalsa searches for potential nucleotide interactions in aligned sequences and takes both thermodynamic considerations and compensatory / consistent substitutions into account. For the tRNA-Val molecule, we adapted the recently proposed secondary structure for burnet moths (Niehuis et al. 2006a) as skeleton for the analysis in rnasalsa.

To date, no study has reported X-ray crystallographic analyses on ITS1 and ITS2 and there is only a single comparative analysis (sensu Gutell et al. 2002) of the secondary structure of the two molecules (i.e., Goertzen et al. 2003). In order to not rely on secondary structure models that are based on thermodynamic considerations only, we inferred the secondary structure of ITS1 and ITS2 in bush crickets ab initio by using the software pfold (Knudsen and Hein 2003). pfold uses the KH-99 algorithm (Knudsen and Hein 1999), which integrates an evolutionary model of RNA sequences and a probabilistic model of secondary structures. The consensus structure predicted by pfold was chosen as input constraint for the subsequent secondary structure analysis in rnasalsa. The acceptance level for the input structure was set to 100%. Thus, only those base pairs in the input secondary structure model that are regarded as thermodynamically stable in all analysed taxa were considered in the secondary structure constraint. We further calculated structure logos for the proposed structure models to display the nucleotide frequencies and their variation as well as the information content of each proposed helix (Schneider and Stephens 1990; Gorodkin et al. 1997). Each structure logo was calculated based on the individual base composition in the analysed data set. Secondary structure models were drawn with the software xrna (Weiser and Noller, University of California, Santa Cruz, available at http://rna.ucsc.edu/rnacenter/xrna/xrna.html).

Phylogenetic analyses

Ambiguously aligned nucleotide positions were excluded from the phylogenetic analyses. The nucleotide composition of the nuclear and mitochondrial data sets were separately tested for homogeneity across taxa with the chi square test implemented in paup* 4.0b10 (Swofford 2003) and considering only parsimony informative sites. To account for an interdependency of paired nucleotides in r/tRNA and ITS coding sequences, the nuclear and mitochondrial data sets were transformed with the Perl-script RNArecode (Fleck et al. 2008). Each pair of interacting nucleotides in r/tRNA and ITS molecules was recoded according to the transformation matrix shown in Table S3 and thereby treated as single character.

Phylogenetic analyses were carried out with paup* and applying the maximum parsimony (MP) optimality criterion. We performed a heuristic tree search (100 search replicates with time limit of 1000 s, random addition of sequences, TBR branch swapping). Bootstrap support values were inferred from 1000 replicates (each with 10 heuristic search replicates, random addition of sequences, TBR branch swapping). The mitochondrial and nuclear data sets were analysed separately and differences in the obtained consensus topologies statistically assessed with the Kishino–Hasegawa, the Templeton (Wilcoxon signed-ranks) and the winning-sites (sign) tests as implemented in paup*.

Results

Characteristics of the data sets

The nuclear sequence alignment (i.e., ITS1 and ITS2; available from the authors upon request) consisted of 155 sequences and 794 sites. About 0.5% of the 115 630 nucleotides were missing and treated as missing information in the phylogenetic analysis. The mitochondrial sequence alignment (i.e., 16S rRNA, tRNA-Val and 12S rRNA; available from the authors upon request) included 155 sequences and 2125 sites. About 1.4% of the 329 375 nucleotides were missing and treated as such in the phylogenetic analysis. No significant inhomogeneity of base frequencies was found among sequences of the nuclear and the mitochondrial data (χ2, p = 1.0).

Secondary structure predictions

A total of 111 bp, folding into 15 helices, were predicted for the 627-bp long internal transcribed spacer 1 (Fig. 1). 60 bp (54%) of the 111 assumed nucleotide interactions are supported by compensatory and/or consistent substitutions. Forty-one of them are supported by one type of compensatory/consistent substitution only; the remaining 19 bp are supported by at least two different types of substitutions. All proposed base pair interactions are predicted to fold in at least 96% of the 155 investigated sequences.

Details are in the caption following the image

Predicted secondary structure of the internal transcribed spacer 1 (ITS1) in Poecilimon chopardiRamme 1933 (AM888875). Watson–Crick base pairs are indicated by dashes, non-canonical guanin-uracil pairs are represented by a solid dot, all other non-canonical interactions by a hollow circle. The provided structure logos display the consensus structure, frequency of nucleotides (height of the nucleotide symbol proportional to its frequency), and information content of individual helices

We inferred two ITS2 structures, corresponding to the two conformational structures that had previously been proposed in yeasts (Yeh and Lee 1990; Joseph et al. 1999; Côté et al. 2002). One of them, the so called ring structure (Joseph et al. 1999), was obtained when applying the program pfold (Knudsen and Hein 1999) (Fig. 2a). The second structure, commonly referred to as the hairpin structure (Yeh and Lee 1990), was inferred when using the program rnasalsa (Fig. 2b). Both are well supported in our data set by compensatory and/or consistent substitutions and share putative homolog helices (i.e., helix 1 ≙ I, 2a ≙ III, 2b ≙ IV, 3 ≙ VI; Fig. 2). The predicted ITS2 ring structure consists of 61 bp in four helices. Each base pair can fold in at least 96% of all analysed sequences. Twenty eight (46%) of the predicted 61 bp are supported by compensatory/consistent substitutions; nine of them are supported by at least two different types of substitutions. The predicted bush cricket ITS2 hairpin structure consists of 73 bp in 10 helices. Each base pair can fold in at least 97% of the 155 investigated sequences. 32 (44%) of the 73 predicted base pairs are supported by compensatory/consistent substitutions, but only three of them are supported by more than one type.

Details are in the caption following the image

(a) Structure prediction for the internal transcribed spacer 2 (ITS2: ring structure conformation) as predicted by the program pfold (Knudsen and Hein 1999). (b) Secondary structure of the internal transcribed spacer 2 (ITS2: hairpin structure conformation) as predicted by the program rnasalsa (Stocsits et al. 2009). Watson–crick interactions are displayed by a dash, non-canonical G-U pairs by a solid dot, and all other non-canonical pairs by a hollow circle. Structure logos show the consensus structure, nucleotide frequencies and information contents of the helices. Both structure graphs are drawn with the sequence of Poecilimon chopardi Ramme 1933 (AM888875)

The inferred bush cricket secondary structure models of the 12S rRNA, tRNA-Val and 16S rRNA are largely consistent with previously proposed models (e.g., Misof and Fleck 2003; Gillespie et al. 2006; Niehuis et al. 2006a,b) and no additional helices were proposed (see Appendix 1 and 2).

Phylogenetic reconstructions

The nuclear data set included initially 794 characters. We removed 48 of them, since we considered their alignment ambiguous. Recoding the ITS1 and ITS2 sequence data to account for the 368 predicted base pair interactions in the secondary structure of ITS1 and the hairpin conformation of ITS2 resulted in a data matrix with 562 characters; 284 (51%) of them were parsimony informative. Recoding the ITS data set to account for the 344 predicted base pair interactions in the secondary structure of ITS1 and the ring conformation of ITS2 resulted in a data matrix with 574 characters, of which 288 (50%) were parsimony informative. The mitochondrial data set included initially 2125 characters, of which we considered 93 as ambiguously aligned and removed them. After recoding the data matrix to account for interacting nucleotides, the alignment consisted of 1592 characters; 813 (51%) of them were parsimony informative.

Phylogenetic analysis of the nuclear data set and accounting for the predicted base pair interactions of the ITS2 hairpin conformation provided 510 326 equally parsimonious trees (1934 steps). A corresponding analysis that accounted for nucleotide interactions of the ITS2 ring conformation resulted in 560 685 trees (1931 steps). Strict consensus trees from the two analyses were nearly identical except for the position of the outgroup genera Andreiniimon and Isophya (Fig. 3a). Phylogenetic analysis of the mitochondrial data set resulted in 3193 equally parsimonious trees (8808 steps; Fig. 3b). The consensus topologies from the phylogenetic analyses of the nuclear and the mitochondrial data were largely congruent for recent splits. These splits were also mostly supported with bootstrap support values larger than 80%. However, all three applied statistical tests to assess the compatibility of the mitochondrial and the nuclear data (i.e., Kishino–Hasegawa, Templeton and winning-sites) indicated significant differences (p < 0.0001) in the phylogenetic signal. While the observed genealogical incompatibilities concern primarily deeper splits, all three data sets confirmed a monophyletic origin of the bush cricket tribe Barbitistini with high statistical support (nuclear data set, hairpin structure: 96%; nuclear data set, ring structure: 98%; mitochondrial data set: 100%). However, none of the three data sets supported a monophyly of the genus Poecilimon: Parapoecilimon, Phonochorion, Poecilimonella and Polysarcus consistently clustered within the genus.

Details are in the caption following the image Details are in the caption following the image

Phylogeny of bush crickets. (a) Strict consensus topology based on the phylogenetic analysis of the ITS1 and ITS2 sequence data. Node labels indicate bootstrap support over 50% based on 1000 replicates. Support values in black were inferred when accounting for the predicted nucleotide interactions of the hairpin conformational structure, support values in gray when accounting for the nucleotide interactions of the ring structure. The conformational structure of ITS2 had no influence on the topology except for the position of Andreiniimon and Isophya, which were inferred as sister taxa (clades shown in gray) when accounting for the nucleotide interactions of the ring structure. (b) Strict consensus topology based on the phylogenetic analysis of mitochondrial (16S rRNA, tRNA-Val and 12S rRNA) sequence data. Node labels indicate bootstrap support over 50% based on 1000 replicates. Letters refer to species groups mentioned in the text; outgroup taxa that cluster inside Poecilimon are marked with an asterix

Discussion

We inferred the secondary structures of the internal transcribed spacers (ITS) 1 and 2 in bush crickets of the tribe Barbitistini. Co-variation of paired nucleotides was accounted for in a maximum parsimony (MP) analysis by recoding the data matrix and treating paired nucleotides as a single character. The inferred phylogenetic estimates were compared with those obtained from analysing the mitochondrial genes 16S rRNA, tRNA-Val and 12S rRNA and considering the secondary structure of the corresponding molecules.

Secondary structure of ITS1 and ITS2

Secondary structure models of the internal transcribed spacers 1 and 2 have been mainly inferred by searching for structures of minimum free energy (e.g., Cunninham et al. 2000; Armbruster 2001; Gontcharov and Melkonian 2005; Coleman 2007). Minimization of the free energy is computed for a specific, though selectable, temperature. As a study by Armbruster (2001) showed, even slight alterations of the temperature can have a strong influence on the structure prediction. Doshi et al. (2004) and Layton and Bundschuh (2005) concluded that thermodynamic considerations alone are insufficient to reliably infer the secondary structures of RNA molecules. Higgs (2000) pointed out that thermodynamic methods are accurate for relatively small molecules like tRNAs, but perform poorly when applied to longer sequences. Given the unreliability of thermodynamic approaches to accurately infer the secondary structure of larger molecules, we studied the ITS and ribosomal RNA sequences of bush crickets by taking both thermodynamic and comparative considerations into account.

Our current knowledge of the ITS1 secondary structure is based primarily on thermodynamic considerations of yeast sequence data (Thweatt and Lee 1990; Yeh et al. 1990; van Nues et al. 1994; Lalev and Nazar 1998). Almost nothing is known about the ITS1 molecule structure in insects. In the most comprehensive analysis of ITS1 sequences published so far (Armbruster 2001), only the sequence of one insect, that of the fruit fly Drosophila simulans, is included. The secondary structure that Armbruster (2001) proposed for the ITS1 molecule in D. simulans differs significantly from the structure that we inferred for bush crickets. A comparison of the bush cricket ITS1 structures with that of D. simulans and that of other organism like yeast is problematic, however, due to the significant variation of the spacer sequence length [e.g., 627 bp in Poecilimon chopardi, 690 bp in D. simulans (Armbruster 2001), 361 bp in Saccharomyces cerevisiae (Thweatt and Lee 1990)]. It is therefore difficult to assess in how far different organisms share common motives in their ITS1 molecule.

Previous studies on the secondary structure of ITS2 exposed evidence for two different conformational structures of the molecule (i.e., ring and hairpin model; Yeh and Lee 1990; Joseph et al. 1999; Côté et al. 2002). Our analyses suggest that these two conformational structures are also present in bush crickets. The ring structure was predicted by the program pfold (Knudsen and Hein 2003). According to Schultz et al. (2005) and Coleman (2007), the ITS2 ring structure of eukaryotes consists of three to four helices arranged along a central loop and two recurring motives. Our ITS2 ring structure model for bush crickets consists of four helices and contains at least one of the two motives: the pyrimidine-pyrimidine bulge at the distal part of helix 2. We did not find the (Y)GG(Y)-motive that Coleman (2007) and Schultz et al. (2005) propose in their models for the 5′ part of helix 3. A similar motive is, however, present in our bush cricket model at the 5′ end of helix 4.

Phylogenetic analyses

The possible occurrence of two conformational structures of the ITS2 molecule in one taxon, as we have found in bush crickets and as it has previously been reported in yeast (Côté et al. 2002), poses a problem for the phylogenetic analysis of ITS2 sequence data: how to account simultaneously for base pair interactions of two structures? If the proposed nucleotide interactions are not in conflict with each other, recoding the data matrix considering both structures would be one possible alternative. However, most of the nucleotide interactions of the two ITS2 conformational structures are in conflict with each other. Fortunately, the phylogenetic signal between the data set for the hairpin and ring structures was largely congruent in the present study. However, this problem might become relevant in other data sets. Researchers should take these considerations into account in future analyses of ITS2 sequence data.

The phylogenetic signal included in the nuclear and mitochondrial data sets resulted in largely compatible species clusters, although certain sister taxa relationships and many deeper splits were ambiguously resolved. Examining the bootstrap support values indicated that the phylogenetic inferences based on the nuclear data were generally better statistically supported, and many deeper splits showed high support values (i.e. >94%) as well. While the sequence data from the nuclear and the mitochondrial genome support the hypothesis of a monophyletic tribe Barbitistini, they are incompatible with a monophyly of the genus Poecilimon: the genera Polysarcus, Phonochorion, Poecilimonella and Parapoecilimon are consistely inferred as sister taxa to specific Poecilimon species (groups). The specific sister group relationships of Polysarcus, Phonochorion, Poecilimonella and Parapoecilimon within Poecilimon remain unclear, however.

The phylogenetic signal of the nuclear and mitochondrial sequence data corroborated several previously proposed species groups in the genus Poecilimon. Early morphological studies by Bey-Bienko (1954) and Ramme (1933, 1939) had indicated that Poecilimon thoracicus, P. macedonicus, P. brunneri, P. ukrainicus, P. elegans, P. zwicki are closely related to each other. The same species also clustered in our phylogenetic analyses with bootstrap support values of 76% (nuclear data set, hairpin structure), 77% (nuclear data set, ring structure) and 90% (mitochondrial data set), suggesting that the P. thoracicus group is monophyletic (Fig. 2, Node A). Heller and Sevgili (2005) further hypothesized that P. sanctipauli, P. lodosi and P. pulcher are a monophyletic species assemblage, which they named P. sanctipauli group (Node B). This hypothesis is substantiated by our phylogenetic analyses with bootstrap support of 99% (nuclear data set, hairpin and ring structure) and 74% (mitochondrial data set). The molecular data finally corroborated morphological hints, which suggested that the P. ampliatus species group (Heller and Lehmann 2004), of which we studied P. ampliatus, P. amissus, P. ebneri, P. intermedius, P. klisuriensis and P. marmaraensis marmaraensis, is likely polyphyletic unless additional taxa are included. The molecular data strongly suggested that Poecilimon birandi, P. davisi, P. doga, P. excisus, P. haydari, P. ledereri, P. luschani, P. orbelicus, P. tuncayi and Poecilimonella armeniaca are part of the ampliatus group (Node C, bootstrap support values >90% in the nuclear and mitochondrial data analyses).

The structure prediction method we propose presents a promising approach to reconstruct secondary structures of non-coding genes in taxa that have not been studied so far. The consideration of taxon-specific secondary structure models helps to improve the inference of phylogenetic relationships and should provide more realistic values of tree robustness. The incorporation of secondary structure information into maximum parsimony analyses can easily be achieved with available software/scripts like 4to20 (Smith et al. 2004) and RNArecode (Fleck et al. 2008) and thus presents a computationally fast approach to consider structure information in large data sets.

Acknowledgements

We are thankful to K.-G. Heller who provided valuable tissue material, determined most of the specimens and contributed his taxonomic knowledge to this study. We thank K. Meusemann for valuable help in the lab and C. Etzbauer for technical assistance. R. Overson provided valuable comments on linguistic issues. Special thanks go to O. W. Snörre for invaluable support. We are further grateful to B. Knudsen, who provided us with an offline version of his program pfold and also granted extended access to the web-based version of his program. For providing specimens or tissue samples our thanks go to A. Benediktow, E. Blümm, H. Braun, D. Chobanov, B. Çiplak, F. Chládek, Y. Durmus, M. Heller, M. Holdried, M. Kalashian, O. Korsunovskaya, A. Lehmann, G. Lehmann, J. McCartney, U. Pörschmann, K. Rohrseitz, H. Sevgili, K. Strauss, A. Stumpner, M. Volleth, D. von Helversen † and R. D. Zhantiev. This project had been financed by the Department of Evolutionary Biology, Bielefeld University and by the Zoological Research Museum Alexander Koenig, Bonn, Germany.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.