Volume 33, Issue 1 pp. 93-108
Article
Free Access

Hennig's semaphoront concept and the use of ontogenetic stages in phylogenetic reconstruction

Prashant P. Sharma

Corresponding Author

Prashant P. Sharma

Department of Zoology, University of Wisconsin-Madison, 430 Lincoln Drive, Madison, WI, USA

Corresponding author:

E-mail address:[email protected]

Search for more papers by this author
Ronald M. Clouse

Ronald M. Clouse

Division of Invertebrate Zoology, American Museum of Natural History, Central Park West at 79th Street, New York, NY, USA

Search for more papers by this author
Ward C. Wheeler

Ward C. Wheeler

Department of Zoology, University of Wisconsin-Madison, 430 Lincoln Drive, Madison, WI, USA

Search for more papers by this author
First published: 11 March 2016
Citations: 23

Abstract

A new practice in systematics, “semaphoront” coding, treats developmental stages as terminals, and it derives from Hennig's concept of the same name. Semaphoront coding has been implemented recently by Lamsdell and Selden (BMC Evol. Biol., 2013, 13:98) and Wolfe and Hegna (Cladistics, 2014, 30:366) in an effort to understand the relationships of fossil taxa of unknown developmental stage. We submit that this approach is antithetical to cladistic practice and constitutes a gross misunderstanding of Hennig's original idea. Here we review the concept of the semaphoront and clarify the role of the semaphoront in phylogenetic systematics. We contend that treating ontogenetic stages as terminals both violates tenets of phylogenetic systematics and oversimplifies the complexity of developmental processes. We advocate Hennig's alternative of including data from as many semaphoronts as possible, but implemented using the superior total evidence framework. Finally, we contend that the application of semaphoront coding to any palaeontological question requires invoking multiple, unjustified assumptions, and ultimately will not yield a possible phylogenetic solution. A total evidence approach can grapple with the placement of fossil developmental stages, if only imperfectly.

Introduction

In morphology-based systematics, sexually immature developmental stages may bear numerous characters pertinent to phylogenetic reconstruction. Coding structures that are unique to certain ontogenetic stages (i.e. that have no homologue in the remainder of the developmental series) has the potential to enrich data matrices and improve phylogenetic resolution. Valuable examples are provided by entomology, wherein larval stages constitute sources of informative characters unique to those stages (e.g. Hennig, 1966; Takizawa, 1976; Beutel, 1993; Marvaldi, 1997; Fleck et al., 2006, 2008; Lawrence et al., 2011), in some cases according more closely with molecular phylogenies (i.e. trees inferred from independent data classes) than morphological characters based exclusively on the adult stage (Fleck et al., 2008). In other cases, characters from nonadult stages have been shown to be generally less informative than adult-based characters (e.g. Reinert et al., 2004; Beutel et al., 2011; Meier and Lim, 2009; reviewed by Blanke et al., 2015), suggesting that the informativeness of given stages may be taxon-specific.

Multiple approaches historically have been implemented to integrate morphological characters across developmental series, such as using only larval characters, using only adult characters, combining data partitions or reconciling morphological trees from different discrete stages (e.g. Grandjean, 1954, 1957; Welbourn, 1991; Zhang, 1995; Judd, 1998; Ng and Clark, 2000; Steyer, 2000; Pugener et al., 2003; Fleck et al., 2008). The philosophical and methodological concerns for these approaches are analogous to analyses of molecular sequence data (or analyses of different data classes), wherein discourse over whether and how data partitions should be combined in phylogenetic analysis has a long history in the literature (Kluge, 1989; Bull et al., 1993; Edwards et al., 2007; Salichos and Rokas, 2013).

A recent practice in systematics implemented by Lamsdell and Selden (2013) and Wolfe and Hegna (2014) incorporates a different approach, treating various ontogenetic stages (“semaphoronts”, sensu Hennig, 1966) as terminals within the same phylogenetic analysis with adult stages. This method (hereafter, “semaphoront coding”) has the effect of generating tree topologies with putative species represented more than once in the phylogeny.

The two studies reached opposite conclusions about the performance of this approach. Lamsdell and Selden (2013) applied semaphoront coding to the phylogeny of the extinct arthropod order Eurypterida, with the aim of examining the effect of accidentally treating juveniles as adult stages in a phylogenetic analysis. They found that the simulated treatment of juveniles as operational taxonomic units (OTUs) resulted in destabilized tree topologies. For this reason, Lamsdell and Selden advocated that juvenile specimens should be excluded from phylogenetic analyses that include only adult characters when the goal is to resolve species relationships (Lamsdell and Selden, 2013).

Wolfe and Hegna (2014) separately implemented semaphoront coding to infer the phylogeny of pancrustaceans (with both extinct and extant species within the ingroup). The stated aim of their work was to determine the placement of Orsten (an Upper Cambrian Lagerstätte) fossil taxa that bore resemblances to the larvae of extant pancrustaceans. They attempted various combinations of terminals and data partitions, including (i) all ontogenetic stages in a single phylogeny, (ii) only comparable (very early or very late) stages, (iii) all stages with constraints for species monophyly, (iv) all stages with a tree constrained to accord with a molecular phylogeny, and (v) all ontogenetic stages with minimal missing data. Notably, none of these approaches constituted total evidence, which would instead (in principle) use all available data from various ontogenetic stages to construct a single row of data per terminal in a morphological character matrix. This applies as well to the monophyly constraint analyses; species were depicted as single terminals, but all semaphoronts were included as separate terminals in the original analysis. Wolfe and Hegna (2014) concluded that semaphoront coding was a promising approach that could be more generally applied to other groups of organisms.

The challenges faced by these palaeontologists are considerable, and their efforts both creative and intensive. However, the advocacy of Wolfe and Hegna (2014) was compromised by misunderstanding of the notion of a semaphoront, its place in phylogenetics and the diversity of developmental processes. The aim of this critique is to clarify some of the misconceptions surrounding the semaphoront concept, to explore the pitfalls of semaphoront coding using intuitive examples and to reinforce the effectiveness of the total evidence approach in estimating the phylogenetic affinity of fossil taxa.

Hennig's semaphoront concept

Willi Hennig dedicated several pages to the great diversity of ontogenetic processes in nature, including the treatment of metamorphism (indirect development) and cyclomorphism (alternation of generations or seasonal dimorphism) in a phylogenetic context (Hennig, 1966). These processes bore heavily on the notion of the “individual” as the fundamental unit of systematics, because individual organisms undergo changes in morphology as a function of time (ontogeny), and in various lineages, as a function of generation or season. Hennig therefore considered a specific period of the individual organism's life history—the semaphoront—to constitute the fundamental unit of systematics.

[O]ne and the same individual assumes a different place in most systems at different times of its life…it follows that we should not regard the organism or the individual (not to speak of the species) as the ultimate element of the biological system. Rather, it should be the organism or the individual at a particular point of time, or even better, during a certain, theoretically infinitely small, period of its life. We will call this element of all biological systematics, for the sake of brevity, the character-bearing semaphoront. (Hennig, 1966, p.6)

The relationships between semaphoronts, between the individuals that bore them and between the species harbouring those individuals, were depicted in Hennig's renowned illustration (Hennig, 1966), redrawn here (Fig. 1a). Semaphoronts were connected by a series of ontogenetic relationships to form individuals; individuals by tokogenetic (i.e. population genetic or genealogical) relationships to form species; and species by phylogenetic relationships represented as cladograms. Of the three, Hennig observed that only the phylogenetic relationships constituted a “hierarchic system” in the mathematical sense (Fig. 1b), wherein: relationships between the parts of the system are unidirectional (single-headed arrows); each element xn can receive only a single relationship (arrowhead), but can donate one or more relationships to other elements (arrow tails); and there is exactly one element in the system that donates relationships, but receives none (x0). Tokogenetic (genealogical) relationships do not constitute hierarchies in sexually reproducing organisms. Significantly, ontogenetic relationships do not constitute a hierarchy either.

Details are in the caption following the image
(a) Modification of fig. 6 of Hennig (1966), indicating the differences between ontogenetic, tokogenetic and phylogenetic relationships. Note that ontogenetic relationships are not observable for fossil taxa. (b) Visual representation of a “hierarchy” sensu Woodger (1952). Note that only phylogenetic relationships are hierarchic and unknowable; ontogenetic and “tokogenetic” (genealogical) relationships are nonhierarchic and knowable/observable, in principle.

In order to answer the question of whether the hierarchic system is rightfully used in biological systematics we must investigate whether semaphoronts can be substituted for x0, x1, … in [Fig. 1b]. Obviously they cannot. It is true that the individual semaphoronts are connected to semaphoront complexes (which we call individuals) by relations that we call ontogenetic relations. But the structure of these ontogenetic relations does not correspond to the conditions of a hierarchic system. (Hennig, 1966, p.18)

In addition, ontogenetic relationships can only be established for extant taxa. How then are morphological data from different semaphoronts (developmental stages) to be used for phylogenetic reconstruction? For the specialized case of insect metamorphosis, Hennig offered a practical solution: coding the stages of metamorphosis separately, and reconciling the resulting datasets.

The various stages of metamorphosis may be treated as if they were independent organisms, a separate phylogenetic system erected for each, and then an attempt made to bring these systems into congruence… (Hennig, 1966, p.122)

Hennig depicted different ontogenetic stages (semaphoronts) as the faces of a cube (the individual), with morphological data from each face of the cube capable of informing a phylogenetic analysis (redrawn in Figs 2a,b). Thus, under the cladistic method, the monophyly of groups could be supported independently by characters specific to individual stages. More importantly, the phylogenetic congruence among different developmental stages that Hennig advocated could be attained.

Details are in the caption following the image
(a) Modification of fig. 33 of Hennig (1966), where semaphoronts are conceptualized as different facets of a single individual. (b) Modification of fig. 35 of Hennig (1966), showing the traditional method of coding developmental data in a cladistic framework. Each of the visible faces of a cube corresponds to different semaphoronts/stages, as in (a). Colours of faces represent hypothetical character states occurring in particular stages. (c) Visual translation of Hennig's approach to integrating developmental data. Each semaphoront/stage contributes a set of characters to a total evidence matrix, with the expectation of congruent phylogenetic signal from each semaphoront's submatrix. Letters A–E indicate species.

It must be recognized in principle that the requirement of complete congruence between larval and imaginal systems is theoretically justifiable only in phylogenetic systematics…The degree of morphological difference or similarity among several species may indeed differ greatly in the various stages of metamorphosis. From this we can conclude that it is impossible to bring into complete congruence larval and imaginal systems that attempt to express the degree of morphological similarity (typological or form relations) of these different stages of metamorphosis. Since phylogenetic systematics evaluates morphological differences and correspondences in a different way (not according to their magnitude), with proper evaluation it must always be possible to bring larval and imaginal systems into congruence. (Hennig, 1966, p.122)

Hennig thus provided a compelling solution for the problem of indirect development (a developmental series with saltational changes in morphology between groups of semaphoronts; these discrete morphologies are divided into “metamorphic stages”). As conceptualized here (Fig. 2c), every ontogenetic stage that can be homologized across taxa provides a source of phylogenetic characters, and uniting the character matrix drawn from each semaphoront (a face of the cube) results in a total evidence matrix composed of data drawn from the full range of the developmental series. Hennig originally depicted multiple trees drawn from stage-specific (e.g. larva tree, pupal tree) partitions and suggested taxonomic congruence as the means to reconcile these partitions (“ontotrees”, sensu Steyer, 2000). However, the relatively small number of inferable ontotrees limits the power of morphological datasets to reconstruct species relationships, in comparison to a character congruence (total evidence) approach that we favour herein.

In comparison to semphoront coding, total evidence implementation circumvents the undesirable possibility of encountering characters “diagnostic” of stages of metamorphosis (what Hennig referred to as “typological or form relations”, which would be scored as uninformative within each stage's submatrix under total evidence), rather than of species relationships. A simpler version of this solution could also be implemented for direct development (a developmental series lacking saltational changes in morphology between stages; juveniles or “nymphs” in such a continuous series are distinguished from one another mostly by size and other continuous characters), namely, by selecting the adult (sexually mature) semaphoront to represent the individual.

The implementation of semaphoront coding

Semaphoront coding, as implemented by both Lamsdell and Selden (2013) and Wolfe and Hegna (2014), treats all definable “stages” as terminals in a phylogenetic practice. The premise of this approach is grounded in palaeontology and the problem of determining the phylogenetic affinities of fossil taxa that may not represent adult stages. In the case of extant taxa, the use of semaphoront coding and the de facto separation of different life stages into putative “species” have no justification; the ontogenetic relationships of semaphoronts can be observed directly as a developmental series, and are thus (in principle) knowable by developmental biologists. If molecular sequence data are available, the need to observe a complete developmental series to determine species boundaries for phylogenetic purposes is further mitigated; it is expected that molecular sequence data obtained for ontogenetic stages of a single individual will be invariable or nearly invariable. For this reason, sequence data can be used to infer the taxonomy of larvae when observing the complete developmental series is not feasible (Webb et al., 2006; Heimeier et al., 2010). Thus, any justification for treating life stages as putative species is exclusive to fossil taxa.

A salient feature of the tree topologies resulting from semaphoront coding is that these entities are not phylogenetic trees. Because ontogenetic relationships occur within an individual (Fig. 1a), the relationships of multiple semaphoronts from multiple species cannot be represented using a branching diagram of hierarchic relationships (Fig. 1b). A phylogenetic (cladistic) framework does not represent these relationships accurately, and thus the very premise of semaphoront coding as a means of testing the phylogenetic position of larval fossils (Wolfe and Hegna, 2014) is fundamentally flawed.

An intuitive example is provided by the ontogeny of our own species. The single-celled zygote constitutes an early developmental stage, or semaphoront, of Homo sapiens. If the zygote were coded into a morphological matrix of eukaryotes, together with a range of embryonic and postembryonic stages, we might expect the zygote to be placed among other single-celled eukaryotes, various embryonic stages to cluster with other vertebrate embryos and lineages such as cephalochordates, and the adult human to cluster with primates. However, these placements of H. sapiens are not in any way phylogenetic, because the species appears in multiple locations, due to each stage suffering from various deficiencies in information. The zygote is especially lacking in morphological characters and thus has the most erroneous placement in the tree, and as a whole the “tree-shaped object” (sensu Wheeler and Pickett, 2007) produced by semaphoront coding cannot answer the phylogenetic question, “Which species is most closely related to H. sapiens?

The most recent implementations of semaphoront coding had different aims, results and conclusions. The aim of Lamsdell and Selden (2013) in their study of the entirely extinct Eurypterida was to demonstrate the effect of accidentally including juveniles as new species, in an exploratory, simulated procedure, and they subsequently concluded that treating (inferred) juveniles as species was destabilizing to phylogenetic reconstruction. In their case, ontogenetic relationships are simply not observable, and their trees could in principle be phylogenetic—we cannot demonstrate that eurypterid juveniles are not paedomorphic exemplars of different species. Wolfe and Hegna (2014) cannot be similarly forgiven, for they implemented semaphoront coding not only for extinct terminals (which might hypothetically represent different species), but also for extant taxa. In every tree presented by Wolfe and Hegna (2014) where the monophyly of species was not enforced and multiple semaphoronts were coded for extant species, various extant species were not recovered as monophyletic. Given that data for extant species were drawn from observable and established developmental series, we know with complete epistemological certainty that the “phylogenetic relationships” presented in most of the tree topologies of Wolfe and Hegna (2014) are false.

An alternative approach, and at least one with some possibility of recovering a phylogenetic tree, would have been to represent each extant taxon as one terminal replete with data from a range of ontogenetic stages (i.e. as in Fig. 2b,c), and subsequently add fossil semaphoronts without a priori designation of species. However, as discussed below, even this approach is fraught with epistemological and procedural hurdles.

Contingency of tree topology on mode of development

One of the problems with the framework of semaphoront coding sensu Wolfe and Hegna (2014) is that it presumes that this approach will produce a repeatable result across taxa. However, a fundamental aspect of this approach overlooked by Wolfe and Hegna (2014) is that the tree topology which will result from semaphoront coding is a function of the developmental mode that occurs in the taxon of interest.

In the case of idealized direct developers, where every stage in the developmental series is morphologically identical to every other stage except in size (Fig. 3b), we might expect that the resulting tree topology will always yield the same true tree, assuming there is an abundance of informative morphological characters that reflect phylogenetic history. In this case, the coding of multiple life stages will be redundant, as each stage will yield the same set of morphological data as any other stage within the same species (Fig. 3c). This is analogous to sequencing multiple stages of each species; as no sequence differences will occur across ontogenetic stages of a single species, clades corresponding to species are expected (barring such phenomena as incomplete lineage sorting, hybridization and systematic error), with zero-length branches for each stage within a species. In cases of ideal (or near ideal) direct development, placement of fossils is therefore a relatively straightforward practice, regardless of ontogenetic stage.

Details are in the caption following the image
Tree topologies produced by semaphoront coding are contingent upon mode of development. (a) A hypothetical true tree. (b) Idealized direct development, where each semaphoront is indistinguishable from all others, except in size. In this extreme case, there is complete overlap in the set of morphological characters occurring in all semaphoronts. (c) Tree topology expected under semaphoront coding, where size does not contribute to the morphological matrix. The coding of different instars is redundant and all species are expected to be monophyletic, given ample and informative morphological characters. (d) Idealized indirect development, where zero characters are shared between metamorphic stages. (e) Tree topology expected under semaphoront coding. Depending on how semaphoronts are coded, clades corresponding to stages may result from abundance of characters that define stages, rather than species relationships. Dashed lines in tree topology correspond to “phylogenetic” relationships between stages, which are not meaningful.

By contrast, in idealized indirect development (metamorphic development, sensu Hennig, 1966), stages do not share homologizable characters (Fig. 3d). In this case, the expected tree topology will be markedly different. Depending on how this type of development is coded (discussed below), the resulting tree topology may yield clades corresponding to stages (Fig. 3e). This is because semaphoront coding allows for the inclusion of characters that are “diagnostic” of stages of metamorphosis, and not of species relationships (i.e. synapomorphies). Given that metamorphosis is defined by the incidence of disparate (and often, incomparable) morphologies during development, semaphoront coding will not yield species monophyly for indirectly developing taxa except under special cases (see below).

Although the scenarios presented here (Fig. 3) may seem unrealistically simple, they reflect the two opposite ends of the spectrum of character distribution, with characters either shared fully across stages (Fig. 3b) or no characters shared across stages (Fig. 3d). In real biological systems, the degree to which homologizable characters overlap across stage boundaries will lie somewhere between these two extremes, which is all the more problematic. The contingency of semaphoront coding's predictions on developmental mode derail any effort to read phylogenetic relationships into semaphoront trees, particularly when the ingroup taxa include both direct and indirect developers (e.g. Pancrustacea; Wolfe and Hegna, 2014). A hypothetical recovery of a nonmonophyletic species or species group could be attributable to actual (phylogenetic) nonmonophyly, or alternatively, these species may simply be indirect developers, whose metamorphic stages are separated across “clades” of stages. The inverse is also true; a cluster of terminals under semaphoront coding may represent phylogenetic proximity or indicate that they are direct developers. Semaphoront coding certainly cannot overcome the conflation of phylogenetic signal and developmental mode in taxa that have both directly and indirectly developing species, because the resulting trees will integrate mutually exclusive topological predictions (Fig. 3c,e).

Wolfe and Hegna (2014) seemed to have noticed this effect in the wake of their analyses, observing that extant species with few qualitative changes in the course of development (e.g. the cephalocarid Hutchinsoniella macracantha) consistently were recovered as monophyletic, whereas extant species with saltational changes in morphology from one developmental stage to the next were rarely monophyletic (e.g. the copepod Labidocera aestiva). They interpreted this to mean that, “the results show that the nature of the ontogeny (direct versus indirect development) exerts a more powerful control on species monophyly than the number of semaphoronts included in the analysis. Direct developers (with all semaphoronts forming a clade) therefore have more easily interpretable phylogenetic signal, as their sister group is related by phylogeny rather than shared ontogeny” (Wolfe and Hegna, 2014). For multiple reasons, this conclusion is not substantiated. As explained above, ontogenetic relationships and species relationships are not causally related, and only the latter can be represented by a tree diagram (Fig. 1a). Both cephalocarids and copepods undergo indirect development via a nauplius (or nauplius-like) larva that subsequently adds somites in the posterior body region (Ferrari et al., 2011); the recovery of Hutchinsoniella macracantha semaphoronts in a clade under semaphoront coding does not make this species a direct developer.

Under a total evidence approach (in which each species is represented by a single terminal; Fig. 2), the degree of character overlap across stages is not problematic; indirect developers can be accommodated by the implementation of sections of the character matrix that are dedicated to a particular metamorphic stage (Fig. 2c). Direct developers are similarly accommodated by coding a single stage (typically the adult), to avoid the problem of redundant taxa. Thus, the problem of mode of development is unique to semaphoront coding.

Metamorphosis: inapplicable cells, character conflict and nodal support

Developmental metamorphosis (indirect development with markedly dissimilar stages) poses additional problems for semaphoront coding. When they implemented semaphoront coding for their respective taxa, both Lamsdell and Selden (2013) and Wolfe and Hegna (2014) were testing (with different expectations) the implicit hypothesis that this method of coding would yield the monophyly of species and meaningful interspecies relationships. Wolfe and Hegna (2014) specifically termed a tree composed of clades of stages (e.g., Fig. 3e) an artificial or “Haeckelian” result. As explained above, the nonmonophyly of species is an expected outcome for taxa with indirect development, but only if a matrix of such taxa is coded a certain way. This is because in cases of metamorphic development, there are three different types of characters that semaphoront coding will encounter (Fig. 4).

Details are in the caption following the image
Top left: Scenario depicted in Fig. 2a, with an indirectly developing species with three defined semaphoronts and the same phylogenetic data in all three submatrices. (a) A total evidence matrix incorporating semaphoront data will recover the same tree topology as for this hypothetical case. (b) Under semaphoront coding, one class of characters (Type 1) will only be applicable to a given stage, but not applicable to any other stage. On their own, these data will yield a complete polytomy, even if the data within each submatrix perfectly reflect phylogenetic signal. (c) Under semaphoront coding, a second class of characters (Type 2) will correspond to morphological differences between ontogenetic stages, not synapomorphies. These data will yield clusters corresponding to stages, not species. (d) Under semaphoront coding, a third class of characters (Type 3) will reflect both phylogenetic signal (species relationships) and species monophyly. Only these characters will yield the monophyly of species and the true relationships between species.

One set of characters that we term Type 1 will be applicable only to specific stages and inapplicable for all other stages (Fig. 4b). Empirical examples of Type 1 characters include genital characters in various indirectly developing organisms, as reproductive organs frequently occur only in sexually mature stages. While there may be some comparability of developmental sequences in the course of ontogeny, especially through the use of modern genetic approaches, reconstructing those sequences imposes an additional inferential and evidentiary burden (e.g. Jeffery et al., 2005; Kerney et al., 2011; Jirikowski et al., 2015). But as for those characters that can be coded only for a single stage, they are incapable of resolving species relationships on their own, even in the ideal circumstance that all stages are reflective of species relationships and completely homoplasy-free (Fig. 4b). A morphological matrix composed exclusively of Type 1 characters will thus yield a complete polytomy due to the absence of cells that can be scored across the entire set of semaphoronts. The overabundance of inapplicable cells renders these characters ineffective for resolving phylogenetic relationships.

A second set of characters (Type 2) will have stage-specific distributions, as metamorphic stages (e.g. larva, pupa) are recognized by developmental biologists on the basis of characters that reflect developmental sequence, rather than species relationships (Fig. 4c). It was precisely these characters that Hennig (1966) termed “undesirable”, as their incidence reflected only morphological similarity, not synapomorphy. Empirical examples of Type 2 characters include Keilin's organ in dipteran larvae, the prototroch of various lophotrochozoan larvae, or larval eyes of Sipuncula. These characters will support clades corresponding to stages, rather than the monophyly of species.

Finally, a third set of homologizable characters (Type 3) will be applicable to all metamorphic stages and reflect phylogenetic signal. To do this, a given character must be scored the same way for all stages of a given species (red characters in Fig. 4d) and also for all species relationships (purple characters in Fig. 4d). Only Type 3 characters will support natural monophyletic groups (including species), given the inclusion of semaphoronts (Fig. 4d). Empirical examples of Type 3 morphological characters are rare and tend to reflect higher-level relationships, such as segmentation in arthropods (observable across all postembryonic stages). Note that, for simplicity, we are not considering embryonic stages of arthropods prior to segmentation as semaphoronts in this example; if all life stages were considered from zygote to adult, we cannot envision any empirical characters that would be distributed thus in arthropods. In addition, molecular sequence data constitute an excellent source of Type 3 characters, but are naturally inapplicable to the question of fossil placement.

Therefore, in order for semaphoront coding to recover species monophyly (and/or true phylogenetic relationships) in indirect developers, there would have to be a preponderance of Type 3 characters (applicable to all semaphoronts of a species and reflecting phylogenetic signal across all stages) for phylogenetic signal to outweigh the character conflict created by Type 2 characters (Fig. 4). While we can cite individual examples of each character type, we see no plausible reason why, nor cite any empirical case where, Type 3 characters would grossly outnumber Type 2 characters in groups with metamorphic development.

We note that the incidence of the three character types has no adverse effect on a total evidence approach. This is because total evidence matrices can treat metamorphic stages as subsets of the complete matrix, and thus both Type 1 and Type 3 characters are fully scored for all taxa across the matrix. Type 2 characters would not be coded at all in total evidence matrices, nor have an effect on the result if they were, because these characters would be scored as invariable in a total evidence matrix (e.g. “larva with Keilin's organ” in cyclorrhaphan flies; “larva with cilia” in Sipuncula).

A final aspect of semaphoront coding not addressed by Wolfe and Hegna (2014) is the issue of nodal support. By definition, nodal support in semaphoront coding will always be lower for any given node than in a total evidence approach applied to the same dataset (with the sole exception of the idealized scenario of direct development depicted in Fig. 3a, where development consists only of isometric growth). Blanke et al. (2015) showed that coding larval and adult characters separately within single species terminals (a traditional total evidence approach) resulted in higher nodal support in comparison to semaphoront coding for a test case of three ingroup species of Odonata. This result is the intuitive consequence of combining evidence and inferring relationships with more characters.

Delimiting stages adds inferential burden in phylogenetic analysis

All of the preceding discourse of this critique has employed hypothetical examples that are simplistic, in large part to convey the conceptual limitations of semaphoront coding. Empirically, the diversity of developmental processes in nature further complicates the implementation of semaphoront coding.

One of the fundamental problems for semaphoront coding is the need to establish stages that will be treated as terminals. Taking the example of the arthropod literature, developmental biologists delimit embryonic stages using either landmark morphogenetic events or the number of hours after egg-laying (given some invariant incubation temperature). Stages are thus arbitrary conventions that facilitate discourse, not actual entities that have unambiguous boundaries. During postembryonic development, entomologists typically use discrete morphologies in the course of development to delimit stages (such as larval or pupal stages), but this definition is at best subjective as well in the context of semaphoront coding. Larval instars (distinguished by moult events; Fig. 3d) in particular are not static entities differentiated simply by size, but rather morphologically dynamic entities that could equally well constitute semaphoronts. Inversely, in hemianamorphic arthropod lineages (e.g. lithobiid centipedes), moults continue regularly after sexual maturation, after which no discernible morphological changes occur in the adult; the terminal semaphoront is therefore not defined by a corresponding terminal moult.

In the course of evolution, various lineages have also acquired new and different metamorphic stages that are difficult to compare (Fig. 5a). The literature surrounding the homology of the holometabolous insect pupal stage with respect to hemimetabolan stages is a good example of evidentiary burden required to homologize characters across different numbers of stages in a developmental series (Berlese, 1913; Hinton, 1948, 1955; Truman and Riddiford, 1999). Thus, semaphoront coding encounters the problem of determining how many stages to delimit for any given indirectly developing species, and how to contend with disparate numbers of stages across species. In the specific test case of Wolfe and Hegna (2014), certain pancrustaceans represent indirectly developing lineages that are opportune targets for semaphoront coding (e.g. the decapod nauplius, zooea and megalopa; the insect larva and pupa), but the markedly different morphologies of these stages proved too complex a problem to address analytically. Wolfe and Hegna (2014) thus excluded both hexapods and indirectly developing decapods from their analysis of Pancrustacea, but exclusion of ingroup taxa is not a satisfactory workaround to testing phylogenetic placement of either larvae or fossils.

Details are in the caption following the image
Additional hurdles for integrating developmental data under semaphoront coding. (a) Defining ontogenetic relationships between semaphoronts can be a complex procedure for indirectly developing taxa, if some lineages acquire novel ontogenetic stages and others lose them. (b) Defining stages across directly developing taxa is epistemologically difficult, due to the continuity of the developmental series and the incidence of heterochronic development between species (either corresponding to the whole organism or a specific anatomical structure). Here progress of direct development is visualized as colours ranging from yellow (beginning of development) to red (end of development), where the length of the bar indicates absolute time. (c) Paedomorphosis and developmental plasticity incur additional problems for semaphoront coding. In cases of paedomorphosis, a presumed terminal semaphoront does not occur with reference to the developmental series of closely related species. In many lineages, plasticity of development or complex life cycles result in a nonlinear developmental process (where ontogenetic relationships form reticulations).

Direct development poses additional challenges. For indirect developers, metamorphic stages may be justified as the operational unit of semaphoront coding. But typical direct development does not offer similar boundaries to delimit stages. In Fig. 5b, direct development is conceptualized as a progression of colour from yellow to red as a function of time. Some species may experience changes in the timing of a developmental process (affecting either the whole organism or a part of it) relative to other species (heterochrony), visualized as different rates of change in colour across the developmental series. Time is also not an effective arbiter of staging, as different species may have markedly different developmental durations. Thus, in either mode of development, there is no natural and nonarbitrary way for the investigator to decide how many semaphoronts should be coded in a semaphoront coding analysis. For extant pancrustacean taxa, Wolfe and Hegna (2014) coded from three to 19 semaphoronts (operationally delimited by moults) per species, but the procedure for choosing and delimiting the number of semaphoronts beyond moulting taxa was not provided, nor was it clarified how the number of semaphoronts coded would affect the behaviour of a species’ terminals.

Additional considerations not addressed by Wolfe and Hegna (2014) in a systematic way are the phenomena of paedomorphosis and developmental plasticity (Fig. 5c). Paedomorphic adults are recorded in various indirect developers, and are known to confound morphological analyses. As an empirical example, the incidence of paedomorphic species confounds morphological phylogenies of extant salamanders, even when the complete developmental histories of the ingroup taxa were known and encoded in a total evidence matrix (Wiens et al., 2005). Wiens et al. (2005) reported the clustering of paedomorphic species (where the terminal stages resemble larvae of nonpaedomorphic species) due to the presence of shared larval traits in morphological datasets, whereas molecular data supported multiple (additional) gains of paedomorphosis, as did the exclusion of paedomorphic traits from the morphological data partition. Wiens et al. (2005) previously discussed the limitations of possible strategies to surmount paedomorphosis, such as artificially coding adult stages of paedomorphic species as missing data, or excluding paedomorphic characters (which in turn requires their identification, an additional inferential burden).

Plasticity in developmental mode is another concern for semaphoront coding. Implicit within the logic of Wolfe and Hegna (2014) is the idea that a developmental series is a linear entity—that ontogenetic relationships are hierarchical and immutable within species (Fig. 1b). In reality, numerous organismal lineages do not experience a linear progression of semaphoronts, but rather, undergo alternative life histories as a function of age, habitat, resource availability or other environmental inputs. Examples of such plasticity in life history include Cycliophora, which have multiple larval types and both sexual and asexual life cycles (Funch and Kristensen, 1995; Baker et al., 2007); facultative paedomorphosis in amphibians (Whiteman et al., 1996); alternation of generations in such taxa as plants or cnidarians (“cyclomorphism”; Hennig, 1966); and the reversible tun stage of Tardigrada during episodes of anhydrobiosis (Welnicz et al., 2011). Life histories that are abbreviated, cyclical, reversible or environmentally cued defy the logic of semaphoront coding because they are not hierarchically organized. For such taxa, the problem of defining semaphoronts aside, interpreting a tree of relationships resulting from semaphoront coding is not a meaningful exercise.

The problems of homologizing stages across diverse developmental modes apply in equal measure to the total evidence approach as well, with one key exception. Total evidence is informed by ontogenetic relationships in extant species, and can thus justifiably align character states from particular semaphoronts to construct the total evidence data matrix. The alignment of the semaphoronts can be based on landmark developmental phenomena, such as occurrence of sexual maturity (e.g. hemi- and holometabolous insects), segment number (e.g. sea spiders; hemianamorphic centipedes), first terrestrial stage (e.g. amphibians), or key morphological or genetic features (e.g. diploid vs. haploid phase of pteridophytes). While exceptional life histories (e.g. paedomorphosis; Wiens et al., 2005) will confound some aspects of this alignment in total evidence, in practice investigators can achieve at least partial alignment of semaphoronts using the external criterion of developmental data (ontogenetic relationships determined by direct observation). Empirical examples of justifiable character-rich semaphoronts traditionally favoured in arthropod systematics for total evidence analyses include the adult males of spiders and the worker castes of hymenopterans. By contrast, in semaphoront coding, the set of known ontogenetic relationships is discarded a priori.

Total evidence can accommodate both fossils and developmental data

Wolfe and Hegna (2014) were intent on assessing the phylogenetic placement of Upper Cambrian pancrustacean fossils, and implemented semaphoront coding as a means of simultaneously treating fossil phylogenetic placement and fossil ontogeny. However, fossil phylogenetic placement and fossil development are two very different goals and each presents its own epistemological challenges. As Hennig (1966) aptly summarized:

[O]nly dead individuals (more accurately, semaphoronts) are available to paleontology, and in the most favorable cases only pieces of the total fabric of characters of these semaphoronts. For this reason paleontology is actually never in a position to determine whether corresponding or different semaphoronts are members of the same or of different reproductive communities, a possibility that always exists at least in principle for neozoology. (Hennig, 1966, p.63)

In phylogenetic impasses (soft polytomies, ancient radiations, character conflict), phylogeneticists frequently vocalize the need for more data to redress a particular problem. However, in the special case of fossils, infusion of data from more semaphoronts may not surmount uncertainty.

In paleontology the incompleteness of the fossil record dictates the use of purely morphological species concepts as a basis for a purely formal classification. So long as only a few semaphoronts are known it would create no difficulties if, for example, the species A, B, D1 or H1, and H2-I are distinguished. But if more and more individuals from the sequence of generations in these populations became known, a purely morphological distinction of the species would eventually become impossible. ‘As the fossil record becomes more completely known, the problems are not therefore likely to be resolved, as is generally supposed, but will be become more acute’ (Clark, 1956). (Hennig, 1966, p.63)

This particular limitation affects the total evidence approach as well. For the simple case of indirectly developing species, semaphoronts are again depicted as facets of an individual organism in Fig. 6a. In neontology, ontogenetic relationships are observable and depicted as assembled individuals (cubes in Fig. 6a), whereas in palaeontology, neither ontogenetic relationships (which square corresponds to which face of the cube) nor species boundaries (which squares can be assembled into individual cubes) are known. Palaeontological approaches must therefore account for two different levels of inference, and therefore, of uncertainty. This was precisely the basis for Wolfe and Hegna's exhortation for semaphoront coding; by obviating the need to infer stage designations and species boundaries, Wolfe and Hegna (2014) aimed to overcome the linking of suppositions in the phylogenetic placement of fossils.

Details are in the caption following the image
Comparative treatment of fossil data by total evidence and semaphoront coding. Cube faces correspond to individual semaphoronts (top, larva; facing left, pupa; facing right, adult). Colours of cube faces correspond to unique suites of character states. (a) In neontological systems, ontogenetic relationships between individuals are known, whereas in fossil taxa, these relationships cannot be observed. Palaeontological reconstructions invoke inferences/assumptions both for the life history stage of a fossil semaphoront and boundaries between species. (b) Under a total evidence approach, a larval fossil can be coded as such by scoring it only for characters pertaining to its life history stage. This approach requires that the fossil semaphoront is reliably identified as a given life history stage using stage-specific characters. (c) If coded thus in a total evidence matrix, the phylogenetic placement of fossil taxa can be assessed. If the data partition for the fossil is particularly informative across all taxa, high phylogenetic confidence is anticipated. (d) In this case, with unordered multistate coding, semaphoront coding will return a topology reflecting a basal polytomy and some clades corresponding to stages with shared characters (square colours).

While we grant that the premise of their concern was legitimate, we submit that total evidence matrices can to some degree place larval fossils, if only imperfectly. If the investigator were able to infer the stage to which a particular fossil belongs (using all Type 2 characters available), one could include the fossil in a total evidence matrix only for that subset of characters (Type 1 and/or Type 3) corresponding to the fossil's inferred stage (Fig. 6b). This would overcome the problem of coding a fossil larval stage for adult characters (an extension of the paedomorphosis problem encountered by Wiens et al., 2005). Under this implementation, fossil larvae could hypothetically be placed reliably in a tree, provided that their morphology and preservation (i) reveal the stage of the fossil (i.e. retain Type 2 characters) and (ii) retain the synapomorphies uniquely preserved in that stage (i.e. Type 1 and/or Type 3 characters) (Fig. 6c). Our proposed general workflow is depicted in Fig. 7, wherein Type 2 characters are first used to align ontogenies; total evidence matrices are constructed using Type 1 characters, and fossil terminals bearing Type 2 characters are coded only for those subsets of the matrix bearing the matching Type 2 character state.

Details are in the caption following the image
A proposed workflow for inferring the placement of fossil semaphoronts under a total evidence framework. (a) Semaphoronts are arranged in ontogenetic order for all extant taxa of interest. Missing semaphoronts are shown here to result from either differences in ontogeny, but may also result from insufficiency of Type 2 characters of certain stages in the ontogeny of some species. (b) Ontogenies are aligned using Type 2 characters to identify points of correspondence in developmental series. Stages with alignable semaphoronts for all ingroup taxa are opportune for coding developmental data in morphological data matrices. Note that some stages may not be alignable for some taxa and thus may not be sources of dispositive phylogenetic characters (e.g. stage S6 can only be coded for three taxa). (c) Add fossil taxa to alignment of developmental series based on distinct Type 2 characters that identify ontogenetic stage. Every fossil taxon can only appear once in the series, given the absence of observable ontogenetic relationships. (d) A total evidence matrix is constructed using Type 1 characters from all ontogenetic stages. Black cells indicate regions of the matrix that can be coded as binary, multistate or quantitative characters. White cells indicate regions of the matrix that cannot be coded for fossil taxa. Grey cells indicate regions of the matrix that are coded as inapplicable (due to known ontogenetic variability in extant species). (e) Phylogenetic placement of fossils of interest is inferred using subsets of the matrix for which those fossils’ characters are coded, or a total evidence tree, if the character subsets are sufficiently informative (an extension of Fig. 6c).
Empirically, we note that even this procedure will have limited success for certain groups, because morphological characters from different stages are not uniformly informative. To quote examples from Hennig:

The imagos [adults] of [Prosthetosominae termite] larvae are unknown. It can be assumed … [that] the relationship between imago and larva has not been recognized because the imaginal characters are so different from the larval characters of the group to which the imagos belong that the genetic relationship cannot be inferred from morphological characters alone. We may also point out that a parasitic snail larva has been described…of which it can be said with certainty only that it belongs among the cyclorrhapids, a group of at least 15,000 species distribution among several families… (Hennig, 1966, p.34-35)

For many taxa, where derived characters are acquired towards the completion of a developmental series (reviewed by Abzhanov (2013)) the inclusion of data from early developmental stages may do little to aid phylogenetic reconstruction. However, our concern here is with the framework for phylogenetic analysis, not with the specifics of individual taxa. If analyses of extant taxa alone are taken as proof-of-concept for the successful integration of developmental data in phylogenetic reconstruction, then total evidence has been shown to work well for multiple groups (e.g. Zhang, 1995; Pugener et al., 2003; O'Leary et al., 2013), as measured by the degree of congruence between larval and adult trees, and the complementary distribution of synapomorphies on a combined total evidence tree. The total evidence framework is thus limited here only by the diagnostic power of Type 2 characters for establishing boundaries between stages, and by the informativeness of Type 1 and Type 3 characters for inferring synapomorphies; each character type's frequency will vary from one taxon to the next. Conceptually, the total evidence framework is operable for extant taxa, and this proof-of-concept justifies the extension of this framework to fossil taxa as well.

In contrast, semaphoront coding has never been shown to work in a proof-of-concept approach for extant taxa alone. We submit that semaphoront coding cannot yield species monophyly or species relationships except in highly unusual morphological matrices (Fig. 4d), or at best will contribute only redundant information to a dataset (Fig. 3c). In the absence of proof-of-concept implementation of semaphoront coding in extant taxa exclusively, there is no justification for credibly applying this framework to fossil taxa.

In a simple hypothetical example, semaphoront coding would not be able to resolve phylogenetic relationships (synapomorphies) for the dataset illustrated in Fig. 6c. Semaphoront coding, if implemented using unordered multistate characters where colours reflect states, would yield for this case a basal polytomy, with individual clades corresponding to stages and nonmonophyletic species (Fig. 6d). By contrast, if sufficient Type 2 characters were available to infer fossil stage, phylogenetic placement of fossils is feasible under total evidence (Fig. 6c, 7). If Type 2 characters are not available or insufficiently informative to infer fossil stage, then we submit that neither total evidence nor semaphoront coding can resolve the fossil's placement.

Conclusion

Phylogenetic analysis cannot redress the multiple sources of uncertainty implicit in Wolfe and Hegna's treatment of semaphoronts. Hypothetically, semaphoront coding could be used to establish the ontogenetic stage of a fossil (but not its phylogenetic placement) in taxa that share a single mode of development, but only if the mode of development were uniform across all species (Fig. 3c,e). For fossil taxa, the mode of development is not knowable and nonfalsifiable. For extant taxa, the mode of development can be established using observational data, obviating the need for semaphoront coding altogether.

Alternatively, phylogenetic analysis with traditional coding could be used to place a fossil in a tree, if that fossil's ontogenetic stage were known (Fig. 6c). But semaphoront coding cannot establish both unknown variables simultaneously (ontogenetic stage and phylogenetic placement), due to the conflation of developmental process (Type 2 characters) with historical relationships (synapomorphies; Type 1 or Type 3 characters). Attempts by Wolfe and Hegna (2014) to enforce a particular expected outcome, such as constraining all semaphoronts of species to be monophyletic (fig. 6 of Wolfe and Hegna, 2014), do not lend any credence to this approach—this circular operation only incurs the arbitrary assumptions that these authors had hoped to avoid (a priori designation of species boundaries). In practice it engenders nonevolutionary scenarios (i.e. the repeated evolution of stages within every species constrained to be monophyletic) and results in inflated parsimony scores that inhibit comparison with topologies from total evidence or other approaches. It is therefore not clear to us whether the aim of Wolfe and Hegna's work was to test the placement of semaphoronts (making the species monophyly constraint approach contradictory to their purpose) or to test the relationships of species (making the coding of extant semaphoronts as separate terminals an indefensible exercise); it is also unclear why the authors would consider summarizing inferred species relationships from both analysis types together (table 2 of Wolfe and Hegna, 2014).

We reiterate that the trees produced by semaphoront coding are not phylogenetic trees, a fact curiously observed by Wolfe and Hegna (2014) at multiple points in their study, but whose logical implications were ignored thereafter. The relationships depicted by Wolfe and Hegna (2014) conflate morphological similarity with synapomorphy. Morphological similarities between life stages of various organisms can engender superficial comparisons and nonevolutionary conclusions, such as a putative relationship between a human zygote and unicellular eukaryotes.

We thus see no sense in reading species limits or phylogenetic relationships in semaphoront trees.

Acknowledgements

We are indebted to A. Blanke, R.J. Garwood, R.E. Harbach, J.C Lamsdell, I.J. Kitching and A. Riesgo, whose multifaceted perspectives and helpful informal peer reviews greatly refined the ideas and argumentation presented in this work. A series of debates between J.W. Wolfe and PPS inspired this expository critique. Critiques from the Associate Editor and three anonymous referees on a previous draft were instrumental to refining our argumentation. This material is based on work supported by the National Science Foundation Postdoctoral Research Fellowship in Biology under grant no. DBI-1202751 to PPS.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.