Volume 27, Issue 20 pp. 4090-4107
ORIGINAL ARTICLE
Full Access

Taxon cycle predictions supported by model-based inference in Indo-Pacific trap-jaw ants (Hymenoptera: Formicidae: Odontomachus)

Pável Matos-Maraví

Corresponding Author

Pável Matos-Maraví

Institute of Entomology, Biology Centre CAS, Ceske Budejovice, Czech Republic

Department of Zoology, Faculty of Science, University of South Bohemia, Ceske Budejovice, Czech Republic

Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden

Gothenburg Global Biodiversity Centre, Göteborg, Sweden

Correspondence

Pável Matos-Maraví, Institute of Entomology, Biology Centre CAS, Ceske Budejovice, Czech Republic.

Email: [email protected]

Search for more papers by this author
Nicholas J. Matzke

Nicholas J. Matzke

Division of Ecology and Evolution, Research School of Biology, The Australian National University, Canberra, Australian Capital Territory, Australia

School of Biological Sciences, The University of Auckland, Auckland, New Zealand

Search for more papers by this author
Fredrick J. Larabee

Fredrick J. Larabee

Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, District of Columbia

Department of Entomology and Department of Animal Biology, University of Illinois, Urbana-Champaign, Urbana, Illinois

Search for more papers by this author
Ronald M. Clouse

Ronald M. Clouse

Division of Invertebrate Zoology, American Museum of Natural History, New York City, New York

Search for more papers by this author
Ward C. Wheeler

Ward C. Wheeler

Division of Invertebrate Zoology, American Museum of Natural History, New York City, New York

Search for more papers by this author
Daniela Magdalena Sorger

Daniela Magdalena Sorger

Department of Applied Ecology, North Carolina State University, Raleigh, North Carolina

W.M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, North Carolina

Research & Collections, North Carolina Museum of Natural Sciences, Raleigh, North Carolina

Search for more papers by this author
Andrew V. Suarez

Andrew V. Suarez

Department of Entomology and Department of Animal Biology, University of Illinois, Urbana-Champaign, Urbana, Illinois

Search for more papers by this author
Milan Janda

Milan Janda

Institute of Entomology, Biology Centre CAS, Ceske Budejovice, Czech Republic

Laboratorio Nacional de Análisis y Síntesis Ecológica, ENES, UNAM, Morelia, Mexico

Search for more papers by this author
First published: 14 August 2018
Citations: 27

Abstract

Nonequilibrium dynamics and non-neutral processes, such as trait-dependent dispersal, are often missing from quantitative island biogeography models despite their potential explanatory value. One of the most influential nonequilibrium models is the taxon cycle, but it has been difficult to test its validity as a general biogeographical framework. Here, we test predictions of the taxon cycle model using six expected phylogenetic patterns and a time-calibrated phylogeny of Indo-Pacific Odontomachus (Hymenoptera: Formicidae: Ponerinae), one of the ant genera that E.O. Wilson used when first proposing the hypothesis. We used model-based inference and a newly developed trait-dependent dispersal model to jointly estimate ancestral biogeography, ecology (habitat preferences for forest interiors, vs. “marginal” habitats, such as savannahs, shorelines, disturbed areas) and the linkage between ecology and dispersal rates. We found strong evidence that habitat shifts from forest interior to open and disturbed habitats increased macroevolutionary dispersal rate. In addition, lineages occupying open and disturbed habitats can give rise to both island endemics re-occupying only forest interiors and taxa that re-expand geographical ranges. The phylogenetic predictions outlined in this study can be used in future work to evaluate the relative weights of neutral (e.g., geographical distance and area) and non-neutral (e.g., trait-dependent dispersal) processes in historical biogeography and community ecology.

1 INTRODUCTION

Islands are natural laboratories for ecologists and evolutionary biologists. Their discrete nature can facilitate the reconstruction of evolutionary and biogeographical histories of resident flora and fauna, especially compared to continental systems. Despite major advances in island biogeography following the publication of its most influential model more than a half-century ago (MacArthur & Wilson, 1967), there remain many open questions that require multidisciplinary efforts to address (Patiño et al., 2017). For example, the reconciliation of ecology and evolutionary history can improve our understanding of the impact of ecological or biological traits on island biogeographical patterns (Valente, Phillimore, & Etienne, 2015; Whittaker, Fernández-Palacios, Matthews, Borregaard, & Triantis, 2017).

Non-neutral processes, such as trait-modulated dispersal and biotic innovations, often remain neglected in island biogeography models. Perhaps, the most influential nonequilibrium model that explicitly incorporates non-neutral processes into an evolutionary context is the taxon cycle (Ricklefs & Bermingham, 2002; Ricklefs & Cox, 1972; Wilson, 1959a, 1961). The taxon cycle proposes that dispersal and ecological adaptation interact with competition, range contraction and extinction to explain observed species distribution. The historical narrative of the taxon cycle describes the diversity dynamics of island biota in a sequence of stages (Figure 1). In Stage I, species expand their geographical ranges across archipelagos aided by ecological release and an adaptive shift to occupy “marginal” habitats. Wilson's (1959a, 1961) definition of “marginal” habitat includes littoral zones, open habitats and disturbed environments; for simplicity and to avoid any misinterpretation, we use open habitat to refer to “marginal” habitat throughout the study. In Stage II, extinction and population differentiation leave patchy species distributions across archipelagos. In Stage III, single-island endemics evolve the potential to either further radiate within islands or re-initiate the taxon cycle by entering into Stage I. Empirical studies typically seek patterns left by hypothetical taxon cycles on island communities, but these have usually relied on qualitative interpretations of distributional data, including altitudinal and habitat occurrences (Cook, Pringle, & Hughes, 2008; Dexter, 2010; Economo & Sarnat, 2012; Economo et al., 2015; Jønsson et al., 2014; Ricklefs & Bermingham, 2008). Revisiting the taxon cycle is timely with the addition of new probabilistic models for historical and island biogeography that allow traits or geographical range to influence dispersal or diversification rates (Goldberg, Lancaster, & Ree, 2011; Klaus, & Matzke, 2018; Sukumaran, Economo, & Knowles, 2016; Sukumaran & Knowles, 2018).

Details are in the caption following the image
Macroevolutionary and biogeographical patterns expected under the taxon cycle model: Stage I, anagenetic (population level) range expansion aided by a shift in habitat preference to open or disturbed environments; Stage II, differentiation among populations, leaving a patchy distribution due to anagenetic range contraction and within-area “subset” cladogenesis, wherein one of the two daughter lineages inherits only a smaller area within a widespread parental species; Stage III, single-island endemics evolve mainly through the cladogenetic “subset” process and re-entering Stage I, linked again to recent ecological broadening [Colour figure can be viewed at wileyonlinelibrary.com]

In this study, we test phylogenetic predictions of the taxon cycle using model-based inference (Burnham & Anderson, 2002) with recently developed computational models. We apply these models to a comprehensive time-calibrated phylogeny of the ant genus Odontomachus, one of the taxa E.O. Wilson used in his original articulation of the taxon cycle model (Wilson, 1959a, 1961). Odontomachus occurs in tropical and subtropical biomes worldwide and is particularly diverse in the insular landscape of the Indo-Pacific region where 31 out of 68 described species in the genus occur. Based on extant distributions and taxonomic affinities, it was believed that most Indo-Pacific ant lineages originated in tropical SE Asia and Australia (the “source regions” sensu Wilson, 1959a). However, New Guinea may also be a primary source of Odontomachus (Wilson, 1959a, 1961) and other ants in the Indo-Pacific region (Lucky & Sarnat, 2010). Recent biogeographical analyses suggest that continental SE Asian Odontomachus did not spread to New Guinea (instead a single dispersal event from the New World has been suggested), and Australian species form a derived monophyletic subgroup nested within New Guinean lineages (Larabee et al., 2016). Expanding taxa in Stage I might further differentiate into Stage II and Stage III mostly on the largest Indo-Pacific archipelagos such as Fiji, the Philippines or Borneo, due to the greater amount of habitat and elevational gradients with increasing island area size (Economo & Sarnat, 2012; Wilson, 1959a, 1961).

Computational models can readily be applied to some aspects of the taxon cycle. Theoretically, the most desirable test of the taxon cycle would use a complete computational model that explicitly integrates all ecological and evolutionary processes proposed in Wilson's verbal model of the taxon cycle, including speciation, extinction, dispersal, competition, biotic innovations and interactions among these processes. However, such a computational model does not yet exist. In particular, the accurate inference of rates of lineage extinction, using only phylogenies of living taxa, is likely to be difficult even in simple models, let alone complex models integrating many processes. However, available computational models can estimate the rates of processes that are most accessible with phylogenies of living taxa, transition rates for an ecological trait such as habitat preference and biogeographical dispersal rates. These models also provide estimates of ancestral geographical ranges and ecology, as well as the approximate timing of when changes occurred. While these inferences do not cover all aspects of the taxon cycle model, they enable testing of key aspects of it. In addition, a key prediction of the taxon cycle involves the relationship between ecological shifts and range expansion. Here, we combine an ecological trait (habitat association) and biogeography into a single model using a trait-dependent dispersal model, where ancestral ecology and historical biogeography are jointly inferred on the phylogeny of Indo-Pacific Odontomachus, and where trait state can influence dispersal rate (Klaus, & Matzke, 2018). In summary, using computational inference models of trait evolution, geographical range evolution or both simultaneously, we test six predictions stemming from the taxon cycle model (Table 1):

Table 1. Six phylogenetic predictions expected if a taxon cycle is operating, and alternative predictions if a taxon cycle is not operating
Prediction Observations and/or phylogeny-based inference If the taxon cycle model is operating in a clade, we expect If the taxon cycle is not operating
1 Inference of ancestral geographical ranges (standard BioGeoBEARS models) Range expansions (e.g., from 1 area to 2+ areas) will last long enough to be detectable on the branches of a phylogeny. This will be shown by estimated values of parameter d greater than 0 in DEC-type biogeography models, and the commonality of multi-area ranges in Biogeographical Stochastic Mapping Dominance of single-area ancestral ranges, and dispersal events dominated by jump-dispersal/founder-event speciation controlled by parameter j, thus significantly improving the fit of the model to the data set; alternatively, a scenario dominated by ancient widespread ranges that break up through vicariance
2 Timing of dispersal to/from the “source area” (here, New Guinea) Dispersals out of the “source area” are older than into “source area” by taxa at Stage I No significant difference in timing of dispersal events into and out of the source area
3 Inferred transition rates between ecological states (forest interior vs. open/disturbed habitat) Transition rates for “open” habitats to forest interiors higher than the opposite, because taxa occupying open/disturbed habitats encounter two “evolutionary fates”: (a) extinction or (b) survival in the forest interior. Such “evolutionary fates” may be controlled by biological interactions such as competition and parasitism No significant difference in transition rates between habitat preferences
4 Trait-dependent dispersal model Dispersal occurs entirely or primarily in lineages occupying open/disturbed habitats, such as islands coasts, because they are more prone to colonize other archipelagos that taxa restricted only to forest interiors. Habitat preference for forest interior thus has low or zero dispersal No significant different dispersal rates between lineages with distinct ecologies. Dispersal modulated by neutral processes such as geographical distance and area
5 Timing of range expansion and ecological shift events At different times across clades At approximately the same times across clades (environment-shift mediated; e.g., Pleistocene sea-level fluctuations)
6 Species ranges and ecology A few widespread taxa with broad ecological niches, potentially within (a) the infandus group; (b) the ruficeps group; and (c) O. saevissimus. Most taxa single-area endemics Widespread taxa regardless of ecological preference

Notes

  • These are predictions about what will be inferred from model-based inference of ancestral ecological traits (habitat preferences), ancestral geographical ranges or both (in the case of a joint trait-based dispersal, biogeographical model).
  • Prediction 1: Importance of range-expansion dispersal. Stage I range expansion is a population-level process, and Stage II and Stage III range contraction trigger speciation at a macroevolutionary level. Evidence to support this prediction at Stage I should be anagenetic range expansion (i.e., along branches in a phylogenetic tree), and at Stage II and III should be within-area cladogenesis (similar to “subset sympatry” in Figure 1). Given the phylogeny of Odontomachus, model-based inference would support parametric biogeographical models that include widespread ancestral taxa at cladogenesis and dominance of within-area speciation, as opposed to vicariance and jump-dispersal speciation.
  • Prediction 2: Timing of dispersal events. Stage I initiates with expanding ranges from large and diverse areas (i.e., source regions). The total number of dispersal events among Indo-Pacific archipelagos should reveal that expansion out of New Guinea, the assumed primary source region for Indo-Pacific Odontomachus based on morphological affinities (Brown, 1976; Wilson, 1959b), patterns of distribution (Wilson, 1959a, 1961) and previous biogeographical analyses (Larabee et al., 2016), is older than range expansions back into New Guinea from the Pacific region. Notably, this prediction does not contradict young dispersal out of source regions; instead, we acknowledge that dispersal events out of/into source regions continue to the present. The timing of dispersal events on the phylogeny of Odontomachus can be estimated using biogeographical stochastic mapping; when this is done, the ages of all dispersal events out of New Guinea should be significantly older than dispersal into New Guinea.
  • Prediction 3: Rates of transition in habitat preference. The transition rate from open habitat (e.g., littoral and disturbed environments) to undisturbed forest interior should be higher than in the reverse direction, because taxa occupying open and disturbed habitats encounter two evolutionary fates: (a) extinction or (b) habitat shift and survival in the forest interior. Evidence to support this pattern includes most extant species occupying forest interior habitat, while the open habitat state would be estimated at deeper nodes in the phylogeny. A trait-dependent dispersal model that jointly infers biogeography and habitat state transitions would estimate state transition rates from open habitat to forest interior to be higher than from forest interior to open habitat.
  • Prediction 4: Dispersal rates in each habitat. Range expansion by Stage I taxa is linked to their ecological preference for open habitat, while taxa occupying only forest interior have negligible dispersal across archipelagos. Higher macroevolutionary dispersal probabilities are expected for taxa associated with open habitat, and extant taxa occupying open habitat should be geographically widespread (Stage I). Therefore, a trait-dependent dispersal model should accrue higher likelihood than a model where dispersal rates are the same between open vs. forest interior habitats, and the open habitat state should increase dispersal probability among archipelagos.
  • Prediction 5: Timing of dispersal events. Stage I taxa disperse readily due to their preference for open habitats and are thus not dependent on external environmental drivers such as sea-level drop and subsequent increased landmass size and connectivity. Therefore, dispersal events by Stage I taxa should be asynchronous across clades, whereas if dispersal was caused by a major external driver, approximately simultaneous dispersal events should be expected. Given the time-calibrated phylogeny of Odontomachus, ancestral character state shifts to open habitat (Stage I) should be estimated at different times.
  • Prediction 6: Widespread taxa should have broad ecological niches. Stage I taxa with broad geographical ranges belong to lineages having broad ecological preferences, including associations with open and disturbed habitat. Widespread Stage I taxa would belong to clades dominated by single-area endemics having a preference for forest interior. Ancestral state inference should indicate that clades that contain widespread taxa have a high probability for the open habitat state at the crown node.

2 MATERIALS AND METHODS

2.1 The study taxon

The trap-jaw ant genus Odontomachus has recently been the subject of taxonomic reviews (Fisher & Smith, 2008; MacGown, Boudinot, Deyrup, & Sorger, 2014; Satria, Kurushima, Herwina, Yamane, & Eguchi, 2015; Sorger & Zettel, 2011; Yoshimura, Onoyama, & Ogata, 2007) and higher-level molecular systematic studies (Larabee et al., 2016). Odontomachus is a monophyletic genus comprised of 68 valid extant species (AntWeb, 2017; Satria et al., 2015; Schmidt & Shattuck, 2014), which are unevenly distributed across continents. The Oriental region (i.e., tropical and subtropical East Asia extending through the Malay Archipelago region west to the Wallace's Line) and the Australasian region (i.e., the Malay Archipelago region east of Wallace's Line and Australia) together harbour the largest number of Odontomachus species, with 31 valid taxa. The New World has 27 described species, whereas the Afrotropics-Malagasy and the Asian Palearctic regions have only three and seven species, respectively. The lower species diversity found in the last two regions is not a sampling artefact, given recent research efforts there (e.g., Fisher & Smith, 2008; Janicki, Narula, Ziegler, Guénard, & Economo, 2016; Liu et al., 2015).

2.2 Taxon sampling and molecular data set

In the recent phylogeny of Larabee et al. (2016), the species-rich Malay Archipelago region lacked comprehensive taxonomic sampling, thus hindering a thorough phylogenetic test of the taxon cycle model on these ants. Larabee et al. (2016) studied 43 Odontomachus specimens each of them representing one species, and only 17 individuals were from the Indo-Pacific. To overcome this, we conducted extensive sampling in the Indo-Pacific during 2002–2015, with emphasis on Melanesia (Supporting Information Figure S1 and Supporting Information Appendix S1). In addition, we included in our analyses specimens deposited at the Museum of Comparative Zoology (MCZ) at Harvard University, the Smithsonian National Museum of Natural History (USNMENT) and the CSIRO Tropical Ecosystems Research Centre, Darwin, Australia (TERC). We sorted specimens primarily based on morphology, using published taxonomic keys (Brown, 1976; Sorger & Zettel, 2011; Wilson, 1959b) and comparing most ant individuals directly to type collections at MCZ and at the USNM. To further corroborate our morphological identifications and find taxa with substantial genetic variation that may represent cryptic species, molecular-based species determination using a multilocus data set was carried out only for the Indo-Pacific clade (Supporting Information Figures S2–S5). The photographs of the specimen vouchers and associated distribution data are available at the public database Ants of New Guinea (http://www.newguineants.org/).

We followed standard laboratory protocols for genomic DNA isolation, amplification and Sanger sequencing (Clouse et al., 2015; Schmidt, 2013; Schultz & Brady, 2008). We also retrieved publicly available DNA sequences in GenBank of Odontomachus and its sister genus Anochetus. We generated a molecular data set of about 3.8-kb aligned gene sequences, encompassing the protein-coding mitochondrial gene COI, and the nuclear gene fragments CAD (including one intron), EF-1αF1, LWR (including one intron) and wingless, as well as a fragment of the ribosomal gene 28S. The alignment of the two intronic regions was straightforward within Odontomachus and the outgroup taxon Anochetus. Sequencing of both DNA strands was carried out by Macrogen (South Korea), and the edition and alignment of sequences were conducted in Geneious R7. All DNA sequences are available in GenBank and in BOLD (Barcode of Life) under the Ants of Papua New Guinea (ASPNA) project (Supporting Information Appendix S1).

2.3 Phylogenetic analyses

Single-gene data sets and a concatenated data set consisting of 93 taxa having at least three sequenced gene fragments were analysed in mrbayes version 3.2.3 (Ronquist et al., 2012). The best partitioning strategy for the concatenated data set was suggested by partitionfinder version 1.1.1 (Lanfear, Calcott, Ho, & Guindon, 2012), using 18 data blocks considering each gene's coding position, parameters set to branchlengths=linked (better likelihood than the unlinked option), model_selection=bic and search=greedy. Phylogenetic analyses in mrbayes were carried out through CIPRES (Miller, Pfeiffer, & Schwartz, 2010) and consisted of two independent runs each for 50 million generations, sampling every 5,000 generations. The nucleotide substitution scheme “mixed + Γ” was set to each partition, which permits the exploration of best-fitting reversible models in the MCMC sampling (Huelsenbeck, Larget, & Alfaro, 2004). The first 25% of sampled parameters were discarded as burn-in. We checked that the final average standard deviation split frequencies were below 0.01, PSRFs were approaching unity, ESS values were higher than 200, and log probabilities reached stationary distribution. We summarized the MCMC sampled trees using the 50% majority rule consensus approach.

We also ran maximum parsimony (MP) and maximum-likelihood (ML) analyses on the same concatenated data set, with ML partitioned under the best-fit strategy as above. We used raxml version 8.1.11 (Stamatakis, 2014) to conduct the ML analyses under the “rapid bootstrapping” algorithm with 1,000 iterations. We set the GTRGAMMA model for each partition and we searched for the best-scoring tree using the command “−f a”. We computed an extended majority rule consensus tree from the 1,000 bootstraps trees using the command “−J” in RAxML. The program tnt version 1.5 (Goloboff, Farris, & Nixon, 2008) was used to conduct the MP analysis using “new technology search” algorithms (Goloboff, 1999; Nixon, 1999) with the following parameters: RSS, CSS and XSS sectorial searches, Ratchet with 10 iterations, Drift with 10 cycles and Tree Fusing with 10 rounds. Node support was assessed using standard bootstrapping with 1,000 pseudo-replicates.

2.4 Divergence time estimation

Five extinct species, two within Anochetus and three within Odontomachus, were used as fossil-based calibrations following the guidelines of Parham et al. (2012). All five fossils are well-diagnosed, and their affinities within infrageneric groups were proposed in the light of well-preserved, apomorphic characters (Table 2). The compression fossil of O. paleomyagra was found in the Bílina Mine coal seam, Most Basin, Czechia (Wappler, Dlussky, Engel, Prokop, & Knor, 2014), with an age of about 20 Ma in the Burdigalian Stage, Early Miocene (Knor, Skuhravá, Wappler, & Prokop, 2013; Kvaček et al., 2004). The remaining four extinct species were found as inclusions in Dominican amber (de Andrade, 1994; Baroni Urbani, 1980), but their ages were not accurately correlated with stratigraphic levels. However, a mid-Miocene origin of Dominican amber has been proposed (Grimaldi & Engel, 2005; Iturralde-Vinent, 2001; Iturralde-Vinent & MacPhee, 1996). We used the concatenated multilocus data set with one specimen per species and fossil data to infer time-calibrated species divergences using the fossilized birth–death (FBD) model (Heath, Huelsenbeck, & Stadler, 2014). We enforced monophyly of extant and extinct taxa (the latter with missing nucleotide data in the alignment) for the taxonomic groups shown in Table 2, and we set the following parameters: “diversification rate” (Exponential; mean = 1.0), “sampling proportion” (Beta; alpha = 2.0, beta = 2.0) and “turnover rate” (Uniform; from 0 to 1), whereas the proportion of sampling extant species was set to 0.5 based on the species checklist for Odontomachus (AntWeb, 2017). We applied a soft prior lognormal distribution for the root age, that is, Anochetus and Odontomachus divergence: upper and lower 2.5% quantiles from 20.9 to 67.4 Ma (Larabee et al., 2016), with mean age of 31 Ma based on previous fossil-calibrated Ponerinae phylogeny (Schmidt, 2013). Alternatively, we calculated divergence times using a calibration-density approach with minimum clade ages modelled as exponential distributions reflecting the fossils dates (Supporting Information Figure S6).

Table 2. Brief description and taxonomic affinities of the five fossil taxa used in the calibration of Odontomachus phylogeny
Fossil taxon Taxonomic affinity and synapomorphies Holotype
Anochetus corayi Baroni Urbani, 1980

A. mayri lineage;

Squamiform excise petiole

Gyne; No. Do-834-K-1,

State Museum of Natural History, Sttutgart

Anochetus exstinctus De Andrade, 1994

A. emarginatus lineage;

Serially dentate mandibles, petiolar and propodeal spines

Worker; No. Do-5479,

State Museum of Natural History, Sttutgart

Odontomachus paleomyagra Wappler et al., 2014

O. assiniensis-rixosus clade;

Mandibles constriction

Gyne; No. ZD0136,

Bílina Mine Enterprises, Czechia

Odontomachus pseudobauri De Andrade, 1994

O. haematodus species group;

Metasternal process

Worker; Amber Sample A,

Natural History Museum, London

Odontomachus spinifer De Andrade, 1994

O. haematodus species group;

Smooth vertex

Worker; No. Do-2215,

State Museum of Natural History, Sttutgart

Notes

  • Odontomachus paleomyagra was found in a coal seam, Most Basin locality, Czech Republic, and the remaining fossil taxa were found in Dominican amber inclusions.

The analysis was run in beast version 2.3.1 (Bouckaert et al., 2014) using the lognormal uncorrelated Bayesian relaxed-clock (Drummond, Ho, Phillips, & Rambaut, 2006). Substitution models as suggested by jmodeltest version 2.1.7 (Darriba, Taboada, Doallo, & Posada, 2012) were unlinked across partitions, and MCMC approximations were run four independent times for 200 million generations, sampling every 20,000 generations and discarding the first 25% sampled parameters in each run. We summarized sampled trees into a maximum clade credibility (MCC) tree, after verifying convergence and mixing of chains (ESS >200). DNA alignment and time-calibrated phylogenies were deposited at TreeBase (Study ID 20232) and Dryad (https://doi.org/10.5061/dryad.5542pr8).

2.5 Probabilistic inference of historical biogeography

Ancestral geographical ranges within the genus Odontomachus were inferred using the r version 3.4.2 package biogeobears version 0.2.1 (Matzke, 2013a, 2014; R Core Team, 2017), with updates to the code as posted at http://phylo.wikidot.com/biogeobears. Ranges were inferred on the MCC tree from beast under the FBD model. We used the following biogeographical base models in our analyses (Matzke, 2013b): (a) Dispersal-Extinction-Cladogenesis (DEC) (Ree, Moore, Webb, & Donoghue, 2005; Ree & Smith, 2008); (b) DIVA-like (a likelihood implementation of the biogeographical processes assumed in DIVA, Ronquist, 1997); and (c) BayArea-like (a likelihood implementation of the biogeographical processes allowed in BayArea, Landis, Matzke, Moore, & Huelsenbeck, 2013). The main differences among the three biogeographical base models are related to the assumed scenarios at cladogenesis: (1) vicariance (allowed only by DEC and DIVA-like models in slightly different ways; Matzke, 2013b); (2) within-area speciation, wherein (2a) both daughter lineages inherit the same ancestral geographical range (allowed in all three base models) or (2b) one of the two daughter lineages inherits only a subset of a wider distributional range of the parent node (allowed only in DEC; see Figure 1). Only the BayArea-like model allows “range copying” (where a species occupying multiple areas gives rise to two daughters copying the same widespread range, i.e., widespread sympatry). In terms of expectations (Prediction 1; Table 1), we do not foresee vicariance playing a major role in the Odontomachus data set, but we do for within-area speciation, either within the same areas (e.g., many speciation events only within the New World or New Guinea), or within a subset of the parental geographical range (e.g., rising of island endemics from ancestral widespread taxa in the Indo-Pacific). All of these models effectively assume that a Yule-process generated the phylogeny, that is, a pure-birth process with no lineage extinction (Matzke, 2014). The “extinction” referred to these models is actually the process of range contraction or local extirpation, which is modelled using the parameter e, but e is typically dramatically underestimated (Matzke, 2014; Ree & Smith, 2008).

To test Prediction 1 using model-based inference, each base model was run four times, with or without null ranges in the state space (i.e., BioGeoBEARS’ default models vs. “*” models sensu Massana, Beaulieu, Matzke, & O'Meara, 2015), and with or without the jump-dispersal/founder-event speciation parameter j (Matzke, 2014). It is expected that such modifications have a strong impact on inference of dispersal events along branches (i.e., by disallowing null ranges, the anagenetic range contraction rate e can have a much higher estimate, thus raising the probability that a daughter lineage has a different range than the parental lineage) and at cladogenetic events (by adding the parameter j). To find the best-supported model given the data, we calculated the sample-size corrected Akaike weight AICc (Wagenmakers & Farrell, 2004) for all twelve biogeographical models. We carried out two separate historical biogeography analyses, one at the genus level and the second only using the Indo-Pacific clade.

First, we inferred ancestral ranges using the worldwide phylogeny of Odontomachus to better estimate the ancestral areas at the stem and crown node of the Indo-Pacific clade. Wilson (Wilson, 1961) hypothesized a SE Asian origin of extant Melanesian Odontomachus, but Larabee et al. (2016) proposed a New World origin. However, it remained unclear in Larabee et al. (2016) whether trans-Pacific dispersal directly to New Guinea (e.g., via stepping-stone through archipelagos in Polynesia and eastern Melanesia) or other geographical routes involving SE Asia have shaped the extant diversity in the Indo-Pacific. Resolving this issue is important to better understand the origin of both Indo-Pacific Odontomachus and the hypothetical taxon cycles. We defined the following geographical areas: (a) Oriental (and Palearctic), encompassing Eurasia, continental SE Asia and the Indo-Malayan region west to Wallace's Line; (b) Wallacea, extending from Wallace's Line to the Lydekker's Line; (c) Philippines, because this region cannot confidently be placed to either side of Wallace's Line; (d) New Guinea; (e) the Bismarck Archipelago and Solomon Islands; (f) Fiji; (g) Australia; (h) New World (Nearctic and Neotropics); and (i) Afrotropics and Malagasy regions. This analysis was time-stratified at 5, 15, 25 and 35 Ma to assign different dispersal rate multipliers and area connectivity through time (Supporting Information Table S1) following paleogeographic models of Earth and the Indo-Pacific (Hall, 2012, 2013). The fossils Odontomachus pseudobauri and Ospinifer were coded as “New World” taxa, and Opaleomyagra was assigned to the “Oriental-Palearctic” area. Biogeographic Stochastic Mapping (Dupin et al., 2017) was conducted using the best-fit biogeographical base model (i.e., BayArea-like*) to count stochastically-mapped dispersal events out and into New Guinea after the origin of the Indo-Pacific clade origin (testing Prediction 2).

Second, we used a trait-dependent dispersal model (Klaus, & Matzke, 2018; for a simulation test of the model, see the section “Supplementary analyses” in Supporting Information) to allow the influence of a discrete ecological trait on both anagenetic and cladogenetic dispersal. This was done by assigning rate multipliers to dispersal depending on the ecological state of a lineage and by allowing transitions between ecological states. For this exercise, we used only the Indo-Pacific clade. The code for the BioGeoBEARS trait-dependent model is available by sourcing the file http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_traits_v1.R after loading base BioGeoBEARS. Habitat occurrence for each Indo-Pacific clade species was compiled from the literature (Brown, 1976; Sarnat & Economo, 2012; Sorger & Zettel, 2011; Wilson, 1959a,1959b, 1961), public databases (AntWeb, 2017) and our own field records (136 ant colonies each assigned to habitat types) (Supporting Information Table S2). We followed the methodology of recent studies investigating the taxon cycle at a macroevolutionary scale (Economo et al., 2015; Sukumaran et al., 2016) to categorize habitat associations into two states: “forest interior” was assigned to those taxa occurring only at primary interior forests (excluding coastal vegetation), and “open habitat” was assigned to taxa occurring at highly-degraded (secondary) forests, open edge habitats such as savannah and littoral areas. We categorized any species occurring at both primary forest interior and open habitat as the “open habitat” state. Creating a three-state character, adding a state for “both forest and open habitat,” is imaginable, but it creates even more free parameters to estimate on a data set of limited size. Three extra parameters were added in a trait-based model variant, namely t12 and t21 (forward and backward transition rates between state 1, “open habitat,” and state 2, “forest interior”), and m2, a multiplier on dispersal probability for the “forest interior” state. The multiplier on dispersal probability for the “open habitat” state, m1, was fixed to 1.0, whereas m2 was allowed to range between 0 and 10.0. We conducted extra analyses with m2 fixed to 1.0, which allowed us to estimate the trait transition rates independently from the biogeography parameters, and providing null models to compare to the trait-dependent models. Estimates of transition rates were used to test Prediction 3 and estimates of the multiplier on dispersal probability to test Prediction 4.

2.6 Ancestral state estimation

We used the function ace in the r package ape v4.1 (Paradis, Claude, & Strimmer, 2004) and the functions make.simmap and describe.simmap in phytools v0.6-20 (Revell, 2012) to estimate ancestral discrete character states (“forest interior” and “open habitat”) on the Indo-Pacific Odontomachus clade. Maximum-likelihood estimation was conducted under two different models: Equal-rate and All-rates-different. We used two approaches, a continuous-time Markov chain and stochastic character mapping with 1,000 simulations. The ancestral state estimates were used to test Prediction 6, wherein clades having expanded ecological niches (i.e., “open habitat”) are the most geographically widespread. Such clades are hypothesized to be: (a) the infandus group, (b) the ruficeps group; and (c) O. saevissimus. The infandus group consists of taxa at Stage I (O. malignus) and Stage III (single-island endemics: O. floresensis, O. angulatus, O. banksi, O. infandus). The ruficeps group consists of taxa at Stage I or II (O. cephalotes, O. ODON015, O. ODON019) and Stage III (endemics to Australia: O. turneri, O. ruficeps). Odontomachus saevissimus is in Stage I, and we recognize a new species, sister to O. saevessimus, which is in Stage III (the Halmahera-endemic O. MOLU001). The classification of taxa in Stages belonging to taxon cycles follows (Wilson, 1959a, 1961).

3 RESULTS

The characteristics of our molecular data set are presented in Table 3. Overall, the new time-calibrated molecular phylogeny of the genus Odontomachus includes 161 ant specimens classified within 36 described species and 16 lineages that most likely represent unrecognized species (Figure 3; see Supporting Information Figure S5 for a graphical summary of molecular species delimitation and Supporting Information Figure S11 for an updated species checklist). This includes six times more Indo-Pacific specimens and the identification of 18 more species than in the most recent phylogenetic work on this group (Larabee et al., 2016). We found evidence to support each of the six taxon cycle predictions outlined in Table 1, suggesting that non-neutral processes have been important in determining the extant species diversity and distribution in the Indo-Pacific.

Table 3. Molecular composition of the data matrix used in the time-calibrated divergence analyses
Genes Specimens (%) Length(bp) Variable sites (Pis) Missing data (%) GC content (%)
28S 46 (89) 889 113 (48) 27.0 63.3
CAD 39 (75) 844 212 (130) 41.4 48.6
EF-1αF1 29 (56) 359 47 (23) 50.4 62.6
COI 46 (89) 659 299 (214) 17.9 24.3
LWR 46 (89) 583 184 (108) 16.5 52.3
Wingless 50 (96) 421 116 (80) 6.3 61.9
Total 52 3,755 971 (603) 26.9 50.8

Notes

  • Six loci and 52 specimens representing 52 species were analysed for testing the phylogenetic predictions of the taxon cycle hypothesis. (Pis) = number of parsimony-informative sites.

3.1 Potential biases in phylogenetic inference

Comparing phylogenies inferred with different data and methods allowed us to determine the robustness of our time-calibrated phylogeny and rule out potential conflicts during the estimation of tree topology and branch lengths. First, the single-gene analyses showed no significant discordance in tree topologies as compared to the concatenated multilocus data set. Second, similar tree topologies were inferred using different phylogenetic approaches on the same concatenated data set (Supporting Information Figure S2), and the node support tended to agree among analyses (Figure 2). Critical nodes having low support values were identified in the crown rixosus group (Oriental region) and haematodus group (New World region). These may indicate either a lack of phylogenetic signal in our data set or a poor taxonomic sampling in those regions. Either way, the Indo-Pacific clade, the group we are focusing on in this study, showed relatively good support at critical nodes. Third, the tree topology, node support and divergence time estimates were not significantly different among two fundamentally distinct time-calibration approaches; that is, the FBD model (Figure 3) and the calibration-density approach (Supporting Information Figure S6). Moreover, the estimated root age was not driven by the secondary calibration as evidenced by a separate analysis carried out only with priors (Supporting Information Figure S7). We thus did not detect any of the potential biases in conflicting tree topology and branch length estimations in our time-calibrated phylogeny.

Details are in the caption following the image
Phylogeny of Odontomachus plotted to the 50% majority rule consensus tree from mrbayes using the concatenated multilocus dataset. Support values are depicted as coloured stars in the following order: mrbayes posterior probability (PP), RA × maximum-likelihood bootstraps (BP) and TNT bootstraps (BP). Black stars represent high support (PP >0.95 and BP >90), blue stars represent moderate support (PP 0.90–0.94 and to BP 70–89), and red stars indicate low support (PP <0.89 and to BS <69). Lower support is indicated with dashes. The Indo-Pacific species groups defined by Brown (1976) based on morphology were recovered with moderate to high support, except for the infandus group, which originally included the papuanus group [Colour figure can be viewed at wileyonlinelibrary.com]
Details are in the caption following the image
Maximum Clade Credibility tree inferred in BEAST using the fossilized birth–death model. Posterior probability colour code follows Figure 2. The 95% dating confidence intervals are depicted as blue horizontal bars. Extant distributions of species are depicted as coloured squares following the legend and map in the figure. Inferred ancestral ranges based on the coarse-scale biogeographical analysis (BayArea-like* model) are displayed as colour blocks on the nodes, with probabilities of the most likely ranges as black in the associated pie chart. Panels a: Odontomachus rixosus, b: O. simillimus, c: O. tyrannicus, d: O. malignus; scale bar in red represents 1 mm [Colour figure can be viewed at wileyonlinelibrary.com]

3.2 Historical biogeography of Indo-Pacific Odontomachus: Predictions 1 & 2

To test Predictions 1 and 2 on the ancestral ranges of Odontomachus, we used maximum likelihood to infer ancestral ranges under different biogeographical models, each making different assumptions about vicariance, within-area speciation and dispersal. The BayArea-like* model had the best fit to the worldwide phylogeny of Odontomachus (AICc weight = 0.52; Table 4). This was expected because there were a significant number of within-area cladogenetic events when the defined geographical areas were large and coarse (e.g., the New World; see Figure 3 and Supporting Information Figure S8). On the other hand, when using only the Indo-Pacific clade and a biogeographical model that incorporates habitat-dependent dispersal, the DEC + t12 + t21 + m2 model was preferred (AICc weight = 0.74; Table 5). For Prediction 1, we found that (a) neither vicariance-heavy models nor models including founder-event speciation were the best fit (i.e., DIVA-like models and models including parameter j had lower AICc weights than the best-fit models), and (b) anagenetic range expansions lasted long enough on the phylogeny to be detectable by the biogeographical models. For Prediction 2, based on biogeographical histories sampled with Biogeographic Stochastic Mapping (Supporting Information Appendix S2), we showed that dispersal events into New Guinea by taxa that have passed through the taxon cycle (i.e., habitat preference for open environments) are significantly younger than dispersal out of New Guinea (t test p-value = 5.6e–8).

Table 4. BioGeoBEARS analyses of biogeography alone (no traits included), at a coarse geographical scale (across multiple continents)
Biogeographical models d e j loglik AICc df Akaike weights
DEC 0.022 0.006 0.000 −116.640 237.547 2 0.00
DEC + J 0.021 0.005 0.018 −116.170 238.885 3 0.00
DEC* 0.050 0.305 0.000 −107.620 219.507 2 0.15
DEC* + J 0.051 0.305 0.000 −107.620 221.785 3 0.05
BayArea-like 0.025 0.039 0.000 −137.660 279.587 2 0.00
BayArea-like + J 0.019 0.012 0.102 −125.990 258.525 3 0.00
BayArea-like* 0.044 0.268 0.000 −106.390 217.047 2 0.52
BayArea-like* + J 0.045 0.286 0.000 −106.370 219.285 3 0.17
DIVA-like 0.030 0.010 0.000 −126.770 257.807 2 0.00
DIVA-like + J 0.023 0.008 0.069 −123.870 254.285 3 0.00
DIVA-like* 0.053 0.360 0.000 −108.250 220.767 2 0.08
DIVA-like* + J 0.052 0.339 0.000 −108.240 223.025 3 0.03

Notes

  • Twelve models were evaluated, including the DEC, DIVA-like and BayArea-like base models, either allowing or not (* models) null ranges, and either including or not an extra parameter j. Estimated parameters: d: anagenetic range expansion; e: anagenetic range contraction; j: cladogenetic founder-event. AICc: Sample-size corrected Akaike information criterion, df: degrees of freedom. The best-fit biogeographical model to the time-calibrated tree of the genus Odontomachus was the BayArea-like* (i.e., disallowing null ranges). The shading indicates the most credible models, discussed in the main text, as suggested by Akaike weights.
Table 5. BioGeoBEARS analyses of biogeography + trait data (comparing models with trait-independent dispersal and trait-dependent dispersal), at a fine geographical scale (the Indo-Pacific clade, 26 extant tips)
Biogeographical models d e j t 12 t 21 m 2 loglik AICc df Akaike weights
DEC + 2rates, m2 = 1 0.007 0.000 0.000 0.074 0.000 1.000 −67.167 144.238 4 0.02
DEC + J + 2rates, m2 = 1 0.007 0.000 0.006 0.074 0.000 1.000 −66.547 146.094 5 0.01
DEC + 2rates + m2 0.016 0.000 0.000 0.072 0.000 0.000 −61.883 136.767 5 0.66
DEC + J + 2rates + m2, run1 0.014 0.000 0.035 0.074 0.000 0.000 −60.941 138.302 6 0.31
DIVA + 2rates, m2 = 1 0.012 0.000 0.000 0.074 0.000 1.000 −75.615 161.136 4 0.00
DIVA + J + 2rates, m2 = 1 0.009 0.000 0.021 0.074 0.000 1.000 −73.756 160.513 5 0.00
DIVA + 2rates + m2 0.024 0.000 0.000 0.072 0.000 0.000 −66.810 146.620 5 0.00
DIVA + J + 2rates + m2, run1 0.017 0.000 0.069 0.073 0.000 0.000 −64.327 145.076 6 0.01
BAYAREA + 2rates, m2 = 1 0.006 0.076 0.000 0.074 0.000 1.000 −79.442 168.789 4 0.00
BAYAREA + J + 2rates, m2 = 1 0.007 0.000 0.038 0.074 0.000 1.000 −78.110 169.220 5 0.00
BAYAREA + 2rates + m2 0.013 0.075 0.000 0.072 0.000 0.000 −75.434 163.869 5 0.00
BAYAREA + J + 2rates + m2, run2 0.014 0.000 0.131 0.074 0.000 0.000 −68.018 152.456 6 0.00

Notes

  • Twelve models were evaluated, using the DEC, DIVA-like and BayArea-like base models either including or not an extra parameter j. Estimated parameters: d: anagenetic range expansion; e: anagenetic range contraction; j: cladogenetic founder-event; t12: transition rate from open habitat (open/disturbed environments) to forest interior; t21: transition rate from forest interior to open habitat; m1: multiplier on dispersal rate when in “open habitat” state was fixed to 1.0 in all models; m2: multiplier on dispersal rate when in “forest interior” state. AICc: Sample-size corrected Akaike information criterion, df: degrees of freedom (number of free parameters). The best-fit biogeographical model was DEC + t12 + t21 + m2. The shading indicates the most credible models, discussed in the main text, as suggested by Akaike weights.

3.3 Trait-dependent dispersal of Indo-Pacific Odontomachus: Predictions 3 & 4

To test Predictions 3 and 4, we used a trait-dependent dispersal model that allows quantitative estimation of the influence of trait states on macroevolutionary dispersal rates in a single analysis. The best-fit model was the trait-dependent dispersal model DEC + t12 + t21 + m2, which allows different transition rates between preference for open habitat and forest interior, and a rate multiplier on dispersal influenced by the “forest interior” state (AICc weight = 0.66; Table 5 and Supporting Information Figure S9). Related to Prediction 3, we showed that under the best-fit model transition rate from open habitat to forest interiors (t12 = 0.07) is at least three orders of magnitude higher than the transition from forest interior to open environments (t21 = 0.0001) (Table 5). For Prediction 4, we showed that there is a significant support for m2 being different from m1 = 1.0, and in fact, the estimate of m2 was close to 0. This means that habitat preference for forest interior has a dramatic, restrictive influence on macroevolutionary dispersal.

3.4 Ancestral state estimation of habitat preference: Predictions 5 & 6

To test Predictions 5 and 6 on potential correlations between ecological states related to habitat preference and expanding clades, and their ages, we estimated the ancestral character states using maximum likelihood and stochastic character mapping approaches. When modelling the habitat trait alone, the model assuming equal transition rates could not be rejected (p-value >0.05; the all-rates-different model did not reject the simpler equal-rates model), but the ancestral state estimates overall agree regardless of the chosen model (Figure 4 and Supporting Information Figure S10). For Prediction 5, we found that range-expanding taxa as defined here based on Wilson (1959a, 1961) have significant probabilities for a character state shift towards open habitats (Figure 4), and that Stage I taxa were detected throughout the late Miocene (crown ruficeps group), Pliocene (O. saevissimus, crown infandus group) and Pleistocene (O. malignus). For Prediction 6, we found that taxa with habitat preference for open environments, specifically the crown node of infandus and ruficeps groups, can potentially give rise to species that further re-initiate the taxon cycle (such as O. malignus, O. ODON019) and island endemics such as O. floresensis (Flores), O. angulatus (Fiji), O. banksi and O. infandus (the Philippines).

Details are in the caption following the image
Ancestral area inference on the Indo-Pacific clade using a trait-dependent dispersal model implemented in BioGeoBEARS. The tree topology, divergence times, and posterior probabilities are identical as the phylogeny shown in Figure 3. Extant distributions of species are depicted as coloured square(s). Inferred ancestral ranges based on the Dispersal-Extinction-Cladogenesis + t12 + t21 + m2 model are displayed on main nodes. The probability of the most probable inferred range is indicated by black in the associated pie chart, and habitat preference is depicted as coloured circles. Ancestral habitat preferences were estimated in phytools under the continuous-time Markov chain and stochastic character mapping with 1,000 simulations. Inferred nodes with the open habitat state (open/disturbed environments; “A.S. O/D”) are shown as coloured arrows in blue (p > 0.5) and light-blue (p > 0.25) [Colour figure can be viewed at wileyonlinelibrary.com]

4 DISCUSSION

By using a multilocus phylogeny of the ant genus Odontomachus, we found support for patterns consistent with six phylogenetic predictions made by the taxon cycle model (Table 1). Comparisons among biogeographical models suggest that range expansions and broad ancestral geographical distributions lasted long enough to be captured by the biogeographical models. For example, the estimation of within-area “subset” speciation (parameter s in BioGeoBEARS; see Figure 1) better fit the predicted pattern of widespread ancestral taxa. Moreover, trait-dependent dispersal played an important role in the biogeography of Odontomachus in the Indo-Pacific. The habitat-dependent dispersal model estimated transition rates from open habitats to forest interior higher than from forest interior to open habitats. Biological interactions, such as competition (Wilson, 1961) and parasitism (Ricklefs & Cox, 1972), are hypothesized to drive such a pattern and explain the transition from Stage I to Stage II and III and the rise of single-island endemics. Furthermore, lineages having preference for forest interior experience drastically reduced probabilities of macroevolutionary dispersal compared to lineages having preference for open habitats (Table 5). Overall, our results provide evidence of the important role of non-neutral processes, such as trait-modulated dispersal, on the biogeography of this clade.

4.1 Dispersal events into the Indo-Pacific

Melanesian Odontomachus and the hypothetical taxon cycles in the Indo-Pacific originated from a New World lineage and not from SE Asian rainforest ancestors as previously proposed (Wilson, 1959a, 1961). The initial dispersal event into Melanesia took place in the early Miocene, most likely as a direct long-distance dispersal event across the Pacific Ocean (see Biogeographic Stochastic Mapping in Supporting Information Appendix S2). Dispersal through Beringia during the mid-Miocene warming followed by extinction of high-latitude taxa cannot be ruled out though, because the biogeographical models used here neglect the process of lineage extinction. However, the early-Pliocene origin of the species Osimillimus (New World haematodus group), currently found in littoral and degraded environments across the Indo-Pacific, possibly reinforces the hypothesis that sporadic trans-Pacific dispersal has contributed to the present-day assemblage of Melanesian fauna. A Gondwanan vicariant origin of the Indo-Pacific and New World clades, which would have left a similar biogeographical pattern, is ruled out by evidence from both fossil (Barden, 2017) and molecular divergence times (see also Moreau & Bell, 2013; Schmidt, 2013). Dispersal from South America to Australia via Antarctica (see Boudinot, Probst, Brandão, Feitosa, & Ward, 2016) is also ruled out given our biogeographical inferences; Australian Odontomachus derive from Melanesian lineages and not the other way around which would be expected if the ants arrived first to Australia (Figure 3).

There is a growing body of evidence supporting trans-Pacific long-distance dispersal in other insect taxa (e.g., Birch & Keeley, 2013; Gillespie et al., 2012; Keppel, Lowe, & Possingham, 2009; Michalak, Zhang, & Renner, 2010). Although there are no estimates of the flight dispersal capabilities of Odontomachus, indirect east-to-west dispersal by means of rafts, wood or debris may have been possible along the South Equatorial Current (Gillespie et al., 2012). Human-mediated dispersal from the New World to the Pacific islands is only likely in the case of Odontomachus ruginodis which we report for the first time in Guam and in New Caledonia (Supporting Information Appendix S1) and given their small genetic divergences among populations (Figure 2). That O. ruginodis has adapted to urban areas, including docks in the Nearctic region (MacGown et al., 2014), might explain its range expansion aided by commerce. Overall, given the global biogeographical history of the genus Odontomachus, we hypothesize that trap-jaw ants in general have had the ability to disperse across remote archipelagos, even along oceanic basins, throughout their evolutionary history. However, despite their high dispersal ability, we infer that only taxa exploiting resources in open and disturbed habitats have been successful in colonizing remote areas. This finding departs from neutral expectations where the observed species distributions are influenced mostly by geography (e.g., distance and area).

4.2 The taxon cycle disentangled

The species Osimillimus has been considered the prototypical Stage I species, initially expanding from New Guinea towards the Pacific (Wilson, 1961). However, we found a New World origin for this species, and, given the large phylogenetic divergence between it and closely related taxa in the New World, this odd pattern was not likely mediated by human activity. Population genetic data also suggest that early-divergent populations of O. simillimus are found in Fiji, suggesting a westward colonization route of the species in Melanesia (Janda, Zima, Borovanska, & Matos-Maraví, 2014). Although O. simillimus did not originate in New Guinea as previously hypothesized (Wilson, 1961), other aspects of its biology are consistent with its Stage I status, including wide distribution across biogeographical barriers linked to adaptations to occupy broad habitat types (Economo et al., 2015). The species occurs in disturbed habitats, gardens, forest edges and coastal areas in New Guinea, but in undisturbed primary rainforest on small to medium sized Indo-Pacific islands (e.g., Guadalcanal) (Wilson, 1959a,1959b; Brown, 1976; M. Janda & D.M. Sorger pers obs.). Notably, O. simillimus is not established in Madagascar, perhaps due to the presence of the ecologically similar Otroglodytes, also a member of the haematodus group (Fisher & Smith, 2008; Larabee et al., 2016). This is in accordance with Wilson's replacement hypothesis, whereby Stage I taxa might also colonize large source islands where no other ecologically similar species exists.

Contrary to Wilson's view of SE Asia and New Guinea as the main sources of Indo-Pacific Stage I taxa, our results suggest that the New World has also been a source region of Indo-Pacific taxa. This pattern suggests a more important role of eastern Melanesia, namely Fiji and the Solomon Islands, in macroevolutionary dispersal of Stage I taxa in the Indo-Pacific. These areas might have acted as a hub area for New World taxa entering the Indo-Pacific region; a hypothesis that has been put forward in ants (Clouse et al., 2015) and other arthropod groups (Clouse et al., 2017; Sharma & Giribet, 2012). However, our interpretation does not challenge the importance of SE Asia as a source region for Indo-Pacific taxa. Instead, we propose that an eastern gateway to the Indo-Pacific since the past 15 Ma has also shaped the region's biodiversity.

4.3 Neutral and non-neutral models fitting the biogeography of Odontomachus

Our coarse- and fine-scale biogeographical analyses were best fit by different base models, BayArea-like* and DEC + t12 + t21 + m2. There is no contradiction between these results as these analyses were done with different phylogenies, and at different temporal and geographical scales. We expect that the finer-scale analysis gives a better approximation to the patterns predicted by the taxon cycle model, because it includes trait-dependent dispersal parameters, and its focus on the Indo-Pacific region allowed the modelling of range expansion between geographical areas. Unlike many previous studies of island clades, our statistical model comparison did not indicate improved fit for +j variants. Several observed and inferred ancestral ranges in this study are widespread (i.e., occupying multiple areas) suggesting that anagenetic range-expansion dispersal is common in this group. As predicted by the taxon cycle, Odontomachus Stage I taxa have been widespread across archipelagos for a considerable time, and thus, the range-expansion process (modelled by DEC's d parameter) is detectable by biogeographical models. Furthermore, the DEC model allows for “within-area subset speciation” (see Figure 1), which is consistent with the taxon cycle, in that: (a) speciation begins with peripheral population differentiation of a widespread species, therefore the Stage I in a phylogeny would be represented by ancestral nodes (common ancestors) having widespread distributions; and (b) after a cladogenetic event (i.e., the “instantaneous” speciation on a phylogeny), one of the two daughter lineages would inherit the widespread range of the parent node, and the second daughter lineage would inherit a peripheral area, thus entering the process as a single-island endemic.

Under a neutral scenario, geographical distances and stochastic dispersal would have shaped the extant distribution of Odontomachus. However, we found little support for the role of geographical distance as a primary regulator of Odontomachus macroevolutionary dispersal and distribution. Moreover, we found a strong distributional delimitation at Wallace's Line, which is in partial agreement with the hypothesis of biotic interactions (e.g., competition among closely related taxa) influencing geographical distribution (Wilson, 1961); the Oriental rixosus group dominates on the western side of the Wallace's Line, whereas the Indo-Pacific clade dominates on the eastern side (Satria et al., 2015; Sorger & Zettel, 2011). In other insects, however, it has been reported that the Wallace's Line has been highly permeable, given the strong dispersal abilities of winged animals (e.g., Balke et al., 2009; Condamine et al., 2013; Matos-Maraví et al., 2018; Müller, Matos-Maraví, & Beheregaray, 2013; Tänzler, Toussaint, Suhardjono, Balke, & Riedel, 2014), and perhaps due to the continual turnover of species (i.e., immigration rate minus extinction rate) across geographically close islands, as expected by equilibrium theory (Gillespie & Roderick, 2002). For Odontomachus, and consistent with Wilson's narrative, we suggest that ecology (e.g., habitat preference and interspecific interactions such as competition) has been stronger at delimiting faunas and influencing dispersal than geography alone at Wallace's Line. In fact, the only two observed Odontomachus species that have spread and established across the c. 35 km distance of the Wallace's Line, Osimillimus (New World origin) and Omalignus (Melanesian origin), were possibly aided by their ecological preference for open and disturbed habitats.

Furthermore, under neutral assumptions, we expect that species and populations are governed by the same evolutionary and migration rates (Missa, Dytham, & Morlon, 2016). However, the geographical spread of Odontomachus species have likely been determined by their occupancy of open and disturbed habitats as shown in the trait-dependent dispersal analyses and in studies using other Indo-Pacific ants (Economo & Sarnat, 2012; Economo et al., 2015; Janda et al., 2016; Matos-Maraví et al., 2018). As an exemplary case in Odontomachus, the crown node of the infandus group has likely been at Stage I in the late Miocene given the ancestral state inference of habitat preferences and was followed by range contraction and speciation giving rise to single-island endemics with narrower habitat preferences in Fiji (Oangulatus) (Sarnat & Economo, 2012), Flores (Ofloresensis) and in the Philippines (Sorger & Zettel, 2011). Recent adaptation to exploit resources in coastal habitats occurred in the late Pliocene/early Pleistocene, which favoured entering the Stage I by Omalignus (Brown, 1976; Olsen, 2009; Wilson, 1959b). Such a state transition likely favoured the later expansion of the species across the entire Indo-Pacific, colonizing New Guinea wherein O. malignus only occupies intertidal zones. This biogeographical pattern was expected under the taxon cycle model, where younger colonizers of “source areas” are necessarily in Stage I. Although this pattern somewhat resembles the concept of priority effects, whereby the ability of species to colonize an island is influenced by preceding colonizers, the priority effect is considered to be stochastic at a macroevolutionary level. This would result in disparate diversity dynamics over time and potentially deviating island communities from equilibrium (Lim & Marshall, 2017). But as observed in Odontomachus, the role of priority effects at the macroevolutionary scale may be stronger on species with habitat preference for forest interior and weaker for geographically expanding taxa occupying open and disturbed habitats.

However, it is still premature to draw generalizations about the importance of the taxon cycles for insular biogeography. Alternatives to the taxon cycle have been proposed (Liebherr & Hajek, 1990; Losos, 1992; Pregill & Olson, 1981), but its rejection often relies on qualitative approaches. Future studies, including meta-analyses using other animal and plant insular clades, need to address additional ecological predictions, such as species abundances of island endemics and trait-dependent speciation. For example, species abundances might be a predictor of nonuniform dispersal among species at large regional scales (Shmida & Wilson, 1985), which can be modelled by recent quantitative methods (Rosindell & Phillimore, 2011). However, the most critical limitation currently is the lack of records from field expeditions/experiments. In this study, for example, we were unable to recover data on species abundances of Indo-Pacific lineages from either literature or databases.

Our proposed framework is a useful attempt to study the relative importance of non-neutral and neutral processes in a single biogeographical analysis, but it has several limitations. First, the ancestral state inference for geography and trait evolution does not take into account missing lineages including extinct and unsampled described species. Extinction erases a clade's history (Marshall, 2017) and assuming a Yule (pure-birth) process may mislead if extinction is highly nonrandom by trait or geographical range. Nevertheless, there is no evidence of mass extinction events in Odontomachus, our taxonomic sampling is comprehensive (c. 70% of Indo-Pacific in the phylogeny), and simulation tests indicate that moderate amounts of missing lineages do not severely bias inference (Matzke, 2014).

Second, combining a trait transition matrix with the standard biogeographical transition matrix creates computational challenges. Adding a single binary trait doubles number of states and quadruples the size of the rate matrix, substantially slowing the calculation of the likelihood—a calculation which must be repeated many times in a ML analysis. Adding a 3-state trait would increase the size of the transition matrix by nine times and adds six transition-rate parameters to the model. Without much larger data sets, this impedes the evaluation of other habitat preferences. For example, in our study we did not explicitly model the influence of the generalist state in dispersal (i.e., creating a third state for taxa occupying both forest interiors and open habitats), nor have we attempted to measure the effects of littoral habitat vs. noncoastal open/disturbed environments, despite the fact that coastal taxa might have even greater probabilities to dispersal (Wilson, 1961).

5 CONCLUSION

Non-neutral processes and nonequilibrium dynamics have long been postulated to be important in island biogeography. The recent developments of quantitative and integrative models that explicitly incorporate non-neutral processes allow the testing of long-standing, nonequilibrium hypotheses. In this study, we evaluated six phylogenetic predictions made by the taxon cycle model. We inferred ancestral habitat preferences and investigated the impact of habitat preference on macroevolutionary dispersal. We found strong support for each of the predictions made by the taxon cycle, suggesting that non-neutral processes, such as trait-dependent dispersal, have indeed been important for the assemblage and distribution of observed species in the Indo-Pacific. Future studies incorporating other ecological predictions, such as island-endemic abundances over time and trait-dependent diversification, will further clarify whether the taxon cycle is valid and a common observed pattern. Moreover, the predictions outlined in this study can be used in future work, including meta-analyses of island communities, to weight the relative contribution of neutral and non-neutral processes in the biogeography and diversification of insular landscapes.

ACKNOWLEDGEMENTS

We are indebted to the staff of the New Guinea Binatang Research Center and to V. Novotny and S.E. Miller for assistance with our research projects, and to the Papua New Guinea Department of Environment and Conservation for assistance with research permits. We are grateful to S. Cover and N.E. Pierce for assistance and access to specimens in MCZ, to A. Andersen and B. Hoffman for kindly providing voucher specimens and DNA sequences for Australian species, M. Borovanska for assistance in the laboratory, to Jesse Czekanski-Moir for contributing Palau samples and to D. General and G. Alpert for contributing the Philippine samples. We thank Rosemary Gillespie, Evan P. Economo, Søren Faurby and six anonymous reviewers for constructive comments on previous versions of the manuscript. Funding was provided by the Czech Science Foundation (P505/12/2467), the Putnam Expedition Grants (Museum of Comparative Zoology), GAJU (003/2015/P; 152/2016/P), CONACYT DICB-2016 No. 282471 and UNAM PAPIIT IN206818. NJM was supported by a NIMBioS Fellowship under NSF award #EFJ0832858 and ARC DECRA Fellowship DE150101773. FJL was supported by the National Science Foundation (DDIG DEB-1407279) and the Smithsonian Institution (Peter Buck Fellowship). AVS was supported by the National Geographic Society's Explorers Grant. WCW was supported by funds from the American Museum of Natural History and the US Army Research Laboratory and the US Army Research Office under contract/grant number W911NF-05-1-0271. Computational resources were provided by the MetaCentrum under the program LM2010005 and the CERIT-SC under the program Centre CERIT Scientific Cloud, part of the Operational Program Research and Development for Innovations, Reg. no. CZ.1.05/3.2.00/08.0144.

    AUTHOR CONTRIBUTIONS

    P.M.M. and M.J. designed the study; P.M.M., F.J.L., R.M.C., W.C.W., D.M.S., A.V.S. and M.J. collected specimens; P.M.M. and F.J.L. performed laboratory work; P.M.M. and N.J.M. conducted analyses; P.M.M. wrote the first draft of the manuscript and all co-authors wrote the final version of the article.

    DATA ACCESSIBILITY

    DNA sequences: GenBank Accession nos KU145821-KU146453; BOLD: under ASPNA project. Data sets and time-calibrated phylogenies: TreeBase study ID 202312. Input files for phylogenetic analyses and R code for biogeographical analyses; sampling localities: Dryad Digital Repository, https://doi.org/10.5061/dryad.5542pr8. Specimens deposited: Harvard Museum of Comparative Zoology (MCZ), Smithsonian National Museum of Natural History (USNMENT), CSIRO Tropical Ecosystems Research Centre, Darwin, Australia (TERC), Institute of Entomology, Czech Academy of Sciences (EntU-CAS).

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.