Volume 59, Issue 5 pp. 1013-1027
ORIGINAL ARTICLE
Full Access

Systematics of the oil bee genus Lanthanomelissa (Apidae: Tapinotaspidini) and its implications for the biogeography of South American grasslands

Taís M. A. Ribeiro

Taís M. A. Ribeiro

Departamento de Zoologia, Programa de Pós-graduação em Zoologia, Universidade de Brasília, Brasília, Distrito Federal, Brazil

Department of Entomology, University of Maryland, College Park, Maryland, USA

Search for more papers by this author
Aline C. Martins

Aline C. Martins

Departamento de Ecologia, Universidade de Brasília, Brasília, Distrito Federal, Brazil

Search for more papers by this author
Daniel P. Silva

Daniel P. Silva

Instituto Federal Goiano, Campus Urutaí, Urutaí, Goiás, Brazil

Search for more papers by this author
Antonio J. C. Aguiar

Corresponding Author

Antonio J. C. Aguiar

Departamento de Zoologia, Programa de Pós-graduação em Zoologia, Universidade de Brasília, Brasília, Distrito Federal, Brazil

Correspondence

Antonio J. C. Aguiar, Departamento de Zoologia, Programa de Pós-graduação em Zoologia, Universidade de Brasília, Brasília, Distrito Federal, Brazil.

Email: [email protected]

Search for more papers by this author
First published: 19 April 2021
Citations: 6

Taís M.A. Ribeiro ([email protected]), Aline C. Martins ([email protected]), Daniel P. Silva ([email protected]), Antonio J. C. Aguiar ([email protected])

Abstract

en

Lanthanomelissa has an uncertain taxonomic history, and was formerly treated either as an independent genus or as a subgenus of Chalepogenus. It is endemic to southern South American grasslands, an endangered and poorly known environment. We aimed to understand the origin of these bees in time and space, the influence of Quaternary climatic fluctuations on their current distribution, and the possible link to the southern South American grasslands’ history. We inferred phylogenetic relationships in Lanthanomelissa using 37 specimens from all six species and 3430 nucleotides of three mitochondrial and two nuclear genes and estimated divergence times and ancestral geographic range. We modeled species distribution for the current and two past climatic scenarios (LIG, ~120 kya, and LGM, ~21 kya), performing an ensemble with three algorithms in a dataset of 192 georeferenced occurrence points using 19 WorldClim bioclimatic variables. The results support the monophyly of the genus and taxonomic changes, including the species Lanthanomelissa parva n. comb., and the treatment of the goeldianus group of Chalepogenus as the genus Lanthanella. Lanthanomelissa originated at the Oligocene–Miocene border in the Chacoan–Pampean region, and the glacial–interglacial models indicate expansion in the Last Glacial Maximum and retraction in the Last Interglacial. This origin was approximately synchronized with their exclusive floral host, Sisyrinchium (Iridaceae). The diversification of Lanthanomelissa supports the estimated austral expansion of the ancestral southern grasslands in South America before the origin of Cerrado during the late Miocene. Expansion and retraction in distribution during the last glacial–interglacial indicate grasslands distributional shifts through climate cooling and warming periods.

Resumo

pt

A posição taxonômica das abelhas do gênero Lanthanomelissa é controversa, sendo tratadas tanto como um gênero independente ou como um sugênero de Chalepogenus. Estas abelhas são endêmicas dos Campos Sulinos na América do sul, um ecossistema ameaçado e pouco estudado. Neste contexto, este trabalho tem como objetivo estudar a origem de Lanthanomelissa no tempo e espaço, a influência das flutuações climáticas do quaternário na sua distribuição atual e suas possíveis relações com a história biogeográfica dos Campos Sulinos. Nós inferimos as relações filogenéticas em Lanthanomelissa usando 37 espécimes de todas as seis espécies e 3430 nucleotídeos de três genes mitocondriais e dois genes nucleares, e estimamos os tempos de divergência e a sua distribuição geográfica ancestral. A distribuição das espécies foi modelada para o presente e para dois cenários climáticos passados - Último Interglacial (LIG), ~120 kya, e Último Glacial Máximo (LGM), ~21 kya) – usando uma conjunção de três algoritmos e um banco de dados com 192 pontos de ocorrência georreferenciados, usando as 19 variáveis bioclimáticas do WorldClim. Os resultados suportam a monofilia do gênero e as mudanças taxonômicas, incluindo a espécie Lanthanomelissa parva n. comb., e o tratamento do grupo de espécies goeldianus de Chalepogenus como o gênero Lanthanella. Lanthanomelissa teve origem na transição Oligoceno-Mioceno, em uma região Chaco-Pampeana. Modelos glaciais-interglaciais indicam expansão no Último Máximo Glacial e retração no Último Interglacial. Esta origem foi aproximadamente sincrônica com a distribuição de suas plantas hospedeiras, Sisyrinchium (Iridaceae). A diversificação de Lanthanomelissa suporta uma expansão austral das savanas ancestrais aos Campos Sulinos da América do Sul antes da origem do Cerrado, durante o final do Mioceno. Expansão e retração na distribuição geográfica durante o último período glacial-interglacial indicam trocas na distribuição dos campos durante períodos de resfriamento e aquecimento climáticos.

1 INTRODUCTION

Bees are the major angiosperm pollinators, and most species rely entirely on flower resources (Michener, 2007; Ollerton et al., 2011; Vogel, 1974), presenting distinct levels of specialization (Fenster et al., 2004). Bees and plants with specialized relationships tend to restrict their distribution according to their mutualistic partners (Vamosi & Vamosi, 2010). Floral oil specialist bees nest solitarily, usually on the ground or in deadwood, and show different specialization levels on their lipid source (Rozen et al., 2006).

The species of the oil-collecting bee genus Lanthanomelissa Holmberg, 1903 (Apidae: Tapinotaspidini) maintain strict mutualistic interaction with plants of the genus Sisyrinchium L. (Iridaceae), which offer oil and pollen to these bees (Cocucci & Vogel, 2001). These plants occur widely in the Neotropical Region's open vegetation environments, but their diversity hotspot overlaps with the distribution of Lanthanomelissa (Chauveau et al., 2011, 2012; Cocucci & Vogel, 2001). Tapinotaspidini bees originated in the Paleocene 60 Mya (Aguiar et al., 2020) and are distributed widely in the Neotropical Region, utilizing a vast array of oil-producing plant families.

The taxonomic history of Tapinotaspidini bees has been uncertain since the initial treatment by Michener and Moure (1957), especially regarding Lanthanomelissa and the closely related genera Chalepogenus Holmberg, 1903 and Arhysoceble Moure, 1948. Lanthanomelissa has been treated in the past as a distinct genus from Chalepogenus, with two subgenera: Lanthanella Michener & Moure, 1957 and Lanthanomelissa (Michener & Moure, 1957). Even though the classification of the genera Chalepogenus and Lanthanomelissa in generic or subgeneric level was initially considered artificial by Michener (1963), who later considered Lanthanomelissa as a subgenus of Chalepogenus (Michener, 2007). Urban (1995), in advance of the taxonomy of the genus Lanthanomelissa, recognized Lanthanomelissa discrepans Holmberg, 1903 as a distinct species of Chalepogenus goeldianus (Friese, 1899), and proposed four new species in this genus, in addition to one new species in Lanthanella. Still, the closest relatives of Lanthanomelissa are species from a paraphyletic group within the genus Chalepogenus (Aguiar et al., 2020).

Lanthanomelissa bees are endemic to the southern South American grasslands, a large open vegetation area across parts of southern Brazil, Argentina, and Uruguay included in two biomes: Atlantic Forest (Subtropical Highland Grasslands) and Pampa (Cabrera & Willink, 1973; IBGE, 2004; Morrone, 2014), endangered by anthropogenic actions (Overbeck et al., 2007). In Atlantic Forest, the grasslands correspond to the Subtropical Highland Grasslands, forming a mosaic with the Araucaria Forest at elevations from 700 to 1300 m a.sl. (Overbeck et al., 2015). The Pampa is part of the Chacoan biogeographic subregion, together with the biomes included in the South American diagonal of open formations: Chaco, Cerrado, and Caatinga (Morrone, 2000). The origins and relationships of the Pampa and Subtropical Highland Grasslands are still barely known. The Miocene climatic cooling, the events of Andean Uplift, and the evolution of C4 grasses have been argued to be the main factors influencing open vegetation worldwide and the formation of the Dry Diagonal in South America (Pennington & Hughes, 2014; Simon et al., 2009).

The current biome configuration of forested and open vegetation has remained relatively stable since the Pliocene (5.3–2.5 Ma). During the last 2.5 Ma in the Quaternary, however, fluctuations between global climate cooling and warming events (Hewitt, 2004) possibly led to periods of expansion and retraction of grasslands in southeastern South America (Behling, 1997, 2002; Behling & Pillar, 2007; Behling et al., 2004; Safford, 1999, 2007). Biogeographic studies with solitary bees (Ramos & Melo, 2010; Zanella, 2002), spiders (Ferretti et al., 2014), and birds (Porzecanski & Cracraft, 2005) suggest that the Pampa is related to Chaco, Atlantic Forest, and Monte vegetation. This is reinforced by the occurrence of xerophytic elements from Chaco, such as the cactus Opuntia (L.) Mill., and patches of the Atlantic Forest (Boldrini, 2009). Since bees are intrinsically connected to the vegetation, their origin and diversification could help understand biome evolution.

The high level of floral host specialization and the endemicity of the understudied genus Lanthanomelissa show its potential for informing the biogeography of the South American grasslands, while its taxonomic controversies call for further investigation of the systematics and evolution of the group. In this context, we seek to understand the origin, distribution, and persistence of these bees in southeastern South American grasslands. Moreover, we were interested in finding out whether Lanthanomelissa originated synchronously with its unique oil host Sisyrinchium L. To answer those questions, we reconstructed the diversification of Lanthanomelissa in time with molecular phylogenetic tools coupled with ancestral range reconstructions and species distribution modeling to find out when and under which climatic conditions this bee group arose and how recent climatic fluctuations influenced its distribution. We hypothesized that Lanthanomelissa's origin and diversification are directly associated with the origin of South American grasslands and of its oil hosts: the plants of the genus Sisyrinchium, along with Quaternary climatic fluctuations. We also offer here a phylogenetic context to solve generic level conflicts in Tapinotaspidini taxonomy.

2 MATERIAL AND METHODS

2.1 Taxon sampling, DNA sequencing, and phylogenetic analysis

Our data matrix consists of 93 newly generated DNA sequences and 16 sequences from Aguiar et al. (2020), all deposited in GenBank (Appendix 1). Vouchers are deposited in the University of Brasilia's Entomological Collection. We analyzed multiple specimens of all five species recognized in Lanthanomelissa (sensu Urban, 1995), to maximize the known variation within the species and investigate each species’ dispersal history. We tried to sample all the known distribution of Lanthanomelissa, including those from highland and lowland grasslands. The genera Chalepogenus, Arhysoceble, and Trigonopedia Moure, 1941 were used as an outgroup, since they are the closest relatives to Lanthanomelissa, according to the comprehensive analysis for the tribe Tapinotaspidini (Aguiar et al., 2020). We extracted DNA mostly from specimens preserved in 95% ethanol (EtOH) and included pinned specimens collected less than 10 years ago (Table S1). We tested several commercial DNA extraction kits (Table S1), following their recommended protocols, but adopting the following modifications for old museum specimens: incubation in 70% EtOH for 24 h for decontamination and rehydration, and incubation in elution buffer for 24 h before digestion (Evangelista et al., 2017). To improve the quality and quantity of DNA extracted, we used different body parts (Table S1), maintaining the body integrity to assure their usefulness for further morphological studies.

We sequenced sections of five genes (gene abbreviation, amplicon, and alignment lengths, respectively): the mitochondrial genes cytochrome c oxidase subunit I (COI, 700–1279; 721 bp) and cytochrome B (CytB, 600; 494 bp); the ribosomal 16S rRNA gene (500–528; 594 bp); and the nuclear genes for long-wavelength rhodopsin (LW-rhodopsin, 800; 798 bp), and the F2 copy of the elongation factor 1α gene (EF-, 800; 1151 bp). Thus, our data comprise both fast- and slow-evolving regions to provide useful information and differentiation between species (Danforth et al., 2005, 2013). PCR conditions for all primers were 95°C for 60 s, 50–56°C for 60 s, and 68°C for 90 s (36 cycles). Table S2 shows primers, amplicon lengths, and specific annealing temperatures for each primer.

We conducted all extraction and amplification procedures in the Molecular Biology Laboratory at the Zoology Department of the University of Brasilia, but sample purification and sequencing were outsourced to the company Macrogen (South Korea). Sequences were assembled, edited, and BLAST-searched in Geneious 8.1.9 (Kearse et al., 2012). They were aligned with MAFFT extension (Katoh & Standley, 2013) using the following parameters: 200PAM/k = 2 for the nucleotide scoring matrix; 1.53 for gap opening penalty; and 0.123 offset value. Depending on the region characterization, we selected the appropriate algorithm: G-INS-i for COI and CytB, recommended for sequences with global homology; E-INS-i for the protein-coding nuclear genes EF- and LW-rhodopsin, recommended for sequences with multiple conserved domains and long gaps; and Q-INS-i for 16S rRNA gene, recommended for sequences with secondary structure.

The final matrix comprises 37 taxa and 3430 nucleotides. We divided this matrix by gene, codon, and, for the nuclear genes, also by intron and exon and searched for best models and partition schemes in PartitionFinder 2.1.1 (Lanfear et al., 2017), selecting with corrected Akaike information criterion. All schemes and models are available in Table S3. Relying on RAxML 7.4.2 (Stamatakis, 2006) using the graphical user interface raxmlGUI 1.3 (Silvestro & Michalak, 2012), we performed maximum-likelihood searches with 1000 non-parametric bootstrap replicates (Felsenstein, 1985). We conducted Bayesian tree searches in MrBayes (Ronquist et al., 2012) with 10 million generations and four chains sampled every 1000 generations. We discarded the first 25% of trees as burn-in, analyzed the convergence of Markov chain Monte Carlo (MCMC) in Tracer (Rambaut et al., 2018), and edited all trees in FigTree 1.4.3 (Rambaut, 2016).

2.2 Divergence times estimation

Since there is no known fossil of Tapinotaspidini, we used a secondary calibration point from the widely sampled phylogeny of the tribe Tapinotaspidini (Aguiar et al., 2020). The estimated 95% HPD interval for the age of the most recent common ancestor of Lanthanomelissa + Arhysoceble + Chalepogenus is 22.98–34.89 Mya. Therefore, we applied a normal distribution prior (with values of mean: 28, stdev: 4.0). Using the same matrix of 37 taxa and 3430 nucleotides, we estimated the divergence times in BEAST 1.8.4 (Drummond et al., 2012) via the CIPRES server (Miller et al., 2012). Under an uncorrelated lognormal relaxed model, with a Yule tree speciation model, it is recommended when considering sequences from different species (Drummond et al., 2012; Heled & Drummond, 2012) and HKY substitution model with empirical base frequencies. The Markov chain Monte Carlo (MCMC) was run for 50 million generations and sampled every 10,000 generations. We assessed convergence of the two chains in Tracer 1.6 (Rambaut et al., 2018), considering an effective sample size for all parameters >200, and produced the maximum clade credibility tree in TreeAnnotator 1.8.4 (part of BEAST package), with a burn-in of the first 25% trees. We visualized and edited the trees in FigTree 1.4.3 (Rambaut, 2016).

2.3 Ancestral area reconstruction

We derived six biogeographic areas from the provinces in the classification of Neotropical Region defined in Morrone (2014), representing the geographic range of Lanthanomelissa and outgroup: (A) Atlantic, (B) Cerrado, (C) Chaco, (D) Caatinga, (E) Pampa, and (F) Araucaria Forest. For the outgroup, we coded the representative of the geographic area of all species of each lineage. For ancestral area estimation, we relied on the R package “BioGeoBEARS” (Matzke, 2013), which evaluates the contribution of evolutionary processes (i.e., range expansion, range extinctions, vicariance, founder-event speciation) through biogeographic models: DEC (Ree & Smith, 2008), and modified versions of DIVA (Ronquist, 1997) and BayArea (Landis et al., 2013), named DIVA-like and BayArea-like. We did not implement the DEC+J model since it has been demonstrated to be a poor model of geographic range evolution (Ree & Sanmartín, 2018). We conducted likelihood ratio tests based on AICc scores to assess the fitness of the models (Table S4).

We coded the geographic area according to the vegetation type of the collection locality, even when representing a minor patch of vegetation located within a biome. For instance, although the classification from Morrone (2014) suggested the Atlantic Province covering all the coastal and highland areas of the Brazilian states of Santa Catarina and São Paulo, respectively, we considered the records on Cotia (São Paulo) and Maracajá (Santa Catarina) as occurring in patches of grasslands in the Atlantic Forest and have coded them as Araucaria Forest Province. We did the same for records in Parana Province. For discussion purposes, we also considered the regionalization introduced by Olson et al. (2001), as modified by Antonelli et al. (2018).

2.4 Occurrence and environmental data sampling

We obtained 192 georeferenced points (Table S5) for all Lanthanomelissa species from the literature and voucher labels deposited in entomological collections. When geographic coordinates were not available on the labels, we used Google Maps (Google Inc. 2018) to acquire approximate geographical coordinates from the recorded city center. We have also obtained data from the online databases speciesLink (www.splink.org.br) and GBIF (Global Biodiversity Information Facility, www.gbif.org/). The following entomological collections provided geographic location points for Lanthanomelissa: Departamento de Zoologia da Universidade de Brasília (DZUB), Departamento de Zoologia da Universidade do Paraná (DZUP), Museu de Ciências e Tecnologia da PUCRS (MCTP), Fundação Zoobotânica do Rio Grande do Sul (FZB/RS), American Museum of Natural History (AMNH), Coleção Sersic-Cocucci, Departamento de Botânica da Universidade Nacional de Cordoba (UNC), Museu Argentino de La Plata—Universidade Nacional de La Plata (UNLP), Universidade do Extremo Sul Catarinense (UNESC), Faculdade de Ciências e Letras de Ribeirão Preto USP—Coleção Camargo (RPSP), and Museu de Zoologia da Universidade de São Paulo (MZSP). We compiled 170 unique points, 52 for Lanthanomelissa betinae Urban, 1995, 48 for Lanthanomelissa clementis Urban, 1995, 35 for Lanthanomelissa discrepans Holmberg 1903, 21 for Lanthanomelissa magaliae Urban, 1995, and 14 for Lanthanomelissa pampicola Urban, 1995 (see Figure S1 for distribution map).

We performed species distribution models based on current (from the years 1970 to 2000), Last Glacial Maximum (LGM, 22 kya), and Last Interglacial (LIG, 120 kya) bioclimatic variables from WorldClim (Fick & Hijmans, 2017; Hijmans et al., 2005; Otto-Bliesner et al., 2006). Our variables had a 2.5 arcminutes resolution to estimate the potential geographic range of Lanthanomelissa species across a glacial–interglacial cycle.

For the Last Glacial Maximum and current climatic scenarios, we performed simulations based on the Community Climate System Model (CCSM) general circulation model in 2.5 arcminutes resolution. We resampled map resolution for the Last Interglacial variables from 30 arcseconds to 2.5 arcminutes using the “raster” package (Hijmans, 2016) in R 3.4.2 (R Core Team, 2017), implemented in RStudio 1.0.153 (RStudio Team, 2016) to correspond to the other environmental layers. We standardized all 19 bioclimatic variables by subtracting from each cell's variable value the variable's mean value and then dividing this result by the variable's standard deviation (a z-transformation). By doing this, all variables vary from −1 to +1 and have averages equal to zero and variances equal to one. We then ran a principal component analysis (PCA) to create new orthogonal spatialized principal components (PCs) and avoid collinearity and model overfitting (Jiménez-Valverde et al., 2011). We performed this analysis first for the current climate and then projected its linear coefficients into the past climates (LGM and LIG), so that the PCs generated for the past were dependent on the current scenario.

Since the models consider only climate data, they do not acknowledge biotic factors or species dispersal ability (Soberón, 2007), so we restricted the modeled area with minimum and maximum parameter constraints using the occurrence points as parameters (longitude: −70, −45; latitude: −40, −20).

2.5 Species distribution modeling

We performed all ecologic niche modeling analyses with the R package ENMTML (Andrade et al., 2020). We partitioned the Lanthanomelissa species occurrence dataset with a checkerboard partition, in which we divided the occurrence data equally into two subsets. Each subset generated the potential distribution and was validated by the other subset. Then, we performed an ensemble through a consensus from algorithms that had TSS values above the mean with three algorithms: the machine learning algorithms Maxent (maximum entropy, Phillips et al., 2004), random forest (Breiman, 2001), and SVM (support vector machines, Schölkopf et al., 2001; Tax & Duin, 2004). We evaluated model performance based on the AUC (area under the curve, Allouche et al., 2006), a threshold independent matrix varying from 0 to 1. Values are considered as acceptable when above 0.8. We have also evaluated true skill statistics (TSS, Allouche et al., 2006), a threshold-dependent metric varying between −1 and +1 in which values above 0.5 are acceptable and values above 0.7 are considered good. We multiplied the binary rasters yielded by the ensemble with all models from the three scenarios (LIG, LGM, and current) to find their intersection and estimate stable areas for each species.

3 RESULTS

3.1 Molecular phylogeny

The aligned data matrix comprised 109 sequences and 3,430 nucleotides, of which 88 are Lanthanomelissa sequences and 21 are outgroup sequences (Alignment S1). Both Bayesian and maximum-likelihood inference trees recovered similar highly supported clades (Figure 1, Figure S2). Trees recovered Arhysoceble as sister to a clade containing Chalepogenus goeldianus, Chalepogenus parvus Roig-Alsina, 1997, and Lanthanomelissa species (Bayesian Posterior Probability (BPP) = 1 and bootstrap support (BS) = 97). In the Bayesian tree, C. goeldianus is sister to C. parvus + Lanthanomelissa (1 BPP). However, in the maximum-likelihood tree, these relations are inverted, such that C. parvus is sister to C. goeldianus + Lanthanomelissa (97% BS). Lanthanomelissa constitutes a monophyletic group (81% BS, 0.99 BPP), and all the species are well supported as monophyletic (BS > 88%, BPP > 0.95). In this clade, L. betinae is sister to all the other species (81% BS, 0.99 BPP). In both trees L. discrepans and L. magaliae are monophyletic (73% BS, 0.97 BPP). These two species constitute a clade sister to another containing L. pampicola and L. clementis (87% BS, 1 BPP). However, in the maximum-likelihood tree, these two species are weakly supported as sister groups (54% BS) while in the Bayesian tree this relation is highly supported (0.97 BPP).

Details are in the caption following the image
Bayesian tree of Lanthanomelissa with representatives of Arhysoceble, Chalepogenus, and Trigonopedia as outgroup based on a concatenated matrix comprising 37 specimens and 3430 nucleotides obtained from mitochondrial COI and CytB, ribosomal 16S rRNA gene, and nuclear EF- and LW-rhodopsin using 14 partitions selected by PartitionFinder using the corrected Akaike information criterion. Names at tips indicate species and locality of the samples

3.2 Divergence times and ancestral range reconstruction

According to the Bayesian time tree (Figure 2, Figure S3), the crown group Arhysoceble + Chalepogenus + Lanthanomelissa had its origin estimated in the Oligocene, at 26.91 (18.56–34.68, 95% HPD) Mya. The crown group of Chalepogenus parvus Lanthanomelissa had its age estimated at 20.91 (13.24–28.38, 95% HPD) Mya, in Miocene. Crown age for C. parvus + Lanthanomelissa is estimated at 17.47 (11.16–24.79, 95% HPD) Mya. Crown age for Lanthanomelissa is estimated at 14.31 (8.58–20.56, 95% HPD) Mya. Individual Lanthanomelissa species had their crown ages estimated between 6.14 (2.59–10.16, 95% HPD) Mya for L. pampicola and 3.12 (1.07–6.01, 95% HPD) Mya for L. magaliae.

Details are in the caption following the image
Time-calibrated phylogeny and historical biogeography of Lanthanomelissa with representatives of Arhysoceble, Chalepogenus, and Trigonopedia as outgroup based on a concatenated matrix comprising 37 specimens and 3430 nucleotides with 50 million generations, age of the root node sampled from a normal distribution (mean = 28 and s.d. = 4). The map shows provinces from Morrone (2014) used for biogeographic reconstruction. Colored squares after species names and at nodes represent current historical distributions inferred in BioGeoBEARS under the BayArea-like+j model (lnL = −28.65). Horizontal gray bars at nodes indicate 95% HPD of estimated divergence times and the bottom bar indicates epochs and Quaternary period. Abbreviations: Eo: Eocene, Plio: Pliocene; Qua: Quaternary. The picture shows a pinned specimen of Lanthanomelissa discrepans. Bottom: Temperature variation for the last 35 Mya adapted from (Zachos et al., 2001), indicating Mid-Miocene Climatic Optimum and estimated origin of Sisyrinchium and Nierembergia (95% hpd, Martins et al., 2015)

Multi-model analysis in BioGeoBEARS yielded BayArea-like+j model (including founder-event speciation) as the best fitting model for our data, as it showed the higher log-likelihood and lowest AICc (LnL = −29.95, AICc = 64.24, Table S4). According to this analysis (Figure 2), the ancestral area for the most recent common ancestor of the lineage Arhysoceble C. goeldianus + C. parvus comprised Chacoan, Pampean, and Araucaria Forest provinces. However, for the most recent common ancestor of the two Chalepogenus + Lanthanomelissa, the area comprised only Chacoan + Pampean provinces (Figure 2). For Lanthanomelissa, the ancestral area inferred is Araucaria Forest +Pampean, and this is the current area for L. betinae and L. clementis, while the other three species are limited to the Pampean Province.

3.3 Distribution modeling of Lanthanomelissa

Potential distribution of Lanthanomelissa species, based on current climate (Figure 3, Figure S4), indicates that all species could occur in larger areas of grassland vegetation, but either the collecting effort was not yet exhaustive, or other factors, such as the distribution of a mutualistic partner, limit their distribution (see blue areas in Figure 3). For all Lanthanomelissa species, the Last Interglacial (LIG) distribution model extended to the south, including mostly southern Brazil and Uruguay, almost reaching the latitude of −40 (Figure 3, Figure S4). On the Last Glacial Maximum (LGM), this suitability was shifted northwards. For all species in LGM, the suitability extrapolated current continental borders because the sea level was retracted at that time (Figure 3, Figure S4). All statistic values are listed in Table S6.

Details are in the caption following the image
Potential range shifts for Lanthanomelissa betinae (a), L. clementis (b), Ldiscrepans (c), L. magaliae (d), and L. pampicola (e). Stable areas are shown in purple, areas predicted as suitable in the Last Interglacial (LIG) are shown in orange, areas predicted in the Last Glacial Maximum (LGM) are in yellow, and areas in current distribution are shown in blue

Stable areas predicted for L. betinae and L. clementis (Figure 3a, b) included south Brazil extending slightly to the southeastern region, but only a small portion of the Pampa. L. discrepans (Figure 3c), on the other hand, shows larger stable areas, including parts of eastern Argentina, and probably expanding southwards past latitude −40S. Northwards, it reaches a part of Paraguay and a small portion of the Brazilian state of Santa Catarina. Stable areas predicted for L. magaliae and L. pampicola (Figure 3d, e) are very restricted compared with the other species, limited to only a small portion of south Brazil for L. magaliae, while that for L. pampicola includes patches in Argentina and Paraguay.

4 DISCUSSION

4.1 Phylogenetic relationships and taxonomic considerations

Our phylogenetic results indicate the monophyly of Lanthanomelissa, sister to a clade consisting of species currently placed in the genus Chalepogenus. This genus is divided into two paraphyletic groups: one with the Chalepogenus related to Lanthanomelissa, occurring in Pampas and Southern grasslands, as shown here (group goeldianus and C. parvus); and a more speciose group (Chalepogenus s.s.) mostly including species occurring in the Monte vegetation in Argentina (Aguiar et al., 2020). These results reinforce the need for a taxonomic reclassification of the species of Chalepogenus as related to Lanthanomelissa, as presented below. Additionally, the species grouped in the Chalepogenus s.s. should be subject to further taxonomic review.

As indicated by our Bayesian phylogeny (Figure 1), C. goeldianus is sister to all Lanthanomelissa plus C. parvus. These relationships agree with morphology and genomic trees for Tapinotaspidini (Aguiar, unpublished data). Therefore, we propose that C. goeldianus and the remaining species from the group goeldianus (Roig-Alsina, 1999) should be reclassified in Lanthanella Michener and Moure (1957). This last name was originally proposed as a subgenus of Lanthanomelissa diagnosed by having mesoscutum hairs longer than the antennal diameter, and fore basitarsus without a distinct comb of setae on the outer margin. Later Moure (1992) elevated Lanthanella to the genus level, and Urban (1995) described Lanthanella luciane Urban, 1995. Roig-Alsina (1999) transferred L. goeldiana and L. luciane to Chalepogenus, also describing a new species in this group, which he named C. neffi. These three species were proposed by Roig-Alsina (1999) as the goeldianus species group, supported by morphological characters such as identical genital capsules, and morphology of the hind tibia, and our work supports the significance of this clade as the early proposed distinct genus Lanthanella comprising Lanthanella goeldiana (Friese, 1899), Lanthanella luciane Urban, 1995, and Lanthanella neffi n. comb. (Roig-Alsina, 1999).

Due to their sister position to Lanthanomelissa, we propose the designation Lanthanomelissa parva n. comb. Even though designating a new monotypic genus could also be a solution for C. parvus taxonomy, this would be unsuitable due to the morphological similarity between Lanthanomelissa parva and the remaining Lanthanomelissa species. As a result, Lanthanomelissa now integrates six species, supported by morphological characters such as the two submarginal cells in the forewing, reduced body size, and similar terminalia of males.

4.2 Grassland distribution in South America based on Lanthanomelissa origin

The origin of the stem group Lanthanomelissa was estimated to be at the transition from Oligocene to Miocene (20.91, 13.24–28.38 Mya) in an ancestral area comprising Pampean and Chacoan biogeographical provinces. Further, the Patagonian Sea, an ocean transgression extending from southern Patagonia to Bolivia and facilitated by Andean orogeny, possibly fragmented the Pampean and Chacoan areas (Ortiz-Jaureguizar & Cladera, 2006). The crown group Lanthanomelissa originated at circa 14.31 Mya (8.58–20.56 Mya 95% HPD) in the Middle Miocene Climatic Optimum, which was the peak of a global warming phase that started in the late Oligocene (Zachos et al., 2001) and favored the rise of many animal and plant groups, especially in South America (Hoorn et al., 2010).

The origin of the crown group Lanthanomelissa is associated with the beginning of a phase of climatic cooling and expansion of open vegetation environments in many parts of the world, including South America. From 15 Mya, Andean Cordillera elevations created a rain-shadow effect to humid winds from the west, increasing the aridity in eastern South America (Hartley, 2003). In the late Miocene, the low CO2 pressure led to a worldwide expansion of C4 grasslands, more flammable than common grasses, and the origin of savannas (Pennington & Hughes, 2014), which rose to dominance in the Pliocene (Simon et al., 2009). Earlier evidence of savannas was found by Aguiar et al. (2020), reconstructing the diversification of Tapinotaspidini bees, whose origin is undoubtedly linked to open vegetation environments. The higher aridity associated with the C4 grass expansion led to the predominance of open vegetation habitats in southern South America (Hoetzel et al., 2013; Lehmann et al., 2011; Pennington & Hughes, 2014; Strömberg, 2011; Werneck, 2011). Similarly, the origin of Pampa inferred here at. ca. 15 Mya could have been a result of the expansion of the grasslands associated with Araucaria forest, before the northern expansion that originated the Cerrado grassland. This suggests that the Cerrado and the Pampa possibly experienced similar diversification processes related to the expansion of grasslands. However, the Pampa expanded earlier due to the cooling and the driest conditions of the austral portions of South America. Although there are pieces of evidence of savanna-like habitats in the Paleocene (Aguiar et al., 2020), those do not correspond to the present northern distribution of Cerrado.

Despite the origin of South American savannas in the Paleocene (Aguiar et al., 2020), the austral and northern expansions in distinct periods can explain some endemic taxa of Pampa and Cerrado. The austral expansion that led to the southern grasslands occurred first, around 20 to 15 Mya in the early Miocene, and the northern expansion occurred later, around 10 to 5 Mya in the late Miocene (Simon et al., 2009). We also inferred that the expansion of grasslands influenced the origin of the clade L. pampicola + L. discrepans + L. magaliae already in a Pampa area, differently from L. betinae and L. clementis, which maintained their distribution in the Araucaria Forest Province.

4.3 Synchronous origin of oil source plants and Lanthanomelissa bees

Lanthanomelissa crown age coincides with the stem origin of its preferred oil host: plant species of the genus Sisyrinchium. Similarly, the origin of the stem node, including Lanthanella parva and Lanthanella goeldiana, coincides with their oil host, Nierembergia Ruiz & Pav. (Solanaceae) (see Martins et al., 2015 for Nierembergia and Sisyrinchium ages). The coincident age, spatial distribution, and close mutualistic association (Chauveau et al., 2011; Cocucci & Vogel, 2001) support our suggestion that Lanthanomelissa and Sisyrinchium could have played important roles in each other's diversification and biogeographic history in South America.

The gain of glandular trichomes in Sisyrinchium species, especially the oil-secreting ones, has been hypothesized as one of the main drivers of diversification of this genus in South America (Chauveau et al., 2011). Although the presence of oil glandular trichomes is not ancestral in Sisyrinchium, it occurred in early-diverging clades (Chauveau et al., 2011), suggesting an ancient relationship with oil bee pollinators. Further analysis of the diversification of these plant clades is necessary to understand the role of oil bee pollinators in speciation or extinction.

Lanthanomelissa's sister lineages (Lanthanella and Arhysoceble) would have exploited other plants for oils, as demonstrated by the time of origin we estimated, the estimated origin of oil-producing Solanaceae and Plantaginaceae (Martins et al., 2014, 2015) and their current oil host preferences (Cocucci, 1991; Martins & Alves-dos-Santos, 2013; Renner & Schaefer, 2010). The relationship of the lineage Arhysoceble-Lanthanomelissa with floral oil families is intimately associated with the expansion of open vegetation areas in South America (Aguiar et al., 2020). The origin of floral oil glands in these younger plant clades was probably benefited by the pre-existence of oil bee specialists, which first evolved in association with oil-producing Malpighiaceae in Paleocene (Aguiar et al., 2020; Renner & Schaefer, 2010).

4.4 Lanthanomelissa distribution shifts due to climatic fluctuations in the Quaternary

According to our species distribution modeling, most species’ suitability shifted north- and eastwards in the LGM. This is corroborated by the presence of a mosaic of grasslands and Araucaria forest in the coast of southern Brazil, from São Paulo to Rio Grande do Sul (Behling, 2002; Behling et al., 2004; Leite et al., 2016; Overbeck et al., 2007; Pessenda et al., 2009; Silva et al., 2018). The distribution of L. discrepans and L. pampicola is divided by a central gap suggesting the division of inland western and eastern clusters and a coastal cluster in LGM. This structure was also observed in this area for plants endemic to the southern grasslands (Longo et al., 2014; Pinheiro et al., 2011; Silva et al., 2018). The species could have dispersed through the “Portal de Torres” (Rambo, 1950), a migratory route from the coastal Atlantic forest to the inland grasslands (Barros et al., 2018; Pinheiro et al., 2011; Silva et al., 2018).

At LGM, grasslands were dominant in southeastern South America. The climate was drier and colder than it is at present (Behling, 1997), and possibly extended to the coast of Uruguay area is also predicted as suitable for Lanthanomelissa. At about 4 kya, with increasing temperature, the Araucaria Forest expanded, limiting once again the occurrence of the grasslands (Behling et al., 2004) and the distribution of Lanthanomelissa evident in current distribution models for L. clementis. The fluctuation among grasslands and Araucaria forests has been suggested as important drivers of speciation in Subtropical Highland Grasslands, but not in Pampa, where Araucaria Forest registers are too recent (Fregonezi et al., 2013). Instead, abiotic factors (e.g., soil diversity) and ecological relations (e.g., intraspecific competition and pollination) could have been more substantial selective pressures in this region. However, few studies use species that cohabit both environments to address these abiotic factors’ interactions and how the two grassland areas relate (Fregonezi et al., 2013; Peres et al., 2015).

The Last Interglacial (LIG, ~120 kya) had the highest global temperatures of the last 250 ky and is characterized by wet conditions, higher sea levels, and forest biomes expansion over arid open formations (Otto-Bliesner et al., 2006). This could have caused the observed restriction of the distribution of most species from Lanthanomelissa at this time (Figure 3), suggesting a restriction in the grasslands, which were probably more expressive toward the south.

The stable areas indicated by Lanthanomelissa species in the last glacial–interglacial cycle varies among species, but a clear pattern of stability of highland grasslands can be observed in L. betinae and L. clementis. In colder periods, the highlands act as refugia for those species, since it favored the expansion of grasslands (Behling, 1997). Other species respond differently to stable areas analysis, but we can see an overlap between stable areas that approximately correspond to the Southern Atlantic Forest refugia described in Costa et al. (2018). This region has been dynamic over the Quaternary climatic fluctuations, presenting forest refugia in the grassland phase, which rapidly expanded over grasslands when the climate became suitable (Costa et al., 2018).

The potential distribution based on the current climate indicated large areas as suitable for most Lanthanomelissa species, including grasslands and Chaco areas. Even though the collecting effort has not yet been exhaustive, possibly other factors, such as the distribution of Sisyrinchium, limit their distribution range.

Lanthanomelissa is dependent on its flowers, although the contrary is not always true (Chauveau et al., 2012; Cocucci & Vogel, 2001). However, the congruence of the diversification of those bees with their oil host's origin indicates that those groups could have driven each other's diversification. Integrative modeling approaches using occurrence data of the oil host plant Sisyrinchium as a variable limiting the distribution of Lanthanomelissa and the phylogeography of one of these bee species would also bring valuable insights to the biogeographic history of the southeastern South American grasslands. Inferences on the relationship of the origin of the genus Lanthanomelissa with the origin of the South American Grasslands show that studies on the systematics and the evolutionary history of bees can help to understand biome evolution.

ACKNOWLEDGMENTS

This study is dedicated to Danuncia Urban who vastly contributed to our knowledge of bee fauna from the Neotropical Region, including Lanthanomelissa and Lanthanella. For providing specimens for this study, we thank Andrea Cocucci, Alicia Sérsic, Birgit Harter Marques, Clemens Schlindwein, Danielle Parizotto, Reisla Oliveira, Eduardo Almeida, Gabriel Melo, Jerome Rozen, Juan Pablo Torreta, Mabel Lizaraso, Kelli Ramos, Leopoldo Alvarez, Rafael Ferrari, Rodrigo Gonçalves, and Vicent Lee. We also thank Lilian Giugliano, Kelli Ramos, and Anahí Espíndola and the EspindoLab for suggestions on previous versions of this manuscript and the reviewers for their time and effort to improve our manuscript. We thank P. Cedro for designing figures, M. Cavalcante and V. Silva for helping with R programming, and F. Duque and the English Editing for International Graduate Students (EEIGS) program of the University of Maryland for grammar revision. TMAR received scholarships from Fundação de Apoio à Pesquisa do Distrito Federal (FAPDF), and ACM received scholarships from CAPES and CNPq during the development of this research. AJCA thanks CAPES and FAPDF for grant support (Proc. Number: 193.000.893/2015). DPS was supported by a productivity grant from Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq—Proc. Number: 304494/2019-4).

    APPENDIX 1:

    Taxon sampling used in this study from A. Ingroup (Lanthanomelissa) and B. Outgroup (Arhysoceble, Chalepogenus, Trigonopedia), with voucher information, and GenBank accession number for sequences of five gene sections: the mitochondrial genes cytochrome c oxidase subunit I (COI), cytochrome B (CytB), the ribosomal 16S rRNA gene (16S), and the nuclear genes elongation factor 1α (EF-1α) and long-wavelength rhodopsin (LW-rhodopsin). Accession numbers in bold indicate the sequences obtained from Aguiar et al. (2020). Voucher: entomological collection number from DZUB (Department of Zoology, University of Brasilia). Lab Code: DNA sample number deposited in DZUB DNA collection

    Species Voucher Lab Code COI CytB 16S EF-1α LW-rhodopsin
    A. Ingroup
    L. betinae AA179 KX064547
    L. betinae AA253 MH213652 MH213625
    L. betinae TR012 MH213653 MH213626 MG894379
    L. betinae DZUB003456 TR020 MH213654 MH213627 MG894381
    L. betinae DZUB003468 TR032 MG894388
    L. betinae DZUB003471 TR033 MH213655 MH213628
    L. betinae DZUB003488 TR035 MH213656 MH213629 MG894390
    L. betinae DZUB003518 TR055 MH213657
    L. betinae DZUB003525 TR056 MH213658 MH213630 MG888653 MG889473
    L. clementis AA143 MH213660 MH213632 KX064535
    L. clementis TR013 MH213661 MH213633 MG894380
    L. clementis DZUB003457 TR021 MH213662 MH213634 MG894382
    L. clementis DZUB003458 TR022 MH213663 MG894383
    L. clementis DZUB003459 TR023 MG894384
    L. clementis DZUB003460 TR024 MH213664 MG894385
    L. clementis DZUB003478 TR034 MH213665 MH213635 MG894389
    L. clementis DZUB003491 TR036 MH213666 MH213636
    L. clementis DZUB003492 TR037 MH213637
    L. discrepans DZUB003545 TR057 MH213667 MH213639 MG888654 MG889474
    L. discrepans DZUB003529 TR059 MH213668 MH213640 MH213618 MG888656 MG889476
    L. discrepans DZUB003555 TR064 MH213669 MH213641 MH213619
    L. discrepans DZUB003550 TR065 MH213670 MH213642 MH213613
    L. magaliae DZUB003463 TR027 MH213643 MG894386
    L. magaliae DZUB003464 TR028 MH213644
    L. magaliae DZUB003509 TR058 MH213671 MH213645 MH213620 MG888655 MG889475
    L. magaliae DZUB003513 TR066 MH213672 MH213614
    L. magaliae DZUB003548 TR067 MH213673 MH213615
    L. pampicola DZUB003465 TR029 MH213674 MH213646
    L. pampicola DZUB003466 TR030 MH213675 MH213647 MG894387
    L. pampicola DZUB003554 TR069 MH213676 MH213649
    B. Outgroup
    A. huberi DZUB089479 AA159 KX064542
    A. huberi DZUB089489 AA169 KX064648 MH213621
    A. melampoda DZUB089487 AA167 KX064649 MH213622 KX064544
    C. goeldianus DZUB000924 AA002 KX064671 KX064518 KX064561
    C. goeldianus DZUB003559 TR060 MH213650 MG888657 MG889477
    C. parvus AA106 KX064661 KX064531
    C. parvus DZUB089519 AA202 KX064580
    C. parvus DZUB089562 TR061 MH213651 MH213624 MH213610
    Trigonopedia sp. DZUB089530 AA217 KX064636 KX064557 KX064583

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.