Length–Dry Mass Relationships of Aquatic Insects: Geographic and Taxonomic Variation in a Digital Database
Funding: Funding was provided by the H2020 European Research and Innovation action grant agreement no. 869226 (DRYvER).
ABSTRACT
- Aquatic insects are an abundant, yet declining, taxonomically heterogeneous group with special importance in both aquatic and terrestrial ecosystems. Accurate estimations of insect biomass during their aquatic life stages are essential to advance our fundamental knowledge about insects, their roles in ecosystems, and their vulnerability to human impact. However, assessing insect biomass from samples using classical drying techniques is time-consuming and prohibits the use of samples for other analyses.
- A widely applied method is therefore to use length–dry mass power regressions to obtain dry mass (DM) from body lengths (BL) using literature-derived parameter values. However, the application of this method relies on reliable and accessible parameter values, preferably matching the studied specimens both taxonomically and geographically.
- Here, we aimed to increase parameter accessibility in the literature to (1) facilitate researchers in employing more appropriate length–mass regressions in their studies, (2) identify knowledge gaps that can direct future research towards unexplored regions and understudied taxonomic groups, and (3) visualise the relative contribution of geographic variation (differences among continents) and taxonomic variation (differences among families within each order) to regression lines.
- We compiled a parameter dataset based on 25 publications for eight insect orders with aquatic life stages: Coleoptera, Diptera, Ephemeroptera, Hemiptera, Megaloptera/Neuroptera, Odonata, Plecoptera, and Trichoptera, and made the dataset available digitally. This parameter dataset is derived from over 15,000 measured insects of 84 (sub)families and 233 genera from all continents, except Africa and Antarctica. We found parameter values to be widely available at the order level, but at the resolution of family and genus levels, values were missing for 65% and 94% of the taxa, respectively.
- Identified knowledge gaps were the need for (1) more data on variation among families that is collected standardised within the same geographic regions, (2) targeted collections of data for different orders within the same study areas, to reveal variation among families and genera, and (3) careful reporting of the exact methodologies used, to identify variation introduced by methodological dissimilarities. Geographic and taxonomic variation is visually presented in figures for further interpretation.
- We conclude that length-mass regressions can be a powerful method, but due to data shortage at the genus and family taxonomic levels, order-level regressions with less reliability are necessarily applied. By providing parameters in a new digital dataset, we hope to facilitate users in more efficient assessment of parameter availability for studied taxa in any geographic region. The identified knowledge gaps can be used to direct future research efforts. More accessible parameter data will facilitate more reliable assessments of aquatic insect biomass and benefit future studies on this important and abundant group of organisms, bridging aquatic and terrestrial ecosystems.
1 Introduction
Many insect orders have life stages as aquatic larvae and then leave the aquatic environment as flying adults, which makes them important components of both aquatic and terrestrial food webs (Recalde et al. 2020; van Klink et al. 2020). Insect biomass currently shows strong declines, with possible cascading effects on other trophic levels (Hallmann et al. 2017). To address this challenge, reliable biomass data on insects is required for progress in various research fields, including population demography, community assembly, trophic interactions within food webs, secondary production estimates, and overall ecosystem functioning (Benke et al. 1999; Sabo et al. 2002). However, direct measurement of insect dry body masses to assess biomass is time-consuming and requires drying of specimens, which may limit other future use, such as morphological or molecular analyses. Therefore, a widely used and efficient method in ecological studies is to derive invertebrate dry biomass from linear body dimensions (Smock 1980; Meyer 1989; Burgherr and Meyer 1997).
Predictive equations for relationships between linear body dimensions and dry mass of aquatic insects have been obtained using linear, exponential, quadratic, and power regressions (Smock 1980; Meyer 1989). The best and most widely used type of equation is the power function DM = a × BLb, in which DM is the dry mass of specimens weighed on a scale in mg, a is the scaling factor, b is the exponent determining the growth rate in the relationship, and BL is the body length of the specimen in mm—measured as the distance between the anterior of the head capsule and the posterior of the last abdominal segment (Smock 1980, Meyer 1989, Burgherr and Meyer 1997). The value of the exponent parameter b determines the slenderness of the insect bodies in relation to their lengths, with lower exponents indicating more slender insects (Schoener 1980). This power function is often linearised in its (natural or 10-based) logarithmic form: log(DM) = log(a) + b × log(BL). Although the power model is typically superior in predicting mass over linear, quadratic, or exponential regressions (Shahbaz-Gahroee et al. 2021), regressions with linear dimensions other than body lengths—such as head capsule width for Ephemeroptera and Plecoptera (Benke et al. 1999), the length of the last abdominal segment for Coleoptera (Meyer 1989), or the interocular distance for Trichoptera (Coelho et al. 2023)—have also been used to estimate dry mass (Nakagawa and Takemon 2014). However, body length is typically preferred to other linear dimensions because, for most taxa, it explains more of the variance in body mass and is available for a wider range of taxa (Benke et al. 1999; Stoffels et al. 2003; Coelho et al. 2023).
Parameter values of the power regressions can best be obtained for the specific population that is studied because they can vary by ecological context, among species, number of specimens and size range involved, preservation methods, sample processing procedures, and among geographic regions (Johnston and Cunjak 1999; Bried and Ervin 2007; Martin et al. 2014). However, usually it is unfeasible to obtain parameters for the precise population under study; hence, many studies rely on parameters obtained from literature sources. In the literature, parameters are available at different taxonomic resolutions (species, genus, family or order level) and for different geographic regions (Towers et al. 1994; Burgherr and Meyer 1997; Benke et al. 1999; Johnston and Cunjak 1999; Miyasaka et al. 2008). Parameter values with finer taxonomic resolutions (e.g., species or genus) tend to perform better than values at family or order levels (Smock 1980; Meyer 1989; Poepperl 1998; Bried and Ervin 2007), and choosing parameter values from the same geographic region should be preferred (Schoener 1980; Martin et al. 2014). Despite this, few studies are able to apply parameter values with an exact geographic and taxonomic match. Consequently, an overview that allows researchers to readily find and select parameter values of taxa that show the best phylogenetic and geographic match with the taxa of their study may increase cost-efficiency and accuracy of their work.
Here, we provide a taxonomy and geography-based digital database of published parameter values for power regressions of body length to body mass for eight insect orders with aquatic life stages: the Coleoptera, Diptera, Ephemeroptera, Hemiptera, Odonata, Plecoptera, Megaloptera/Neuroptera, and Trichoptera. Our aims were (1) to facilitate researchers in employing more appropriate length–mass regressions in their studies, (2) to identify knowledge gaps that can direct future research towards unexplored regions and understudied taxonomic groups, and (3) to visualise the relative contribution of geographic variation (differences among continents) and taxonomic variation (differences among families within each order) to regression parameter values. We specifically focus on insect taxa with aquatic life stages because the parameters of length–mass regressions can differ between aquatic and (semi)terrestrial life stages of insects (Martin et al. 2014). For insects with only terrestrial life stages, we refer to the terrestrial literature and references provided in the meta-analysis of Martin et al. (2014) (e.g., Rogers et al. 1977; Gowing and Recher 1984; Ganihar 1997; Wardhaugh 2013).
2 Methods
2.1 Data Collection and Construction of the Database
To obtain regression parameters for insects from different orders and geographic regions, we searched the Web of Science, Scopus and Google Scholar for relevant literature in July 2024. We used the following search phrases: “length–mass AND relationship AND insect”, “length AND dry AND mass AND regression”, “body size AND biomass AND aquatic insects”, and “linear AND body AND dimensions AND mass AND aquatic AND insect”. The obtained literature was scanned for relevance, and for relevant publications, we subsequently used cross-referencing to obtain more publications.
All literature was checked for reliability by selecting peer-reviewed publications only, followed by checking for the presence of data that satisfied the following three criteria: (1) regression parameters of body length to dry mass (i.e., parameters for regressions of body lengths to wet mass or ash free dry mass (AFDM) were not included; e.g., Rosati et al. 2012); (2) regression parameters were only included for body lengths, in which body length was defined as the linear length from the front edge of the head to the tip of the abdomen, without antennae, cerci, legs, etc. Other linear body dimensions were not included, as body length is the most frequently measured variable (Burgherr and Meyer 1997); (3) regressions other than the power regression (such as linear, quadratic or exponential regression models) were not included in the database, but for readers interested in these values, we refer to the list of original publications in Table 1. After applying these selection criteria, we collated parameter values from all 25 publications that contained this type of information directly, or in their supplementary information, in the case of the reviews by Benke et al. (1999) and Johnston and Cunjak (1999). Although for some regions and specific taxa there may be more information available—for instance, in non-English or in literature not digitally available—we believe that we captured the majority of the most used information currently available in the peer-reviewed international literature.
Source | # Orders | # Families | # Genera | Continent |
---|---|---|---|---|
Benke et al. (1999)a | 8 | 52 | 117 | N-America |
Smock (1980) | 8 | 33 | 39 | N-America |
Johnston and Cunjak (1999)a | 5 | 30 | 53 | Europe and N-America |
Meyer (1989) | 5 | 25 | 30 | Europe |
Chung (2008) | 4 | 20 | 22 | Asia |
Miserendino (2001) | 6 | 18 | 34 | S-America |
Towers et al. (1994) | 5 | 14 | 16 | Oceania |
Dekanová et al. (2022) | 4 | 12 | 12 | Oceania |
Baumgärtner and Rothhaupt (2003) | 4 | 11 | 15 | Europe |
Burgherr and Meyer (1997) | 3 | 10 | 13 | Europe |
Mährlein et al. (2016) | 2 | 8 | 11 | Europe |
Giustini et al. (2008) | 2 | 6 | 8 | Europe |
Sabo et al. (2002) | 3 | 6 | 0 | N-America |
Hajiesmaeili et al. (2019) | 3 | 5 | 0 | Asia |
Miyasaka et al. (2008) | 3 | 5 | 4 | Asia |
Stoffels et al. (2003) | 3 | 5 | 15 | Oceania |
Poepperl (1998) | 2 | 4 | 8 | Europe |
Shahbaz-Gahroee et al. (2021) | 2 | 3 | 0 | Asia |
Genkai-Kato and Miyasaka (2007) | 1 | 1 | 3 | Asia |
Nolte (1990) | 1 | 1 | 8 | Europe |
Coelho et al. (2023) | 1 | 1 | 1 | S-America |
Marchant et al. (2015) | 1 | 1 | 2 | Oceania |
Méthot et al. (2012) | 1 | 3 | 0 | N-America |
Balibrea et al. (2017) | 1 | 1 | 1 | Europe |
Dekanová et al. (2023) | 1 | 1 | 1 | Europe |
- a Note that these publications are previous reviews that include both primary and review data from other sources.
From these 25 studies, we extracted for each taxon the following key data: the continent on which data had been collected, the country or local region of study, and the parameters for the scaling factor and exponent of the length–mass regressions. Linearised regressions were converted to power regressions to create a standardised database across all taxa, but all original parameter values are also included (to avoid under- or over-estimations due to transformations, Mährlein et al. 2016). Data were obtained from eight insect orders with aquatic life stages (Coleoptera, Ephemeroptera, Diptera, Hemiptera, Megaloptera/Neuroptera (in figures further referred to as Megaloptera), Odonata, Plecoptera and Trichoptera). We separated the data based on geographic regions representing five continents (“Asia”, “Europe”, “North America” excluding lower Mexico, “Oceania” including Australia, and “South America” including Central America), which combines an anthropogenic and geographic viewpoint on the performed research effort in different parts of the world. A full overview of extracted variables with metadata is provided in Table 2.
Variable | Description |
---|---|
Order | Taxonomic information about the analysed sample |
Family | Taxonomic information about the analysed sample |
Genus | Taxonomic information about the analysed sample |
Species | Taxonomic information about the analysed sample |
Tax_resolution | The taxonomic resolution provided in the publication |
a | Standardised value of a to use in power regressions |
b | Standardised value of b to use in power regressions |
N | Number of specimens measured to determine the regression |
Minimum size | Minimum specimen size used in the regression in mm |
Maximum size | Maximum specimen size used in the regression in mm |
Continent | Continent where the samples were collected |
Region_detail | More detailed information about where the analysed samples were collected |
Source publication | Source publication from which the original data was extracted |
Relation | The type of logarithm used in the formula of the original publication |
Original_a | Original a value in the source publication's power regression |
Orginal_a_SE | Original a standard error in the source publication's power regression |
Orginal_ln_a | Original a value in the source publication's linearised logarithmic regression |
Orginal_ln_a_SE | Original a standard error in the source publication's linearised logarithmic regression |
Orginal_b | Original b value in the source publication's power regression |
Orginal_b_SE | Original b standard error value in the source publication's power regression |
Notes | Whether the original publications contain more detailed information, e.g., if multiple values were given per taxonomic level based on multiple samples or if cross references were used |
Regression parameters were typically derived from several specimens measured to the nearest 0.1 mm. The database includes 523 regressions, that is, each based on between n = 3 and n = 1423 specimens (median number of specimens is n = 31, mean is n = 63 ± 124 SD, with only 34 of the regressions based on less than 10 specimens). To enable readers to assess the robustness of the regressions, the number of measured specimens is provided in the database. All reported relationships were always significant at the alpha = 0.05 level, and R2 values were judged as representative by the original authors. After the lengths of specimens were measured, they were typically dried at temperatures between 40°C and 100°C for minimally 24 h and then cooled in a desiccator. Subsequently, they were weighed individually with accuracies varying between 0.1 and 10 μg (Burgherr and Meyer 1997; Benke et al. 1999). We did not assess the effects of preservation methods on the regression parameters, as this was not our primary focus, but it is important to emphasise that chemical preservation methods can affect insect lengths (Benke et al. 1999; Mährlein et al. 2016). Such preservation effects may differ among taxa, types of chemicals used, duration of storage, and the size of the specimens—and for more in-depth reading, we refer to Johnston and Cunjak (1999).
To assess the current geographic and taxonomic coverage of the body length–mass regression parameter dataset and identify knowledge gaps, we compiled a comprehensive list that includes the currently known families for each relevant order and, within families, the number of known genera per continent. Since taxonomy, as well as the number of species, genera, and even families, is continuously changing, this list should be considered as the best possible estimate reflecting current taxonomic richness. We were unable to accurately determine the number of genera for families in the order Diptera, as most families are regarded as “aquatic” and encompass partly aquatic and fully terrestrial species too, and this variation can also occur within genera. Genus-level analyses for Diptera are therefore excluded. The compilation of the list of taxa was based on taxonomic expertise supported by up-to-date catalogues, other literature sources, as well as reliable online databases (as Table S1 or on Figshare, Van Leeuwen et al. 2025).
2.2 Data Analyses
Data were not statistically analysed because data availability for most family and/or geographic regions was too low to perform meaningful analyses. Instead, we visualised relative data availability and relations in figures and provided the database itself directly for readers to assess and further study possible variation for their specific taxa of interest. The parameter values for Haliplidae in Shahbaz-Gahroee et al. (2021) were not included in the database because the values were identified as strong outliers.
3 Results
The literature search yielded 25 publications with information about eight insect orders and 84 (sub)families, published between 1980 and 2023 (Table 1). Key publications and the largest datasets were Smock (1980), Meyer (1989), Johnston and Cunjak (1999) and Benke et al. (1999), which provided parameter data for 33, 25, 30, and 52, respectively, of the total 84 insect (sub)families in the eight covered orders. Regression parameters for these (sub)families (including 233 genera, or 196 not including Diptera) of aquatic insects were made digitally available with metadata as described in Table 2. The complete database is available as Tabel S2 or in Comma-delimited file format (.csv) on Figshare (Van Leeuwen et al. 2025). Summarized means are available in Table 3.
Family | a | b | Range (mm) | N |
---|---|---|---|---|
Coleoptera | 0.0315 | 2.603 | 0.9–17.2 | 20 |
Chrysomelidae | 0.0392 | 3.111 | 0.9–4.1 | 1 |
Dytiscidae | 0.0618 | 2.501 | 3.1–6.5 | 2 |
Elmidae | 0.0178 | 2.550 | 1.2–7.0 | 11 |
Gyrinidae | 0.0531 | 2.588 | 11.0–17.2 | 2 |
Haliplidae | 0.0271 | 2.742 | 4.4–6.0 | 2 |
Hydrophilidae | 0.0024 | 2.200 | 3.0–8.0 | 1 |
Hydraenidae | 0.0191 | 2.530 | 2.2–2.7 | 1 |
Diptera | 0.0042 | 2.659 | 0.5–53.5 | 102 |
Athericidae | 0.0062 | 2.529 | 3.0–17.1 | 3 |
Blephariceridae | 0.0067 | 2.850 | 2.0–15.5 | 1 |
Ceratopogonidae | 0.0042 | 2.455 | 4.0–9.1 | 6 |
Chaoboridae | 0.0005 | 2.430 | 1 | |
Chironomidae | 0.0029 | 2.688 | 0.5–19.0 | 38 |
Chironomidae (non-Tanypodinae) | 0.0028 | 2.518 | 2.0–6.4 | 1 |
Chironominae (Chironomidae) | 0.0043 | 2.216 | 2.0–14.0 | 3 |
Empididae | 0.0047 | 2.729 | 1.8–6.2 | 1 |
Limoniidae | 0.0056 | 2.485 | 2.4–24.0 | 4 |
Muscidae | 0.0003 | 3.550 | 6.5–13.5 | 1 |
Orthocladiinae (Chironomidae) | 0.0016 | 2.705 | 2.3–9.9 | 4 |
Pediciidae | 0.0013 | 3.404 | 3.5–32.3 | 3 |
Simuliidae | 0.0075 | 2.809 | 0.7–7.0 | 21 |
Tabanidae | 0.0037 | 2.808 | 1.9–38.1 | 5 |
Tanypodinae (Chironomidae) | 0.0050 | 2.193 | 3.0–10.5 | 5 |
Tipulidae | 0.0094 | 2.889 | 3.2–53.5 | 5 |
Ephemeroptera | 0.0072 | 2.945 | 0.5–61.0 | 135 |
Ameletidae | 0.0040 | 2.886 | 2.6–18.5 | 3 |
Ameletopsidae | 0.0155 | 2.430 | 1 | |
Baetidae | 0.0074 | 2.820 | 1.0–15.0 | 31 |
Baetiscidae | 0.0116 | 2.905 | 1.9–9.8 | 1 |
Caenidae | 0.0053 | 2.748 | 1.3–6.4 | 5 |
Coloburiscidae | 0.0261 | 2.470 | 1 | |
Ephemerellidae | 0.0085 | 2.749 | 0.7–61.0 | 18 |
Ephemeridae | 0.0037 | 2.906 | 3.3–25.5 | 4 |
Heptageniidae | 0.0096 | 2.889 | 0.8–16.0 | 29 |
Isonychiidae | 0.0031 | 3.040 | 2.2–13.2 | 3 |
Leptophlebiidae | 0.0057 | 3.008 | 1.0–12.8 | 19 |
Nesameletidae | 0.0010 | 3.527 | 1 | |
Polymitarcyidae | 0.0020 | 3.048 | 1.0–14.0 | 6 |
Siphlonuridae | 0.0008 | 3.650 | 3.0–17.0 | 2 |
Tricorythidae | 0.0038 | 3.344 | 0.5–13.0 | 11 |
Hemiptera | 0.0161 | 2.678 | 2.8–17.5 | 8 |
Corixidae | 0.0307 | 2.608 | 3.4–6.8 | 2 |
Gerridae | 0.0150 | 2.598 | 9.0–17.5 | 2 |
Veliidae | 0.0130 | 2.714 | 2.8–5.5 | 4 |
Megaloptera | 0.0034 | 2.865 | 2.4–85.0 | 11 |
Corydalidae | 0.0033 | 2.914 | 2.4–85.0 | 7 |
Sialidae | 0.0038 | 2.804 | 3.4–21.3 | 4 |
Odonata | 0.0163 | 2.806 | 1.3–38.1 | 41 |
Aeshnidae | 0.0213 | 2.876 | 3.0–38.0 | 4 |
Calopterygidae | 0.0050 | 2.742 | 2.0–16.1 | 1 |
Coenagrionidae | 0.0251 | 2.635 | 1.3–22.0 | 10 |
Cordulegastridae | 0.0067 | 2.782 | 3.2–38.1 | 1 |
Corduliidae | 0.0303 | 2.920 | 1.7–29.3 | 7 |
Gomphidae | 0.0063 | 2.952 | 2.3–37.1 | 8 |
Lestidae | 0.0075 | 2.970 | 4.9–21.0 | 1 |
Libellulidae | 0.0478 | 2.702 | 2.1–19.8 | 9 |
Plecoptera | 0.0313 | 2.619 | 0.5–37.3 | 95 |
Austroperlidae | 0.0070 | 2.565 | 4.5–25.0 | 2 |
Capniidae | 0.0116 | 2.292 | 1.0–7.0 | 3 |
Chloroperlidae | 0.0049 | 2.601 | 2.2–12.2 | 4 |
Eustheniidae | 0.0011 | 3.278 | 1 | |
Geometridae | 0.0060 | 2.852 | 4.1–23.2 | 1 |
Gripopterygidae | 0.0476 | 2.359 | 2.0–17.0 | 9 |
Leuctridae | 0.0019 | 2.903 | 1.8–10.7 | 8 |
Nemouridae | 0.0070 | 2.764 | 0.9–9.2 | 12 |
Peltoperlidae | 0.0216 | 2.372 | 0.5–11.0 | 4 |
Perlidae | 0.0122 | 2.813 | 0.5–37.3 | 30 |
Perlodidae | 0.0175 | 2.751 | 1.0–22.2 | 12 |
Pteronarcyidae | 0.1959 | 2.335 | 8.6–34.9 | 2 |
Taeniopterygidae | 0.0059 | 2.448 | 1.5–8.7 | 7 |
Trichoptera | 0.0104 | 2.721 | 0.5–50.0 | 85 |
Brachycentridae | 0.0080 | 2.891 | 0.7–13.0 | 3 |
Conoesucidae | 0.0045 | 2.922 | 2 | |
Ecnomidae | 0.0034 | 2.418 | 2.3–10.8 | 1 |
Glossosomatidae | 0.0182 | 2.483 | 1.6–6.9 | 4 |
Goeridae | 0.0133 | 3.410 | 1.8–11.3 | 2 |
Hydrobiosidae | 0.0070 | 2.393 | 2.0–13.0 | 4 |
Hydropsychidae | 0.0122 | 2.885 | 0.5–25.0 | 16 |
Hydroptilidae | 0.0081 | 2.907 | 1.3–4.3 | 5 |
Lepidostomatidae | 0.0067 | 2.723 | 1.0–9.1 | 2 |
Leptoceridae | 0.0062 | 2.776 | 1.2–13.9 | 8 |
Limnephilidae | 0.0060 | 2.867 | 1.0–25.5 | 10 |
Molannidae | 0.0025 | 2.490 | 1 | |
Odontoceridae | 0.0140 | 2.766 | 1.2–13.6 | 3 |
Philopotamidae | 0.0049 | 2.597 | 1.6–11.5 | 4 |
Phryganeidae | 0.0038 | 2.700 | 3.8–28.2 | 2 |
Polycentropodidae | 0.0053 | 2.737 | 1.8–14.2 | 5 |
Psychomyiidae | 0.0028 | 2.787 | 2.2–14.6 | 3 |
Rhyacophilidae | 0.0047 | 2.997 | 2.4–24.0 | 6 |
Sericostomatidae | 0.0203 | 2.367 | 1.5–15.0 | 3 |
Stenopsychidae | 0.0560 | 2.290 | 2.0–50.0 | 1 |
- Note: Includes parameter values of a and b for power regressions as means weighted by the number of equations reported, the length range of specimens measured in mm, and N (the number of equations used in calculating the parameters).
Data availability worldwide was scarce, with regression parameters for 34% of the insect families (84 of the in total 246 described families in the eight covered orders), and only 6% at the genus level (196 of the estimated worldwide total of 3151 insect genera in the studied orders, not including Diptera). Availability differed strongly among continents. No regression parameters were found for Africa and Antarctica, and only 12% of the South American families were covered, which increased to 16% for Asia, 20% for Oceania, 36% for Europe and 40% for North America (Figure 1). At the genus level, parameter values were available for 2% of the genera in Asia, 2% in South America, 4% in Oceania, 9% in Europe and 9% in North America. Outside Europe and North America, Odonata, Hemiptera and Megaloptera/Neuroptera are particularly understudied (Figure 1). The parameter values that were available showed considerable variation among geographic regions and among families within orders (visualised in Figures 2 and 3).



4 Discussion
Length–mass relationships are a powerful tool to derive mass estimates of invertebrates, but obtaining suitable model parameters from literature sources can be time-consuming. We provide a digital database that contains regression parameters for eight insect orders, 84 families, and 233 insect genera across five continents. The database provides length ranges for all specimens to prohibit estimating the dry mass of individuals beyond these ranges, because this can lead to serious errors when mass increases are size-dependent (Johnston and Cunjak 1999). The database revealed considerable variation in data availability among geographic regions and taxonomic groups.
4.1 Variation in Data Availability
Parameter data were completely lacking for Africa and Antarctica. From the remaining continents, Oceania and South America had the fewest available parameter values at the family level. The poor representation of genus-level parameter data across our entire dataset—and even less on the species level—explains the frequent use of less precise order or family-level parameters in the literature. This variation may be due to the presence of extensive morphological variation within orders among families, or due to variation within families among genera. Order-level parameter values can provide relevant body mass estimates (Smock 1980; Meyer 1989; Poepperl 1998; Bried and Ervin 2007), but it is important to note that regression parameters can add considerable variation to data. Still, any inaccuracy of regression parameters should be seen in the light of other variations, such as inaccuracy of density estimates that may arise from variation in sampling effort (Méthot et al. 2012).
Information on genus-level parameter availability can help guide the focus of future studies. For example, for the aquatic insects described for Midwest North America (Bouchard et al. 2004), data were available for all 7 Odonata, all 8 Plecoptera and both Megaloptera families, 15 of the 17 Trichoptera, and 11 of 13 Ephemeroptera families. However, data at the family level were only available for 3 of the 13 Hemiptera families, 5 of 8 Coleoptera, and 8 of 19 Diptera families. In contrast, for insects in Northern Europe (Nilsson 1997), no European data were available for the 11 Hemiptera families. Only 2 of the 16 Coleoptera, 1 of the 3 Megaloptera/Neuroptera, and 9 of the 24 Diptera families were represented. Data availability was much better for the Plecoptera (6 of 7), Odonata (5 of 9), Trichoptera (13 of 19) and Ephemeroptera (6 of 9). This type of information can be used a priori during study designs or the selection of target species.
4.2 Variation in Regression Parameter Values
In the dataset, we detected considerable variation in parameter values, which likely reflects (1) actual variation among taxa and regions (e.g., Benke et al. 1999; Hajiesmaeili et al. 2019), (2) possible variation due to differences in the handling of specimens, such as variation in preservation methods (Johnston and Cunjak 1999), and (3) variation in research focus between particular well-studied taxa and regions, versus those that are understudied. Regarding the latter, our study highlights a highly uneven representation of taxa and geographical regions in the available dataset, which prohibits a reliable statistical analysis of parameter variation across taxa and regions at a global scale, beyond what has been done (e.g., Smock 1980; Schoener 1980; Meyer 1989; Poepperl 1998; Bried and Ervin 2007; Martin et al. 2014). However, visual inspection of the relationships suggests some overarching patterns that may help steer future research priorities.
For instance, based on the currently available data, Coleoptera seems to show more variation in the scaling parameter a than in the exponent parameter b. This implies that the use of a less precise exponent parameter is less critical than the use of a less precise scaling parameter: using exponent parameters at the order level (instead of genus or family level) or from other geographic regions may be relatively more appropriate than using these as a scaling parameter. However, given the great variety of body shapes in the aquatic life stages of the large order of Coleoptera, using the order level or even family level equation may be less appropriate than in other orders. Important for this order is that studies clearly report the life stage of the specimens to avoid confusion since both the larvae and adults are aquatic and exhibit highly different body shapes and sizes. We therefore report the size ranges of the captured specimens for all data, as some parameters may not represent the species in its entire life stage and regressions should only be considered valid for the size ranges on which they are based. Furthermore, variation within larger orders such as Trichoptera (50 families, 605 genera), Odonata (46 families, 670 genera) and Ephemeroptera (44 families, 484 genera) may introduce more variance on order-level estimates than in smaller orders such as Plecoptera (17 families, 310 genera) or Megaloptera/Neuroptera (5 families, 50 genera), although current data availability prohibits testing this assumption. The fact that for Hemiptera there are only very few studies available outside North America may be useful to know a priori for studies that are planning work on the biomass of Hemiptera on other continents (i.e., these should realise that accurate regressions will be difficult to obtain for their study area).
Geographic variation in parameter values seemed particularly strong in Coleoptera, Diptera, Ephemeroptera, and Plecoptera. However, these orders are also the orders with the most data available, providing more power to detect geographic variation. This geographic variation is also expected to be related to temperature, following the temperature–size rule (Atkinson 1995), which states that in ectotherms, higher temperatures can lead to smaller adults or metamorphic body sizes—and may affect growth and consequently morphology. Variation in length–mass relationships within the same taxonomic group can thus also be driven by environmental variation within regions, as a consequence of, for example, altitude or latitude. All in all, our data emphasise that there is a need for (1) more data on variation among families that is collected standardised within the same geographic regions, (2) targeted collections of different orders in the same study areas, to reveal variation among genera, and (3) careful reporting of the exact methodologies used, to allow detecting and mitigating variation introduced due to different methods (e.g., preservation methods, sample sizes, drying and cooling processes). Consortia of researchers from multiple biogeographic regions would be valuable to assess geographic variation at local, regional, and continental scales within the same genera and families.
Concluding, we here provide a digital database with currently available regression parameters that can be applied to existing body length data and may be of use during the design of new studies. The database can be used to identify knowledge gaps and pave the road for additional data collections. It remains important to keep updating the database, as new variations can also arise due to changes in habitat conditions, global change, or species introductions. This is especially true given the low proportions of all taxonomic richness that exist that we have thus far covered at fine taxonomic resolutions.
Author Contributions
Casper H.A. van Leeuwen and Steven A.J. Declerck: conceptualisation. Casper H.A. van Leeuwen and Zoltán Csabai: developing methods. Casper H.A. van Leeuwen, Zoltán Csabai, and Steven A.J. Declerck: data analysis, data interpretation. Casper H.A. van Leeuwen, Anita Szloboda, and Arnold Móra: preparation of figures and tables. Casper H.A. van Leeuwen, Zoltán Csabai, Anita Szloboda, Arnold Móra, and Steven A.J. Declerck: conducting the research, writing.
Acknowledgements
We thank Gabriel Singer, Thibault Datry, and Nuria Bonada for enabling the work within the EU project DRYvER. We thank Bea Bartalovics (UP), Edurne Estévez, Dorottya Hárságyi (UP), Patrik Kis (UP), Zsolt Kovács (UP) and Zsuzsanna Pap (UP) for practical assistance and one anonymous reviewer for constructive comments.
Conflicts of Interest
The authors declare no conflicts of interest.
Open Research
Data Availability Statement
All data used in this manuscript is available through Figshare (Van Leeuwen et al. 2025).