EUNIS Habitat Classification: Expert system, characteristic species combinations and distribution maps of European habitats
Funding information
The previous versions of the expert system and related reports were produced within a contract from the European Environment Agency to Wageningen Environmental Research and Masaryk University. The opinions expressed are those of the contractor and do not represent the Agency's official position. EVA data management and preparation of this paper were supported by the Czech Science Foundation (project no. 19-28491X to MC, LT, IK, TP, CM, JDa, MH, PN, DZ, GB, AJ, AKu, ZL and DV). IB and JAC were supported by the Basque Government (project no. T936-16). TB, ET, and LK were supported by the Ministry of Science and Higher Education of the Russian Federation (TB and ET project no. AAAA-A18-118052590019-7; LK project no. AAAA-A19-119012490096-2).
Abstract
Aim
The EUNIS Habitat Classification is a widely used reference framework for European habitat types (habitats), but it lacks formal definitions of individual habitats that would enable their unequivocal identification. Our goal was to develop a tool for assigning vegetation-plot records to the habitats of the EUNIS system, use it to classify a European vegetation-plot database, and compile statistically-derived characteristic species combinations and distribution maps for these habitats.
Location
Europe.
Methods
We developed the classification expert system EUNIS-ESy, which contains definitions of individual EUNIS habitats based on their species composition and geographic location. Each habitat was formally defined as a formula in a computer language combining algebraic and set-theoretic concepts with formal logical operators. We applied this expert system to classify 1,261,373 vegetation plots from the European Vegetation Archive (EVA) and other databases. Then we determined diagnostic, constant and dominant species for each habitat by calculating species-to-habitat fidelity and constancy (occurrence frequency) in the classified data set. Finally, we mapped the plot locations for each habitat.
Results
Formal definitions were developed for 199 habitats at Level 3 of the EUNIS hierarchy, including 25 coastal, 18 wetland, 55 grassland, 43 shrubland, 46 forest and 12 man-made habitats. The expert system classified 1,125,121 vegetation plots to these habitat groups and 73,188 to other habitats, while 63,064 plots remained unclassified or were classified to more than one habitat. Data on each habitat were summarized in factsheets containing habitat description, distribution map, corresponding syntaxa and characteristic species combination.
Conclusions
EUNIS habitats were characterized for the first time in terms of their species composition and distribution, based on a classification of a European database of vegetation plots using the newly developed electronic expert system EUNIS-ESy. The data provided and the expert system have considerable potential for future use in European nature conservation planning, monitoring and assessment.
1 INTRODUCTION
Comprehensive systems of classification of natural, semi-natural and man-made habitat types (hereafter also “habitats”) are essential tools for nature conservation. They are important for designing networks of protected areas, conducting inventories of natural areas, monitoring, management planning, environmental impact assessment and setting targets for ecological restoration. The EUNIS (European Nature Information System) Habitat Classification, developed by the European Topic Centre for Biodiversity for the European Environment Agency (EEA) in the 1990s and early 2000s (Davies and Moss, 1998; Davies et al., 2004; Moss, 2008), is the main comprehensive pan-European hierarchical classification of habitats covering both the marine and terrestrial realms (Evans, 2012; Rodwell et al., 2018). It is extensively used in research and for various applications, including the implementation of European Community directives related to environmental protection (Vilà et al., 2007; Chytrý et al., 2008; De Graaf et al., 2009; Strasser and Lang, 2015; Adamo et al., 2016; Gigante et al., 2018; Hämmerle et al., 2018). It has also become one of the key elements for the European Directive 2007/2/EC on Infrastructure for Spatial Information in the European Union (INSPIRE, 2013) and the updated version of Resolution 4 of the Bern Convention on the Conservation of European Wildlife and Natural Habitats, which is the legislative basis for the Emerald network — a complement of the Natura 2000 network in the European countries that are not members of the European Union (Council of Europe, 2018).
Terrestrial habitats in EUNIS are often based on phytosociological vegetation types, such as those defined in EuroVegChecklist (Mucina et al., 2016; Rodwell et al., 2018). However, while phytosociological classification is mainly based on species composition and vegetation structure (De Cáceres et al., 2015), the EUNIS Habitat Classification also emphasizes the abiotic environment and geographic location as classification criteria. It also includes habitats in which plants are nearly or entirely absent. Still, most of the terrestrial habitats of EUNIS can be successfully defined using methods of vegetation science.
In recent years, the EEA recognized the EUNIS Habitat Classification as a key tool for assessing progress towards the European Union biodiversity targets and global Aichi targets. EUNIS became a European reference to which national and regional classifications and various data sets could be linked in the framework of the INSPIRE Directive. As such, EUNIS enables structured dialogue between different networks of experts, including those describing habitats through in-situ vegetation sampling, those working with satellite imagery, and those developing and evaluating various policies.
To improve these uses of the EUNIS Habitat Classification, the EEA initiated a process of its revision at Level 3 (for the terrestrial realm) and 4 (for the marine realm) of the classification hierarchy. This revision established more consistency, removed ambiguity and overlaps in definitions of types, and extended the typology to the entire European continent and adjacent seas, although still with some gaps especially in eastern Europe (Russia and some adjacent countries). The proposals for revision of grassland, shrubland and forest habitat classification were summarized in a series of reports (Schaminée et al., 2012, 2013, 2014, 2016a), and a preliminary version of the revised EUNIS Habitat Classification was used in the project European Red List of Habitats (Janssen et al., 2016). The revisions included additions of new units, splitting or merging existing units and changes in habitat names and definitions. The review of the revised EUNIS classification has undergone public consultations with international experts and country representatives of Eionet, a partnership network of the European Environment Agency (https://www.eionet.europa.eu/). The public consultations resulted in further changes in the delimitation of individual habitats and their names. Based on the consultation proposals, a refinement of the classification for grassland, shrubland and forest habitats was made by Schaminée et al. (2018), for coastal and wetland habitats by Schaminée et al. (2019) and for vegetated man-made habitats by Schaminée et al. (2020). The work on the remaining sections is under way.
The recent compilation of the European Vegetation Archive (EVA; Chytrý et al., 2016), a continent-wide integrated electronic database of vegetation-plot records, and the development of computer expert systems for classifying huge data sets of this kind (Bruelheide, 1997, 2000; Kočí et al., 2003; Chytrý, 2007–2013; Landucci et al., 2015; Mucina et al., 2016; Tichý et al., 2019) have opened up an avenue towards characterizing European habitats based on in-situ data. Classification expert systems assign individual vegetation plots to already established classification systems. This type of classification can also be called identification. It is particularly relevant for the EUNIS Habitat Classification because once a large number of vegetation plots from different parts of Europe are consistently assigned to habitats, exact characterization of species composition, distribution and environmental relationships of these habitats can be provided. This is of great importance for practitioners because so far the EUNIS habitats were only characterized by brief and often rather unclear textual descriptions and lists of units taken without revision from previous classifications such as CORINE Biotopes or Palaearctic Habitat Classification (Rodwell et al., 2018). Such superficial, and in some cases inconsistent, characterization confused the meaning of the EUNIS habitats. Therefore, the current interpretation of the same habitat type can vary among European countries.
Our aims here are to: (a) develop a classification expert system for automatic assignment of vegetation-plot records to coastal, wetland, grassland, shrubland, forest and man-made habitats of the revised EUNIS Habitat Classification at Level 3 of the classification hierarchy; (b) base this system on algebraic and set-theoretic concepts combined using formal logic; (c) assign all available European vegetation plots to EUNIS habitats; (d) define the characteristic species combination for each habitat based on a statistical analysis of the plots assigned to this habitat by the expert system; and (e) provide distribution maps of individual habitats based on the location of vegetation plots assigned to these habitats.
2 METHODS
2.1 Revised EUNIS habitat classification
- N — Coastal habitats
- Q — Wetlands
- R — Grasslands and lands dominated by forbs, mosses or lichens (called “Grasslands” in this paper)
- S — Heathlands, scrub and tundra (called “Shrublands” in this paper)
- T — Forests and other wooded land (called “Forests” in this paper)
- V — Vegetated man-made habitats (called “Man-made habitats” in this paper)
A list of the individual habitats belonging to these six groups is provided in Table 1, and habitat factsheets with their descriptions and corresponding phytosociological alliances of EuroVegChecklist (Mucina et al., 2016) are in Appendix S1. We prepared the lists of corresponding alliances based on expert judgement by comparing the basic characteristics of the EUNIS habitats with the EuroVegChecklist alliances. These lists can help understand the content of individual EUNIS habitats to those scientists and practitioners who are familiar with European phytosociological classification. However, although the EUNIS habitat classification was, to a large extent, inspired by the phytosociological classification system, it developed independently of it, which implies that the phytosociological alliances are often not nested within habitat types. The “one-to-many” or “many-to-one” relationships of the EUNIS habitats to the EuroVegChecklist alliances are much more common than simple one-to-one matches.
EUNIS2020 code | EUNIS2007 code | Red List code | EUNIS2020 habitat name | No. of plots |
---|---|---|---|---|
N | B | B | Coastal habitats | 32,399 (+0) |
N1 | B1 | B1 | Coastal dunes and sandy shores | 28,923 |
N11 | B1.1; B1.2 | B1.1a | Atlantic, Baltic and Arctic sand beach | 558 |
N12 | B1.1; B1.2 | B1.1b | Mediterranean and Black Sea sand beach | 1,417 |
N13 | B1.31; B1.311; B1.321 | B1.3a | Atlantic and Baltic shifting coastal dune | 4,479 |
N14 | B1.3 | B1.3b | Mediterranean, Macaronesian and Black Sea shifting coastal dune | 5,542 |
N15 | B1.4 | B1.4a | Atlantic and Baltic coastal dune grassland (grey dune) | 3,102 |
N16 | B1.4 | B1.4b | Mediterranean and Macaronesian coastal dune grassland (grey dune) | 4,797 |
N17 | B1.4 | B1.4c | Black Sea coastal dune grassland (grey dune) | 967 |
N18 | B1.5; B1.51 | B1.5a | Atlantic and Baltic coastal Empetrum heath | 185 |
N19 | B1.5 | B1.5b | Atlantic coastal Calluna and Ulex heath | 147 |
N1A | B1.6 | B1.6a | Atlantic and Baltic coastal dune scrub | 1,650 |
N1B | B1.6 | B1.6b | Mediterranean and Black Sea coastal dune scrub | 368 |
N1C | B1.6 | B1.6c | Macaronesian coastal dune scrub | 71 |
N1D | B1.7; B1.72 | B1.7a | Atlantic and Baltic broad-leaved coastal dune forest | 657 |
N1E | B1.7 | B1.7b | Black Sea broad-leaved coastal dune forest | 16 |
N1F | B1.7; B1.71 | B1.7c | Baltic coniferous coastal dune forest | 482 |
N1G | B1.7; B1.74 | B1.7d | Mediterranean coniferous coastal dune forest | 184 |
N1H | B1.8 | B1.8a | Atlantic and Baltic moist and wet dune slack | 4,010 |
N1J | B1.8 | B1.8b | Mediterranean and Black Sea moist and wet dune slack | 291 |
N2 | B2 | B2 | Coastal shingle | 617 |
N21 | B2.1; B2.2; B2.3; B2.4 | B2.1a | Atlantic, Baltic and Arctic coastal shingle beach | 540 |
N22 | B2.1; B2.2; B2.3; B2.4 | B2.1b | Mediterranean and Black Sea coastal shingle beach | 77 |
N23 | B2.5 | – | *Shingle and gravel beach with scrub | – |
N24 | B2.6 | – | *Shingle and gravel beach forest | – |
N3 | B3 | B3 | Rock cliffs, ledges and shores, including the supralittoral | 2,859 |
N31 | B3.2; B3.3 | B3.1a | Atlantic and Baltic rocky sea cliff and shore | 1,287 |
N32 | B3.2; B3.3 | B3.1b | Mediterranean and Black Sea rocky sea cliff and shore | 1,343 |
N33 | B3.2; B3.3 | B3.1c | Macaronesian rocky sea cliff and shore | 61 |
N34 | B3.4 | B3.4a | Atlantic and Baltic soft sea cliff | 71 |
N35 | B3.4 | B3.4b | Mediterranean and Black Sea soft sea cliff | 97 |
Q | D | D | Wetlands | 84,066 (+28,864) |
Q1 | D1 | D1 | Raised and blanket bogs | 5,496 |
Q11 | D1.1 | D1.1 | Raised bog | 4,488 |
Q12 | D1.2 | D1.2 | Blanket bog | 1,008 |
Q2 | D2 | D2 | Valley mires, poor fens and transition mires | 20,579 |
Q21 | D2.1 | D2.1 | Oceanic valley mire | 1,786 |
Q22 | D2.2; D2.3 | D2.2a | Poor fen | 5,766 |
Q23 | D2.2 | D2.2b | Relict mire of Mediterranean mountains | 170 |
Q24 | D2.2 | D2.2c | Intermediate fen and soft-water spring mire | 6,885 |
Q25 | D2.3 | D2.3a | Non-calcareous quaking mire | 5,972 |
Q3 | D3 | D3 | Palsa and polygon mires | 298 |
Q31 | D3.1 | D3.1 | *Palsa mire | – |
Q32 | D3.3 | D3.3 | *Polygon mire | – |
Q4 | D4 | D4 | Base-rich fens and calcareous spring mires | 12,285 |
Q41 | D4.1 | D4.1a | Alkaline, calcareous, carbonate-rich small-sedge spring fen | 4,811 |
Q42 | D4.1 | D4.1a | Extremely rich moss–sedge fen | 2,840 |
Q43 | D4.1 | D4.1b | Tall-sedge base-rich fen | 1,768 |
Q44 | D4.1 | D4.1c | Calcareous quaking mire | 1,727 |
Q45 | D4.2 | D4.2 | Arctic–alpine rich fen | 1,102 |
Q46 | D6.14 | D.6.14 | Carpathian travertine fen with halophytes | 37 |
Q5 | Q5 | Q5 | Helophyte beds | 45,408 |
Q51 | C3.2; D5.1 | C5.1a | Tall-helophyte bed | 24,003 |
Q52 | C3.1; C3.4 | C5.1b | Small-helophyte bed | 13,000 |
Q53 | C3.2; D5.2 | C5.2 | Tall-sedge bed | 7,632 |
Q54 | D6.2 | C5.4 | Inland saline or brackish helophyte bed | 773 |
R | E | E | Grasslands and lands dominated by forbs, mosses or lichens | 291,558 (+187,471) |
R1 | E1 | E1 | Dry grasslands | 94,045 |
R11 | E1.1 | E1.1a | Pannonian and Pontic sandy steppe | 706 |
R12 | E1.1 | E1.1b | Cryptogam- and annual-dominated vegetation on siliceous rock outcrops | 589 |
R13 | E1.1 | E1.1d | Cryptogam- and annual-dominated vegetation on calcareous and ultramafic rock outcrops | 2,577 |
R14 | E1.1 | E1.1e | Perennial rocky grassland of the Italian Peninsula | 690 |
R15 | E1.1 | E1.1f | Continental dry rocky steppic grassland and dwarf scrub on chalk outcrops | 543 |
R16 | E1.1 | E1.1g | Perennial rocky grassland of central and southeastern Europe | 6,326 |
R17 | E1.1 | E1.1h | Heavy-metal dry grassland of the Balkans | 75 |
R18 | E1.1 | E1.1i | Perennial rocky calcareous grassland of subatlantic–submediterranean Europe | 3,485 |
R19 | E1.1 | E1.1j | Dry steppic submediterranean pasture of the Amphi-Adriatic region | 374 |
R1A | E1.2 | E1.2a | Semi-dry perennial calcareous grassland (meadow steppe) | 43,885 |
R1B | E1.2 | E1.2b | Continental dry grassland (true steppe) | 15,775 |
R1C | E1.2 | E1.2c | Desert steppe | 959 |
R1D | E1.3 | E1.3a | Mediterranean closely grazed dry grassland | 680 |
R1E | E1.3 | E1.3b | Mediterranean tall perennial dry grassland | 2,347 |
R1F | E1.3 | E1.3c | Mediterranean annual-rich dry grassland | 1,142 |
R1G | E1.5 | E1.5a | Iberian oromediterranean siliceous dry grassland | 564 |
R1H | E1.5 | E1.5b | Iberian oromediterranean basiphilous dry grassland | 1,193 |
R1J | E1.5 | E1.5c | Cyrno-Sardean oromediterranean siliceous dry grassland | 80 |
R1K | E1.5 | E1.5d | Balkan and Anatolian oromediterranean dry grassland | 137 |
R1L | E1.5 | E1.5e | Madeiran oromediterranean siliceous dry grassland | 14 |
R1M | E1.7 | E1.7 | Lowland to montane, dry to mesic grassland usually dominated by Nardus stricta | 2,109 |
R1N | E1.8 | E1.8 | Open Iberian supramediterranean dry acid and neutral grassland | 148 |
R1P | E1.9 | E1.9a | Oceanic to subcontinental inland sand grassland on dry acid and neutral soils | 5,254 |
R1Q | E1.9 | E1.9b | Inland sanddrift and dune with siliceous grassland | 2,043 |
R1R | E1.A | E1.A | Mediterranean to Atlantic open, dry, acid and neutral grassland | 2,209 |
R1S | E1.B | E1.B | Heavy-metal grassland in western and central Europe | 134 |
R1T | E1.E | E1.F | Azorean open, dry, acid to neutral grassland | 7 |
R2 | E2 | E2 | Mesic grasslands | 93,085 |
R21 | E2.1 | E2.1 | Mesic permanent pasture of lowlands and mountains | 24,859 |
R22 | E2.2 | E2.2 | Low and medium altitude hay meadow | 63,036 |
R23 | E2.3 | E2.3 | Mountain hay meadow | 5,018 |
R24 | E2.4 | E2.4 | Iberian summer pasture (vallicar) | 172 |
R3 | E3 | E3 | Seasonally wet and wet grasslands | 52,393 |
R31 | E3.1 | E3.1a | Mediterranean tall humid inland grassland | 1,443 |
R32 | E3.2 | E3.2a | Mediterranean short moist grassland of lowlands | 133 |
R33 | E3.2 | E3.2b | Mediterranean short moist grassland of mountains | 785 |
R34 | E3.3 | E3.3 | Submediterranean moist meadow | 800 |
R35 | E3.4 | E3.4a | Moist or wet mesotrophic to eutrophic hay meadow | 26,209 |
R36 | E3.4 | E3.4b | Moist or wet mesotrophic to eutrophic pasture | 13,664 |
R37 | E3.5 | E3.5 | Temperate and boreal moist or wet oligotrophic grassland | 9,359 |
R4 | E4 | E4 | Alpine and subalpine grasslands | 21,962 |
R41 | E4.1 | E4.1 | Snow-bed vegetation | 1,562 |
R42 | E4.3 | E4.3a | Boreal and Arctic acidophilous alpine grassland | 247 |
R43 | E4.3 | E4.3b | Temperate acidophilous alpine grassland | 13,668 |
R44 | E4.4 | E4.4a | Arctic–alpine calcareous grassland | 5,193 |
R45 | E4.4 | E4.4b | Alpine and subalpine calcareous grassland of the Balkans and Apennines | 1,292 |
R5 | E5 | E5 | Woodland fringes and clearings and tall forb stands | 25,242 |
R51 | E5.2 | E5.2a | Thermophilous forest fringe of base-rich soils | 1,230 |
R52 | E5.2 | E5.2b | Forest fringe of acidic nutrient-poor soils | 571 |
R53 | E5.2 | E5.2c | Macaronesian thermophilous forest fringe | 46 |
R54 | E5.3 | E5.3 | Pteridium aquilinum vegetation | 1,235 |
R55 | E5.4 | E5.4 | Lowland moist or wet tall-herb and fern fringe | 16,903 |
R56 | E5.5 | E5.5 | Montane to subalpine moist or wet tall-herb and fern fringe | 2,936 |
R57 | E5.6 | E5.6 | Herbaceous forest clearing vegetation | 2,321 |
R6 | E6 | E6 | Inland salt steppes | 4,831 |
R61 | E6.1 | E6.1 | Mediterranean inland salt steppe | 410 |
R62 | E6.2 | E6.2 | Continental inland salt steppe | 2,286 |
R63 | D6.1 | E6.3 | Temperate inland salt marsh | 1,753 |
R64 | E6.4 | E6.4 | Semi-desert salt pan | 298 |
R65 | E6.5 | E6.5 | Continental subsaline alluvial pasture and meadow | 84 |
R7 | E7 | E7 | Sparsely wooded grasslands | – |
R71 | E7.1 | E7.1 | *Temperate wooded pasture and meadow | – |
R72 | E7.2 | E7.2 | *Hemiboreal and boreal wooded pasture and meadow | – |
R73 | E7.3 | E7.3 | *Mediterranean wooded pasture and meadow | – |
S | F | F | Heathlands, scrub and tundra | 49,089 (+25,462) |
S1 | F1 | F1 | Tundra | 1,065 |
S11 | F1.1 | F1.1 | Shrub tundra | 1,027 |
S12 | F1.2 | F1.2 | Moss and lichen tundra | 38 |
S2 | F2 | F2 | Arctic, alpine and subalpine scrub | 12,812 |
S21 | F2.1 | F2.1 | Subarctic and alpine dwarf Salix scrub | 1,832 |
S22 | F2.2 | F2.2a | Alpine and subalpine ericoid heath | 6,466 |
S23 | F2.2 | F2.2b | Alpine and subalpine Juniperus scrub | 1,412 |
S24 | F2.2 | F2.2c | Subalpine genistoid scrub of the Amphi-Adriatic region | 116 |
S25 | F2.3 | F2.3 | Subalpine and subarctic deciduous scrub | 1,164 |
S26 | F2.4 | F2.4 | Subalpine Pinus mugo scrub | 1,822 |
S27 | – | – | *Krummholz with conifers other than Pinus mugo | – |
S3 | F3 | F3 | Temperate and Mediterranean montane scrub | 11,885 |
S31 | F3.1 | F3.1a | Lowland to montane temperate and submediterranean Juniperus scrub | 792 |
S32 | F3.1 | F3.1b | Temperate Rubus scrub | 2,203 |
S33 | F3.1 | F3.1c | Lowland to montane temperate and submediterranean genistoid scrub | 1,670 |
S34 | F3.1 | F3.1d | Balkan-Anatolian submontane genistoid scrub | 5 |
S35 | F3.2 | F3.1e | Temperate and submediterranean thorn scrub | 4,371 |
S36 | F3.2 | F3.1f | Low steppic scrub | 673 |
S37 | F3.2 | F3.1g | Corylus avellana scrub | 1,286 |
S38 | F3.2 | F3.1h | Temperate forest clearing scrub | 885 |
S4 | F4 | F4 | Temperate heathland | 7,568 |
S41 | F4.1 | F4.1 | Wet heath | 1,973 |
S42 | F4.2 | F4.2 | Dry heath | 5,452 |
S43 | F4.3 | F4.3 | Macaronesian heath | 143 |
S5 | F5 | F5 | Maquis, arborescent matorral and thermo-Mediterranean scrub | 4,303 |
S51 | F5.1 | F5.1 | Mediterranean maquis and arborescent matorral | 3,149 |
S52 | F5.3 | F5.3 | Submediterranean pseudomaquis | 656 |
S53 | F5.4 | F5.4 | Spartium junceum scrub | 190 |
S54 | F5.5 | F5.5 | Thermomediterranean arid scrub | 308 |
S6 | F6 | F6 | Garrigue | 2,474 |
S61 | F6.1 | F6.1a | Western basiphilous garrigue | 1,169 |
S62 | F6.1 | F6.1b | Western acidophilous garrigue | 143 |
S63 | F6.2 | F6.2 | Eastern garrigue | 495 |
S64 | F6.6 | F6.5 | Macaronesian garrigue | 87 |
S65 | F6.7 | F6.7 | Mediterranean gypsum scrub | 93 |
S66 | F6.8 | F6.8a | Mediterranean halo-nitrophilous scrub | 214 |
S67 | F6.8 | F6.8b | Aralo-Caspian semi-desert | 225 |
S68 | F6.8 | F6.8c | Semi-desert sand dune with sparse scrub | 48 |
S7 | F7 | F7 | Spiny Mediterranean heaths | 1,743 |
S71 | F7.1 | F7.1 | Western Mediterranean spiny heath | 114 |
S72 | F7.3 | F7.3 | Eastern Mediterranean spiny heath (phrygana) | 322 |
S73 | F7.4 | F7.4a | Western Mediterranean mountain hedgehog-heath | 134 |
S74 | F7.4 | F7.4b | Central Mediterranean mountain hedgehog-heath | 458 |
S75 | F7.4 | F7.4c | Eastern Mediterranean mountain hedgehog-heath | 518 |
S76 | F7.4 | F7.4d | Canarian mountain hedgehog-heath | 197 |
S8 | F8 | F8 | Thermo-Atlantic xerophytic scrub | 413 |
S81 | F8.1 | F8.1 | Canarian xerophytic scrub | 391 |
S82 | F8.2 | F8.2 | Madeiran xerophytic scrub | 22 |
S9 | F9 | F9 | Riverine and fen scrub | 6,826 |
S91 | F9.1 | F9.1 | Temperate riparian scrub | 2,272 |
S92 | F9.2 | F9.2 | Salix fen scrub | 3,880 |
S93 | F9.3 | F9.3 | Mediterranean riparian scrub | 626 |
S94 | F9.4 | F9.4 | Semi-desert riparian scrub | 48 |
T | G | G | Forests and other wooded land | 246,926 (+91,345) |
T1 | G1 | G1 | Broadleaved deciduous forests | 154,277 |
T11 | G1.1 | G1.1 | Temperate Salix and Populus riparian forest | 3,171 |
T12 | G1.2 | G1.2a | Alnus glutinosa–Alnus incana forest on riparian and mineral soils | 9,731 |
T13 | G1.2 | G1.2b | Temperate hardwood riparian forest | 9,478 |
T14 | G1.3 | G1.3 | Mediterranean and Macaronesian riparian forest | 981 |
T15 | G1.4 | G1.4 | Broadleaved swamp forest on non-acid peat | 2,148 |
T16 | G1.5 | G1.5 | Broadleaved mire forest on acid peat | 3,665 |
T17 | G1.6 | G1.6a | Fagus forest on non-acid soils | 37,719 |
T18 | G1.6 | G1.6b | Fagus forest on acid soils | 9,721 |
T19 | G1.7 | G1.7a | Temperate and submediterranean thermophilous deciduous forest | 20,452 |
T1A | G1.7 | G1.7b | Mediterranean thermophilous deciduous forest | 658 |
T1B | G1.8 | G1.8 | Acidophilous Quercus forest | 13,212 |
T1C | G1.9 | G1.9a | Temperate and boreal mountain Betula and Populus tremula forest on mineral soils | 609 |
T1D | G1.9 | G1.9b | Southern European mountain Betula and Populus tremula forest on mineral soils | 375 |
T1E | G1.A | G1.Aa | Carpinus and Quercus mesic deciduous forest | 30,529 |
T1F | G1.A | G1.Ab | Ravine forest | 8,005 |
T1G | G1.B | G1.Ba | Alnus cordata forest | 100 |
T1H | G1.C | – | Broadleaved deciduous plantation of non-site-native trees | 3,723 |
T1J | – | – | *Deciduous self-sown forest of non-site-native trees | – |
T1K | G1.C | – | *Broadleaved deciduous plantation of site-native trees | – |
T2 | G2 | G2 | Broadleaved evergreen forests | 12,109 |
T21 | G2.1 | G2.1 | Mediterranean evergreen Quercus forest | 10,222 |
T22 | G2.2 | G2.2 | Mainland laurophyllous forest | 229 |
T23 | G2.3 | G2.3 | Macaronesian laurophyllous forest | 94 |
T24 | G2.4 | G2.4 | Olea europaea-Ceratonia siliqua forest | 1,220 |
T25 | G2.5 | G2.5a | Phoenix theophrasti vegetation | 27 |
T26 | G2.5 | G2.5b | Phoenix canariensis vegetation | 3 |
T27 | G2.6 | G2.6 | Ilex aquifolium forest | 223 |
T28 | G2.7 | G2.7 | Macaronesian heathy forest | 56 |
T29 | G2.8 | – | Broadleaved evergreen plantation of non-site-native trees | 35 |
T2A | G2.8 | – | *Broadleaved evergreen plantation of site-native trees | – |
T3 | G3 | G3 | Coniferous forests | 80,540 |
T31 | G3.1 | G3.1a | Temperate mountain Picea forest | 11,531 |
T32 | G3.1 | G3.1b | Temperate mountain Abies forest | 10,039 |
T33 | G3.1 | G3.1c | Mediterranean mountain Abies forest | 422 |
T34 | G3.2 | G3.2 | Temperate subalpine Larix, Pinus cembra and Pinus uncinata forest | 3,136 |
T35 | G3.4; G3.5 | G3.4a | Temperate continental Pinus sylvestris forest | 8,402 |
T36 | G3.4; G3.5 | G3.4b | Temperate and submediterranean montane Pinus sylvestris–Pinus nigra forest | 2,171 |
T37 | G3.4; G3.5 | G3.4c | Mediterranean montane Pinus sylvestris–Pinus nigra forest | 700 |
T38 | G3.9 | G3.4d | Mediterranean montane Cedrus forest | 424 |
T39 | G3.6 | G3.6 | Mediterranean and Balkan subalpine Pinus heldreichii–Pinus peuce forest | 339 |
T3A | G3.7 | G3.7 | Mediterranean lowland to submontane Pinus forest | 7,179 |
T3B | G3.8 | G3.8 | Pinus canariensis forest | 659 |
T3C | G3.9 | G3.9a | Taxus baccata forest | 261 |
T3D | G3.9 | G3.9b | Mediterranean Cupressaceae forest | 654 |
T3E | G3.9 | G3.9c | Macaronesian Juniperus forest | 25 |
T3F | G3.A | G3.A | Dark taiga | 8,422 |
T3G | G3.B | G3.B | Pinus sylvestris light taiga | 8,538 |
T3H | G3.C | G3.C | Larix light taiga | 86 |
T3J | G3.D; G3.E | G3.Da | Pinus and Larix mire forest | 5,233 |
T3K | G3.D; G3.E | G3.Db | Picea mire forest | 2,388 |
T3L | – | – | *Coniferous self-sown forest of non-site-native trees | – |
T3M | G3.F1 | – | Coniferous plantation of non-site-native trees | 9,931 |
T3N | G3.F2 | – | *Coniferous plantation of site-native trees | – |
T4 | G5 | – | Lines of trees, small anthropogenic forests, recently felled forest, early-stage forest and coppice | – |
T41 | G5.6 | – | *Early-stage natural and semi-natural forest and regrowth | – |
T42 | G5.7 | – | *Coppice and early-stage plantation | – |
T43 | G5.8 | – | *Recently felled areas | – |
V | I | – | Vegetated man-made habitats | 79,139 (+8,802) |
V1 | I1 | – | Arable land and market gardens | 35,615 |
V11 | I1.1 | – | Intensive unmixed crops | 7,268 |
V12 | I1.2 | – | Mixed crops of market gardens and horticulture | 431 |
V13 | I1.3 | – | Arable land with unmixed crops grown by low-intensity agricultural methods | 3,439 |
V14 | I1.4 | – | Inundated or inundatable cropland, including rice fields | 116 |
V15 | I15 | – | Bare tilled, fallow or recently abandoned arable land | 24,361 |
V2 | I2 | – | *Cultivated areas of gardens and parks | – |
V21 | I2.1 | – | *Large-scale ornamental garden areas | – |
V22 | I2.2 | – | *Small-scale ornamental and domestic garden areas | – |
V23 | I2.3 | – | *Recently abandoned garden areas | – |
V3 | I3 | – | Artificial grasslands and herb-dominated habitats | 43,524 |
V31 | E2.6 | – | *Agriculturally improved, re-seeded and heavily fertilised grassland, including sports fields and grass lawns | – |
V32 | E1.6 | – | Mediterranean subnitrophilous annual grasslands | 7,664 |
V33 | E1.C | – | Dry Mediterranean lands with unpalatable non-vernal herbaceous vegetation | 419 |
V34 | E1.E | – | Trampled xeric grassland with annuals | 1922 |
V35 | E2.8 | – | Trampled mesophilous grassland with annuals | 4,668 |
V36 | E4.5 | – | *Alpine and subalpine enriched grassland | – |
V37 | E5.1 | – | Annual anthropogenic herbaceous vegetation | 11,489 |
V38 | E5.1 | – | Dry perennial anthropogenic herbaceous vegetation | 13,819 |
V39 | E5.1 | – | Mesic perennial anthropogenic herbaceous vegetation | 3,543 |
V4 | FA | – | *Hedgerows | – |
V41 | FA.1 | – | *Hedgerows of non-native species | – |
V42 | FA.2 | – | *Highly-managed hedgerows of native species | – |
V43 | FA.3 | – | *Species-rich hedgerows of native species | – |
V44 | FA.4 | – | *Species-poor hedgerows of native species | – |
V5 | FB | – | *Shrub plantations | – |
V51 | FB.1 | – | *Shrub plantations for whole-plant harvesting | – |
V52 | FB.2 | – | *Shrub plantations for leaf or branch harvest | – |
V53 | FB.3 | – | *Shrub plantations for ornamental purposes or for fruit, other than vineyards | – |
V54 | FB.4 | – | *Vineyards | – |
V6 | G5 | – | *Tree dominated man-made habitats | – |
V61 | G1.D | – | *Broadleaved fruit and nut tree orchards | – |
V62 | G2.9 | – | *Evergreen orchards and groves | – |
V63 | G5.1 | – | *Lines of planted trees | – |
V64 | G5.2 | – | *Small deciduous broadleaved planted other wooded land | – |
V65 | G5.3 | – | *Small evergreen broadleaved planted other wooded land | – |
V66 | G5.4 | – | *Small coniferous planted other wooded land | – |
Within the six revised habitat groups, we selected only those habitats that could be defined based on floristic criteria (Table 1, Appendix S1). We did not develop formal definitions for those habitat types that represent mosaics of several different habitats (e.g., wooded pastures) because their complete structure cannot be represented by single vegetation plots. We also did not consider those forest habitats that are defined based on the management practice or successional stage but do not differ floristically from related types with different management or in different successional stages (e.g., T41 Early-stage natural and semi-natural forest and regrowth or T43 Coppice and early-stage plantations). We were able to define plantations of non-site-native trees (those planted at sites where they would not occur naturally) but unable to define plantations of site-native trees because their floristic composition is usually indiscernible from that of natural forests.
2.2 Data sources
The primary source for producing characteristic species combinations and maps for EUNIS habitats was a data set of European vegetation-plot records (henceforth “vegetation plots” or “plots”). Such plots typically contain a full list of vascular plant species, often also a list of bryophytes and lichens, estimates of cover abundance of each species and various additional sources of information on vegetation structure, location and environmental features in the plot (Dengler et al., 2011). These plots were extracted from the EVA database (Chytrý et al., 2016; accessed on 19 May 2020), and several other databases not included in EVA (see the full list of databases used in this study in Appendix S2). The geographical scope was the whole of Europe (including the European part of Russia), Azores, Madeira, Canary Islands, Anatolia, Cyprus, Georgia, Armenia and Azerbaijan. The data set contained plots representing both the target habitat groups (coastal, wetlands, grasslands, shrublands, forests and man-made) and non-target habitat groups (marine, inland surface water and inland sparsely vegetated habitats). The latter groups are not in the focus of this study because the revision of their classification is not yet complete or published; still, plots sampled in these types were needed to assure correct classification of the habitats that are transitional between the target and non-target habitat groups. We excluded the plots that reported only species composition without cover-abundance information for individual species. Further, we excluded plots smaller than 1 m2, larger than 1,000 m2, without geographical coordinates and those with reported uncertainty of the coordinates larger than 10 km. The plots with a missing indication of location uncertainty were retained, assuming most of them were within 10 km from the indicated coordinates. The resulting data set contained a total of 1,261,373 georeferenced plots. The data set was prepared using the Turboveg 3 program (Hennekens, 2015) and analysed using the Juice 7.1 program (Tichý, 2002). It is the most extensive data set of vegetation plots ever analysed (compare Bruelheide et al., 2019).
The taxon names in this data set originated from various national and thematic international databases, most of them managed in Turboveg 2 (Hennekens and Schaminée, 2001), which use different taxon lists with partly inconsistent taxon concepts and names. Taxonomy and nomenclature were unified using Turboveg 3 in two steps. Firstly, the names from the original databases were interpreted by regional botanists, considering the taxonomic concepts and nomenclature used in the focal region of each database. This step was important because it solved regional differences in the use and meaning of some taxon names. In this step, taxon lists of most of the European vegetation-plot databases were matched to accepted names in the SynBioSys Taxon Database, an unpublished working database of taxon names and concepts used in the EVA project (Chytrý et al., 2016). Secondly, the names of vascular plants from the SynBioSys Taxon Database and the names from the original databases that did not match any name in the SynBioSys Taxon Database were translated to the nomenclature of the Euro+Med PlantBase (Euro+Med, 2006–2020; ww2.bgbm.org/EuroPlusMed), using a complete list of accepted names and synonyms of European and Mediterranean vascular plant taxa provided by the Berlin-Dahlem Botanical Garden and Botanical Museum in February 2020. The names of bryophytes and lichens followed the SynBioSys Taxon Database.
The cover of individual species was, in most vegetation plots, recorded using a cover-abundance scale (70% of plots were recorded using a variant of the Braun-Blanquet scale; Westhoff and van der Maarel, 1973). We transformed all of these scales to the arithmetic mid-point percent cover values corresponding to the individual cover-abundance classes following the default conversion of the Turboveg 2 program (Hennekens and Schaminée, 2001). For the seven-grade Braun-Blanquet scale the transformation was r + 1 2 3 4 5 → 1% 2% 3% 13% 38% 63% 88%, for the nine-grade Braun-Blanquet scale r + 1 2m 2a 2b 3 4 5→1% 2% 3% 4% 8% 18% 38% 63% 88%, and for the Domin scale + 1 2 3 4 5 6 7 8 9 10→1% 2% 3% 4% 13% 23% 29% 42% 63% 88% 99%. The occurrences of the same species in different layers of the same plot were merged, and their percentage covers recorded in different layers were combined using an algorithm that assumes random overlap of covers or different species, resulting in cover values that cannot exceed 100% (Chytrý et al., 2005; Jennings et al., 2009; Fischer, 2015; further referred to as the “Jennings‒Fischer formula”). As a result, each species was present only once in the data set. This step was necessary because information on layers was not recorded in many plots.
2.3 EUNIS-ESy: An expert system for identifying EUNIS habitats in vegetation-plot databases
The new classification expert system EUNIS-ESy (= EUNIS Expert System) was developed for identifying coastal, wetland, grassland, shrubland, forest and vegetated man-made habitats of the EUNIS Habitat Classification based on species composition and cover abundances of particular species or species groups. In the habitats that are difficult to distinguish based on purely floristic criteria, plot-location criteria were added.
EUNIS-ESy is based on formal definitions of habitats written as logical formulas in an editable script stored as a TXT file (Appendix S3). The computer program that runs the expert system evaluates all the plots of a vegetation database and checks for each of them whether it meets the conditions of one or more of the formal definitions of habitats included in this script. If a plot matches a definition of one habitat, it is assigned to this habitat. In an ideal case, habitats should be mutually exclusive, and each plot should be assigned to one and only one habitat. However, in reality, some plant communities have a transitional composition corresponding to two or even more habitats. The plots representing such communities are simultaneously assigned to all of these habitats. Other plant communities with idiosyncratic or impoverished species composition may be unable to be assigned to any habitat and remain unclassified by the expert system. Nevertheless, the expert system was prepared with the aim of allowing a large majority of plots to be assigned unequivocally to a single habitat.
The expert system script of EUNIS-ESy (Appendix S3) is divided into three sections, which represent successive steps in the analysis. Section 1 merges selected taxon names. In most cases, it merges subspecies, varieties or forms to the species level. We did so in order to improve the consistency of taxonomic concepts across the data set because some authors recorded only species while others also recorded infraspecific taxa. Further, we merged taxonomically difficult groups of species containing many misidentifications into species aggregates.
Section 2 enumerates species belonging to individual species groups that characterize particular habitats or groups of habitats and are used in the formal definitions of habitats. A single species can be assigned to more than one group. The initial lists of species included in the groups were compiled from relevant phytosociological literature, national habitat handbooks, personal field experience, data from our previous EUNIS reports (Schaminée et al., 2013, 2014, 2016a, 2016b), factsheets of the European Red List of Habitats (https://forum.eionet.europa.eu/european-red-list-habitats/library/terrestrial-habitats; see Janssen et al., 2016) and other sources. These initial lists were critically revised and extensively modified based on multiple classification trials with successive versions of EUNIS-ESy (Section 2.4).
Section 3 comprises definitions of habitats written as logical formulas that combine taxonomic specifiers, relational operators and threshold abundance criteria, which can be joined as required by the logical operators AND, OR or NOT. Technical description of the syntax of the expert system is provided by Tichý et al. (2019: their Appendix S1).
- Species-based assignment rules (internal classification criteria according to De Cáceres et al., 2015): plant species composition or cover abundances of plant species in vegetation plots, often applied to functional (e.g., dwarf shrubs, trees), biogeographical (e.g., western Mediterranean species) or ecological (e.g., calcareous fen) species groups.
- Location-based assignment rules (external classification criteria according to De Cáceres et al., 2015): information about plot location, such as geographical coordinates, which can be used for assigning plots to biogeographical regions (e.g., boreal vs temperate), landscape types (e.g., coastal vs inland) or altitudinal belts.
2.3.1 Species-based assignment rules
Species-based (or in general, taxon-based) assignment rules typically consist of three components: (a) a taxon specifier; (b) a criterion; and (c) a relational operator. The taxon specifier can be a single species or a pre-defined species group. The criteria are based on either occurrence or cover. The occurrence criterion is the presence or absence of a species or a species group in a plot. The cover criterion is the percentage cover of specific species or the total cover of a species group occurring in a plot. The criteria are combined using the relational operators GR (greater than) or GE (greater than or equal to).
When applied to a single species, an assignment rule with the occurrence criterion is “Species name GR 00,” meaning that the species is present (its percentage cover is greater than zero). A non-zero percentage value defines a cover criterion; for example, “Fagus sylvatica GR 50” denotes that the cover of Fagus sylvatica in the plot should be greater than a preselected threshold cover of 50%. Alternatively, the cover of a species can be compared with the total cover of all the other species occurring in the plot, for example, “Erica tetralix GR #$$” means that the cover of Erica tetralix should be greater than the cover of any other species in the plot, or “Erica tetralix GR $50” means that the cover of Erica tetralix should be greater than 50% of the total percentage cover of all species in the plot (see Tichý et al., 2019, their Appendix S1, for syntax details).
When applied to species groups, the occurrence criterion assesses whether the number of species of the target group in the plot exceeds a pre-selected threshold, or whether the number of species of one group is greater than the number of species of another group (or other groups). The cover criterion for species groups assesses whether the total cover of the species belonging to the group is greater than a preselected threshold, or whether it is greater than the total cover of species belonging to another group (or other groups). The total cover of a species group is computed by combining percentage covers of individual species of the group following the Jennings‒Fischer formula, which returns values that do not exceed 100%. Alternatively, in discriminating species groups (see below), the total cover can be computed as a simple sum of percentage covers of individual species, or a sum of square-rooted percentage covers of individual species.
EUNIS-ESy contains two basic types of species groups called “functional species groups” and “discriminating species groups.” All the groups of both types are defined in Section 2 of the expert system script by listing species belonging to them. In that section, functional groups are indicated by the symbols ### and discriminating species groups by ##D. The groups of both types are used to define assignment rules in Section 3 of the expert-system script (Appendix S3). Most relational operators can be applied to both functional and discriminating groups, but one set of operators can only be applied to the discriminating groups.
Functional species groups
The concept of the functional species groups follows Landucci et al. (2015). These groups comprise species with similar traits (e.g., life form, morphology or phenology), but also species with similar distribution ranges, affinity to the same habitat, or species characterized by a combination of these properties.
- The plot should contain at least n species of the group (“#nn Group-name” in the script, where nn is a two-digit number of required species; for example, “#03 Dwarf-shrubs” means that at least three species of the functional group Dwarf shrubs should be present in the plot).
- The plot should contain more species from one group than from another group (“### Group-name1 GR ### Group-name2” in the script; for example, “### Wet-grassland-herbs GR ### Mesic-grassland-herbs” means that the plot should contain more herb species of wet grassland than of mesic grassland).
- The total cover of a functional species group in the plot should be greater than a threshold (“#TC Group-name GR nn” in the script, where #TC means the total cover of the species of the group calculated using the Jennings–Fischer formula and GR nn means greater than a percentage threshold; for example, “#TC Dwarf-shrubs GR 50” means that the total cover of dwarf shrubs in the plot should be greater than 50%).
- The total cover of a functional species group should be greater than that of another functional group (“#TC Group-name1 GR #TC Group-name2” in the script; for example, “#TC Wet-grassland-herbs GR #TC Mesic-grassland-herbs” means that the plot should contain a greater total cover of wet grassland herbs than that of mesic grassland herbs).
- The total cover of a functional species group should be greater than that of another functional group, excluding the species of the former group from the latter group (“#TC Group-name1 GR #TC Group-name2 EXCEPT #TC Group-name1” in the script; for example, “#TC Dark-taiga-trees GR #TC Trees EXCEPT #TC Dark-taiga-trees” means that the total cover of the dark taiga trees in the plot should be greater than that of other trees). This formula can also be used for comparing the cover of a single species with the total cover of a group, e.g., “Picea abies GR #TC Trees EXCEPT Picea abies” means that Picea abies should have a greater cover than the total cover of the other trees.
- The total cover of a functional species group in a plot should be greater than nn% of the total cover of all the species in a plot (“#TC Group-name GR $50” in the script, where $50 means 50% of the total cover of all species; for example, “#TC Dwarf-shrubs GR $50” means that the total cover of dwarf shrubs should be greater than 50% of the total cover of all species in the plot).
- A general group containing all the species occurring in the plot can be created, and its total cover computed using the #T$ notation in the script. Such a group can be used to identify whether the total cover in the plot is greater than a given threshold (e.g., “#T$ GR 30” means that the total vegetation cover in the plot should be greater than 30%). Alternatively, this general group can be used to define that the total cover of a functional species group should be greater than the total cover of all the other species in the plot (“#TC Group-name GR #T$” in the script; in this case, #T$ means the total cover of all the other species in the plot excluding the species of the group involved in the comparison; for example, “#TC Dwarf-shrubs GR #T$” means that the total cover of dwarf shrubs should be greater than the total cover of all the other species, i.e. non-dwarf-shrubs, in the plot).
- Finally, the assignment rules can consider the cover of only one species of the group, specifically the one that has the highest cover in the plot, using the symbol #SC (single-species cover). For example, “#SC Phrygana-shrubs GE #$$” means that the cover of at least one species belonging to the group of phrygana shrubs is greater than the cover of any other species in the plot or at least it is equal to the cover of the species with the highest cover value of those not belonging to this group; “Corylus avellana GR #SC Shrubs” means that the species Corylus avellana has a greater cover than the cover of any single species in the functional species group of shrubs except Corylus avellana.
In all cases, two or more functional species groups can be merged in the logical formulas. For example, “#TC Trees|#TC Shrubs|#TC Dwarf-shrubs” represents the total cover of all woody plants in the plot.
- (<#TC Temperate-submediterranean-deciduous-shrubs GR 25> AND (<#TC Temperate-submediterranean-deciduous-shrubs GR $50> OR <#SC Temperate-submediterranean-deciduous-shrubs GR #$$>)) NOT (<#TC Mesomediterranean-maquis-shrubs GR 05> OR <#TC Trees GR 10>)
This means that the total cover of the functional species group “Temperate-submediterranean deciduous shrubs”, calculated using the Jennings‒Fischer formula, should be greater than 25% and, at the same time, either the total cover of this group should be greater than 50% of the total cover of all the species in the plot or the cover of any species of this group should be greater than the highest cover in the plot of a single species that does not belong to this functional species group. In addition, the total cover of the functional species group “Mesomediterranean maquis shrubs” should not be greater than 5% and the total cover of the functional species group “Trees” should not be greater than 10%.
Discriminating species groups
The concept of discriminating species groups follows the proposals of Dengler et al. (2006) and Willner (2011) and the principles of the expert system developed by L. Tichý for the identification of European vegetation classes (Mucina et al., 2016: their Appendix S12). One discriminating species group is compiled for each habitat. Such a group includes a subset of diagnostic species of this habitat that can be, as a group, used for reliable discrimination of this habitat against the other habitats. A species can be included in more than one discriminating species group. For each plot, quantitative representation of all the present discriminating groups is compared, and the plot is classified to that habitat which has the highest representation of its discriminating species group in this plot. In EUNIS-ESy, the quantitative representation of the discriminating species groups is measured as the sum of square-rooted percentage covers of individual species of the group (##Q), which is an intermediate solution in terms of relative weights given to dominants vs species with low cover between a measure based on the number of species and a measure based on the sum of untransformed percentage species covers (Tichý et al., 2019).
EUNIS-ESy includes several independent sets of discriminating species groups, which are identified by a two-digit number after the + sign at the beginning of the group name. Any comparison of species covers is made only among the groups with the same number. In the current version of the expert system (ver. 2020-06-08), groups marked as +01 are compared only among Atlantic, Baltic and Arctic coastal habitats, +02 among Mediterranean and Black Sea coastal habitats, +03 among inland sparsely vegetated habitats, +04 among all the other non-forest habitats, +05 among a few specific habitats of temperate broadleaved deciduous forests, +06 to +10 among different groups of mire habitats, and +11 among broad habitat groups. For example, the expression “##Q +04 R1A-Semi-dry-perennial-calcareous-grassland” in the expert system script means that the sum of square-rooted percentage covers of the species belonging to the discriminating species group of semi-dry perennial calcareous grassland in the plot should be greater than the sum of square-rooted percentage covers of the species of the discriminating species group of any other non-forest habitat which has a discriminating group with a name starting with +04.
The relative importance of functional and discriminating species groups for habitat classification varies among habitat groups. In the habitats defined by the presence of certain dominant species, especially forest and shrubland habitats, the use of functional groups in combination with threshold cover values is often sufficient and most effective to define the particular habitat (e.g., heathland is a habitat determined by the dominance of ericoid or genistoid dwarf-shrub species). In contrast, for the habitats characterized by a weak or irregular dominance of specific species, such as most grassland and coastal habitats, this method of habitat definition rarely provides satisfactory classification. For such habitats, we based the classification mainly on the discriminating species groups.
- ((<##Q +11 Mire-species> AND <##Q +04 Fen-species>) AND (<##Q +09 Acidophilous-fen-species >AND (<##Q +07 Poor-fen> AND <#TC +09 Acidophilous-fen-species GR 25>))) NOT (<#TC Trees GR 15> OR <#TC Shrubs GR 15>)
This means that the sum of square-rooted percentage covers of the discriminating species group of all mire habitats should be greater than that of any other broad habitat group, and at the same time the sum of square-rooted percentage covers of the discriminating species groups of fens, acidophilous fens and poor fens should be greater than those of their contrasting groups with the same number at the beginning of their name, and the cover of acidophilous fen species should be greater than 25% and the cover of either trees or shrubs should not be greater than 15%. Note that in this example, the group “+09 Acidophilous-fen-species” is used both as a discriminating species group (##Q) and a functional species group (#TC).
2.3.2 Location-based assignment rules
Some habitats in the EUNIS classification are defined partly by their occurrence in specific latitudinal vegetation zones, altitudinal vegetation belts or habitat complexes. For example, some groups of coniferous forests are divided into boreal and temperate types, or some habitat types are defined by their occurrence on coastal dunes, although similar habitats also occur on inland dunes. In several cases, it is impossible to distinguish such habitats by plant species composition and cover alone, especially if they are species-poor, because at least in some places, their species composition can be the same in the boreal and temperate zones, or in coastal and inland dunes. Therefore, we included several location-based assignment rules into the formal definitions of habitats in EUNIS-ESy, complementing the species-based assignment rules. Nevertheless, we could define most of the habitats purely based on species composition, and we added the location-based assignment rules only when the species-based classification was unable to separate some types or would have required very complex definitions.
The location-based criteria are either qualitative or quantitative: in the expert-system script, they are indicated as $$C (C stands for “character”) or $$N (N stands for “numeric”). The qualitative criteria are defined using the relational operator EQ (equal to), e.g., “$$C Country EQ Belgium” (the plot was located in Belgium). The quantitative criteria can also use the operator EQ, but more often they use the operators GR (greater than) or GE (greater than or equal to), e.g., “$$N Altitude (m) GR 1,000” (altitude of the plot was higher than 1,000 m a.s.l.). A range of the quantitative criteria can be defined by a combination of two statements, e.g., “<$$N Altitude (m) GE 500 > NOT <$$N Altitude (m) GR 1,000>” defines an altitude from 500 m to 1,000 m a.s.l.
- Country — a qualitative variable containing country names
- Ecoreg — a quantitative variable containing the three-digit codes of the terrestrial ecoregions (Dinerstein et al., 2017; see https://ecoregions2017.appspot.com/). Ecoregions were only used in the definitions of some shrubland and forest types
- Coast_EEA — a qualitative variable indicating whether the location is on the coastline, including a buffer distance of up to 5,000 m from the coast. A digital coast map provided by the European Environment Agency (https://www.eea.europa.eu/data-and-maps/data/eea-coastline-for-analysis-1/gis-data/europe-coastline-shapefile) was used to identify coastal plots based on their geographic coordinates. The categories are as follows: Arctic (ARC_COAST), Atlantic (ATL_COAST), Baltic (BAL_COAST), Black Sea (BLA_COAST), Mediterranean (MED_COAST) and Not on the coast (N_COAST)
- Dunes_Bohn — a qualitative variable indicating whether the location is on coastal dunes. A digital version of the Map of Natural Vegetation of Europe (Bohn et al., 2003) was used, and the plots with geographic coordinates corresponding to the mapping units from P1 to P16 (coastal dune vegetation) were given the value Y_DUNES, whereas the others were given the value N_DUNES
- DEG_LAT, DEG_LON — a quantitative variable containing the degrees of latitude and longitude of the plot in the coordinate system WGS 84, format DD.DDDD. Western longitudes are indicated with a minus sign
- Altitude (m) — a quantitative variable containing the altitude of the plot in metres above sea level.
- ((<#TC Arctic-alpine-bryophytes-lichens GR #T$> AND <#02 Arctic-alpine-bryophytes-lichens>) NOT (<#TC Sphagnum GR 05> OR <#TC Trees GR 05>)) AND (<$$N DEG_LAT GR 65> OR <$$C Country EQ Iceland>)
This means that a plot is assigned to the habitat type F12 if the total cover of the functional group of the Arctic and alpine bryophytes and lichens is greater than the cover of all the other species in the plot, at the same time at least two species of this group are present, the total cover of Sphagnum species or trees is not greater than 5%, and either the latitude is greater than 65°N or the plot is from Iceland.
2.3.3 The hierarchical structure of the expert system
The expert system was developed hierarchically. Each habitat definition was assigned a priority degree. When the expert system is running, the definitions with the highest priority are applied to the data set first, and the plots that meet the requirements of these definitions are assigned to the habitats, while other plots remain unclassified. Then, the definitions with lower priority are applied to the remaining unclassified plots.
The current study deals only with the EUNIS habitat groups that were revised so far, i.e. N (Coastal), Q (Wetlands), R (Grasslands), S (Shrublands), T (Forests) and V (Man-made). However, the hierarchy of the expert system has been designed to include the vegetated habitats from the other groups (A — Marine, C — Inland surface waters, and H — Inland sparsely vegetated) once their revision is finished and published. Preliminary definitions of these habitats were developed and included in the expert system, which is important for separation of the target habitat groups from the non-target groups. The codes and concepts of these habitats in the expert system correspond to those used in the European Red List of Habitats (Janssen et al., 2016). However, these preliminary definitions were not tested and therefore not included in the results of the current study.
In some cases, we created two definitions with different priority levels to define a single habitat. The narrower definition is applied at a higher priority level. It is usually based on the occurrence or a high total cover of species from a functional group that comprises species narrowly specialized to the habitat. This definition classifies the plots that are typical examples of the particular habitat, but it leaves many less typical plots of this habitat unclassified. Subsequently, a broader definition is applied at a lower priority level to the unclassified plots. This definition is based on a discriminating species group and classifies the plots that are less typical examples of the habitat but still possess more features of this habitat than of any other habitat. Such a two-step approach is needed for habitats in which the occurrence of narrowly specialized species is a sufficient criterion for habitat assignment even if such species have a low cover. If only definitions based on discriminating species groups were used, some of the plots of these habitats could be misclassified.
- <#TC R11-Pannonian-and-Pontic-sandy-steppe-specialists GR 15> NOT <#TC Trees|#TC Shrubs GR 15>
- (<##Q +04 R11-Pannonian-and-Pontic-sandy-steppe> AND <#03 +04 R11-Pannonian-and-Pontic-sandy-steppe>) NOT <#TC Trees|#TC Shrubs GR 15>
This means that the sum of square-rooted percentage covers of a discriminating species group of the Pannonian and Pontic sandy steppe (including both the narrow specialists and frequently occurring less-specialized species) should be greater than the sum of square-rooted percentage covers of the discriminating species groups of any other habitat, and the plot should contain at least three species of this group, and the total cover of trees and shrubs should not exceed 15%.
- Level 8: Coastal habitats dominated by woody plants (i.e. coastal heaths, dune scrub and dune forests; habitats N18 to N1G) and Macaronesian heath (S43) are defined based on a high cover of dominant species or a high total cover of functional groups of dominant species (dwarf shrubs, shrubs and trees) in combination with the occurrence on the coast, on coastal dunes, or in Macaronesia, respectively.
- Level 7: Other (i.e. herbaceous) coastal habitats (N11–N17 and N1H–N1J) and marine habitats of the tidal zone dominated by vascular plants (A25a–A25d) are defined based on the discriminating species groups of habitats within these two groups, in combination with occurrence on the coast or in coastal dunes.
- Level 5–6: The habitat group H (inland sparsely vegetated habitats) is defined based on a cover not greater than 30% in combination with the occurrence of at least one species specialized to the habitats of this group. Individual habitats within this group are provisionally defined based on their preliminary discriminating species groups at level 5. In the future, level 6 with narrower definitions of these habitats based on specialist species will be added.
- Level 4: Some coastal herbaceous (group N), some grassland (group R), all shrubland except the Macaronesian heath (group S), all forest (group T) and some man-made (group V) habitats are classified using definitions based on a high cover of characteristic dominant species or functional groups of characteristic dominant species, or the presence of a specified minimum number of species narrowly specialized to individual habitats.
- Level 3: Habitat groups of shrublands (group S) and forests (group T) are defined based on the dominance of shrubs, dwarf shrubs or trees. Vegetation plots that have not been previously classified to specific habitats within these groups are classified directly to these broad groups.
- Level 2: All the non-shrubland and non-forest habitats that were not defined before are defined based on the discriminating species groups.
- Level 1: Habitat groups of wetland (Q, separated into the groups of Qa — mires and Qb — helophyte beds), grassland (R), man-made (V), inland surface water (C) and inland sparsely vegetated (H) habitats, without assignment to any specific habitat, are defined based on discriminating species groups. Vegetation plots that have not been previously classified to specific habitats within these groups are classified directly to these habitat groups.

2.4 Iterative evaluation and optimization of the expert system
The expert system and formal definitions of individual habitats therein were created based on expert opinion combined with iterative improvement, which used information from the evaluation of the results of successive classification trials. A preliminary version of the expert system contained the initial species groups (Section 2.3) and the first version of formal definitions for a subset of habitats belonging to the same habitat group (e.g., Forests). The formal definitions were proposed by the authors of this paper, who considered the content of each target habitat and the options provided by the formal language of the expert system. The aim was to propose such definitions that would include most plots belonging to the habitat and none or very few plots not belonging to the habitat. This preliminary version of the expert system was applied to a data set of European vegetation plots in the Juice program. The resulting classification was evaluated by the experts, focusing on false positive (a plot does not belong to the habitat but is assigned to it) and false negative (a plot belongs to the habitat but is not assigned to it) classification results. The classification could only be validated based on the judgement of human experts because there is no standard of correct plot-level classification. Therefore the plots that the expert system assigned to individual habitats were checked by several experts from different countries, who were specialists in different habitats. These plots were also mapped, and the experts paid special attention to geographically outlying plots and to the absence of plots in areas where the habitat was expected. If the experts identified misclassified plots, the expert system was modified to avoid such misclassifications. The modifications were made either to the content of the species groups or to the assignment rules in the formal definitions. For species groups, the species that contributed to misidentifications were identified and removed from the groups, while other species that might contribute to the correct habitat identification were added to the relevant groups. For formal definitions, the structure of the formulas or thresholds used in the formulas were changed. This process was repeated many times until misclassification identifiable by the experts were eliminated. Once the final classification was achieved for one habitat group, the first versions of formal definitions of another habitat group were added, and the iterative optimization process was repeated.
2.5 Characteristic species combination
In phytosociology, “characteristic species combination” is defined as a combination of diagnostic species and species with higher constancy that together define a vegetation unit (Braun-Blanquet, 1964). Here we use this term as an umbrella for the three types of species that Chytrý and Tichý (2003) introduced to characterize vegetation types: diagnostic, constant and dominant species. Diagnostic species (Whittaker, 1962; Westhoff and van der Maarel, 1973) are species with occurrences concentrated in a particular habitat, being absent or rare in other habitats. As such, they are useful as positive indicators of the habitat. However, diagnostic species may be absent from the habitat at many sites. Constant species are species that occur frequently but not necessarily exclusively in a particular habitat: some of them may be generalist species that are also frequent in other habitats. Dominant species are those that often reach high cover in a particular habitat, thus determining the habitat physiognomy.
For the purposes of computing the characteristic species combination for each EUNIS habitat at Level 3 of the classification hierarchy, we used a data set of all the plots classified at this level, including the vegetated marine (A), inland surface water (C) and inland sparsely vegetated (H) habitats, which were defined provisionally in the expert system. We performed a stratified resampling of this data set (Knollová et al., 2005) to balance the spatially uneven sampling effort across Europe, i.e. a high concentration of vegetation plots in relatively small areas contrasting with low density or absence of vegetation plots in other, often large areas. This procedure should reduce the bias in identification of diagnostic, constant and dominant species, especially for those species that are frequent in heavily sampled areas but rare or absent elsewhere. The stratification was applied to a data set of those vegetation plots that were classified by the expert system to Level 3 habitats, excluding the plots classified to Level 1 (habitat groups) but not to Level 3 habitats. All the plots in this data set were assigned to geographical grid cells of 5 min × 3 min of longitude × latitude (corresponding to approximately 6.0 km × 5.5 km at 50° N). If a cell contained more than one plot belonging to the same habitat, one randomly selected plot was retained, while the others were removed. If such resampling resulted in <20 plots per habitat across the whole data set, some of the previously removed plots were selected randomly and returned to the data set, ensuring that the total number of plots of the habitat was 20. Habitats with fewer than 20 plots in the whole data set were not resampled. The resampled data set contained 233,352 plots, i.e. 28% of all the plots classified to Level 3 habitats, but it was more balanced and more representative than the original data set.
Diagnostic species were determined based on species fidelity, i.e. the degree of concentration of species occurrences in each group of plots representing a Level 3 EUNIS habitat. Fidelity was calculated using the phi coefficient of association (Sokal and Rohlf, 1995; Chytrý et al., 2002) standardized as if each habitat was represented by the same number of plots (Tichý and Chytrý, 2006). The species with a value of phi greater than 0.15 for a particular habitat were considered as diagnostic for this habitat. This threshold was selected arbitrarily as a compromise between a stringent selection of few species with high diagnostic value (if phi was higher) and a lax selection of many species with weak diagnostic value (if phi was lower). However, the concentration of species occurrences in the habitat, even if expressed by a high value of the phi coefficient, may not be statistically significant for some habitats represented by a low number of plots in the data set. Therefore, the statistical significance of the species–habitat association was tested using Fisher's exact test (Sokal and Rohlf, 1995), and if not significant at p < 0.05, the species was excluded from the list of diagnostic species (Tichý and Chytrý, 2006).
Constant species were defined as those with a constancy (= percentage occurrence frequency) of at least 10% in the target habitat. This threshold is much lower than usually used for constant species of vegetation types in phytosociology. However, a lower value is needed for EUNIS habitats than for vegetation types, because many habitat types comprise several vegetation types occurring across broad geographic ranges with varying species composition; as a result, few species have a higher constancy across the whole habitat.
Dominant species were defined as those that occurred with a cover greater than 25% in at least 5% of vegetation plots classified to the target habitat. This means that a species is considered as dominant even if it does not belong to the tallest vegetation layer, and a single plot can have more than one dominant species. Conversely, a habitat can have no dominant species, especially if it has sparse vegetation cover.
Records of taxa identified only to the genus level and records of epiphytic lichen species were removed from the characteristic species combinations. Records of other non-vascular plants (bryophytes and non-epiphytic lichens) were retained because many of these species are important ecological indicators. However, as they were not recorded in all plots, their calculated constancy values were likely underestimated. Their fidelity can be either underestimated (if they were sampled only in some proportion of plots of the habitat) or overestimated (if bryophytes and lichens were more often sampled in some habitats than in others). A solution would be to compute constancy values only for plots where bryophytes and lichens were recorded (or would have been recorded if present). However, this was not possible because vegetation plots without records of bryophytes and lichens in most cases do not contain information whether these species were really absent or just not recorded. Therefore, we reported the values calculated for bryophytes and lichens based on all the plots, but we emphasize that these values can be inaccurate and have to be interpreted with caution.
As a quality test of our results, we made a formal comparison of characteristic species combinations computed for the EUNIS forest habitats with an earlier established list of indicator species for French forest habitats prepared on the basis of a different data set (Gégout et al., 2009). This exercise, performed by national experts (J.-C. Gégout and L. Maciejewski), revealed a high degree of correspondence between both lists, thereby indicating the reliability of the characteristic species combinations computed in our study.
2.6 Habitat distribution mapping
Distribution maps for individual habitats were prepared by plotting the location of all vegetation plots classified to individual habitats (before stratified resampling) on a map. All the maps were checked for outlying locations, which in most cases pointed out either an error in coordinates or misidentification of an important species that led to an erroneous classification to a different habitat. Errors were corrected, new classification prepared, and both the characteristic species combinations and distribution maps were updated.
Because of a strong geographic bias in the available European vegetation plots, especially their low density in northern and eastern Europe, we indicated the locations of plots belonging to individual habitats on grid maps showing the regional density of plots belonging to the particular habitat group. Such maps indicate whether the absence of occurrences of the habitat in a region is likely real or caused by the absence of data from the region (Figure 2).

2.7 Expert system software tools
A software tool to apply the expert system script to a data set of vegetation plots was developed within the Juice 7 program and, in a simpler form that does not contain all the functions, also in the Turboveg 3 program. The syntax of the expert system was described by Tichý et al. (2019: their Appendix S1). The code for applying the expert system in the R program (www.r-project.org) was developed by Bruelheide et al. (https://git.loe.auf.uni-rostock.de/misc/ESy).
3 RESULTS
We developed formal definitions for 199 EUNIS habitats including 25 coastal (group N), 18 wetland (Q), 55 grassland (R), 43 shrubland (S), 46 forest (T) and 12 man-made (V) habitats (Table 1) and included them in the expert system (Appendix S3). We were unable to develop formal definitions for 2 coastal, 3 grassland, 1 shrubland, 8 forest and 19 man-made habitats because EUNIS defines these habitats by features not associated with species composition and cover (e.g., abiotic habitat features, vegetation structure, successional age or the origin as plantation of site-native trees). Others of these non-defined habitats were mosaics of trees or shrubs and herbaceous vegetation (e.g., wooded pastures, orchards or vineyards), in which only the herbaceous component is usually recorded in vegetation plots. Although the habitats were defined at hierarchical Level 3 of the EUNIS classification, one habitat (Q3 Palsa and polygon mires) was defined on Level 2, because the two subordinated habitats at Level 3 could not be distinguished based on the species composition. In addition, the expert system contains 46 preliminary definitions of the habitats of other groups: A — marine (coastal salt marshes), C — inland surface water and H — inland sparsely vegetated habitats. However, they have not been tested and require a considerable revision in the future.
Of all 1,261,373 vegetation plots in the data set, 1,125,121 were classified to one of the six habitat groups N, Q, R, S, T or V. Of those, 784,901 were classified directly to one of the habitats at hierarchical Level 3 (or Level 2 for Q3) and 341,944 were classified directly to habitat groups (i.e. Level 1 habitats). Further 73,188 plots were preliminarily classified to the habitat groups A, C and H or their Level 3 habitats, 59,745 plots remained unclassified and 3,319 plots were classified to more than one habitat.
The resulting characteristic species combinations for the EUNIS coastal, wetland, grassland, shrubland, forest and man-made habitats, divided into diagnostic, constant and dominant species, are listed in habitat factsheets (Appendix S1) and also provided in a spreadsheet format (Appendix S4).
The distribution maps of these habitats are also included in habitat factsheets (Appendix S1). These maps include only localities of the vegetation plots identified by the expert system as belonging to the habitat. Therefore, they can be biased by the distribution of the available plots for some habitats (Figure 2). To estimate the magnitude of the potential bias, the densities of plots available for the corresponding habitat group are shown in the background.
4 DISCUSSION
4.1 EUNIS-ESy and the development of the EUNIS Habitat Classification
EUNIS-ESy is the first tool that automatically classifies vegetation plots across Europe to habitat types of the EUNIS Habitat Classification. The development of this expert system represents a major step forward in the applicability of the EUNIS Habitat Classification for nature conservation survey, planning, monitoring and reporting on the international, national and regional levels.
EUNIS comprises concepts of individual habitat types that resulted from discussions of international teams of experts and public consultations with national experts and practitioners, organized by the European Environment Agency (Rodwell et al., 2018). Therefore, the aim of the current study was not (and could not be) to revise this classification or concepts of individual habitats within it. Our aim was to develop formal definitions that would closely match the concepts of individual habitats and enable correct assignment of vegetation plots to these habitats. However, the work on this expert system was done in parallel with the EUNIS revision process in 2013–2019, and various experiences from developing formal definitions were fed back to the revision process and influenced its outcome (Schaminée et al., 2012, 2013, 2014, 2016a, 2016b, 2018, 2019, 2020). Further refinements of the formal definitions and expert system were made during the preparation of the current paper, based on the feedback from an international team of co-authors. Therefore, the results presented here are an update of the work that was previously summarized in the reports cited above.
The present paper deals with the six habitat groups for which the EUNIS classification has already been revised (Schaminée et al., 2018, 2019, 2020): coastal, wetland, grassland, shrubland, forest and man-made habitats. These habitat groups represent a large majority of the European terrestrial habitat types. Revisions have not yet been completed or published for the other groups, including marine habitats, inland surface waters, inland unvegetated or sparsely vegetated habitats, constructed, industrial and other artificial habitats, and habitat complexes. Many habitats of these remaining habitat groups are based entirely on the abiotic or non-plant features or are habitat complexes comprising several different plant communities. Therefore, the expert system approach based on floristically defined vegetation types cannot be used to identify them. Still, there are some habitats in these remaining habitat groups for which it will be possible to develop formal definitions and add them to the expert system, once these habitats are revised in the process guided by the European Environment Agency. In the marine habitats, the expert system would presumably also work with non-plant benthic species if data were available in a suitable form. These tasks remain for the future. Nevertheless, the current expert system includes preliminary definitions of the non-revised habitats from the remaining groups. In this way, its structure is prepared for the inclusion of new habitat definitions, which will replace the current preliminary definitions.
4.2 Comparison with other expert systems
The expert system EUNIS-ESy presented here is not the first one developed for the classification of European vegetation plots. Mucina et al. (2016) provided expert-based lists of diagnostic species for European vegetation classes defined in EuroVegChecklist and included them in an expert system. In this EuroVegChecklist expert system, diagnostic species were used directly as discriminating species in our terminology (see section 2 Methods), which provides an acceptable classification for many plots. However, unless diagnostic species lists are optimized for the purpose of identification of vegetation types, the classification error rate in expert systems is relatively high (Tichý et al., 2019), which is the case for the EuroVegChecklist expert system. Also, the EuroVegChecklist expert system does not consider cover or dominance of individual species or groups of species belonging to specific layers (e.g., trees or shrubs). Therefore it often fails to discriminate between vegetation types with similar species composition but different physiognomy (e.g., heathland vs forest with heath-like undergrowth).
In our work on EUNIS-ESy, we found that habitat/vegetation types defined by species of the dominant growth form or the growth form of the highest layer are most accurately defined by threshold covers of the dominant species or a dominant species group, in some cases in combination with other criteria. This holds true especially for shrubland and forest habitats, which are often defined by the dominance of a single tree or a small species group of trees or shrubs. In contrast, most types of grasslands, which often have a single layer of vascular plants with the dominant species changing from place to place, are better defined by discriminating species groups. Using different approaches to defining habitats within different habitat groups, EUNIS-ESy provides a more accurate (as assessed by expert judgement) classification of individual vegetation plots than the EuroVegChecklist expert system, which applies the same classification approach across all vegetation formations.
Another expert system on the European scale was developed by Giannetti et al. (2018) for the classification of European Forest Types produced by forestry experts as a tool for sustainable forest management (EEA, 2006; Barbati et al., 2014). This expert system classifies European forests to 14 types based on the dominant tree species, their basal area and information derived from plot location, e.g., altitude, biogeographical region or occurrence in wetland areas. Like EUNIS-ESy, it is a rule-based expert system (Grosan and Abraham, 2011) using the information provided by human experts, both on the dominant species and location, as classification criteria. However, EUNIS-ESy can identify more forest types (46), partly due to the use of herb-layer species in addition to trees in the definitions of individual habitats.
Other currently available pan-European expert systems were designed to identify phytosociological alliances or associations within a specific vegetation type, e.g., floodplain forests and alder carrs (Douda et al., 2016), beech forests (Willner et al., 2017), fens (Peterka et al., 2017), coastal dune grasslands (Marcenò et al., 2018), Mediterranean Lygeum spartum grasslands (Marcenò et al., 2019) and marshes (Landucci et al., 2020). Other expert systems have a more restricted geographic scope (see an overview with source codes at http://www.sci.muni.cz/botany/juice/?idm=25). These expert systems provided useful resources, and some species groups and decision rules proposed in some of them were used, with modifications, in EUNIS-ESy.
Some of the mentioned expert systems are designed to be applied only to the vegetation plots belonging to the broad habitat/vegetation type for which the expert system was developed. The plots not belonging to this scope have to be removed before classification; otherwise, they might be erroneously assigned to some of the types defined in the expert system. In contrast, EUNIS-ESy includes, in addition to the formal definitions of the coastal, wetland, grassland, shrubland, forest and man-made habitats, also preliminary definitions of all the other European vegetated habitat types. As a result, it can be applied to any vegetation plot from Europe.
Some expert systems for vegetation classification were also developed outside Europe. The expert system for national forest vegetation classification of Taiwan (Li et al., 2013), provided in a code executable in the R program, used a similar approach as the European expert systems, following the principles outlined by Bruelheide (1997, 2000) and Kočí et al. (2003). A different approach, based on supervised or semi-supervised fuzzy classification performed using the noise clustering algorithm, was applied for matching new plots to the units of existing national vegetation classification in New Zealand (Wiser and De Cáceres, 2013; Wiser et al., 2016).
4.3 Remarks on the practical application of EUNIS-ESy
- Taxonomic harmonization. Taxon nomenclature and taxonomic concepts in the input data set of vegetation plots should correspond to those of the Euro+Med PlantBase. If an export from the EVA database is used for the analysis, nomenclature can be automatically converted to the Euro+Med standard in Turboveg 3 using the in-built SynBioSys Taxon Database. Standard data exports from EVA include this conversion. If an export from another database (e.g., a database managed in Turboveg 2) is used, taxon nomenclature used in this database has to be first converted to the Euro+Med PlantBase standard. For most European vegetation-plot databases, we provide a tool for automatic nomenclature conversion to Euro+Med in a series of files that contain conversion instructions for species lists used in individual databases (http://doi.org/10.5281/zenodo.3841729). These files can be applied to the species-by-plot tables using the expert system functions in the Juice program. After they convert species names in the table to the desired standard of Euro+Med, another modification of species names must be done using EUNIS-ESy, which guarantees that the taxon names and concepts in the analysed table are the same as used for classification by EUNIS-ESy. This second step follows the instructions in Section 1 of the EUNIS-ESy script. In particular, it merges subspecies and varieties to the species level and closely related species to broader taxonomic entities such as aggregates.
- Species cover vs presence data. Because EUNIS-ESy uses the information on species cover, it cannot reliably classify data in which only species presences (not covers) are recorded. If applied to such data, this expert system can correctly classify some plots of grasslands and other open-land habitats, but it consistently fails to provide the correct classification of shrubland, forest and some other habitats. Therefore, we do not recommend to apply EUNIS-ESy to presence-only data.
- Plot location information. Because EUNIS-ESy classifies habitats (and not purely vegetation types), it requires that input data contain information on the location and some environmental features, in addition to species composition and covers. This information includes plots' geographical coordinates, altitude, and their location in specific countries, ecoregions (Dinerstein et al., 2017), on the coast, or in coastal dunes. Location-based criteria are used in 90 (48%) habitat definitions. For some habitats, they are essential, whereas for others they are only used for removing geographical outliers. If the information on location was missing in the input data, the expert system would classify the plots, but the classification might be wrong, especially for the habitats for which the location criteria are essential to the definition. If the location information (except coordinates) is not available in the input data, the Turboveg 3 export function can derive it from an overlay of plot coordinates with relevant GIS layers and store the values (e.g., location on coastal dunes or in a certain ecoregion) to the header data of vegetation plots. However, the expert system itself does not extract this information from plot coordinates.
- Tested vs not-tested habitats. The current version of the expert system was tested for the coastal, wetland, grassland, shrubland, forest and man-made habitats (EUNIS habitat groups N, Q, R, S, T and V). It also contains preliminary definitions of vegetated marine habitats, inland surface water habitats and inland sparsely vegetated habitats. However, these definitions have not been tested, and the proportion of plots misclassified by these definitions may be high. Indeed, the concept and delimitation of these habitats may change considerably in the process of EUNIS revisions.
- Classification accuracy at the regional level. EUNIS-ESy was designed for use in Europe and adjacent areas including Macaronesia, Anatolia, Cyprus and the Caucasus region. It may work well also in adjacent parts of Siberia, the non-desert part of Kazakhstan or in the biome of Mediterranean sclerophyllous vegetation in the Near East and northern Africa. However, the misclassification risk is higher there because the expert system was not tested for these regions. Misclassifications can also be more common in some regions within the geographical scope of this expert system such as Turkey, Cyprus and the Caucasus. These regions have high habitat and vegetation diversity but sparse data, which did not allow the same level of testing of the classification accuracy as in other parts of Europe.
- Classification accuracy at the local level. Even in Europe, some misclassifications can occur because the formal definitions of the habitats are optimized for the whole of Europe, which does not allow all local between-habitat differences in species composition to be considered. There are many pairs of species that clearly belong to different habitats in some European regions while sharing the same habitat in other regions. Therefore, in local applications of the expert system, the classification of specific vegetation plots should be considered as their suggested classification rather than their correct classification.
4.4 Characteristic species combinations and distribution maps of habitats
Characteristic species combinations provided in this study are based on a statistical analysis of a large database of European vegetation plots, following the approach originally proposed by Chytrý and Tichý (2003) for an analysis of the Czech National Phytosociological Database and subsequently used for analyses of vegetation-plot databases in other countries (Jarolímek and Šibík, 2008; Kącki et al., 2013). The formal division of the characteristic species combination into diagnostic (concentrated in the habitat), constant (frequent, but not necessarily concentrated) and dominant (often attaining a high cover) provides a comprehensive characterization of each habitat through its plant species.
Species lists for European vegetation/habitat types are also provided in the electronic factsheets of the European Red List of Habitats (Janssen et al., 2016; https://forum.eionet.europa.eu/european-red-list-habitats/library/terrestrial-habitats) and the Supplementary material of EuroVegChecklist (Mucina et al., 2016, their Appendix S6; see also https://www.synbiosys.alterra.nl/evc/). However, both of these compilations are based on data from various sources and concepts developed independently by various experts, which introduces some inconsistencies. Moreover, these lists do not distinguish between diagnostic, constant and dominant species. In contrast, our lists are consistent across habitats, clearly discriminate the three categories of species included in characteristic species combination, and provide a numerical ranking of importance of each species within each category. Therefore, they can be used for various analyses and practical applications as the so far most reliable source of information on species composition of different European habitats. However, it is important to note that although we used a geographically stratified resampling of the data set, these species lists are, to some extent, biased due to considerable differences in vegetation plot density among European regions. In particular, species from northern and eastern Europe can be underrepresented in these lists. Moreover, information on bryophyte and lichen species is affected by the lack of their recording in many vegetation plots. Nevertheless, once the data sets from undersampled regions and with a more consistent recording of non-vascular plants become available, an extended data set can be classified by the current expert system and the species lists of characteristic species combinations can be updated.
The maps of habitat distribution based on the available European vegetation plots appear realistic for the habitats restricted to western, central and southern Europe, but have many gaps for the habitats occurring in or extending to the north and east. Distribution ranges of habitats can be successfully modelled if the original records of habitat occurrence cover a large part of the real range (Jiménez-Alfaro et al., 2018). However, our modelling exercise (Schaminée et al., 2014) yielded unstable and often incorrect results, especially in extrapolations to data-poor areas in eastern Europe. Therefore we refrained from complementing the maps with models here. Further work should be focused on identifying the areas with the most important data gaps for individual habitats and collecting data from such areas. The number of plots classified to each habitat reported in Table 1 partly reflects the occurrence frequency of each habitat in Europe, but it also reflects the research intensity. Special attention should be paid to the habitats that are so far represented by very few plots.
5 CONCLUSIONS
The expert system EUNIS-ESy introduced in this paper has been shown to effectively assign vegetation-plot records to EUNIS habitats with a high level of accuracy as evaluated by expert judgement. The novel possibility of combining floristic data with plot-specific geographic or environmental data as classification criteria allows enormous flexibility, which makes it possible to apply the expert system not only to floristically defined vegetation types but also to other habitat types that are defined through a combination of vegetation type and other criteria.
The expert system approach to habitat identification has several advantages: (a) it enables identification of vegetation plots even if they have not been labelled with the name of a habitat or a syntaxon by human experts; (b) it applies habitat classification based mainly (but not only) on floristic criteria consistently across the whole of Europe and adjacent areas, unlike assignments by human experts, which may differ from place to place depending on varying regional traditions; (c) it provides explicit species-based and location-based assignment rules that are intelligible to human experts, enabling them to understand the meaning, delimitation and content of each habitat; (d) it can classify not only the plots that are currently available but also those that will be obtained in the future, using the same criteria; and (e) it can be modified or improved by adding new types or adjusting definitions of already included types.
EUNIS-ESy can serve multiple purposes. Its primary purpose is to provide clearly defined units for conservation assessment, planning, decision-making, monitoring and management, both within the European Union and beyond, and to support the initiatives such as Natura 2000, Emerald, INSPIRE or habitat Red List assessments. It also provides operational habitat units for research focused on European biodiversity, ecosystem services and global change. The practitioners in nature conservation can either use the expert system directly to classify vegetation plots (e.g., in the Juice program) or they can use the outputs provided in the Appendices of this paper, i.e. characteristic species combinations and distribution maps.
We intend to continue the development of EUNIS-ESy. Its structure is designed to allow additions of formal definitions of new habitats, modifications of existing definitions and replacement of current provisional definitions of some habitats by improved and tested definitions. The missing or provisionally defined habitats will be added in the near future, depending on the process of the EUNIS revision. We posted the current version of the expert system to the Zenodo repository (http://doi.org/10.5281/zenodo.3841729), where we will be releasing updated versions in the future, labelled with the date of release.
ACKNOWLEDGEMENTS
Vegetation-plot data for this study were provided, in addition to the authors of this paper, by Alicia Acosta, Iva Apostolova, Ariel Bergamini, Henry Brisse, János Csiky, Iris de Ronde, Patrice De Ruffray, Michele De Sanctis, Panayotis Dimopoulos, Federico Fernández González, Úna FitzPatrick, Xavier Font, Deniz Işık Gürsoy, Jonathan Lenoir, Thomas Michl, Vladimir Onipchenko, Eszter Ruprecht, Pavel Shirokikh, Irina Tatarenko, Ioannis Tsiripidis, Emin Uğurlu, Roberto Venanzoni, Thomas Wohlgemuth, the teams of the Finnish, French and Swedish National Forest Inventories and many others. We thank Eckhard von Raab-Straube for providing the Euro+Med PlantBase, and the CERIT Scientific Cloud of Masaryk University for access to their computing facilities. Useful comments on our work were kindly provided by Nicola Alessi, Annemarie Bastrup-Birk, Dave Roberts and three anonymous reviewers. Previous versions of different parts of the text and appendices presented in this paper were released earlier in non-reviewed reports by our team (Schaminée et al., 2012, 2013, 2014, 2016a, 2016b, 2018, 2019, 2020). In this paper, these partial results including the expert system script were revised with a significant input of numerous co-authors, the analyses were recalculated using the up-to-date version of the EVA database, and the results were summarized across all the six habitat groups and organized into factsheets.
AUTHOR CONTRIBUTIONS
JHJS, MC, LT, SMH, JSR, DE, RS and MPL conceived the idea; JHJS, JAMJ, JSR, MC, LM, SMH, DE and MPL, with contributions of other co-authors and technical assistance of ET, revised the EUNIS classification system and concepts of individual habitat types; JSR, JHJS, JAMJ and MC prepared or revised the descriptions of habitat types; SMH, IK, BJA and MC managed the EVA database and related data sets; SMH, MC and JDa unified taxon nomenclature; LT developed the expert system language with contributions from MC, FL, SMH, HB and FJ; LT implemented the software tool for the expert system in Juice; SMH implemented a corresponding tool in Turboveg 3; MC prepared and tested the expert system with contributions especially from TP, CM, FL, JHJS, JDe, PN and DZ; MC prepared the output data sets of characteristic species combinations and habitat distributions with technical assistance from SMH, LT and IK; ET performed a quality check of the results; MC wrote the paper; all the other authors provided vegetation-plot data and commented on the content of the expert system, characteristic species combinations, maps and the text of the paper.
Open Research
DATA AVAILABILITY STATEMENT
Vegetation plot data used in this study are stored in the EVA (http://euroveg.org/eva-database). The expert system script is available in Appendix S3 and in the Zenodo repository (http://doi.org/10.5281/zenodo.3841729).