Gullies and ravines are common landforms in raised marine fine-grained deposits in Norway. Gullies in marine clay are significant landforms indicative of soil erosion and natural hazards and are of high conservation value. As a result of the substantial impact of human intervention over the past century, marine clay gullies are now red-listed. To monitor the condition of these landforms, we need to improve our understanding of their spatial extent, complexity and morphology. We explore the applicability of automated approaches that use a methodology of combining deep learning (DL), fully convolutional neural networks (FCNNs) and an unmodified U-Net model with ArcPy libraries and ground truth data to derive a high-resolution map of gullies in raised marine fine-grained deposits. Predictors used comprise solely terrain derivatives to broaden the usage of the pre-trained model to other regions. Our best model achieved a precision score of 0.82 and a recall of 0.75. We find that our pre-trained model can successfully predict gullies, also in blind-test areas. The model performs better in regions with similar geological settings, scoring a length-weighted overlap of >70% with reference datasets. The novelty of this study is that we demonstrate that the model's applicability for mapping routines increases when we post-process the predictions by eliminating noise, especially by using the predictions derived from ensembled models. We, therefore, conclude that the pre-trained models can effectively be used to supplement the geomorphological mapping of marine clay gullies in Norway. The outcome of this research contributes towards mapping the spatial extent and condition of red-listed landforms in Norway, as well as the development of monitoring systems for future landscape change.

1 INTRODUCTION

During and after the rapid deglaciation of the Scandinavian ice sheet, the retreating outlet glaciers fed fjords and sea inlets with large quantities of fine-grained glaciomarine and marine sediments. As a result of the ongoing post-glacial isostatic rebound after deglaciation, these marine deposits, commonly consisting of clay stratified with silt and sand, gradually emerged above sea level (Hansen et al., 2007; Reite, Sveian, & Erichsen, 1999). During regression, these emerging flats of marine deposits tend to develop into a characteristic landscape called marine clay landscapes (Erikstad, 1992) or gully landscapes (Bergqvist, 1990; Hamre et al., 2021). The gully landscapes in Norway have a rough topography mainly controlled by gully erosion, incision by rivers and landsliding in quick clay.

Characteristics of marine clay landscapes are gullies and ravines. These narrow, often v- or u-shaped landforms have steep sides and head scarps incised into unconsolidated material (Higgins & Coates, 1990). Erosion along these gullies involves the removal of sediment because of concentrated flow converging towards lower points of the watershed and is often associated with groundwater seepage and shallow sliding (Bridge, 2003). With the narrow channels' increasing size, depth and branching, gullies gradually transit into ravines. Gullies and ravines may have permanent or intermittent flowing streams that control the local drainage network and influence the direction of groundwater flow. For simplicity, in this paper, we do not differentiate between gully and ravine and use the term marine clay gully to describe the narrow channel incised into fine-grained glaciomarine and marine deposits.

Gullies in marine fine-grained deposits are of high conservation value due to the marine clay's high nutrient content and moisture-holding capacity (Erikstad, 1992; Hamre et al., 2021). Networks of the gully systems are essential wildlife corridors (Blindheim & Abel, 2002) and facilitate a large diversity of habitat types (Blindheim et al., 2018; Jansson & Høitomt, 2013). Agricultural policies of levelling and ploughing, along with urban development, caused the gully landscape to be subjected to substantial landscape change over the past century (Erikstad, 1992; Hamre et al., 2021), which resulted in the red-listing of the landform marine clay gully (Erikstad et al., 2018).

As part of developing a nationwide conservation plan for preserving the gully landscape, there is a need to map and monitor the change and condition of marine clay gullies. Moreover, because many quick-clay landslides are initiated in marine clay gullies, an overview of the spatial extent and development of these landforms contributes to improving hazard assessment. The establishment of monitoring systems with repeated mapping and comparison of time series, areas of comprehensive vertical erosion and migration of gullies may attract attention to hazard mitigation (Ryan et al., 2022) and soil erosion (Barneveld, Stolte, & van der Zee, 2022; Kværnø & Krzeminska, 2021).

Earlier studies on the delineation and condition of marine clay gullies in Norway have relied on manual mapping of aerial images (Hamre et al., 2021) or high-resolution terrain models and surficial geological maps (Christoffersen et al., 2021; van Boeckel et al., 2022; van Boeckel et al., 2023). Approaches to automated delineations of gullies outside of Norway have also developed rapidly but have mostly focused on gully erosion susceptibility and comparison of different machine-learning algorithms (Arabameri et al., 2020; Arabameri et al., 2022; Band et al., 2020; Chen et al., 2021; Chuma et al., 2023; Gayen et al., 2019; Mohebzadeh et al., 2022; Setargie et al., 2023). Setargie et al. (2023) used a Random Forest-based approach in Ethiopia, combining 164 manually mapped gullies with 20 predictors. The predictors used in this study included elevation, slope, Topographic Positioning Index (TPI), Terrain Roughness (TR), profile curvature, convergence index, soil type and distance from streams. Band et al. (2020) applied a deep learning approach using 132 gully erosion locations with 13 independent variables, comprising lithology, rainfall, Stream Power Index (SPI), and Topographic Wetness Index (TWI) and terrain derivatives, similar to the study from Setargie et al. (2023). Liu et al. (2022) also tested the applicability of automated approaches in new blind-test areas by applying U-net for image segmentation using satellite (QuickBird-2, Pleiades: worldView-03) with obtained UAV image data (worldView2 and PHANTOM 4 RTK). Here, the authors successfully used vector lines of gullies as ground truth data but provided limited information on the accuracy of delineation of the depressions. On the other hand, Arabameri et al. (2021) and Roy and Saha (2022) applied ensemble models with conventional machine learning algorithms. Lately, U-net and deep learning have been applied in different variations to map gully erosion (Aouragh et al., 2023; Chen et al., 2023; Gholami et al., 2023; Malik et al., 2021). They used topographical and hydrological gully erosion conditioning factors such as rainfall, distance from the river, surface runoff, length of overland flow and topographical wetness index. Although the studies mentioned above succeeded in identifying and delineating the gullies, little is known about automatic differentiations of gullies impacted by human interventions, such as agricultural levelling, and how these predictions can further be used for geomorphological mapping routines.

In this study, we address these issues by (1) exploring automated differentiation of intact and impacted gullies, (2) carefully choosing predictors that are typically used in manually mapping approaches of marine clay gullies, (3) testing the applicability of pre-trained models to blind-test areas on similar and different geological settings and (4) discussing the applicability of deep learning predictions for geomorphological mapping routines. We do so by assessing the automatic differentiation of gullies with high precision, using U-net architecture and fully convolutional neural networks (FCNNs). The selected study areas are in Romerike and in Trøndelag (Figures 1 and 2), where intact and impacted gully systems are found in different geological and geomorphological settings. We evaluate the predictions statistically by (a) calculating precision, recall and F1 score with ground truth data and (b) by comparing the predictions to a reference vector line dataset (Christoffersen et al., 2021; van Boeckel et al., 2022).

Details are in the caption following the image — **FIGURE 1**
Open in figure viewer PowerPoint

(a) Overview of the two study areas south of Norway: Trøndelag and Romerike. (b) Elevation below the marine limit in Romerike, in shaded relief, using a slope map from Kartverket. (c) Surficial geology map of Romerike (NGU, 2024). The blue line marks the marine limit, and the thick black line marks the division between Romerike North and Romerike South. The marine clay gullies are mapped as vector lines by van Boeckel et al. (2022). [Colour figure can be viewed at wileyonlinelibrary.com]

2 STUDY AREAS

The study areas represent different marine clay landscapes in two regions: in Romerike, South-East Norway, divided into Romerike North and Romerike South (Figure 1b), and in Mid-Norway, comprising of Byneset, Orkdal and Stadsbygd (Figure 2). All study areas are located below the marine limit, representing a modelled elevation of the highest relative sea level after deglaciation (Høgaas et al., 2022). The marine limit varies throughout the country; in South-East Norway, the marine limit reaches up to 220 m a.s.l., and in Mid-Norway, it reaches up to 190 m a.s.l. (NGU, 2024). The vast majority of the raised marine fine-grained deposits, hosting the marine clay gullies, are found below this limit.

All the study areas comprise large raised marine clay deposits, reflecting a near-horizontal surface of the old seabed before the inception of gully and river erosion. This near-horizontal surface can be regarded as a reference surface for estimating erosion depth in gullies and differs in elevation above sea level for each study area. Therefore, the base level of erosion along gullies is located at different elevations but is also controlled by bedrock, rivers, or the sea. An overview of the different characteristics of each study area and how the different areas have been involved in this study is shown in Table 1. More detailed descriptions of the different study areas are given below.

TABLE 1. Overview of the different study areas.

Study area	Landscape characteristics	Usage	Reference data
Romerike South	Large flats of fine-grained marine deposits incised by large parallel and dendritic gully networks	Training and testing. Predictions evaluated with reference data	Vector lines (van Boeckel et al., 2023)
Romerike North	Large flats of fine-grained marine deposits incised predominantly dendritic gully networks	Blind-test area, predictions evaluated with reference data	Vector lines (van Boeckel et al., 2022)
Byneset	Large flats of fine-grained marine deposits incised by a few large dendritic gully networks	Blind-test area, predictions evaluated with reference data	Vector lines (NGU, 2024)
Orkdal	Narrow fjord valley with large fluvial plains and closely spaced gullies	Blind-test area, predictions evaluated with reference data	Vector lines (NGU, 2024)
Statsbygd	Low and flat-lying marine deposits with short and shallow gullies	Blind-test area, predictions evaluated with reference data	Vector lines (Christoffersen et al., 2021)

Romerike and Byneset are characterised by bedrock hills protruding through large flats of fine-grained marine deposits (Figures 1b and 2a). These deposits have widely been incised by large, often dendritic networks of gullies and/or substantial quick-clay landsliding. The extensive networks of gullies vary in length from nearly a kilometre up to 10 km and depths between a few metres up to 40 m. Romerike has two large rivers, Glomma and Vorma. The gullies along these river systems, adjacent to the fluvial bars and plains, are relatively parallel and straight, oriented perpendicular to the main river, with less dendritic branching. Romerike also has four other smaller river systems, and large gully networks are connected to three of them. The main stream in Byneset is small, and its outlet is on bedrock that acts as the erosional basis.

Orkdal is a relatively narrow fjord valley with steep bedrock sides. Along the valley runs a large meandering river, Orkla, with fluvial plains and terraces in the valley bottom. The marine clay deposits are exposed at higher elevations on the valley sides but lie stratigraphically beneath the fluvial deposits in the valley bottom (Figure 2b). The gullies in Orkdal are steep, closely spaced and oriented perpendicular to the main river. Here, the gullies are relatively short, with only a few branched into networks and longer than 1 km.

Stadsbygd has one main, small stream in relatively flat-lying marine deposits confined by bedrock and with an outlet in the sea. The stream has only a few attached short and shallow gullies with depths of less than 10 m (Figure 2b).

3 METHODS

We aimed to train our models to identify and delineate gullies using a minimum number of predictor variables, comprising solely terrain derivatives and ground truth data from Romerike S. We then assessed the performance of our models to sampled ground truth data for evaluation and compared our predictions to reference datasets, see Figure 3 and Table 1.

3.1 Data input and preparation

3.1.1 Data preparation—ground truth data

Marine clay gullies were digitised manually and prepared as ground truth data for training the model. The landforms were digitized using light detection and ranging (LiDAR) data, orthophotos provided by Kartverket (the Norwegian Mapping Authority) and existing Quaternary geological maps (NGU, 2024). The landforms were mapped with a minimum size of 2500 m². The shape of many individual gullies, or sections of more extensive gully networks, is partly impacted by human activity, often as a result of agricultural levelling and ploughing, and filling of construction material (Erikstad, 1992; van Boeckel et al., 2022). Because the morphology of impacted, often smoothened, gullies differ to such an extent from the unimpacted, steep and V-shaped gullies, we divided the ground truth dataset into two categories: sharp and smoothened gullies.

The width of sharp gullies scales with length but is typically less than 100 m wide and rarely exceeds 200 m. The slopes of both flanks are steep (>20°), often symmetrical, with abrupt boundaries to a surrounding near horizontal surface (Figure 4f). The centreline along the base of the gully is gradually inclined with deeper incisions in the downstream direction. In contrast, smooth gullies are typically wider, ranging between 100 and 250 m, irrespective of the length of the gullies. The boundary of smooth gullies is more gradual, with shallow slopes along the flanks (<10°). The base has a lower relative topography compared with sharp gullies, in respect to the surrounding surface elevation. Due to levelling and/or infill of material, the base of smooth gullies can sometimes be undulating in the long-profile direction (Figure 4f).

Because the morphology of gullies also differs from near parallel gullies to dendritic gully systems, we depicted subsets of training data that covered both types of gully networks in the regions RS 1 and RS 2 in Romerike South (Figures 3 and 4). In order to test the amount of training data needed to predict gullies, we used two different settings: Setting 1, which included training data of the whole Romerike South, including ground truth data from RS 1 and RS 2, and Setting 2, which only used training data from RS 1 (Figures 3 and 4). In total, 186 smooth and 147 sharp (Table 3) gullies randomly spread with different sizes were used for training.

3.1.2 Predictor variables—terrain derivatives from DEM

To target the above-mentioned morphological characteristics of marine clay gullies, we chose to use solely terrain derivatives describing the relative topography as predictor variables. We calculated the terrain derivatives from a high-resolution digital elevation model (DEM), derived from LiDAR, accessed January 2023 at https://hoydedata.no. The DEM is clipped to the area below the marine limit (Høgaas et al., 2022) and has a spatial resolution of the DEM 1 m with a vertical accuracy of approximately 0.1 m (Terratec, 2022). The terrain derivatives comprised of slope, TPI, and terrain roughness (TR) were stacked into one composite band; accordingly, see Figure 3. We used a moving window of 100 m for calculating TPI with the tool ‘DiffFromMeanElev’ using WhiteboxTools (Lindsay, 2014) and a moving window of 5 m for calculating the TR using the standard deviation of the surrounding topography (Grohmann, Smith, & Riccomini, 2009).

Initial test runs also included additional categorical data, such as land-use maps Arealressurskart (1: 5 000) (AR5) (Ahlstrøm, Bjørkelo, & Fadnes, 2019), surficial deposit maps in 1:50 000 (NGU, 2024) and continuous elevation data. Due to unsuccessful predictions of gullies of the pre-trained model for the blind-test areas, and because the scope of this study was to apply the pre-trained model to other regions, we decided to drop these predictor variables for further analysis which will not be presented in our results section.

3.2 Method and evaluation

3.2.1 Semi-automated mapping

We used a convolutional neural network (CNN) to gullies using training data consisting of predictor variables (slope, TPI and TR) and ground truth data. Models based on CNN can distinguish patterns exceptionally well in applications that deal with image data (Chen et al., 2019; Zaidi et al., 2022). Convolutions are matrix calculations based on a moving window, usually using 3 × 3 cells, to compile geospatial information into classified tiles (Albawi, Mohammed, & Al-Zawi, 2017). The spatial dimension of the classified tiles is crucial; therefore, we apply U-net (Ronneberger, Fischer, & Brox, 2015) and CNN architecture for semantic segmentation and pixel-based classification (Prakash, Manconi, & Loew, 2021; Zhang, Zhang, & Du, 2016). We used the backbone ResNet34 to create the UnetClassifier base, using ArcPy libraries to build a dynamic U-Net. U-Net is known for being fast, effective and precise in segmentation, recognising objects based on local information in the ground truth (Leng et al., 2019). This approach requires two types of data sources for training: ground truth data, with vector-based manually mapped features aimed to be predicted, and terrain derivatives used for recognising these features (Nodjoumi et al., 2023). To train robust models (Shelhamer, Long, & Darrell, 2017; Ye, Ni, & Yi, 2017), we used training datasets for Settings 1 and 2 containing both a raster stack of terrain derivatives and classified ground truth polygons for the corresponding areas; see also Figure 3. After testing the training data with different numbers of classified tiles, we found that 20 000 randomly generated samples, exported as classified tiles, performed the best to train the model. The randomly generated samples were exported as classified tiles using an Image Analyst licence (‘Export Training Data for Deep Learning’) from ArcGIS Pro (ArcGIS Pro, 2022). The most suitable classified tile size in our case was 256 × 256 pixels, and in order to have 50% overlap in each sample tile when creating the following image chips, the stride, which describes the distance of movement in the x- and y-direction, was set to be 128 × 128 pixels. The entire process of training, evaluating and exporting the model was conducted using Jupyter Notebook and ArcPy libraries. During the training process, an input image in the form of classified tiles flows through the CNN network that recognizes it with a set of trainable kernels, resulting in a group of feature maps (Liu, 2018). The models were trained for 50 epochs. To avoid overfitting, we set an early stopping parameter when the training did not improve after 10 epochs. We also applied the Adam optimizer (Kingma & Ba, 2015; Malik et al., 2021). The other parameters were maintained at their default values. The trained model was saved as a ‘Deep Learning Package’(‘.dpk’ format), which is the standard format used to deploy deep learning models on the ArcGIS Pro platform and can be used further as a pre-trained model (ESRI, 2023; Ma & Mei, 2021; Miranda & Von Zuben, 2015). The trained models were then used to predict gullies in the other study areas for the blind test (Figures 1b and 2).

3.2.2 Evaluation

The resulting predictions of smooth and sharp gullies were evaluated quantitatively by comparing pixels of the sampled ground truth data of gullies to the automated gullies' predictions of the same areas. We calculated metric precision, recall and F1 score metrics to evaluate the performance of the two proposed models. Precision is a measure of how many of the positive predictions are made correctly (true positives) (Table 2a), while recall is a measure of how many of the positive cases were correctly predicted, over all the positive cases in the data. F1 score is a measure combining both precision and recall. A satisfactory F1 score means that there are low false positives and low false negatives. An F1 score is considered solid with a value close to 1 (Table 2b) (Lipton, Elkan, & Naryanaswamy, 2014).

TABLE 2. (a) Explained values of true positive (TP), false positive (FP), false negative (FN) and true negative (TN) values (Safari, 2015; Skaik, 2008) (b) to evaluate performance metrics precision, recall and F1 score that were calculated based on (a).

(a)
Prediction	Actual value	Type	Explanation
1	1	True positive (TP)	Predicted positive and was positive
0	0	True negative (TN)	Predicted negative and was negative
1	0	False positive (FP)	Predicted positive but was negative
0	1	False negative (FN)	Predicted negative but was positive

(b)
Metric	Formula
Precision	$\frac{TP}{TP + FP}$
Recall	$\frac{TP}{TP + FN}$
F1 score	$\frac{2 TP}{2 TP + FP + FN}$

When trained properly, pre-trained models can be used for similar problems in similar settings to save time and reduce the need for more ground truth data (Ma & Mei, 2021; Tehrani et al., 2022). For this reason, we test our pre-trained models' applicability to the four blind-test areas: Romerike N, Byneset, Orkdal and Stadsbygd (Figures 1 and 2). Here, we regard Romerike N and Byneset to have a similar geological setting with large networks of dendritic gullies. In Orkdal and Stadsbygd, the geological setting is different, with shorter, often shallower and more individual gullies. We compare the predictions in all study areas with a reference dataset comprising manually mapped gullies as vector lines from the Norwegian geological survey (Christoffersen et al., 2021; van Boeckel et al., 2022) (Figures 1c and 2 and Table 1). First, we post-processed the predicted delineations to remove noise, which therefore readily can be incorporated into manual geomorphological mapping routines. The post-processing comprised (1) transforming the pixels into vector shapes, (2) buffering and dissolving the vector shapes with a 5 m radius, and (3) applying a filter by removing polygons smaller than 5000 m². Then, we compared the post-processed predictions using an overlay and intersect analysis to calculate length-weighted overlap and coverage of intersecting gullies. The latter represented the relative surface area of post-processed predictions intersecting with the reference dataset. We regard the intersecting predictions as true positives, which we can then use as a first-order indication of the agreement between post-processed predictions and the reference dataset. The length-weighted overlap represented the relative length of the vector lines overlapping with the post-processed predictions. The cumulative lines that did not overlap with the post-processed predictions can be regarded as an indicator of false negatives (Figure 5). We note that comparing the reference dataset with post-processed predictions does not give any information about the accuracy of the delineation of the landforms, which was done visually.

4 RESULTS

In this section, we present the performance of the U-net model in Romerike S by applying two data settings (Settings 1 and 2) using Jupyter Notebook and ArcPy libraries environment. The quantitative evaluation of the sampled pixels between ground truth data and predictions is presented in Table 3. Statistically, we can see higher precision (0.75–0.82), recall (0.69–0.73) and F1 score (0.72–0.74) for sharp gullies. On the other hand, smooth gullies have a tendency to achieve lower scores for precision (0.70–0.72), recall (0.66–0.72) and 0.68–0.72 for the F1 score (Table 3). Our results show that by using the same amount of ground truth data (20 000) but from a more extensive and more diverse study area, the performance of Setting 1 (Figure 1) only increased slightly for sharp gullies but decreased for smooth gullies, with an F1 score of +0.02 and −0.04, respectively.

TABLE 3. The statistics presented show the performances when applying training in two different settings: Setting 1 with training data from RS 1 and RS 2 and Setting 2 with training data only from RS 1 (Figure 1).

	Smooth gullies	Sharp gullies	Smooth gullies	Sharp gullies
	Setting 1		Setting 2
GTPs	186	147	95	47
Precision	0.70	0.75	0.72	0.82
Recall	0.66	0.73	0.72	0.69
F1	0.68	0.74	0.72	0.72

Abbreviation: GTPs, ground truth polygons.

Even though the statistics show minor differences in the overall performance using the different data settings, visual inspection reveals that the different models pick up different sections along the same gullies. This can also be observed when comparing the predictions to the reference datasets. When combining the predictions of Settings 1 and 2, the length-weighted overlap and coverage of intersecting gullies score slightly higher compared with the values by only using Setting 2, increasing the respective average scores with +10% and +2.4% (Table 4).

TABLE 4. The comparison of the reference dataset and post-processed predictions for Setting 2 and Settings 1 and 2 (the combined products of both Settings 1 and 2).

	Predictions (nr.)	Coverage of intersecting gullies	Length-weighted overlap	Predictions (nr.)	Coverage of intersecting gullies	Length-weighted overlap
	Setting 2			Settings 1 and 2
Romerike S	2117	82.8%	67.8%	1918	87.4%	70.5%
Romerike N	1245	94.7%	76.6%	954	94.1%	86.1%
Byneset	175	91.7%	64.5%	160	91.4%	72.7%
Orkdal	120	94.9%	55.1%	105	94.1%	67.2%
Stadsbygd	59	71.2%	38.4%	72	80.2%	56.1%

The next step was to compare the predictions of the pre-trained models to the reference datasets in Romerike S and the four blind-test areas: Romerike N, Byneset, Orkdal and Stadsbygd (Figures 1 and 2 and Table 4). We applied both pre-trained models (Settings 1 and 2) to all the study areas and post-processed the predicted pixels, as explained in Section 3.2.2. The first category of blind-test areas, Byneset and Romerike N, with a similar geological setting as Romerike S, show promising results scoring 91.7% and 94.7% for coverage of intersecting gullies and 64.5% and 76.6% for length-weighted overlap, respectively. These values increased slightly when combining the post-processed predictions of Settings 1 and 2 (Table 4). The blind-test areas Stadsbygd and Orkdal scored poorly in length-weighted overlap with 38.4% and 55.1%, respectively, when only using the post-processed products of Setting 2. The low length-weighted overlap values indicate that the pre-trained models did not pick up many vector lines from the reference dataset. The coverage of intersecting gullies scored relatively high (>71%) for all the blind-test areas, which indicates that the post-processed predictions of the pre-trained models largely managed to successfully identify the gullies.

5 DISCUSSION

5.1 Automated differentiation of intact and impacted gullies

The delineation of landforms is the fundamental process of mapping the spatial extent and condition of landscape change, which conventionally is performed manually using high-resolution optical remote-sensing images or LiDAR data. Our results show that using only three terrain derivatives and 142 manually mapped gullies (Setting 2), the U-net model successfully predicted and differentiated intact sharp gullies from impacted smooth gullies. Quantitative pixel evaluation of sampled ground truth data revealed that doubling the ground truth data (Setting 1) only slightly improved the F1 score for sharp gullies (+0.02) but decreased for smooth gullies (−0.04). Overall, both automated identification models revealed promising results in differentiating gullies impacted by agricultural levelling from intact gullies, as also seen in Roy & Saha (2022).

5.2 The minimal amount of predictors

For the blind-test areas, the best predictions were achieved using predictors indicative of relative elevation, for example, the terrain derivatives slope, TR and TPI, as opposed to absolute elevation from DEMs. We explain this by the fact that the gullies are found in raised marine fine-grained deposits at varying elevations between the study areas (Figures 1 and 2). Because the model was trained on ground truth data located at elevations between approximately 100 and 180 m a.s.l., the pre-trained model was unable to detect gullies at lower elevations. We, therefore, stress that using elevation data as a predictor should be used with caution when predicting blind-test areas.

Unlike similar studies delineating gullies with the usage of over a dozen independent predictors (Band et al., 2020; Setargie et al., 2023), we show that a promising delineation of gullies can be achieved by only using three predictor layers derived from high-resolution elevation data. Although we recognize that adding additional predictor variables, such as curvature- or slope of slope terrain derivatives and categorial land-use maps, could potentially improve the performance of our models, we argue that having few predictors makes our approach more accessible and applicable in other areas for future mapping.

5.3 The applicability of pre-trained models to blind-test areas

The robustness of a model increases when successful predictions are not limited to trained areas but also manage to predict in blind-test areas for other regions (Sarker, 2021). As we do not have ground truth data of our blind-test areas, we rely on overlay analysis between the vector-line reference dataset with post-processed predictions. Our results show that the post-processed predictions broadly intersect with the reference dataset (>71.2%). If we only regard the blind-test areas with similar geological settings, namely, Romerike N and Byneset, the coverage of intersecting gullies increases to >91.7%, indicating that the model manages to accurately identify gullies. Similarly, Romerike N and Byneset score significantly higher in length-weighted overlap compared with Orkdal and Stadsbygd, reflecting that large stretches of the reference dataset overlap with the pre-trained models. The difference in geological setting can explain the discrepancy of lower length-weighted overlapping values for Orkdal and Stadsbygd. In these areas, the gullies are much shorter and less branched into networks compared with the gullies used in the ground truth dataset. Future incentives to train the model specifically for these settings or to include them in the training dataset might increase the model's performance.

We noticed that the delineation of the predictions was improved by using the combined predictions of Settings 1 and 2. This improved performance is also reflected by higher length-weighted overlap values for all blind-test areas, increasing the length-weighted overlap by an average of +10%. Similar to the studies of Arabameri et al. (2021) and Roy and Saha (2022), which used ensemble models for forecasting areas vulnerable to gully erosion, our findings confirm that the combined products of the pre-trained models increase the overall delineation of the gullies.

5.4 Applicability of DL in geomorphological mapping

One of the advantages of using automated approaches compared with manual mapping is that the automatic delineation of landforms can be evaluated quantitatively against ground truth data. For example, predictions can be evaluated by positively identified pixels (e.g. Setargie et al., 2023) and positively identified vector lines (e.g. Band et al., 2020). Even though quantitative evaluations can give satisfactory results, there is little information about the correctness of the delineation of the predicted gullies. A gully can, for example, be identified with a pixel accuracy of 75%, but this does not necessarily mean that the outer extent of the predicted gully corresponds to the actual landform. As manual mapping routines often involve the delineation of individual landforms, having an inaccurate outer delineation still requires substantial adjustments to be implemented to satisfy the prerequisites for usage in geomorphological maps. We found that post-processing the pixel-based predictions into coherent polygons and reducing noise with a minimum size filter significantly increased the applicability of the product for mapping routines, see also Figure 6. We point out that the post-processing of prediction delineations should be considered when implementing automated approaches in manual mapping routines.

6 CONCLUSION

Development in computing, deep learning algorithms and increased availability of high-resolution and free data have the potential to automate many mapping problems in Earth sciences. However, its application in differentiating and delineating landforms in mapping routines using deep learning techniques has not been thoroughly investigated. We contribute by exploring the automated differentiation of intact (sharp) and impacted (smooth) gullies with high precision using the combination of the deep learning FCNN model with only three terrain derivatives (slope, TR and TPI). Our best model achieved a precision score of 0.82 and 0.72 for sharp and smooth gullies, respectively. Our pre-trained models successfully predicted gullies in blind-test areas, scoring >70% length-weighted overlap and >82% coverage of intersecting gullies for regions with similar geological settings. We find that combining model predictions, along with processing the predictions, increases the agreement between automated delineations and reference datasets. We therefore stress the importance of post-processing steps to enhance the applicability of deep learning models in geomorphological mapping routines. The outcome of this research contributes towards better implementing automated approaches in manual mapping routines, as well as the development of monitoring systems for future landscape change.

AUTHOR CONTRIBUTIONS

Conceptualization: Alexandra Jarna Ganerød and Mikis van Boeckel. Methodology: Alexandra Jarna Ganerød and Mikis van Boeckel. Software: Alexandra Jarna Ganerød and Mikis van Boeckel. Validation: Alexandra Jarna Ganerød and Mikis van Boeckel. Formal analysis: Alexandra Jarna Ganerød and Mikis van Boeckel. Investigation: Alexandra Jarna Ganerød and Mikis van Boeckel. Resources: Alexandra Jarna Ganerød, Mikis van Boeckel and Inger-Lise Solberg. Data curation: Alexandra Jarna Ganerød and Mikis van Boeckel. Writing—original draft preparation: Alexandra Jarna Ganerød, Mikis van Boeckel and Inger-Lise Solberg. Visualization: Alexandra Jarna Ganerød and Mikis van Boeckel. All authors have read and agreed to the published version of the manuscript.

ACKNOWLEDGEMENTS

We are grateful to all those with whom we have had the pleasure to work during this and other related projects connected to the topic. The copy read by Danielle Robert benefited this manuscript. Thank you to Gabriela Spakman-Tánásescu for introducing ArcGIS Pro and deep learning possibilities and the DEEP: Norwegian Research School for Dynamics and Evolution of Earth and Planets.

Open Research

DATA AVAILABILITY STATEMENT

The source code is available for download here: https://github.com/alexandra-jarna/Ravines-Norway. Programme language: Python. Software required: data preparation (ArcGIS Pro/QGIS). Pretrained models are available for download here: https://github.com/alexandra-jarna/Ravines-Norway/blob/main/pre-trained-models.

REFERENCES

Ahlstrøm, A., Bjørkelo, K., & Fadnes, K. D. (2019) AR5 Klassifikasjonssystem Klassifisering av arealressurser.
Google Scholar
Albawi, S., Mohammed, T.A. & Al-Zawi, S. (2017) Understanding of a convolutional neural network. In: Proceedings of 2017 International Conference on Engineering and Technology, ICET 2017, 2018-Janua (April 2018), pp. 1–6. Available at: https://doi.org/10.1109/ICEngTechnol.2017.8308186
10.1109/ICEngTechnol.2017.8308186
Google Scholar
Aouragh, M.H., Ijlil, S., Essahlaoui, N., Essahlaoui, A., el Hmaidi, A., el Ouali, A. et al. (2023) Remote sensing and GIS-based machine learning models for spatial gully erosion prediction: a case study of Rdat watershed in Sebou basin, Morocco. Remote Sensing Applications: Society and Environment, 30(February), 100939. Available from: https://doi.org/10.1016/j.rsase.2023.100939
10.1016/j.rsase.2023.100939
Google Scholar
Arabameri, A., Chandra Pal, S., Costache, R., Saha, A., Rezaie, F., Seyed Danesh, A. et al. (2021) Perdition of gully erosion susceptibility mapping using novel ensemble machine learning algorithms. Geomatics, Natural Hazards and Risk, 12(1), 469–498. Available from: https://doi.org/10.1080/19475705.2021.1880977
10.1080/19475705.2021.1880977
Web of Science® Google Scholar
Arabameri, A., Chandra Pal, S., Santosh, M., Chakrabortty, R., Roy, P. & Moayedi, H. (2022) Drought risk assessment: integrating meteorological, hydrological, agricultural and socio-economic factors using ensemble models and geospatial techniques. Geocarto International, 37(21), 6087–6115. Available from: https://doi.org/10.1080/10106049.2021.1926558
10.1080/10106049.2021.1926558
ADS Web of Science® Google Scholar
Arabameri, A., Chen, W., Loche, M., Zhao, X., Li, Y., Lombardo, L. et al. (2020) Comparison of machine learning models for gully erosion susceptibility mapping. Geoscience Frontiers, 11(5), 1609–1620. Available from: https://doi.org/10.1016/j.gsf.2019.11.009
10.1016/j.gsf.2019.11.009
Web of Science® Google Scholar
ArcGIS Pro. (2022) Export training data for deep learning (image analyst). Available at: https://pro.arcgis.com/en/pro-app/latest/tool-reference/image-analyst/export-training-data-for-deep-learning.htm
Google Scholar
Band, S.S., Janizadeh, S., Chandra Pal, S., Saha, A., Chakrabortty, R., Shokri, M. et al. (2020) Novel ensemble approach of deep learning neural network (DLNN) model and particle swarm optimization (PSO) algorithm for prediction of gully erosion susceptibility. Sensors (Switzerland), 20(19), 1–28. Available from: https://doi.org/10.3390/s20195609
10.3390/s20195609
Web of Science® Google Scholar
Barneveld, R.J., Stolte, J. & van der Zee, S. (2022) Estimating ephemeral gully erosion rates in a Norwegian agricultural catchment using low-altitude Uav imagery. SSRN Electronic Journal, 1, 1–34. Available from: https://doi.org/10.2139/ssrn.4085344
10.2139/ssrn.4085344
Google Scholar
Blindheim, T. & Abel, K. (2002) Vilt i Skedsmo kommune [Wildlife in Skedsmo municipality]. Siste Sjan.
Google Scholar
Blindheim, T. et al. (2018) Kartlegging av arter i raviner i Skedsmo kommune 2017 [Mapping of species in gullies in Skedsmo municipality 2017].
Google Scholar
Bridge, J.S. (2003) Rivers and floodplains: forms, processes, and sedimentary record. Journal of Quaternary Science, 19(6), 618–619. Available from: https://doi.org/10.1002/jqs.856
10.1002/jqs.856
Google Scholar
Chen, R., Zhou, Y., Wang, Z., Li, Y., Li, F. & Yang, F. (2023) Towards accurate mapping of loess waterworn gully by integrating google earth imagery and DEM using deep learning. International Soil and Water Conservation Research, 12(1), 13–28. Available from: https://doi.org/10.1016/j.iswcr.2023.06.006
10.1016/j.iswcr.2023.06.006
Web of Science® Google Scholar
Chen, W., Lei, X., Chakrabortty, R., Chandra Pal, S., Sahana, M. & Janizadeh, S. (2021) Evaluation of different boosting ensemble machine learning models and novel deep learning and boosting framework for head-cut gully erosion susceptibility. Journal of Environmental Management, 284, 112015. Available from: https://doi.org/10.1016/j.jenvman.2021.112015
10.1016/j.jenvman.2021.112015
PubMed Web of Science® Google Scholar
Chen, W. et al. (2019) Evaluation of different machine learning methods and deep-learning convolutional neural networks for landslide detection. Remote Sensing, 195(October), 104777. Available from: https://doi.org/10.3390/rs11020196
10.3390/rs11020196
Google Scholar
Christoffersen, M. et al. (2021) Kartlegging av rødlistede landformer: videreføring av pilotprosjekt 2019, NGU rapport nr. 2021.001.
Google Scholar
Chuma, G.B., Mugumaarhahama, Y., Mond, J.M., Bagula, E.M., Ndeko, A.B., Lucungu, P.B. et al. (2023) Gully erosion susceptibility mapping using four machine learning methods in Luzinzi watershed, eastern Democratic Republic of Congo. Physics and Chemistry of the Earth, 129, 103295. Available from: https://doi.org/10.1016/j.pce.2022.103295
10.1016/j.pce.2022.103295
Web of Science® Google Scholar
Erikstad, L. (1992) Recent changes in the landscápe of the marine clays, Østfold, southeast Norway. Norsk Geografisk Tidsskrift - Norwegian Journal of Geography, 46(1), 19–28. Available from: https://doi.org/10.1080/00291959208552279
10.1080/00291959208552279
Google Scholar
Erikstad, L. et al. (2018) Landformer. Norsk rødlista for naturtyper 2018. Available at: https://www.artsdatabanken.no/Pages/259126
Google Scholar
ESRI. (2023) Pretrained deep learning models. Available at: https://www.esri.com/en-us/arcgis/deep-learning-models
Google Scholar
Gayen, A., Pourghasemi, H.R., Saha, S., Keesstra, S. & Bai, S. (2019) Gully erosion susceptibility assessment and management of hazard-prone areas in India using different machine learning algorithms. Science of the Total Environment, 668, 124–138. Available from: https://doi.org/10.1016/j.scitotenv.2019.02.436
10.1016/j.scitotenv.2019.02.436
CAS ADS PubMed Web of Science® Google Scholar
Gholami, H., Mohammadifar, A., Golzari, S., Song, Y. & Pradhan, B. (2023) Interpretability of simple RNN and GRU deep learning models used to map land susceptibility to gully erosion. Science of the Total Environment, 904(July), 166960. Available from: https://doi.org/10.1016/j.scitotenv.2023.166960
10.1016/j.scitotenv.2023.166960
CAS ADS PubMed Google Scholar
Grohmann, C.H., Smith, M.J. & Riccomini, C. (2009) Surface roughness of topography: a multi-scale analysis of landform elements in Midland Valley, Scotland. Proceedings of Geomorphometry, 2009, 140–148. Available at: https://doi.org/citeulike-article-id:8857982
Google Scholar
Hamre, L.N., Rydgren, K., Incerti, C., Hjorth-Johansen, I. & Simonsen, K.S. (2021) Paradise lost—transformation of the gully landscape in South-East Norway. Landscape Research, 46(3), 377–389. Available from: https://doi.org/10.1080/01426397.2020.1847263
10.1080/01426397.2020.1847263
Web of Science® Google Scholar
Hansen, L., Eilertsen, R., Solberg, I.L. & Rokoengen, K. (2007) Stratigraphic evaluation of a Holocene clay-slide in Northern Norway. Landslides, 4(3), 233–244. Available from: https://doi.org/10.1007/s10346-006-0078-4
10.1007/s10346-006-0078-4
Web of Science® Google Scholar
Higgins, C. G. & Coates, D. R. (1990) Groundwater geomorphology: the role of subsurface water in Earth-surface processes and landforms. Geological Society of America (Geological Society of America Special Paper). Available at: https://books.google.no/books?id=4jFmf4cGyRYC
Google Scholar
Høgaas, F., Hansen, L., Mølmann, K., Nordahl, B., Pettersen, E., Romundset, A., Seternes, L., Wesche, J. G. (2022) ‘ Datasett for registrering av marin grense (MG) i Norge’. Available at: https://doi.org/NGU-Rapport nr.: 2022.005, 35.
Google Scholar
Jansson, U. & Høitomt, T. (2013) Ravinekartlegging i Nannestad kommune 2012 [Mapping of gullies in Nannestad municipality 2012].
Google Scholar
Kingma, D.P. & Ba, J.L. (2015) Adam: A method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings, pp. 1–15.
Google Scholar
Kværnø, S. H. & Krzeminska, D. (2021) ‘ Agricat2-beregninger av jord- og fosfortap i vannområdet PURA, basert på arealbruk i 2020’, 7(178).
Google Scholar
Leng, J., Liu, Y., Zhang, T., Quan, P. & Cui, Z. (2019) Context-aware U-Net for biomedical image segmentation. In: Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018, (February), pp. 2535–2538. Available at: https://doi.org/10.1109/BIBM.2018.8621512
Google Scholar
Lindsay, J.B. (2014) The whitebox geospatial analysis tools project and open-access GIS. In: Proceedings of the GIS research UK 22nd annual conference. Glasgow, UK, pp. 16–18.
Google Scholar
Lipton, Z.C., Elkan, C. & Naryanaswamy, B. (2014) Optimal thresholding of classifiers to maximize F1 measure. In: R. Meo (Ed.) Machine learning and knowledge discovery in databases. Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 225–239.
10.1007/978-3-662-44851-9_15
Google Scholar
Liu, B., Zhang, B., Feng, H., Wu, S., Yang, J., Zou, Y. et al. (2022) Ephemeral gully recognition and accuracy evaluation using deep learning in the hilly and gully region of the Loess Plateau in China. International Soil and Water Conservation Research, 10(3), 371–381. Available from: https://doi.org/10.1016/j.iswcr.2021.10.004
10.1016/j.iswcr.2021.10.004
CAS Web of Science® Google Scholar
Liu, Y.H. (2018) Feature extraction and image recognition with convolutional neural networks. Journal of Physics: Conference Series, 1087, 062032. Available from: https://doi.org/10.1088/1742-6596/1087/6/062032
10.1088/1742-6596/1087/6/062032
Google Scholar
Ma, Z. & Mei, G. (2021) Deep learning for geological hazards analysis: data, models, applications, and opportunities. Earth-Science Reviews, 223, 103858. Available from: https://doi.org/10.1016/j.earscirev.2021.103858
10.1016/j.earscirev.2021.103858
Web of Science® Google Scholar
Malik, K., Robertson, C., Braun, D. & Greig, C. (2021) U-Net convolutional neural network models for detecting and quantifying placer mining disturbances at watershed scales. International Journal of Applied Earth Observation and Geoinformation, 104(May), 102510. Available from: https://doi.org/10.1016/j.jag.2021.102510
10.1016/j.jag.2021.102510
Google Scholar
Miranda, C. S. & Von Zuben, F. J. (2015) ‘ Reducing the training time of neural networks by partitioning’, pp. 1–10. Available at: http://arxiv.org/abs/1511.02954
Google Scholar
Mohebzadeh, H., Biswas, A., Rudra, R. & Daggupati, P. (2022) Machine learning techniques for gully erosion susceptibility mapping: a review. Geosciences (Switzerland), 12(12), 429. Available from: https://doi.org/10.3390/geosciences12120429
10.3390/geosciences12120429
ADS Web of Science® Google Scholar
NGU. (2024) Quaternary geological map from Geological Survey of Norway. Available at: https://geo.ngu.no/kart/losmasse_mobil
Google Scholar
Nodjoumi, G., Pozzobon, R., Sauro, F. & Rossi, A.P. (2023) DeepLandforms: a deep learning computer vision toolset applied to a prime use case for mapping planetary skylights. Earth and Space Science, 10(1), e2022EA002278. Available from: https://doi.org/10.1029/2022EA002278
10.1029/2022EA002278
ADS Web of Science® Google Scholar
Prakash, N., Manconi, A. & Loew, S. (2021) A new strategy to map landslides with a generalized convolutional neural network. Scientific Reports, 11(1), 1–15. Available from: https://doi.org/10.1038/s41598-021-89015-8
10.1038/s41598-021-89015-8
PubMed Web of Science® Google Scholar
Reite, A. J., Sveian, H., & Erichsen, E. (1999) ‘ Trondheim fra istid til nåtid – landskapshistorie og løsmasser’, Norges geologiske undersøkelse Gråsteinen, p. 40 p.
Google Scholar
Ronneberger, O., Fischer, P. & Brox, T. (2015) U-Net: convolutional networks for biomedical image segmentation. In: N. Navab, et al. (Eds.) Medical image computing and computer-assisted intervention -- MICCAI 2015. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-24574-4_28
10.1007/978-3-319-24574-4_28
Google Scholar
Roy, J. & Saha, S. (2022) Ensemble hybrid machine learning methods for gully erosion susceptibility mapping: K-fold cross validation approach. Artificial Intelligence in Geosciences, 3(March), 28–45. Available from: https://doi.org/10.1016/j.aiig.2022.07.001
10.1016/j.aiig.2022.07.001
Google Scholar
Ryan, I., Bruvoll, A., Foldal, K. M., Hæreid, G. O., Muthanna, T. M., Nordal, S., Ottesen, H. B., Solberg, I. L. (2022) På trygg grunn — Bedre håndtering av kvikkleirerisiko. Available at: https://www.regjeringen.no/no/dokumenter/nou-2022-3/id2905694/?ch=1
Google Scholar
Safari, S., Baratloo, A., Elfil, M. & Negida, A.S. (2015) Evidence based emergency medicine part 2: positive and negative predictive values of diagnostic tests. Emergency (Tehran, Iran), 3(3), 87–88. Available at:. Available from: http://www.ncbi.nlm.nih.gov/pubmed/26495390%0Ahttp://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC4608333
PubMed Google Scholar
Sarker, I.H. (2021) Machine learning: algorithms, real-world applications and research directions. SN Computer Science, 2(3), 1–21. Available from: https://doi.org/10.1007/s42979-021-00592-x
10.1007/s42979-021-00592-x
Google Scholar
Setargie, T.A., Tsunekawa, A., Haregeweyn, N., Tsubo, M., Fenta, A.A., Berihun, M.L. et al. (2023) Random Forest–based gully erosion susceptibility assessment across different agro-ecologies of the Upper Blue Nile basin, Ethiopia. Geomorphology, 431, 108671. Available from: https://doi.org/10.1016/j.geomorph.2023.108671
10.1016/j.geomorph.2023.108671
Web of Science® Google Scholar
Shelhamer, E., Long, J. & Darrell, T. (2017) Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4), 640–651. Available from: https://doi.org/10.1109/TPAMI.2016.2572683
10.1109/TPAMI.2016.2572683
PubMed Web of Science® Google Scholar
Skaik, Y. (2008) Understanding and using sensitivity, specificity and predictive values. Indian Journal of Ophthalmology, 56(4), 341. Available from: https://doi.org/10.4103/0301-4738.41424
10.4103/0301-4738.41424
PubMed Web of Science® Google Scholar
Tehrani, F.S., Calvello, M., Liu, Z., Zhang, L. & Lacasse, S. (2022) Machine learning and landslide studies: recent advances and applications. Natural Hazards, 114(2), 1197–1245. Available from: https://doi.org/10.1007/s11069-022-05423-7
10.1007/s11069-022-05423-7
Web of Science® Google Scholar
Terratec. (2022) Viken Laser 2022. Laserskanning rapport.
Google Scholar
van Boeckel, M., Solberg, I. L., Christoffersen, M., & Nordahl, B. (2022) Kartlegging av rødlistede landformer, resultater fra kartlegging i 2022. Available at: https://doi.org/NGUrapport2022.028
Google Scholar
van Boeckel, M., Solberg, I.L., Christoffersen, M., & Nordahl, B. (2023) Kartlegging av rødlistede landformer, resultater fra kartlegging i 2023.
Google Scholar
Ye, J., Ni, J. & Yi, Y. (2017) Deep learning hierarchical representations for image steganalysis. IEEE Transactions on Information Forensics and Security, 12(11), 2545–2557. Available from: https://doi.org/10.1109/TIFS.2017.2710946
10.1109/TIFS.2017.2710946
Web of Science® Google Scholar
Zaidi, S.S.A., Ansari, M.S., Aslam, A., Kanwal, N., Asghar, M. & Lee, B. (2022) A survey of modern deep learning based object detection models. Digital Signal Processing: A Review Journal, 126, 103514. Available from: https://doi.org/10.1016/j.dsp.2022.103514
10.1016/j.dsp.2022.103514
Web of Science® Google Scholar
Zhang, L., Zhang, L. & Du, B. (2016) Deep learning for remote sensing data: a technical tutorial on the state of the art. IEEE Geoscience and Remote Sensing Magazine, 4(2), 22–40. Available from: https://doi.org/10.1109/MGRS.2016.2540798
10.1109/MGRS.2016.2540798
Web of Science® Google Scholar
Bergqvist, E. (1990). Terrace-and-gully landscapes in southern and central Sweden (UNGI report 77). U. Universitet.
Google Scholar

Volume49, Issue8

30 June 2024

Pages 2367-2379

This article also appears in:

Remote Sensing Applications in Geomorphology

The applicability of automated marine clay gully delineation using deep learning in Norway

Abstract

1 INTRODUCTION

2 STUDY AREAS