International Journal of Energy Research

Volume 2025, Issue 1 8872793

Research Article

Open Access

Evaluating the Rate of Penetration With Deep-Learning Predictive Models

Cheolhwan Lee,

Cheolhwan Lee

orcid.org/0009-0009-5481-3107

Global E&P Technology Center , Korea National Oil Corporation , Ulsan , 44538 , Republic of Korea

Search for more papers by this author

Jongkook Kim,

Jongkook Kim

Global E&P Technology Center , Korea National Oil Corporation , Ulsan , 44538 , Republic of Korea

Search for more papers by this author

Namjoong Kim,

Namjoong Kim

Global E&P Technology Center , Korea National Oil Corporation , Ulsan , 44538 , Republic of Korea

Search for more papers by this author

Seil Ki,

Corresponding Author

Seil Ki

[email protected]

orcid.org/0000-0002-5341-4739

Global E&P Technology Center , Korea National Oil Corporation , Ulsan , 44538 , Republic of Korea

Search for more papers by this author

Jeonggyu Seo,

Jeonggyu Seo

Global E&P Technology Center , Korea National Oil Corporation , Ulsan , 44538 , Republic of Korea

Search for more papers by this author

Changhyup Park,

Changhyup Park

orcid.org/0000-0001-8083-6809

Department of Energy and Resources Engineering , Kangwon National University , Chuncheon , Kangwon , 24341 , Republic of Korea , kangwon.ac.kr

Search for more papers by this author

Cheolhwan Lee,

Cheolhwan Lee

orcid.org/0009-0009-5481-3107

Global E&P Technology Center , Korea National Oil Corporation , Ulsan , 44538 , Republic of Korea

Search for more papers by this author

Jongkook Kim,

Jongkook Kim

Global E&P Technology Center , Korea National Oil Corporation , Ulsan , 44538 , Republic of Korea

Search for more papers by this author

Namjoong Kim,

Namjoong Kim

Global E&P Technology Center , Korea National Oil Corporation , Ulsan , 44538 , Republic of Korea

Search for more papers by this author

Seil Ki,

Corresponding Author

Seil Ki

[email protected]

orcid.org/0000-0002-5341-4739

Global E&P Technology Center , Korea National Oil Corporation , Ulsan , 44538 , Republic of Korea

Search for more papers by this author

Jeonggyu Seo,

Jeonggyu Seo

Global E&P Technology Center , Korea National Oil Corporation , Ulsan , 44538 , Republic of Korea

Search for more papers by this author

Changhyup Park,

Changhyup Park

orcid.org/0000-0001-8083-6809

Department of Energy and Resources Engineering , Kangwon National University , Chuncheon , Kangwon , 24341 , Republic of Korea , kangwon.ac.kr

Search for more papers by this author

First published: 10 March 2025

https://doi.org/10.1155/er/8872793

Academic Editor: Kathiravan Srinivasan

Share a link

Email
Wechat
Bluesky

Abstract

This paper presents a sophisticated deep-learning framework designed for predicting rate of penetration (ROP) by assimilating well-log data, litho-facies classifications, and parameters of onshore production wells drilling operations in Central Asia. The evolution in bit technology and relevant drilling operation underscores the necessity for enhancing the traditional empirically derived predictions. Distinctively, our approach integrates transfer learning into a conventional deep-neural-network, employing two important techniques. One is data quality control by Kalman filter to make machine learning applicable to in situ data which have significant noises. The other is K-means clustering to reflect litho-facies attributes as input features of deep-learning predictive model. The developed scheme was applied to the in situ drilling data which have 12 kinds of data types: measured depth; two drilling operation variables, namely weight on bit (WOB) and rotary speed (RPM [revolutions per minute]); six well-log measurements including density (RHOZ), neutron porosity (TNPH), resistivity (RT), sonic (DT), gamma ray (GR), and photoelectric factor (PEFZ); alongside three clusters delineating litho-facies data. The developed schemes are tested by being applied to the in situ well’s ROP prediction based on the training and validation of four wells’ data. All in-situ data are in the interval of 7-in. casing which ranges from about 800 to 3100 m. By adding the well-log-data-driven litho-facies and the transfer learning on the base model, ROP prediction performances are improved as follows: R² value up to 49% (from 0.49 to 0.73), mean absolute error up to 23% (from 6.79 to 8.82 m/h), and the dynamic time warping up to 24% (from 361 to 473 h), respectively. As a result of deriving a drilling operation strategy that allocates WOB from 1 to 6 tons for each 100 m section and optimizes ROP, it is expected to reduce drilling time by about 16.5% compared to actual drilling. The developed method can evaluate ROP with high reliability from the comparison between ROPs predicted and measured in actual drilling operation. It is expected that the developed scheme can be applied for an extension to real-time ROP optimization, a kind of inverse modeling, to find the optimum parameter conditions for ROP maximization, as a forward model.

1. Introduction

The rate of penetration (ROP), an indicator of the distance drilled per unit of time, plays a pivotal role in determining both time and expenses involved in the exploration and production of subsurface hydrocarbons. The underestimation of ROP can cause excessive budget plan and waste of opportunity cost. Its overestimation can result in tight operation and threaten safety. For these reasons, it is required to predict ROP with reliability even under the harsh conditions of many uncertain factors in drilling campaign. Herein, the nonlinear and complex relation in drilling-related variables poses challenges in the quantitative evaluation of drilling time. Existing researches on this issue are often restricted in its applicability due to the geological characteristics, drilling operation, and relevant equipment of the drilling campaign, thereby being difficult to get the general solution of the ROP prediction. Moreover, operational safety and stability are often prioritized over efficiency, drilling operation often has unexpected events and thus the prediction becomes more difficult.

The ROP-related parameters are divided into controllable and uncontrollable factors [1–3]. The controllable factors are operation parameters in drilling activities, such as weight on bit (WOB), rotary speed (RPM; revolutions per minute), and pump flow rate (GPM; gallons per minute), whereas uncontrollable parameters are drilling design parameters, mud design, and environmental aspects like formation pressure, principal stress, and compressive strength, etc. The two factors and their combination influence the ROP prediction [4, 5].

In the past, various regression models and empirical equations have been developed to predict the ROP. Maurer [6] asserted that ROP is directly proportional to WOB and RPM and inversely proportional to bit diameter. Subsequent research endeavors have focused on refining ROP predictions by optimizing these empirical formulas, especially by emphasizing the controllable factors [7–9]. However, Soares, Daigle, and Gray [10] highlighted the limitations of these traditional empirical models in predicting ROP with reliability, citing the complex nature of drilling operations, and the variability of uncontrollable factors. Furthermore, these models often necessitate the determination of empirical constants or experimental data which are customized to their own target drilling campaign and constraint their general applicability [10].

Machine learning techniques, which have witnessed significant advancements over the past decade, have been widely applied in various disciplines of science and engineering. The ROP prediction in the petroleum industry is not an exception, with these methods proving their applicability [11–16]. Hegde et al. [17] demonstrated the reliability of data-driven models over physics-based methods for ROP prediction. Field-specific studies have also employed machine learning techniques for addressing ROP prediction challenges with various ideas and field applications [18–23].

Deep learning, a type of machine learning that utilizes artificial neural networks, is capable of handling various kinds of input and output variables that have nonlinear relations and of building up a model to explain the relation. This approach can make actual drilling operations be mimicked more effectively. Bilgesu et al. [24] presented a neural network model that established the relationship between laboratory-measured mechanical data and actual ROP data. Moran et al. [25] suggested a neural network model for ROP prediction using petrophysical logging data. Ahmed et al. [26] demonstrated superior ROP prediction accuracy when compared to traditional empirical formulas.

This paper aims to develop a deep-learning-based predictive model for evaluating ROP in the complex fluvial depositional environments of onshore in Central Asia, which have a significant lithological heterogeneity. In order to make the models efficient, Kalman filter and K-means clustering are used for data preprocessing and feature extraction. The former is suitable for data control which has a significant fluctuation due to that it can separate the original data into data trends and noises. The latter has a strength in data grouping based on the characteristics of variables’ distributions. With the help of the techniques for data preprocessing and feature extraction, transfer technique is applied to the deep-learning model to accomplish the higher accuracy of ROP evaluation.

2. Data Description and Preprocessing

2.1. Well Information

The target field, located in Central Asia, has been producing hydrocarbon for several decades. During the Paleozoic era, it was the site of extensive deposition of deep marine sediments and platform carbonates within a passive margin setting. The commencement of the Ural orogeny toward the late Paleozoic era marked a significant shift in tectonic dynamics, transitioning the depositional environment from a deep marine setting to a more varied shallow marine and fluvial landscape during the Jurassic period. This tectonic evolution facilitated the accumulation of sand deposits with thicknesses varying between 5 and 20 m, interspersed with layers of claystone and coal within the Jurassic stratigraphic interval. These juxtaposed subformations are instrumental in forming the primary reservoir-seal rock assemblages within the exploration field. The Jurassic sand reservoirs are notably situated at true vertical depths ranging between 2400 and 3100 m. The replenishment of these sand units is predominantly influenced by the dynamics of meandering to braided channel systems, which serve as the chief conduits for sediment transport and deposition. The variability in channel directionality over temporal scales has significantly contributed to the geological complexity, with sand deposition predominantly concentrated along the routes of ancient channel pathways, thus enhancing the reservoir’s heterogeneity.

Approximately 10 vertical wells have been drilled annually in the target field, with the average drilling time of about 25 days. Prior to the initiation of drilling operations, the conductor is installed through hammering. A typical well design is divided into three sections: Section I is up to a depth of 100 m with a 13 3/8 in. surface casing (and a hole inner diameter of 17 1/2 in.); Section II is from 100 to 800 m depth with a 9 5/8 in. casing; and Section III is from 800 to 3100 m depth with a 7-inch casing. Bit replacement was circumvented within a designated each section barring unusual circumstances, which is for minimizing nonproductive time.

ROP prediction in this study concerns on Section III and five wells, three wells drilled in 2022 (namely, Well A, Well B, and Well C) and the remaining two (namely, Well D and Well E) drilled in 2023 (Figure 1).

Details are in the caption following the image — **Figure 1**
Open in figure viewer PowerPoint

The locations of wells used in this work.

2.2. Preparation of Input Dataset

2.2.1. Well-Logs and Drilling-Operation Data

Six types of well-log data were available for all wells: density (RHOZ), sonic (DT), gamma ray (GR), resistivity (RT), neutron porosity (TNPH), and photoelectric factor (PEFZ) (Table 1). The measured depth (MD) ranges for the well-logs spanned from 817 to 3030 m with a data interval of 0.2 m, identical to ROP recorded. Each well provided 11,066 instances of well-log data. The drilling-operation data encompass WOB and RPM. The several thousand measurement points of mud-logging and wireline logging in each well are depth-matched by GR measurements so that any possibilities of mismatches in data points are not needed to be considered.

Table 1. Well-logs and drilling operation parameters as the input dataset in the study.

Data type	Abbreviation	Unit
Well-logs
Bulk density	RHOZ	g/cm³
Sonic travel time	DT	μs/m
Gamma ray	GR	gAPI
Formation resistivity	RT	Ω·m
Neutron porosity	TNPH	Fraction
Formation photoelectric factor	PEFZ	Barns/electron
Drilling-operation Parameters
Weight on bit Rotary speed	WOB RPM	Ton r/min

Figure 2 illustrates the data series acquired from Well A. Except for ROP as the output response, the other parameters were normalized and thereafter utilized as the input dataset. Herein, no preprocessing for the other is employed, which means data that may seem to be outliers are also regarded as characteristics of formations.

2.2.2. Denoising ROPs (Output) With Kalman Filter

The ROP data series, as illustrated in Figure 2, exhibits complex fluctuations, emphasizing the need to extract the overall trend rather than concentrating on individual data points in establishing a meaningful correlation between the ROPs and the input data. In this study, Kalman filter is used as a method of denoising. It can provide data trends denoised by filtering out random fluctuations which can be regarded as measuring errors as explained below.

Kalman filter estimates the state of a system and subsequently compares this estimated with observed values [27, 28]. This approach acknowledges the inherent uncertainty in both the estimated and observed values. By addressing the discrepancies between two values, the algorithm successively refines its estimation for future data by considering the relevant uncertainties. Through iterative state update processes based on the new observations, Kalman filter can estimate the state by reducing the estimation error effectively according to time increase (depth deepening in this study). The process is summarized as Equation (1) for estimation and Equation (2) for state update:

(1)

(2)

In Equation (1), x, P, and Q denote the estimated state, the state variance matrix, and the process variance matrix, respectively, where the subscript t|t signifies the current time step, the subscript t − 1|t − 1 indicates the prior time step, and the subscript t|t − 1 is the intermediate time step. In Equation (2), R represents the measurement covariance, K the Kalman gain, and y the observation variable. The sequential updates Equations (1) and (2) by trial-and-error to get the improved relation between log and ROP data.

The results of denoising the ROPs for each well are shown in Table 2 and Figure 3. Relative to the original data, the denoised ROPs demonstrate a reduction in the coefficient of variation, with the mean essentially unchanged, and a narrower span between the maximal and minimal values noted. The red lines in Figure 3, depicting the denoised ROPs, validate the observed ROP trend patterns. It shows the extent of the fluctuation is similar to that of WOB data which have larger influences on the ROP compared to the RPM data among the drilling parameters (Figure 2). The ratio of 1:200 for process covariance and noise covariance, based on the trial-and-error method, was applied to get the reliable R² values.

Table 2. Statistical results of denoising ROPs using Kalman filter.

Dataset	Coefficient of variation	Mean	Maximum	Minimum
Well A
Raw	0.451	41.045	251.207	0.628
Denoised	0.064	41.257	113.924	4.106
Well B
Raw	0.397	45.343	275.052	0.394
Denoised	0.053	45.279	127.309	10.021
Well C
Raw	0.427	50.759	584.887	0.798
Denoised	0.053	50.790	196.432	10.084
Well D
Raw	0.308	38.413	130.664	1.398
Denoised	0.052	38.350	96.709	7.115

Note: The coefficient of variation is defined as the ratio of the standard deviation to the mean.
Abbreviation: ROPs, rate of penetrations.

2.2.3. Litho-Facies Classification Using K-Means Clustering

Despite the fact that depositional features and in situ rock properties have been shown to influence the ROPs [29], available litho-facies, a geological information for rock groupings based on the similar sedimentary conditions, at specific depths often remains elusive. Consequently, in constructing a workflow that addresses rock distribution heterogeneity with depth, particularly in scenarios with scarce core data, we aim to incorporate rock characteristics distinguishable via unsupervised learning-based clustering utilizing well-log data.

This work uses K-means clustering, one of unsupervised methods for classifying the big-data, to categorize well-log data features. Metrics, including the silhouette score (s(i)), and the Davies–Bouldin index (DBI), are employed to derive the optimal number of clusters (refer to Equations 3 and 4) [30]. The s(i) encapsulates how similar an individual data point is to its cluster compared to other clusters, by evaluating the extent of similarity in intracluster against intercluster separation. Meanwhile, the DBI gauges the quality of clustering by calculating the mean similarity between each cluster and its nearest counterpart. This measure depends on the ratio of intracluster to intercluster distances, where a higher s(i) and a lower DBI signify superior clustering by minimizing distances within clusters and maximizing those between them:

(3)

(4)

In Equation (3), a(i) denotes the mean distance between a sample, i, and all other points in the same cluster, whereas b(i) represents the mean distance from a sample, i, to all points in the nearest other cluster. Equation (4) defines k as the total number of clusters, σ_i as the mean intracluster distance for cluster i, and d(c_i, c_j) as the centroid distance between clusters i and j. Table 3 compares the calculated values of s(i) and DBI for several numbers of clusters. From this, the optimal number of the clusters is derived as three, evidenced by the highest s(i) and the lowest DBI. These clusters have been designated as C0, C1, and C2, respectively.

Table 3. Comparison of s(i) and DBI values for deriving the optimal number of clusters.

Number of cluster	Silhouette score	Davies–Bouldin index
2	0.4107	0.9476
3	0.4180	0.9232
4	0.3389	1.0468
5	0.3036	1.1646
6	0.2994	1.0985
7	0.2950	1.0636
8	0.2931	1.1419
9	0.2918	1.0876

Note: The bold means that we select 3 as the optimum number of cluster.

Figure 4 shows the clustering results based on the principal component analysis (PCA). It validates the clustering methods by showing distinct data segmentation. Figure 5 offers box-whisker plots that explain the statistical dispersion of well-log data within each cluster. Through PCA results and box-whisker plots, extraction of three clusters from well-logs seems to be adequate qualitatively.

As a result of clustering, cluster C0, which has the highest DT and the lowest RHOZ relative to other clusters, is inferred to exhibit porous and soft attributes, indicating undercompaction within the shallow interval and its portion decreases gradually as depth is getting deeper due to the increase of the compaction. Cluster C1, which has intermediate characteristics between C0 and C2 by higher RHOZ and RT than C0, is inferred to be moderately harder and tighter than C0. C2, with the highest RHOZ, the lowest DT, and the lowest GR among the three clusters, signifies the hardest formations.

Figure 6 also validates the clustering results by showing the comparison to available core data analysis which is highlighted within a blue dotted box. Within the coring interval, the majority of the data belongs to the C1 cluster, which indicates the facies of sandstone; meanwhile, some data points indicate clusters C0 and C2 with coal and carbonate or cemented sand, respectively. From the well-log and core data, the exact alignment of the depths for C0 and C2 validates the effectiveness of data-driven clustering. Based on the core data analysis, Table 4 summarizes litho-facies corresponding to each cluster.

Table 4. Well-log/core characteristics of three clusters and the inferred litho-facies.

Cluster	Well-log/core characteristics	Inferred litho-facies
Cluster 0 (C0)	Low GR and high porosity-related log values (RHOZ, DT, and TNPH) imply a porous and clay-lean characteristic Corresponded to interbedded coal layers (depth > 1500 m) or shallow sand intervals (<1500 m)	Undercompacted sand, claystone, intercalated coal (soft characteristic) High ROP supports the soft characteristic

Cluster 1 (C1)	Moderate log characteristic between clusters 0 and 2 A wide range of GR values implies a relatively high variation in clay volume Corresponding to argillaceous to clean sand intervals in the Jurassic cored section	Compacted, argillaceous to clean sand (moderate characteristic)

Cluster 2 (C2)	Highest RHOZ, highest RT, lowest GR, and lowest DT imply a dense, resistive, and hard characteristic with clay-lean composition PEF near 5 suggests limestone rather than sand Corresponded to the Cretaceous carbonate interval or calcite cementations in the Jurassic sand interval	Carbonate or calcite–cemented sand (hard characteristic) Low ROP supports the hard characteristic

Abbreviations: GR, gamma ray; ROP, rate of penetration.

3. Development of the Deep-Learning Predictive Models

3.1. Data Interrelationship and Deep-Learning Architecture

It is important to examine the asymmetric interrelationships among the data to test whether a deep-learning predictive model is applicable. If a particular input value has a very high correlation linearly with the response, the result can be estimated only with this input without the need to build the deep-learning model.

The Pearson correlation coefficient measures a linear correlation between two variables. Figure 7 depicts a heat map of the Pearson correlation coefficient. The MD, DT, TNPH, and C0 reveal the Pearson correlation coefficient of absolute value greater than 0.6 with ROP, indicating a higher linear correlation compared to other data. Figure 7 confirms that no specific data solely determines the ROP and thereby it requires elucidating the nonlinear relationships by developing the deep-learning architecture.

The deep-learning architecture developed in this work is based on deep-neural-network with fully connected layers for a regression purpose. Table 5 summarizes the input and output dataset. Ten kinds of input data are used, categorized by the MD, the well-logs, the drilling operation, and litho-facies, while one output value is the ROP. Except for the litho-facies s, all input data are normalized. The litho-facies information clustered by the well-logs is a vector-typed one-hot encode, i.e., C0 = (1, 0, 0), C1 = (0, 1, 0), and C2 = (0, 0, 1). The deep-learning architecture has four hidden layers which have 50 neurons each. It employs the learning rate ( = 0.0001), the mini-batch size ( = 64) based on hyperband method and its fine-tuning technique (Figure 8). The activation function, the optimizer, and the loss function are ReLU, ADAM, and the mean absolute error (MAE), respectively.

Table 5. Dataset used in the deep-learning predictive model.

Input	Output
Measured depth (MD), Well-logs (RHOZ, DT, GR, RT, TNPH, PEFZ), Drilling operation (WOB, RPM), Litho-facies	ROP

3.2. ROP Predictive Models: Effects of Litho-Facies and Transfer Learning

The production performances which have been drilled before are not similar to each other in the target field. It seems to be due to that the horizontal heterogeneity is severe by the complex depositional environments. This severe heterogeneity hinders the ROP prediction based on only well-logs of the offset wells and thus it requires to examine the effects of input dataset as well as of any tuning methods. Notably, unlike this traditional deep-neural-network, this study examines the ROP predictability by incorporating the transfer learning because the number of available wells is limited, which makes it difficult to reproduce or expand the training dataset. The transfer learning utilizes a pre-existing learning patterns as a starting point for a similar or related task [31].

To test the ROP prediction performances based on different input datasets, this work is divided into three cases: W4 uses only well-logs and drilling operation data as input values for the deep-learning model, excluding the log-based litho-facies. W4 represents four wells, namely Well A to D. The second scenario, referred to as W4C, includes the litho-facies information in addition to the input data used in W3. The last case is W4C-T, involved combining transfer learning with W4C (Figure 9; Table 6). W4C-T assumes that the data series of Well E is similar to those of Well D so that the tuning of weights at the last hidden layer is carried out using Well D. The training and validation groups were randomly divided in an 8:2 ratio. The test data are those of Well E.

Table 6. Three cases of the deep-learning-based models.

Model	Training and validation dataset (Wells A, B, C, and D)
Model	MD	Well-logs	Drilling operation	Litho-facies	Transfer learning
W4 (the base model)	O	O	O	—	—
W4C	O	O	O	O	—
W4C-T	O	O	O	O	O

Note: “O” represents that its term is included in training and validation processes.
Abbreviation: MD, measured depth.

3.3. Indicators to Evaluate the Prediction Performances

The dataset in the deep-learning model is utilized in training, validation, and prediction. In this study, given datasets from previous drilling are used as training and validation in the process that build-up the network of the complex linear model. After model build-up, the new drilled well’s data are used as prediction, that means, the model’s ROP prediction is blind-tested with newly ROP data acquisition from the newly drilling campaign. Based on the comparison of the two ROP datasets, the model can be evaluated in several ways. Three indicators are used to evaluate the prediction reliability: the coefficient of determination (R²), MAE, and the dynamic time warping (DTW) distance, as delineated in Equations (5)–(7):

(5)

(6)

(7)

In Equations (5)–(7), y_i denotes the actual ROP at the ith depth, while represents the ROP estimated by the predictive model at the ith depth. is the arithmetic mean of the ROPs. indicates the distance between y_n and , and signifies the cumulative distance between y_n−1 and . R² is a statistical indicator that determines the proportion of variance in the measured ROPs (the true values) that can be explained by the predicted ones. MAE is a measure of error while DTW measures the similarity between two time series, that is, the measured and the predicted ROPs.

4. Results and Discussion

4.1. Evaluation of ROP Prediction Accuracy

Figure 10 depicts the ROP prediction results of deep-learning models such as W4, W4C, and W4C-T. At shallow depths, ROP shows significant variation, but as depth increases and the formation consolidation intensifies, ROP decreases. Since the ROP prediction model is trained using denoised well-logs, there are limitations in accurately predicting the sharply changing ROP with depth. However, it effectively tracks the overall trend of ROP variations. R², indicative of the regression model’s characteristics, improves from 0.49 for the basic model (W4) to 0.57 when litho-facies information through K-means clustering is included (W4C) and further to 0.73 when both transfer learning and litho-facies information are incorporated (W4C-T). This improvement is attributed to the expanded dataset used for training, enhancing the accuracy of ROP predictions. As the heatmap in Figure 7, all data are nonlinearly linked to the ROP, validating the applicability of predictions through the deep-learning model. The right-hand side of Figure 10 illustrates that while the error between actual and predicted ROP values is minimal at lower ROP values, it increases as the ROP values grow. This result indicates the requirement for improved prediction performance at shallow depths, where ROP variability is high. A reason for the low prediction performance in high-variability sections can be traced to noise removal from input well-logs using Kalman filter.

Figure 11 plots MAEs for each 100-m segment of the entire drilling section (from 800 to 3100 m). While W4 and W4C exhibit high MAE values in shallow depths where ROP variability is high, W4C-T demonstrates overall lower and more consistent trends. MAE of W4 model is 8.82 m/h, whereas it decreases to 6.79 m/h in the W4C-T model. This result proves that the deep-learning predictive model incorporating litho-facies information and transfer learning not only has lower errors but also offers better prediction performances in sections with high ROP variability. Table 7 summarizes the performance evaluation metrics. DTW of W4 is 473, while W4C-T has 361. This improvement indicates enhanced similarity between the actual measured ROP trends and the predicted values.

Table 7. Summary of ROP predictability in the test set (Well E) using the deep-learning predictive models.

Indicator	W4	W4C	W4C-T
R² (unitless)	0.49	0.57	0.73
MAE (m/h)	8.82	8.14	6.79
DTW (unitless)	473	430	361

Abbreviations: DTW, dynamic time warping; MAE, mean absolute error; ROP, rate of penetration.

In summary, compared to the base model, W4C-T model demonstrates significant improvements in prediction performance, with an increase in R² by 49% (), a reduction in MAE by 23% (), and a decrease in DTW by 24% (). The improvement of prediction accuracy emphasizes the limitation of basic model and the necessity of transfer learning for coping with local heterogeneity or inconsistency of newly acquired data.

4.2. Optimization of WOBs to Reduce Drilling Time

Using the most reliable ROP predictive model, that is, W4C-T, the paper examines the possibility to reduce drilling time by optimizing ROPs. The ROP prediction in this study, analyzed postdrilling, can provide guidelines for subsequent drilling operations. Generally, when RPM is fixed for safety reasons, WOB can be used as a drilling operation factor. Since varying WOB over short intervals can pose practical challenges in drilling operations, this paper segments the entire drilling interval into 100-m sections. For each section, WOB values ranging from 1 to 6 tons are assigned, and a WOB scheduling is optimized to obtain the maximum ROPs. Figure 12a illustrates the predicted ROP for each 100-m segment according to different WOB values, with the optimized values indicated by a dashed line connecting the WOB values yielding the highest ROP. By scheduling WOB for each segment, the total drilling time could be reduced by ~16.5% compared to the actual drilling time of Well E (Figure 12b). In addition, these WOB guidelines for each segment allow for efficient preparation for the next drilling operation.

This study developed a reliable deep-learning model capable of predicting ROP by extracting litho-facies information from well-log data and performing transfer learning from offset wells, particularly in regions with limited core data and high heterogeneity. By fixing RPM and optimizing WOB by segment, WOB scheduling to reduce drilling time is proposed. The results of this study demonstrate that in highly heterogeneous regions, reliable ROP prediction techniques can be achieved through preprocessing and integrated analysis of well-log data.

5. Conclusions

This paper developed a new ROP prediction model which employed deep-learning architecture to be applicable to high heterogeneity due to the complex depositional environment. Some techniques, that is, Kalman filter, K-means clustering, and transfer learning, enhanced the ROP prediction accuracy and their usefulness were verified by comparison to actual in situ data. Herein, the usage of the methods is for further extended application to various geological conditions with preprocessing of data that have a big fluctuations, reflecting the litho-facies characteristics in complex depositional environment, and considering fine-tuning for local heterogeneity, respectively.

The litho-facies information extracted from well-log data using K-means clustering was validated for its usefulness against actual core samples. By utilizing information from a similar offset well, transfer learning was performed, enabling reliable ROP predictions using litho-facies information, well-logs, drilling parameters (WOB and RPM), and the MD. Compared to a deep-learning prediction model using only well-log data, the integrated model with litho-facies and transfer learning showed a 49% improvement in the coefficient of determination (from 0.49 to 0.73), a 23% reduction in MAE (from 6.79 to 8.82 m/h), and a 24% improvement in DTW distance (from 361 to 473 h). To ensure field applicability, WOB scheduling demonstrated a potential reduction in drilling time by ~16.5% compared to actual drilling.

The results discussed the applicability of the ROP prediction and possibility of drilling parameters optimization for reducing drilling cost in real-time which is responsible to a large part of CAPEX in the oil and gas industry. This study can be attributed to real-time optimization with combining a new model to be developed for log data estimation in the new well, which can provide the input features for ROP prediction model of this study.

Conflicts of Interest

The authors declare no conflicts of interest.

Author Contributions

Seil Ki and Jeonggyu Seo were responsible for conceptualization. Cheolhwan Lee and Seil Ki were responsible for the methodology and formal analysis. Cheolhwan Lee, Jongkook Kim, and Namjoong Kim were responsible for validation and original draft preparation. Seil Ki was responsible for resources. Seil Ki and Changhyup Park were responsible for writing, review, and editing. Jeonggyu Seo was responsible for supervision and project administration. Changhyup Park was responsible for funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This research is supported by the Korea Agency for Infrastructure Technology Advancement (KAIA) (RS-2022-00143541), the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (RS-2023-00253751), and the Korea Institute of Energy Technology Evaluation and Planning (KETEP) (20225B10300050, 20212010200020).

Open Research

Data Availability Statement

Research data are not shared.

References

1 Yi P., Kumar A., and Samuel R., Realtime Rate of Penetration Optimization Using the Shuffled Frog Leaping Algorithm, Journal of Energy Resources Technology. (2015) 137, no. 3, https://doi.org/10.1115/1.4028696, 2-s2.0-85050580057, 032902.
10.1115/1.4028696
Web of Science® Google Scholar
2 Shi X., Liu G., Gong X., Zhang J., Wang J., and Zhang H., An Efficient Approach for Real-Time Prediction of Rate of Penetration in Offshore Drilling, Mathematical Problems in Engineering Article. (2016) 2016, 13, https://doi.org/10.1155/2016/3575380, 2-s2.0-84997673993, 3575380.
10.1155/2016/3575380
Web of Science® Google Scholar
3 Elkatatny S., Real-Time Prediction of Rheological Parameters of KCl Water-Based Drilling Fluid Using Artificial Neural Network, Arabian Journal for Science and Engineering. (2017) 42, no. 4, 1655–1665, https://doi.org/10.1007/s13369-016-2409-7, 2-s2.0-85015379607.
10.1007/s13369-016-2409-7
CAS Google Scholar
4 Arabjamaloei R. and Shadizadeh S., Modeling and Optimizing Rate of Penetration Using Intelligent Systems in an Iranian Southern Oil Field (Ahwaz Oil Field), Petrol, Petroleum Science and Technology. (2011) 29, no. 16, 1637–1648, https://doi.org/10.1080/10916460902882818, 2-s2.0-79960670841.
10.1080/10916460902882818
CAS Google Scholar
5 Ahmed A., Ali A., Elkatatny S., and Abdulraheem A., New Artificial Neural Networks Model for Predicting Rate of Penetration in Deep Shale Formation, Sustainability. (2019) 11, no. 22, https://doi.org/10.3390/su11226527, 6527.
10.3390/su11226527
CAS Web of Science® Google Scholar
6 Maurer W. C., The Perfect-Cleaning Theory of Rotary Drilling, Journal of Petroleum Technology. (1962) 14, no. 11, 1270–1274, https://doi.org/10.2118/408-PA.
10.2118/408-PA
Google Scholar
7 Bingham M. G., A New Approach to Interpreting Rock Drillability, 1965, Petroleum Publishing Company, Tulsa.
Google Scholar
8 Bourgoyne A. T. and Young F. S., A Multiple Regression Approach to Optimal Drilling and Abnormal Pressure Detection, SPE Journal. (1974) 14, no. 4, 371–384.
Google Scholar
9 Warren T. M., Penetration-Rate Performance of Roller-Cone Bits, SPE Drilling Engineering. (1987) 2, no. 1, 9–18, https://doi.org/10.2118/13259-PA, 2-s2.0-0023315592.
10.2118/13259-PA
Google Scholar
10 Soares C., Daigle H., and Gray K., Evaluation of PDC bit ROP Models and the Effect of Rock Strength on Model Coefficients, Journal of Natural Gas Science and Engineering. (2016) 34, 1225–1236, https://doi.org/10.1016/j.jngse.2016.08.012, 2-s2.0-84981295495.
10.1016/j.jngse.2016.08.012
Web of Science® Google Scholar
11 Payette G., Pais D., and Spivey B., et al. Mitigating Drilling Dysfunction Using a Drilling Advisory System: Results From Recent Field Applications, Proceedings of International Petroleum Technology Conference, 2016, Doha, Qatar, SPE (Society of Petroleum Engineers), https://doi.org/10.2523/IPTC-18333-MS.
10.2523/IPTC-18333-MS
Google Scholar
12 Wallace S. P., Hegde C. M., and Gray K. E., A System for Real-Time Drilling Performance Optimization and Automation Based on Statistical Learning Method, Proceedings of SPE Middle East Intelligent Oil and Gas Conference and Exhibition, 2015, Abu Dhabi, UAE, SPE, https://doi.org/10.2118/176804-MS.
10.2118/176804-MS
Google Scholar
13 Eskandarian S., Bahrami P., and Kazemi P., A Comprehensive Data Mining Approach to Estimate the Rate of Penetration: Application of Neural Network, Rule Based Models and Feature Ranking, Journal of Petroleum Science and Engineering. (2017) 156, 605–615, https://doi.org/10.1016/j.petrol.2017.06.039, 2-s2.0-85021042193.
10.1016/j.petrol.2017.06.039
CAS Web of Science® Google Scholar
14 Gan C., Cao W., and Wu M., et al.Prediction of Drilling Rate of Penetration (ROP) Using Hybrid Support Vector Regression: A Case Study on the Shennongjia Area, Central China, Journal of Petroleum Science and Engineering. (2019) 181, https://doi.org/10.1016/j.petrol.2019.106200, 2-s2.0-85068124486, 106200.
10.1016/j.petrol.2019.106200
CAS Web of Science® Google Scholar
15 Hegde C. and Gray K. E., Use of Machine Learning and Data Analytics to Increase Drilling Efficiency for Nearby Wells, Journal of Natural Gas Science and Engineering. (2017) 40, 327–335, https://doi.org/10.1016/j.jngse.2017.02.019, 2-s2.0-85014108241.
10.1016/j.jngse.2017.02.019
Web of Science® Google Scholar
16 Soares C. and Gray K., Real-Time Predictive Capabilities of Analytical and Machine Learning Rate of Penetration (ROP) Models, Journal of Petroleum Science and Engineering. (2019) 172, 934–959, https://doi.org/10.1016/j.petrol.2018.08.083, 2-s2.0-85053916178.
10.1016/j.petrol.2018.08.083
CAS Web of Science® Google Scholar
17 Hegde C., Daigle H., Millwater H., and Gray K., Analysis of Rate of Penetration (ROP) Prediction in Drilling Using Physics-Based and Data-Driven Models, Journal of Petroleum Science and Engineering. (2017) 159, 295–306, https://doi.org/10.1016/j.petrol.2017.09.020, 2-s2.0-85033491950.
10.1016/j.petrol.2017.09.020
CAS Web of Science® Google Scholar
18 Duan J., Yang C., and He J., A ROP Optimization Approach Based on Well Log Data Analysis Using Deep Learning Network and PSO, Proceedings of 2019 IEEE International Conference of Intelligent Applied Systems on Engineering, 2019, Fuzhou, China, IEEE, 86–88, https://doi.org/10.1109/ICIASE45644.2019.9074096.
10.1109/ICIASE45644.2019.9074096
Google Scholar
19 Sabah M., Talebkeikhah M., Wood D. A., Khosravanian R., Anemangely M., and Younesi A., A Machine Learning Approach to Predict Drilling Rate Using Petrophysical and Mud Logging Data, Earth Science Informatics. (2019) 12, no. 3, 319–339, https://doi.org/10.1007/s12145-019-00381-4.
10.1007/s12145-019-00381-4
Web of Science® Google Scholar
20 Elkatatny S., Real-Time Prediction of Rate of Penetration While Drilling Complex Lithologies Using Artificial Intelligence Techniques, Ain Shams Engineering Journal. (2021) 12, no. 1, 917–926, https://doi.org/10.1016/j.asej.2020.05.014.
10.1016/j.asej.2020.05.014
Web of Science® Google Scholar
21 Feng Z., Gani H., Damayanti A. D., and Gani H., An Explainable Ensemble Machine Learning Model to Elucidate the Influential Drilling Parameters Based on Rate of Penetration Prediction, Geoenergy Science and Engineering. (2023) 231, 212–231, https://doi.org/10.1016/j.geoen.2023.212231.
10.1016/j.geoen.2023.212231
Web of Science® Google Scholar
22 Al-Sahlanee D. T., Allawi R. H., Al-Mudhafar W. J., and Yao C., Ensemble Machine Learning for Data-Driven Predictive Analytics of Drilling Rate of Penetration (ROP) Modeling: A Case Study in a Southern Iraqi Oil Field, Proceedings of 2023 SPE Western Regional Meeting, 2023, Anchorage, Alaska, USA, SPE, https://doi.org/10.2118/213043-MS.
10.2118/213043-MS
Google Scholar
23 Li C., Cheng P., and Cheng C., A Comparison of Machine Learning Algorithms for Rate of Penetration Prediction for Directional Wells, Proceedings of the Middle East Oil, Gas and Geosciences Show, 2023, Manama, Bahrain, SPE, https://doi.org/10.2118/213321-MS.
10.2118/213321-MS
Google Scholar
24 Bilgesu H., Tetrick L., Altmis U., Mohaghegh S., and Ameri S., A New Approach for the Prediction of Rate of Penetration (ROP) Values, Proceedings of SPE Eastern Regional Meeting, 1997, Lexington, Kentucky, SPE, https://doi.org/10.2118/39231-MS.
10.2118/39231-MS
Google Scholar
25 Moran D., Ibrahim H., Purwanto A., and Osmond J., Sophisticated ROP Prediction Technologies Based on Neural Network Delivers Accurate Drill Time Results, Proceedings of IADC/SPE Asia Pacific Drilling Technology Conference and Exhibition, 2010, Ho Chi Minh City, Vietnam, SPE, https://doi.org/10.2118/132010-MS.
10.2118/132010-MS
Google Scholar
26 Ahmed A., Elkatatny S., Ali A., Mahmoud M., and Abdulraheem A., Rate of Penetration Prediction in Shale Formation Using Fuzzy Logic, Proceedings of International Petroleum Technology Conference, 2019, Beijing, China, SPE, https://doi.org/10.2523/IPTC-19548-MS.
10.2523/IPTC-19548-MS
Google Scholar
27 Ma’arif A., Iswanto I., Nuryono A. A., and Alfian R. I., Kalman Filter for Noise Reducer on Sensor Readings, Signal and Image Processing Letters. (2019) 1, no. 2, 11–22, https://doi.org/10.31763/simple.v1i2.2.
10.31763/simple.v1i2.2
Google Scholar
28 Murugendrappa N., Ananth A. G., and Mohanesh K. M., Adaptive Noise Cancellation Using Kalman Filter for Non-Stationary Signals, IOP Conference Series: Materials Science and Engineering. (2020) 925, no. 1, https://doi.org/10.1088/1757-899X/925/1/012061, 012061.
10.1088/1757-899X/925/1/012061
Google Scholar
29 Kivade S. B., Murthy C. S. N., and Vardhan H., Experimental Investigations on Penetration Rate of Percussive Drill, Procedia Earth and Planetary Science. (2015) 11, 89–99, https://doi.org/10.1016/j.proeps.2015.06.012.
10.1016/j.proeps.2015.06.012
Google Scholar
30 Davies D. L. and Bouldin D. W., A Cluster Separation Measure, IEEE Transactions on Pattern Analysis and Machine Intelligence. (1979) PAMI-1, no. 2, 224–227, https://doi.org/10.1109/TPAMI.1979.4766909, 2-s2.0-0017953820.
10.1109/TPAMI.1979.4766909
Web of Science® Google Scholar
31 Pan S. J. and Yang Q., A Survey on Transfer Learning, IEEE Transactions on Knowledge and Data Engineering. (2010) 22, no. 10, 1345–1359, https://doi.org/10.1109/TKDE.2009.191, 2-s2.0-77956031473.
10.1109/TKDE.2009.191
Web of Science® Google Scholar

All articles

Evaluating the Rate of Penetration With Deep-Learning Predictive Models

Abstract

1. Introduction

2. Data Description and Preprocessing

2.1. Well Information

2.2. Preparation of Input Dataset

2.2.1. Well-Logs and Drilling-Operation Data

2.2.2. Denoising ROPs (Output) With Kalman Filter

2.2.3. Litho-Facies Classification Using K-Means Clustering

3. Development of the Deep-Learning Predictive Models

3.1. Data Interrelationship and Deep-Learning Architecture

3.2. ROP Predictive Models: Effects of Litho-Facies and Transfer Learning

3.3. Indicators to Evaluate the Prediction Performances

4. Results and Discussion

4.1. Evaluation of ROP Prediction Accuracy

4.2. Optimization of WOBs to Reduce Drilling Time

5. Conclusions

Conflicts of Interest

Author Contributions

Funding

Open Research

Data Availability Statement

References

Figures

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Evaluating the Rate of Penetration With Deep-Learning Predictive Models

Abstract

1. Introduction

2. Data Description and Preprocessing

2.1. Well Information

2.2. Preparation of Input Dataset

2.2.1. Well-Logs and Drilling-Operation Data

2.2.2. Denoising ROPs (Output) With Kalman Filter

2.2.3. Litho-Facies Classification Using K-Means Clustering

3. Development of the Deep-Learning Predictive Models

3.1. Data Interrelationship and Deep-Learning Architecture

3.2. ROP Predictive Models: Effects of Litho-Facies and Transfer Learning

3.3. Indicators to Evaluate the Prediction Performances

4. Results and Discussion

4.1. Evaluation of ROP Prediction Accuracy

4.2. Optimization of WOBs to Reduce Drilling Time

5. Conclusions

Conflicts of Interest

Author Contributions

Funding

Open Research

Data Availability Statement

References

Figures

References

Related

Information