Volume 9, Issue 2 pp. 77-86
ORIGINAL ARTICLE
Open Access

Fast estimation of patient-specific organ doses from abdomen and head CT examinations without segmenting internal organs using machine learning models

Wencheng Shao

Wencheng Shao

Institute of Radiation Medicine, Fudan University, Shanghai, China

Search for more papers by this author
Liangyong Qu

Liangyong Qu

Department of Radiology, Shanghai Zhongye Hospital, Shanghai, China

Search for more papers by this author
Xin Lin

Xin Lin

Institute of Radiation Medicine, Fudan University, Shanghai, China

Search for more papers by this author
Ying Huang

Ying Huang

Department of Nuclear Science and Technology, Institute of Modern Physics, Fudan University, Shanghai, China

Search for more papers by this author
Weihai Zhuo

Corresponding Author

Weihai Zhuo

Institute of Radiation Medicine, Fudan University, Shanghai, China

Correspondence

Weihai Zhuo and Haikuan Liu, Institute of Radiation Medicine, Fudan University, Shanghai, China.

Email: [email protected] and [email protected]

Search for more papers by this author
Haikuan Liu

Corresponding Author

Haikuan Liu

Institute of Radiation Medicine, Fudan University, Shanghai, China

Correspondence

Weihai Zhuo and Haikuan Liu, Institute of Radiation Medicine, Fudan University, Shanghai, China.

Email: [email protected] and [email protected]

Search for more papers by this author
First published: 29 May 2025

Wencheng Shao and Liangyong Qu contribute equally to this study.

Abstract

Background

Computed Tomography (CT) imaging is essential for disease detection but carries a risk of cancer due to X-ray exposure. Typically, assessing this risk requires segmentation of the internal organ contours to predict organ doses, which hinders its clinical application. This study introduces a method that uses support vector regression (SVR) models trained on skin outline radiomic features to predict organ doses without organ segmentation, thus streamlining the process for clinical use.

Methods

CT scans of the head and abdomen were used to extract radiomic features of the skin outline. These features were used as inputs, with organ doses from Monte Carlo simulations as benchmarks to train the SVR models for predicting organ doses. The accuracy of the models was evaluated using the mean absolute percentage error (MAPE) and coefficient of determination (R2).

Results

The results showed a high precision in dose prediction for various organs, including the brain (MAPE: 1.5%, R2: 0.9), eyes (MAPE: 5%, R2: 0.84), lens (MAPE: 5%, R2: 0.82), bowel (MAPE: 6%, R2: 0.84), kidneys (MAPE: 7.5%, R2: 0.7), and liver (MAPE: 8%, R2: 0.67). Internal organ disturbances had a minimal impact on accuracy.

Conclusions

The SVR models efficiently predicted patient-specific organ doses from CT scans, offering a user-friendly tool for rapid segmentation-free dose prediction. This innovation can significantly enhance clinical efficiency and accessibility in predicting patient-specific organ doses using CT.

1 INTRODUCTION

Computed tomography (CT) has emerged as a widely applied diagnostic modality for diverse pathological conditions, providing intricate cross-sectional representations of internal anatomical structures and organs.1-4 CT holds pivotal significance in identifying various maladies, including infectious, traumatic, inflammatory, and hemorrhagic disorders.5-10 Despite its diagnostic utility, CT scans expose patients to ionizing radiation, imparting absorbed doses to crucial organs.11-13 This radiation exposure can induce DNA damage, thereby amplifying the susceptibility to the development of malignancies.14-17 Consequently, it is crucial to calculate the organ-specific doses received by patients undergoing CT examinations.

Currently, size-specific dose estimates (SSDE), neural networks (NN), and Monte Carlo simulations (MCS) are used to predict patient-specific organ doses from CT scans. NN and MCS exhibited enhanced predictive capabilities over SSDE, which is attributed to the comprehensive incorporation of various patient factors, such as geometric size, shape, and tissue inhomogeneity. However, almost all NN- and MCS-based prediction methods require the identification of internal organ borders and their delineation as regions of interest (ROIs). This is the primary barrier standing on the way to user-friendly clinical applications with improved accessibility and efficiency. Delineating internal organs constrains practical applications for predicting CT-induced patient-specific organ doses by introducing the cumbersome process of manual segmentation or the intensive financial cost of purchasing dedicated autosegmentation software. Hence, to facilitate and promote clinical applications, exploring novel methods that can efficiently predict CT-induced organ doses without delineating the internal organs is essential.

Support vector regression (SVR) has a relatively simple structure and emphasizes key support vectors that enhance its resilience to overfitting and ensure reliable predictions with small medical datasets. SVR excels in robustness to outliers, a common occurrence in medical datasets owing to individual variations. Specifically designed to minimize the impact of outliers, SVR provides stable and reliable predictions, making it well-suited for medical applications. SVR models trained using patient radiomics features can efficiently predict CT-induced organ doses.18 However, this study still requires the segmentation of internal organ contours when practically applying the trained SVR models to predict unseen patient samples. Despite SVR's efficient and robust CT-induced organ dose prediction of SVR, its clinical applications are still hindered by barriers to accessibility and efficiency owing to internal organ segmentation.

Unlike a previous study using SVR, this study investigated the efficient prediction of patient-specific organ doses from CT scans using SVR models trained on the radiomics features of patients’ skin outlines to eliminate the requirement for segmenting internal organ contours in clinical applications. This study aimed to create a user-friendly, swift, and online prediction method for CT-induced organ doses by constructing SVR models that operate without internal organ segmentation, thus removing a significant barrier to accessibility and efficiency. To evaluate and verify the overall performance of the SVR models after discarding the internal organ radiomics features, regression metrics, such as mean absolute percentage error (MAPE) and coefficient of determination (R2), were compared between the SVR models trained with skin outline radiomics features and skin outline plus internal organ radiomics features.

2 MATERIALS AND METHODS

2.1 Data collection

This study analyzed a cohort of 237 head and 235 abdominal cases who underwent CT examinations at the Shanghai Zhongye Hospital. The age and sex distributions of the selected head and abdomen patients are shown in Figure 1. Using the advanced auto-segmentation software “DeepViewer,” the CT images of the individuals in the cohort were segmented to generate regions of interest (ROIs) within anatomical domains, including vital structures like the skin outline, brain, eyes, lens, kidneys, liver, and bowel. To prepare for extracting radiomics features from the CTs and ROIs, CT images and corresponding ROIs underwent a crucial transformation from Digital Imaging and Communications in Medicine (DICOM) to Neuroimaging Informatics Technology Initiative (NIFTI) format, expertly performed using the specialized tool “dcmstruct2nii”.19 This transformation is pivotal because it is a prerequisite for generating the necessary CT and mask data. The scan voltage was set at 120 kV for head CT scans and 100 kV for abdominal CT scans. ROIs from the internal organs were adopted to investigate their impact on the predictive performance of SVR-based models for predicting organ doses from CT scans.

Details are in the caption following the image
Illustration of general patient information: A, Age distribution for the head patients; B, Gender distribution for the head patients; C, Age distribution for the abdomen patients; D, Gender distribution for the abdomen patients.

2.2 Preparing input and reference data for SVR training

2.2.1 Input data preparation

Radiomics features were extracted from the CT images and ROIs for each patient using the Pyradiomics module.20 This process comprises image preprocessing, feature computation, and feature selection. Spatial resolution standardization involved resampling CT images and masks to (1, 1, 5) for the chest and abdomen CTs and (1,1,3) for the head CTs by setting the “resamplePixelSpacing” parameter in the Pyradiomics module. Data augmentation was applied to enhance the diversity of robust SVR models.21 The feature computation extracted 107 radiomic features per ROI spanning seven categories without filters. The feature categories included a Gray-Level Co-occurrence Matrix, First-Order Statistics, Neighboring Gray-Tone Difference Matrix, Gray-Level Dependence Matrix, Gray-Level Run-Length Matrix, Shape-based Matrix, and Gray-Level Size-Zone Matrix. We employed the F-regression function in the scikit-learn library to identify relevant features based on the F-value and P-value for feature selection. Features with high F-values and low P-values, indicating strong linear relationships, were selected to reduce dimensionality and prevent overfitting as much as possible. The mean exposure (ME) of the entire scan sequence was extracted, and the water-equivalent diameter (dw) for the central slice was calculated based on each patient's DICOM CT images. These values were then incorporated as two input vectors alongside the extracted radiomic features to train and test the SVR model. The process utilized the Pyradiomics module on double AMD EPYC 7551 CPUs in the Anaconda 3 environment.22 The general workflow for training the SVR model based on skin outline radiomic features to predict CT-induced organ doses is shown in Figure 2.

Details are in the caption following the image
General workflow for training the SVR model based on skin outline radiomics features to predict patient-specific organ doses for head and abdomen patients.

2.2.2 Input data classification

This study classified the input features into three categories: ME + dw + skin outline radiomics features (class_1) and ME + dw + skin outline radiomics features + internal organ radiomics features (class_2). Class_1 was adopted to explore whether combining skin outline radiomics features and raw CT-based information (i.e., ME and dw) could train efficient and robust SVR models for predicting patient-specific organ doses. Class_2 aimed to determine whether the radiomic features of internal organs were necessary to improve the performance of the SVR models. Within class_2, the internal organs included the brain, left eye, right eye, left lens, and right lens for head CT scans, and the bowel, liver, left kidney, and right kidney for abdominal CT scans. The SVR models were trained to predict patient-specific organ doses for each organ using feature class_1 and class_2 as input data.

Segmentation-free patient-specific organ dose prediction was achieved based on the following outcomes:
  1. SVR models trained with feature class_1 achieved satisfactory overall performance by incorporating skin outline radiomic features and raw CT-based information (i.e., ME and dw).

  2. SVR models trained with feature class_2 showed no apparent enhancement in overall performance, indicating that the radiomic features of internal organs did not contribute significantly to improving the SVR model performance.

2.2.3 Organ dose

This study employed a GPU Geant4-based Monte Carlo Simulation (GGEMS) to compute the reference organ doses from each patient's CT images and masks. The GGEMS excels in managing complex geometries, diverse materials, and multiple radiation sources (photons and electrons).23 It surpasses CPU-based MC simulation codes in speed while maintaining precision in organ dose calculations. The GPU-calculated organ doses served as references for training the SVR models to predict patient-specific doses for 13 organs: the brain, left eye, right eye, left lens, right lens, liver, bowel, left kidney, and right kidney. Two Nvidia RTX4090 graphics cards enabled GPU-based MC simulations to acquire slice-wise dose distributions for each patient with less than 2% per voxel uncertainty. Considering the autotube current effect, accounting for the variations in the tube current along the scan range ensured accurate MC dose calculations. Utilizing the GGEMS facilitated the generation of reliable reference organ doses, aiding SVR model training.

2.3 SVR prediction models

SVR is a powerful machine learning algorithm that extends the principles of support vector machines (SVM) to address regression problems. SVR has gained prominence for its effectiveness in handling nonlinear relationships between input variables and target outputs. At its core, SVR aims to determine the optimal hyperplane that maximizes the margin between the data points and a specified epsilon-insensitive tube. Unlike traditional regression algorithms, SVR is particularly well-suited for datasets with complex structures and nonlinear patterns because of its ability to transform input data into higher-dimensional feature spaces using kernel functions.

The fundamental concept behind SVR lies in minimizing the empirical risk, which is the sum of the training errors within the epsilon-insensitive tube and the regularization term that penalizes the model complexity. SVR offers several advantages, such as robustness to outliers and the ability to capture intricate relationships in data. Its adaptability to different kernel functions, such as linear, polynomial, and radial basis function (RBF), enhances its versatility in modeling diverse datasets.

To train the SVR models to predict patient-specific organ doses from CT scans, we utilized the input variables of radiomic features (specifically, class_1 and class_2). The organ doses for the investigated organs, namely the brain, eyes, lens, bowel, liver, and kidneys, computed using the GPU, were designated as the ‘y values’ during the training process. For each organ's SVR training, the patient samples were divided in a ratio of 0.8 to 0.2 between the training set and the testing set. During the configuration phase, we applied the linear kernel and set the SVR regulation parameter (C) to 5 to strike a balance between model complexity and predictive performance.

2.4 Regression metrics

To evaluate the precision of the organ dose predictions, we integrated multiple regression metrics, including the MAPE and R2. The MAPE calculates the mean value of the absolute percentage differences between the actual and predicted values, measuring how closely the predictive outputs align with the actual values in percentage terms without considering the direction of errors. The expression for MAPE is as follows:
M A P E = 1 n i = 1 n y i y ̂ i y i × 100 % $$\begin{equation}MAPE = \frac{1}{n}\ \sum_{i = 1}^n \mid \frac{{{{y}_i} - {{{\hat{y}}}_i}}}{{{{y}_i}}}\mid \times 100{\mathrm{\% }}\end{equation}$$ ()
where n represents the number of patients with head, chest, or abdominal conditions; y i ${{y}_i}$ represents the actual reference organ dose; and y ̂ i ${{\hat{y}}_i}$ ​ represents the anticipated organ dose. The R-squared, represented as R2, elucidates the percentage of variability in the dependent variable, which can be ascribed to the influence of the independent variables. The main objective is to measure how well the regression model aligns with the dataset, providing a numerical assessment of the goodness-of-fit. A greater R2 value indicates a stronger fit of the model. The mathematical representation of R2 is articulated as follows:
R 2 = 1 i = 1 n ( y i y ̂ i ) 2 i = 1 n ( y i y ` ) 2 $$\begin{equation}\ {{R}^2} = 1 - \frac{{\sum_{i = 1}^n {{{({{y}_i} - {{{\hat{y}}}_i})}}^2}}}{{\sum_{i = 1}^n {{{({{y}_i} - \mathop y\limits^` )}}^2}}}\end{equation}$$ ()

2.5 Examining input radiomics feature quantities

We utilized radiomics feature selection techniques to address the concerns of underfitting and overfitting during the training of the SVR models for predicting patient-specific organ doses from CT scans. Selecting appropriate radiomics features is crucial for establishing a reliable predictive model. We systematically examined 11 types of feature quantities (3, 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100) for input feature class 1 and 15 types of feature quantities (3, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 125, 150, 175, and 200) for input feature class 2 using the F-regression function in the scikit-learn library. For each considered number of input features, the features were ranked by the F-score (from F-regression) and selected accordingly to investigate the impact of input complexity on model performance.

We employed regression metrics like MAPE and R2, which are crucial for assessing predictive precision and model generalizability. The SVR models were trained and assessed for each type of feature quantity, enabling a comparative analysis of the regression metrics between the input feature classes. This allowed us to identify the appropriate feature quantity for each input feature group, a critical step in enhancing the SVR performance in predicting patient-specific organ doses from CT scans. Feature selection, with the systematic exploration of various types of input feature quantities and the use of the F-regression function, aims to strike a balance between overfitting and underfitting, ultimately enhancing the accuracy and effectiveness of SVR models.

3 RESULTS

In this section, we reveal the results of the examinations of the two input feature quantities for the SVR-based prediction models distinguished by feature classifications (i.e., classes _1 and _2). The established regression metrics, MAPE and R2, were employed to gauge predictive effectiveness. To ensure the statistical significance of the results, 100 random patient sample splits were performed for each feature classification and input feature quantity combination, maintaining a 0.8 vs. 0.2% ratio between the training and testing sets. The mean regression metrics and standard deviations were calculated across these splits. Section 3.1 details the regression metrics for head organs and Section 3.2 for abdomen organs.

3.1 Regression metrics for the head organs

The regression metrics, MAPE and R2 for the head organs (brain, eyes, and lenses) are shown in Figure 3. Figure 3A shows that the brain MAPE varied similarly for both input feature classes (class_1: skin outline only, class_2: skin outline, and internal organs). The mean MAPE values decreased as the number of input features increased, reaching a minimum of approximately 1.5% at 50 features, and then rose slowly to a maximum of nearly 1.8% at 200 features for class_2. Figure 3B shows that the R2 values for both classes increase rapidly from 3 to 20 input features, reaching a plateau with a maximum R2 of about 0.9 at 20 features. The SVR models had low mean R2 values with high variations and large standard deviations for class_2 from 100 input features onwards, indicating poor prediction stability and accuracy, possibly due to underfitting by adding internal organ radiomic features as inputs.

Details are in the caption following the image
Variation of the regression metrics as feature quantity increases for the brain (A), left eye (B), right eye (C), left lens (D), and right lens (E). In Figure 2, the solid data points represent the mean values of the regression metrics (i.e., MAPE or R2), and the error bars represent the standard deviations across 100 random patient sample data splits.
Details are in the caption following the image
FIGURE 3 (continued)
Variation of the regression metrics as feature quantity increases for the brain (A), left eye (B), right eye (C), left lens (D), and right lens (E). In Figure 2, the solid data points represent the mean values of the regression metrics (i.e., MAPE or R2), and the error bars represent the standard deviations across 100 random patient sample data splits.

Figures 3C and 3E show that the MAPE values of both eyes decrease as more input features are used from 3 to 50 and then level off, with a minimum MAPE of approximately 5% for class_1 and 5.5% for class_2. The difference in MAPE between the two classes was less than 1%. For eyes, the MAPE standard deviation across different patient sample splits increased sharply as the number of input features increased from 100 to 200, indicating that adding radiomic features from internal organs reduced the stability of the prediction accuracy. Figures 3D and 3F illustrate that the R2 values of both classes increased quickly from 3 to 30 input features and reached a stable state with a maximum R2 of approximately 0.8 from 30 to 100 input features. The R2 difference between the two classes is lower than 0.5, which indicates that the radiomic features from the internal organs do not significantly affect the prediction generality of the SVR-based models. However, as the number of input features increased from 100 to 200, the R2 values of class_2 decreased, and the standard deviation increased. This implies that radiomic features from internal organs harm the stability of the prediction generality of the SVR-based models. As seen in Figures 3G–3J, as the number of input features increases from 3 to 50, the MAPE values decrease, stabilizing at approximately 6% for the right lens and 5.5% for the left lens, with a <1% difference. However, expanding the number of features from 100 to 200 led to a sharp increase in the MAPE standard deviation, indicating reduced prediction stability. The R2 values rise rapidly from 3 to 30 features, stabilizing at 0.8 from 30 to 100 features. However, a subsequent drop in R2 values for the left lens suggests compromised prediction generality due to feature expansion.

3.2 Regression metrics for the abdomen organs

As illustrated in Figures 4A and 4B, as the number of input features increases from 3 to 20, the MAPE values decrease, stabilizing between 20 and 50 features at approximately 5.5% for class_1 and 6% for class_2, and then increasing with further feature increases. For class_2, expanding from 100 to 200 features resulted in minimal MAPE improvement. The R2 values increased from 3 to 10 features, plateauing near 0.85 from 30 to 60 features. However, as the number of input features increases from 100 to 200 for class_2, the R2 values gradually decline, suggesting that internal organ radiomics features neither significantly improve nor worsen predictive generality.

Details are in the caption following the image
Variation of the regression metrics as feature quantity increases for the bowel (A), left kidney (B), right kidney (C), and liver (D). In Figure 3, the solid data points represent the mean values of the regression metrics (i.e., MAPE or R2), and the error bars represent the standard deviations across 100 random patient sample data splits.

As illustrated in Figures 4C and 4E, increasing the number of input features from 3 to 20 led to a reduction in the MAPE values, stabilizing between 20 and 40 features at approximately 7.5% for both the right and left kidneys. Expanding the input features from 100 to 200 for both kidneys barely improves the MAPE and may even exacerbate it. In Figures 4D and 4F, R2 values rise from 3 to 20 input features, reaching a plateau near 0.85 from 30 to 60 features for both kidneys. However, as the number of input features increased from 100 to 200, R2 values experienced a gradual decline, suggesting that incorporating internal organ radiomics features does not significantly alter the predictive performance for either kidney. Overall, the disturbances induced by the head organs resulted in less than 1% change in MAPE and less than 0.05 alteration in R2.

As illustrated in Figure 4G, increasing the number of input features from 3 to 30 triggers a decline in MAPE values, stabilizing between 20 and 100 features at approximately 8.2% for both class_1 and class_2. Expanding from 100 to 200 input features for class_2 in the liver analysis barely improves the MAPE and may even exacerbate it. In Figure 4H, R2 values show an upward trajectory from 3 to 30 input features, peaking near 0.7 from 30 to 80 features. As input features increased from 100 to 200, R2 values gradually decreased, indicating that incorporating internal organ radiomics features negatively impacts the predictive generality of the liver. Overall, the disturbances induced by abdominal organs resulted in less than 1% change in MAPE and less than 0.05 alteration in R2.

4 DISCUSSION

Patient organs are exposed to energy deposition from ionizing radiation emitted during CT scans, which could potentially increase the risk of developing solid cancers in individuals who undergo such examinations. An efficient approach for swiftly predicting patient-specific organ doses from CT scans is the application of SVR models, which are trained on radiomic features extracted from images. However, a significant barrier to user-friendly clinical applications is the need to segment organ contours before predicting doses, which is cumbersome and time-consuming. Therefore, there is a crucial need for innovative methods that can efficiently predict organ doses from CT scans without requiring organ contour segmentation. Consequently, this study explores how to train SVR models with only radiomic features from the skin outline, which does not depend on the internal organs and still achieves high accuracy and robustness in predicting organ doses from CT scans.

Many studies have utilized neural networks or Monte Carlo simulations to forecast organ doses for each patient from CT scans; however, these methods have two key limitations. First, the robustness of neural networks in forecasting patient-specific organ doses was not verified when the patient samples in the training and testing sets differed. Second, both neural networks and Monte Carlo simulations require the segmentation of internal organs, which is a tedious and time-consuming process. This study employs the SVR algorithm to efficiently predict organ doses from CT scans to overcome these limitations. These findings indicate that the influence of radiomic features from internal organs on the SVR-based performance varied across different types of CT scans. In head CT scans, these features had a minimal impact on improving the predictive performance and may have even hindered it. In abdominal CT scans, radiomic features of internal organs marginally enhanced the SVR-based model performance. Overall, the SVR-based models trained solely on radiomic features extracted from the skin outline, independent of the internal organs, demonstrated adequate accuracy and robustness in predicting organ doses from both head and abdominal CT scans. This suggests the feasibility of developing a user-friendly online tool capable of assessing radiation risks and organ doses without requiring internal organ segmentation.

This study has some limitations. First, skin outline-based SVR models can efficiently predict organ doses for head and abdomen cases. However, they may not directly apply to other CT-scanned regions, such as the pelvis and chest, with different radiomic features and voxel dose distributions. Second, the skin outline-based SVR model was trained and tested on patients who underwent brain and abdominal CT scans at our institution, which may have CT scanning parameters and protocols different from those of other institutions. Different institutions must train and optimize their own SVR models to implement the proposed method in other settings, considering their unique CT scanning parameters and protocols. Third, our study mainly focused on adult brain and abdominal patients, who may have organ sizes and shapes different from those of pediatric patients. Therefore, a specialized prediction model for patient-specific organ doses in pediatric patients must be developed when applying the proposed method to children. Additionally, our study only covered the head and abdominal organs, such as the eyes, lenses, brain, kidneys, liver, and bowel. In the future, it will be necessary to train SVR-based models for other organs.

5 CONCLUSIONS

In this study, we attempted to efficiently predict organ doses from CT scans by training SVR models with only radiomic features from the skin outline, which does not depend on the internal organs. The results show that the radiomic features of the internal organs can be neglected in both the training and practical application of SVR-based prediction models without severely compromising the accuracy and robustness of the prediction. The skin outline radiomic features are adequate for training efficient and robust SVR models to predict CT scan-induced organ doses. This implies the potential to achieve a satisfactory overall performance for predicting organ doses and radiation risks from CT scans without requiring segmentation of the internal organs, which is a tedious and time-consuming process.

ACKNOWLEDGMENTS

Funding for this study was provided by the National Natural Science Foundation of China (Grant No. 12075064) and the National Key R&D Program of China (Grant No. 2019YFC0117304).

    CONFLICTS OF INTEREST STATEMENT

    The authors declare no conflict of interest.

    ETHICS STATEMENT

    The study adhered to the guidelines of the Declaration of Helsinki. Using the available CT data from our hospital, we conducted a retrospective study approved by the hospital's administration for research purposes. The study strictly ensured the anonymization of CT data and safeguarded personal information.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.