Volume 1, Issue 2 pp. 167-189
REVIEW
Open Access

Radiobioinformatics: A novel bridge between basic research and clinical practice for clinical decision support in diffuse liver diseases

Pinggui Lei

Corresponding Author

Pinggui Lei

Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Kowloon, Hong Kong SAR, China

Department of Radiology, The Affiliated Hospital of Guizhou Medical University, Guiyang, Guizhou, China

School of Public Health, Guizhou Medical University, Guiyang, Guizhou, China

Correspondence

Pinggui Lei, Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Kowloon, HKSAR, China; Department of Radiology, The Affiliated Hospital of Guizhou Medical University, Guiyang, GZ, China; and School of Public Health, Guizhou Medical University, Guiyang, GZ, China.

Email: [email protected]

Lawrence Wing-Chi Chan, Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Kowloon, HKSAR, China.

Email: [email protected]

Contribution: Conceptualization (lead), Data curation (lead), Formal analysis (equal), Funding acquisition (lead), ​Investigation (equal), Methodology (lead), Project administration (lead), Resources (equal), Software (equal), Supervision (equal), Writing - original draft (equal), Writing - review & editing (equal)

Search for more papers by this author
Na Hu

Na Hu

Department of Radiology, The Affiliated Hospital of Guizhou Medical University, Guiyang, Guizhou, China

Contribution: Data curation (equal), Writing - original draft (equal), Writing - review & editing (equal)

Search for more papers by this author
Yuhui Wu

Yuhui Wu

Department of Radiology, The Affiliated Hospital of Guizhou Medical University, Guiyang, Guizhou, China

Contribution: Data curation (equal), Methodology (supporting), Visualization (supporting), Writing - original draft (equal), Writing - review & editing (equal)

Search for more papers by this author
Maowen Tang

Maowen Tang

Department of Radiology, The Affiliated Hospital of Guizhou Medical University, Guiyang, Guizhou, China

Contribution: Visualization (supporting), Writing - original draft (equal), Writing - review & editing (supporting)

Search for more papers by this author
Chong Lin

Chong Lin

Department of Radiology, The Affiliated Hospital of Guizhou Medical University, Guiyang, Guizhou, China

Contribution: Data curation (equal), Software (supporting), Writing - original draft (equal)

Search for more papers by this author
Luoyi Kong

Luoyi Kong

Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Kowloon, Hong Kong SAR, China

Contribution: Data curation (supporting), Writing - original draft (equal), Writing - review & editing (supporting)

Search for more papers by this author
Lingfeng Zhang

Lingfeng Zhang

Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Kowloon, Hong Kong SAR, China

Contribution: Data curation (supporting), Writing - original draft (equal), Writing - review & editing (supporting)

Search for more papers by this author
Peng Luo

Peng Luo

School of Public Health, Guizhou Medical University, Guiyang, Guizhou, China

Search for more papers by this author
Lawrence Wing-Chi Chan

Corresponding Author

Lawrence Wing-Chi Chan

Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Kowloon, Hong Kong SAR, China

Correspondence

Pinggui Lei, Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Kowloon, HKSAR, China; Department of Radiology, The Affiliated Hospital of Guizhou Medical University, Guiyang, GZ, China; and School of Public Health, Guizhou Medical University, Guiyang, GZ, China.

Email: [email protected]

Lawrence Wing-Chi Chan, Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Kowloon, HKSAR, China.

Email: [email protected]

Contribution: Conceptualization (lead), Formal analysis (equal), Methodology (equal), Project administration (equal), Resources (equal), Software (equal), Supervision (equal), Writing - review & editing (equal)

Search for more papers by this author
First published: 18 June 2023

Abstract

The liver is a multifaceted organ that is responsible for many critical functions encompassing amino acid, carbohydrate, and lipid metabolism, all of which make a healthy liver essential for the human body. Contemporary imaging methodologies have remarkable diagnostic accuracy in discerning focal liver lesions; however, a comprehensive understanding of diffuse liver diseases is a requisite for radiologists to accurately diagnose or predict the progression of such lesions within clinical contexts. Nonetheless, the conventional attributes of radiological features, including morphology, size, margin, density, signal intensity, and echoes, limit their clinical utility. Radiomics is a widely used approach that is characterized by the extraction of copious image features from radiographic depictions, which gives it considerable potential in addressing this limitation. It is worth noting that functional or molecular alterations occur significantly prior to the morphological shifts discernible by imaging modalities. Consequently, the explication of potential mechanisms by multiomics analyses (encompassing genomics, epigenomics, transcriptomics, proteomics, and metabolomics) is essential for investigating putative signal pathway regulations from a radiological viewpoint. In this review, we elaborate on the principal pathological categorizations of diffuse liver diseases, the evaluation of multiomics approaches pertaining to diffuse liver diseases, and the prospective value of predictive models. Accordingly, the overarching objective of this review is to scrutinize the interrelations between radiological features and bioinformatics as well as to consider the development of prediction models predicated on radiobioinformatics as integral components of clinical decision support systems for diffuse liver diseases.

Abbreviations

  • AI
  • Artificial intelligence
  • ALD
  • alcoholic liver disease
  • BCS
  • Budd–Chiari syndrome
  • CT
  • computed tomography
  • CNN
  • convolutional neural network
  • GWAS
  • growing numbers of genome-wide association studies
  • HCC
  • hepatocellular carcinoma
  • MRI
  • magnetic resonance imaging
  • NAFLD
  • nonalcoholic fatty liver disease
  • NASH
  • namely nonalcoholic steatohepatitis
  • SVM
  • support vector machine
  • SWE
  • shear wave elastography
  • 1 BACKGROUND

    Diffuse liver diseases, including hepatic steatosis, fibrosis, metabolic disease, and hepatitis, can cause chronic liver damage and eventually lead to hepatocellular carcinoma (HCC). Chronic liver disease affects 1.5 billion people worldwide, making it a major global health problem [1]. Early diagnosis and treatment of diffuse liver diseases affect their progression and outcome, and therefore the need to assess hepatic parenchyma has increased. Liver biopsy is currently the “gold standard” for diagnosing diffuse liver diseases, but there are limitations, such as invasiveness, sampling error, and interobserver variability [2, 3].

    Imaging plays a key role in the evaluation of diffuse liver diseases. Conventional imaging features, such as morphology, size, margin, density, signal intensity, and echoes, are insufficient to accurately diagnose diffuse liver diseases [1]. The latest developments of computer science have made it possible to use computer-assisted analysis in imaging examination. Among them, radiomics and deep learning are widely investigated [4]. However, major functional or molecular changes can occur in the liver before any morphological changes are detected by image modality. Hence, knowledge of the possible mechanism by multiomics, such as genomics, epigenomics, transcriptomics, proteomics, and metabolomics, is needed to explore possible signal pathway regulation from a radiological perspective. The application of multiomics in diffuse liver diseases is still in its infancy, and more work is needed before multiomics can be applied to precision medicine. High-dimensional data from various sources should be integrated to develop computational methods for large datasets while reducing the cost of omics analysis. Bioinformatics analysis is increasingly applied to diffuse liver diseases, and such studies may provide useful information to explore the potential diagnosis, prognosis, and candidate drug target biomarkers of these diseases [5, 6].

    Radiobioinformatics is a promising approach to precisely evaluate the diagnosis or prognosis for diffuse liver diseases. In this review, we discuss the potential application of radiobioinformatics as a clinical decision support method for diffuse liver diseases and focus mainly on a prediction model for diagnosis and prognosis in clinical practice.

    2 DIFFUSE LIVER DISEASES

    Diffuse liver diseases include the diffuse pathological changes in liver parenchyma caused primarily by metabolic abnormalities, inflammation, parasites, hemorrhagic diseases, cirrhosis, metastatic diseases, drug-induced hepatitis, toxic hepatitis, schistosomiasis, and hereditary diseases. Most common diffuse liver diseases were reported to be chronic diseases: nonalcoholic fatty liver disease (NAFLD; 60%), hepatitis B virus (29%), hepatitis C virus (9%), and alcoholic liver disease (ALD; 2%) [7].

    2.1 Nonneoplastic diffuse liver diseases

    Hepatic steatosis is caused by excessive accumulation of triglycerides in liver cells in people with, for example, NAFLD and ALD. Obesity is associated with hepatic steatosis, but the deleterious health implications of obesity extend beyond hepatic steatosis to encompass more severe conditions, namely nonalcoholic steatohepatitis (NASH), NASH-related cirrhosis, and HCC [8]. Therefore, early diagnosis and quantification of hepatic steatosis are of great importance.

    NAFLD is characterized by fatty accumulation in the liver accompanied by underlying metabolic disorders, also known as metabolic-associated fatty liver disease [9]. People with advanced stages of NAFLD, such as NASH or advanced fibrosis, and people who have both NAFLD and type 2 diabetes, are at higher risk of developing cardiovascular disease. NAFLD is also an independent risk factor for heart conditions like myocardial infarction, coronary heart disease, and atrial fibrillation [10-12].

    Hepatitis is a common type of chronic liver disease. Its causes include NAFLD, ALD, virus infection, chemotherapy, and metabolic or toxic impairment, all of which lead to liver cell damage and liver function impairment. Viral hepatitis, caused by hepatitis A, B, C, D, and E viruses, is a major global health concern that affects millions of individuals worldwide. The disease can present with clinical manifestations that range from asymptomatic to acute with rapid onset or to chronic disease. Unfortunately, a significant proportion of infected individuals remain asymptomatic and may develop liver cirrhosis or liver cancer, which can be fatal. Although all types of hepatitis are associated with morbidity, hepatitis B and C cause most of the deaths associated with this disease. The development of novel diagnostic tools and highly effective direct-acting antivirals has opened up new opportunities for global eradication of chronic hepatitis C. Recognizing the epidemiological and clinical characteristics of hepatitis A, B, C, D, and E is essential to facilitate the accurate diagnosis of these diseases [13, 14]. Therefore, it is crucial to explore how to use imaging and bioinformatic technologies to study these diseases. Further research and investigation are required in this field, and radiologists will be required to play an increasingly critical role in these studies.

    Liver fibrosis is a dynamic process in which various pathogenic factors of chronic liver disease activate hepatic stellate cells and turn them into myofibroblasts, yielding large amounts of extracellular matrix proteins and excessive sedimentation. Common causes of liver fibrosis include NAFLD, ALD, chronic viral hepatitis, drug-induced liver injury, autoimmune diseases, and metabolic disorders. If the primary cause is not treated promptly, liver fibrosis may progress to cirrhosis, which can lead to portal hypertension and related complications, such as varicose bleeding, hepatic encephalopathy, hydrops, hepatorenal syndrome, and even hepatocellular carcinoma [15-18]. Early treatment of the underlying factors of liver fibrosis can stabilize or even reverse the disease [19]. Accurate diagnosis and staging of hepatic fibrosis is the key to improving prognosis, determining a treatment plan, and monitoring the progress of the disease. Radiological features and bioinformatics could have a potential role in assessing the severity of fibrosis. IDEAL-IQ magnetic resonance imaging (MRI) and computed tomography (CT) images of liver fibrosis are shown in Figure 1.

    Details are in the caption following the image

    IDEAL-IQ magnetic resonance imaging and computed tomography (CT) techniques are used to assess hepatic fibrosis. (a) In phase, (b) Out of phase, (c) Water mapping, (d) Fat mapping, (e) R2* mapping, (f) Fat fraction mapping, and (g) CT image.

    Budd–Chiari syndrome (BCS) is caused by thrombosis of the hepatic veins, inferior vena cava, or both, resulting in impaired hepatic venous outflow [20]. One or more prothrombotic conditions are often found, for instance, myeloproliferative disorders. The main complications of BCS are portal hypertension and the development of HCC, but the latter is less common [20]. The underlying disease may develop negatively in about 10% of cases, as is the case for prothrombotic factors, thus affecting the prognosis of BCS. Additional concerns are bleeding and thrombosis of other organs [21]. Although commendable efforts have been made to establish the background of diagnostic methods for BCS, there are currently no universal diagnostic criteria. Clearly, further effort is needed to find satisfactory guidelines for diagnosing BCS.

    Hepatic iron deposition is another common liver pathology for which the etiology includes primary and secondary hemochromatosis. Hereditary hemochromatosis is the most common primary hemochromatosis where changes in iron absorption lead to iron overload [22]. Secondary causes of hemochromatosis include cirrhosis, thalassemia, myelodysplastic syndrome, and multiple blood transfusions [23, 24]. Advanced hemochromatosis may progress to arthritis, involving multiple organs such as the liver, heart, and endocrine systems. Iron accumulation can lead to tissue damage and, if left untreated, can cause cirrhosis, HCC, and cardiovascular and endocrine disorders [25]. Early detection and treatment of hemochromatosis were shown to be crucial to completely prevent excess mortality caused by iron overload and also to largely prevent adverse consequences such as liver cancer, depending on the amount and duration of iron excess [26, 27]. Thus, providing a new basis for clinical practice to further investigate the mechanism of iron deposition and to follow up monitoring of the disease course will be hugely beneficial.

    Hepatic amyloidosis is characterized by the deposition of amyloid fibers in the Disse space, usually starting in the area around the portal vein, but occasionally starting in the center of lobules or being deposited in the hepatic vascularization system [28]. In severe cases, amyloid deposition causes the pressure on liver cells to atrophy, which interferes with bile passage leading to cholestasis or blocks the sinuses leading to portal hypertension [29]. Hepatic amyloidosis is usually clinically silent, with hepatomegaly and mildly elevated serum alkaline phosphatase being the most common clinical manifestations. Severe cases may be associated with jaundice, portal hypertension, liver failure, and in rare cases, spontaneous rupture of the liver [30]. Although hepatic amyloidosis rarely presents clinically, it is associated with poor prognosis and reduced survival in untreated patients. Therefore, early diagnosis is the key to effective clinical management.

    2.2 Neoplastic diffuse liver diseases

    HCC is the fourth leading cause of cancer-related mortality globally. Chronic hepatitis B and C, alcohol addiction, metabolic liver diseases (notably nonalcoholic fatty liver disease), as well as exposure to dietary toxins such as aflatoxins and aristolochic acid, are established risk factors for HCC [31, 32]. Some liver diseases can imitate the invasive presentation of HCC, such as focal confluent fibrosis, hepatic steatosis, hepatic microabscesses, intrahepatic cholangiocarcinoma, and widespread metastatic disease (pseudocirrhosis) [33]. Thus, the diagnosis of invasive HCC is greatly important for treating patients with this disease.

    Hepatic metastasis is a frequent manifestation in the metastatic cascade of primary neoplasms, connoting an advanced stage of disease progression [34]. The liver has a dual blood supply from the portal vein and the hepatic artery and is prone to metastasizing extrahepatic malignancies. The highest probability of metastasis is in tumors of the digestive tract. Colorectal cancer is one of the most common malignant tumors of the alimentary canal system worldwide, and approximately half of the patients either present with metastasis at the time of initial diagnosis or subsequently develop metastasis during the course of their illness [35]. Early detection and accurate localization of liver metastasis are greatly valuable for selecting appropriate treatments and improving patient survival.

    Hepatic lymphomas (HLs), such as non-Hodgkin's lymphoma, affect the liver in the advanced stages of systemic disease. Clinical presentation varies widely from asymptomatic to fulminant liver failure with rapid progression to coma and death. A frequently encountered manifestation is abdominal discomfort or pain observed in 39%–70% of patients; other symptoms include weight loss, night sweats, fever, fatigue, and anorexia [36]. The significance of texture analysis was corroborated based on positron emission tomography/computed tomography (PET/CT) images for the detection of distinct pathological forms of cancer. As a consequence, PET/CT has been posited as a novel approach to discriminate between HCC and HL. Although both image-based and textural parameters have the capacity to distinguish HCC from HL, textural parameters are more efficacious. Furthermore, synergistic integration of the two parameters enhanced the diagnostic accuracy of HCC and HL [37]. Therefore, radiomics has potential value to distinguish these two diseases.

    Liver biopsy is the current “gold standard” for diagnosis of diffuse liver diseases, but its use is limited owing to drawbacks such as its invasive nature, high cost, and other complications [38, 39]. Serum markers are currently not specific enough for diagnosing diffuse liver diseases. Traditional imaging modalities such as ultrasound, CT, MRI, and PET/CT provide noninvasive options to evaluate liver disease. Ultrasound is considered a first-line screening tool for identifying liver disease, but it relies heavily on the subjective judgment of the examiner [40]. CT is frequently used for the diagnosis of liver disease; however, the diagnostic efficacy of CT for quantitatively assessing macrovesicular steatosis has been deemed clinically inadequate. Nevertheless, unenhanced CT has shown substantial diagnostic accuracy in qualitatively identifying macrovesicular steatosis that exceeds the threshold of 30% [41]. MRI is not widely used because of its high price and contraindications for patients with metal implants and devices in their bodies [42]. Artificial intelligence (AI) radiomics and deep learning methods are increasingly being widely applied in medical studies and have shown excellent performance in diffuse liver disease recognition, classification, and segmentation [43-45]. Nonetheless, both radiomics models and deep learning algorithms are subject to overfitting because they are based on large numbers of image-derived parameters, and therefore, all radiomics and deep learning results require rigorous clinical validation.

    3 RADIOBIOINFORMATICS

    Radiobioinformatics encompasses both radiological features and bioinformatics and can be used to support clinical decisions by constructing predictive models. Image features can reflect possible signal pathways or biomarkers and together with bioinformatics can provide a bridge between basic research and clinical applications.

    In the last decade, the role of radiologists in evaluating patients with diffuse liver diseases has expanded and, in many cases, pathologists have significantly expanded their management options in imaging workups of patients with suspected diffuse liver diseases. Sometimes, imaging may point directly to a diagnosis, but more often imaging helps narrow differential diagnoses or is crucial in patient follow-up. The radiobioinformatics workflow is shown in Figure 2. Radiobioinformatics is a promising approach that can be used to construct multiomics prediction models based on different omics datasets (genomics/transcriptomics/proteomics/metabolomics/radiomics). Such models can be applied to establish new methods for noninvasive assessment of diffuse liver diseases.

    Details are in the caption following the image

    Workflow for Radiobioinformatics. CNN, convolutional neural network; RF, random forest; SVM, support vector machine.

    4 ARTIFICIAL INTELLIGENCE BASED ON IMAGING FOR DIFFUSE LIVER DISEASES

    AI methods have been used to extract high-throughput quantitative imaging features beyond what can be seen with the naked eye and to encrypt medical images and convert them into numerical datasets that can be mined [46]. Using such extracted and analyzed imaging features, AI can be applied to exploit medical image information aside from traditional clinical factors and biomarkers to improve prognostic and diagnostic accuracy [47, 48]. The flow chart of radiomics is shown in Figure 3. Deep learning encompasses sophisticated algorithms that use multilayer neural networks that try to simulate the working mechanisms of the human brain to analyze large datasets [49-51].

    Details are in the caption following the image

    Workflow for radiomics analysis. ROI, region of interest.

    Recently, large-scale medical imaging was used to manage liver disease by applying AI techniques [52]. In liver cancer, quantitative imaging traits were linked with global gene expression profiles and used to reconstruct 78% of global gene expression profiles [53]. Besides setting a foundation for future research, this landmark result greatly encouraged researchers to examine the potential of quantitative imaging tools for preoperative genetic and pathological outcome prediction. Therefore, many AI studies have used multiparametric and multimodality imaging to diagnose liver diseases and make treatment decisions [54]. AI has also been used to evaluate the severity of diffuse liver diseases, including liver fibrosis, fatty liver disease, and liver hepatitis. A summary of the methodologies and results of representative studies is provided in Table 1.

    TABLE 1. Artificial intelligence (AI) studies of diffuse liver diseases.
    Diffuse liver disease Authors Participants (n) Imaging modality Algorithm of AI Performance or mine results
    Liver steatosis Byra et al. [55] 55 B-mode US CNN Sensitivity: 100%, Specificity: 88%
    Accuracy: 96%, AUC: 0.98
    Biswas et al. [56] 101 US Deep learning Accuracy: 100%, AUC: 1.0
    Han et al. [57] 204 US CNN Sensitivity: 97%, specificity, 94%
    Chou et al. [58] 2070 US Deep learning AUC, respectively, was 0.971, 0.985, and 0.996
    Zamanian et al. [59] 55 US Deep learning AUC: 0.9999; accuracy was 0.9864%
    Nguyen et al. [60] Rabbits US CNN Accuracies of 74%
    Cha et al. [61] 294 US DCNN Average values were 0.919 (95% CI, 0.899 to 0.935), 0.916 (95% CI, 0.895 to 0.932), and 0.734 (95% CI, 0.676 to 0.782), respectively
    Kullberg et al. [62] 107 CT Filtering approach (ILT) A strong correlation was found between automated measurements and manual reference measurements.
    Graffy et al. [63] 9552 CT Deep learning With this CT-based liver fat quantification tool, hepatic steatosis and nonalcoholic fatty liver disease can be evaluated at the population level
    Pickhardt et al. [64] 1204 Contrast-enhanced CT CNN Contrast-enhanced volumetric hepatosplenic attenuation obtained using a fully automated deep learning CT tool might allow objective categorical assessment of steatosis in the liver.
    Huo et al. [65] 240 CT DCNN Combining DCNN and morphological operations to estimate liver attenuation, ALARM achieves “excellent” agreement with manual estimation for fatty livers.
    Ding et al. [66] 150 CT Radiomic The AUC of the training cohort was 0.907, and the AUC of the testing cohort was 0.906.
    Jimenez-Pastor et al. [67] 183 (MECSE) MR CNNs An accurate segmentation of liver parenchyma can be achieved by a CNN in a vendor-neutral manner.
    Cirrhosis Yeh et al. [68] 20 fresh postsurgical human liver samples US SVM The best classification accuracy of two, three, four, and six classes were 91%, 85%, 81%, and 72%, respectively.
    Zhang et al. [69] 239 Doppler US ANNs Sensitivity: 95%, specificity: 85%, accuracy: 88.3%, the YI was 0.80.
    Cao et al. [70] 37 US ANN The classification accuracies of S0-S4 were 100%, 90%, 70%, 90%, and 100%, respectively.
    Chen et al. [71] 513 Elastography Machine learning Adopted classifiers significantly outperformed the LFI method.
    Brattain et al. [72] 328 Elastography CNN AUROC was 0.89.
    Gatos et at. [73] 200 SWE CNNs SWE images showed improved accuracy (ranging from 82.5% to 95.5%) compared to those of the unmasked ones (ranging from 79.5% to 93.2%).
    Li et al. [44] 144 SWE Machine-learning There was a better prediction capability for liver fibrosis stage using ORF and CEMF features than conventional radiomics (both p < 0.01).
    Gatos et al. [74] 85 SWE Textural features The accuracy from SVM modal was 87.7%; sensitivity: 83.3% and specificity: 89.1%
    Gatos et al. [75] 126 SWE Machine-learning Accuracy: 87.3%, sensitivity: 93.5%, and specificity: 81.2%
    Wang et al. [76] 654 Elastography Radiomics AUCs were 0.97 for F4 (95% CI 0.94 to 0.99), 0.98 for ≥F3 (95% CI 0.96 to 1.00) and 0.85 (95% CI 0.81 to 0.89) for ≥F2.
    Xue et al. [43] 466 Elastography TL radiomics AUCs were 0.950, 0.932, and 0.930 for classifying S4, ≥ S3, and ≥ S2, respectively.
    Lee et al. [77] 3,446 US DCNN The DCNN could accurately determine METAVIR scores and could diagnose cirrhosis more accurately than radiologists.
    Park et al. [78] 167 US Texture analysis Accuracy: 64%, sensitivity: 87%, and specificity: 48%
    Zhou et al. [79] 237 US Texture analysis ROC was 0.88, 0.85 (group I), and 0.85, 0.86 (group II).
    AI-Hasani et al. [80] 233 rats US Texture analysis MLP and nB combined had high diagnostic performances, with AUCs approaching 0.95–0.96.
    Daginawala et al. [81] 83 Contrast-enhanced CT Texture analysis The results suggest that texture-based analyses of contrast-enhanced CT images could be used to diagnose liver fibrosis without invasive methods.
    Lubner et al. [82] 289 CT Texture analysis At cutoff 0.18 and the ROC AUC for F0 versus F1-4 was 0.78, with 74% sensitivity and 74% specificity
    Hirano et al. [83] 75 CT Texture analysis The combined feature had specificity, sensitivity, and accuracy of 0.66, 0.88, and 0.76, respectively.
    Choi et al. [84] 1321 Contrast-enhanced CT Deep learning A simple DLS (Obuchowski index, 0.94) was found to be superior to the radiologist's interpretation, APRI, and fibrosis-4 index (Obuchowski index range, 0.71–0.81) for liver fibrosis staging.
    Yasaka et al. [85] 286 CT DCNN Staging of liver fibrosis can be determined based on CT images using deep learning models.
    Yin et al. [86] 252 Contrast-enhanced CT Deep learning AUC of the LFS network for F2-F4, F3-F4, and F3 were 0.92, 0.89, and 0.88, respectively.
    Homayounieh et al. [87] 300 CT Radiomics For differentiating between healthy liver and amiodarone (AUC:0.93) radiomics were more accurate than iron overload (AUC:0.79).
    Doda Khera et al. [88] 75 DECT Radiomics Radiomics and iodine quantification by DECT accurately distinguish a normal liver from steatosis and cirrhosis.
    Pollack et al. [93] 149 Traditional MR Machine learning Machine learning can generate virtual elastography images from conventional MRI data and clinical data.
    Yasaka et al. [89] 634 MR DCNN Discriminating between these three stages of liver fibrosis, achieving AUCs of 0.85, 0.84, 0.85, and 0.84, respectively
    Park et al. [90] 436 MR Radiomic The AUC of 0.90, 0.89, and 0.91 has been achieved in detecting significant fibrosis, advanced fibrosis, and cirrhosis, respectively, which is higher than 0.66, 0.62, and 0.60 of APRI.
    Zheng et al. [91] 102 MR Radiomic As compared with model 1, model 2 incorporating radiomics signatures performed satisfactorily (0.874 vs. 0.757, p = 0.010).
    Elkilany et al. [92] 248 MR Radiomic Radiomics-based MRI analysis of gadoxetic acid-enhanced images offer a promising method for identifying liver cirrhosis
    Hepatitis Tana et al. [45] 70 CT CNN In the test set, deep learning CNNs demonstrated an accuracy rate of 70%
    • Abbreviations: ANN, artificial neural network; AUROC, area under the receiver operating curve; CEMF, contrast-enhanced micro-flow; DCNN, deep convolutional neural network; DECT, dual-energy computed tomography; ILT, filtering approach; LFI, liver fibrosis index; MECSE, multi-echo chemical shift encoded; MLP, multi-class perceptron; nB, naïve Bayes; ORF, original radiofrequency; SWE, shear wave elastography; TL, transfer learning; YI, Youden's index.

    4.1 Liver steatosis

    4.1.1 Application of AI based on ultrasound

    Ultrasound-based AI has proven to be extremely accurate and has demonstrated excellent reproducibility and reliability for the detection of steatosis. A convolutional neural network (CNN) model was created by Byra et al.[55] to obtain characteristics from B-mode ultrasound images. Their results showed that sonographers may be able to automatically diagnose liver fat. Biswas et al. [56] demonstrated that a deep learning-based paradigm was superior to traditional machine learning systems for fatty liver detection and risk stratification. Radiofrequency data were used by Han et al. [57] to develop a deep learning algorithm to assess NAFLD quantitatively. A neural network-based model for assessing fat liver and categorizing severity was developed by Chou et al. [58] using B-mode ultrasound images. The combined deep learning algorithm based on B-mode images had an AUC of 0.9999 and accuracy of 0.9864 when compared with the study of Zamanian et al. [59]. Similarly, CNNs based on radiofrequency signals performed better for the classification of steatosis in an animal experiment than conventional quantitative ultrasound [60]. A CNN-based hepatorenal index model was tested for the evaluation of NAFLD [61]. The result showed that a DCNN was able to quantitatively quantify the hepatorenal index in patients with normal or mild fatty liver, and it produced results that were comparable to those obtained by experienced radiologists.

    4.1.2 Application of AI based on CT

    The performance of CT-based AI in hepatic steatosis has been evaluated in several studies. A technique for automating body composition analysis, including liver fat, was developed and validated by Kullberg et al. [62] by applying deep learning based on CT. A study by Graffy et al. [63] validated an automated liver segmentation tool based on deep learning to determine the amount of liver fat in 9,552 consecutive abdominal CT scans of consecutive patients. CT-based AI has even been used to segment liver fat using deep learning volumetric algorithms on contrast-enhanced CT scans, and a categorical objective assessment of hepatic steatosis was achieved [64]. A deep learning and morphological operation-based method was proposed by Huo et al. [65] to estimate liver attenuation in peripheral regions of interest to further reduce vessel effects. A Ding et al. [66] study showed that CT-based radiomic models were more accurate than conventional clinical models and served as a valuable reference for macrosteatosis (MaS) grading in cadaveric liver donors.

    4.1.3 Application of AI based on MRI

    AI-based MRI has been shown to be a useful method for comprehensive and objective analysis of liver steatosis. Jimenez-Pastor et al. [67] used CNNs to automate the segmentation of whole liver parenchyma in multiecho chemical shift encoded MR examinations.

    4.2 Liver fibrosis

    4.2.1 Application of AI based on ultrasound

    The accuracy of liver fibrosis diagnosis and grading has been improved by applying AI-based ultrasound. Yeh et al. [68] developed a support vector machine (SVM) model based on US to analyze liver fibrosis. The results indicated that SVM models can be recommended for evaluating liver fibrosis stages. Zhang et al. [69] showed that artificial neural network models have high sensitivity and specificity in quantitatively diagnosing liver cirrhosis and hepatic fibrosis. Radiomics has also been shown to be highly effective in grading liver fibrosis using texture analysis on ultrasound liver images [70]. A study was conducted to quantify the pattern of real-time tissue elastography (RTE) images using 11 image features that were extracted directly by the ultrasound system's RTE software [71].

    After processing the data, four classical classifiers were chosen to classify fibrosis stages. The results demonstrate that the new classifier clearly outperformed the previously used liver fibrosis index method. Brattain et al. [72] showed that by combining image quality assessment, regions of interest selection and CNN classification based on shear wave elastography (SWE), F2 fibrosis could be detected with high accuracy. AI based on SWE has proven to be useful for detecting liver fibrosis and staging it [44, 73-75]. Additionally, deep learning radiomics performed better than SWE in diagnosing chronic hepatitis B [76]. A CNN model developed by transfer learning radiomics was used to assess gray-scale and elastogram images of ultrasounds to grade liver fibrosis with high accuracy [43]. Lee et al. [77] trained a four-class model (F0 vs. F1 vs. F23 vs. F4) that was based on deep CNNs to predict fibrosis according to the METAVIR system. Furthermore, analyzing gray-scale ultrasound images for texture has been shown to be useful for discriminating between early and advanced liver fibrosis [78-80].

    4.2.2 Application of AI based on CT

    CT texture analysis has been applied to assess hepatic fibrosis on CT images [81, 82]. These studies suggest that CT texture analysis can detect useful biomarkers of liver fibrosis. CT texture analysis metrics and software platforms differ widely, and therefore further study and standardization of CT methodology are needed. The newest research by Hirano et al. [83] showed that CT image texture features combined with logistic regression with L2-norm regularization could be used to detect liver fibrosis. Deep learning methods based on CT have been shown to be effective in staging liver fibrosis [84-87]. Furthermore, Doda et al. [88] demonstrated that combining dual-energy computed tomography radiomics and iodine quantification accurately distinguished normal liver, steatosis, and cirrhosis. AI-based CT studies have shown promising results, which strongly suggests that AI methods could be used for evaluating and staging liver fibrosis.

    4.2.3 Application of AI based on MR

    Recently, AI based on MR has been applied to noninvasively evaluate liver fibrosis. Yasaka et al. [89] used gadoxetic acid-enhanced hepatobiliary phase MR images and developed a CNN algorithm that discriminated stages of liver fibrosis, achieving AUCs of 0.85, 0.84, 0.85, and 0.84. Gadoxetic acid-enhanced hepatobiliary phase MRI was used previously to develop and validate a radiomics-based model for staging liver fibrosis [90-92]. Pollack et al. [93] successfully used conventional MRI and clinical data with machine learning algorithms to reconstruct virtual MR enterography images for assessing liver stiffness and the fibrosis category. Sack et al. [94] found that MR-based liver and spleen radiomic features were highly accurate in identifying cirrhosis.

    4.3 Hepatitis

    Imaging manifestations of hepatitis are not specific, and therefore it is impossible for the radiologist to identify hepatitis with the naked eye. AI has been applied to identify hepatitis. Tana et al. [45] used texture analysis to build measurable CT-based imaging features to predict clinical severity in hepatitis. Naganawa et al. [95] were the first to report that texture analysis of non-contrast-enhanced CT images effectively predicted NASH. Subsequent efforts to apply machine learning algorithms to NASH histology have demonstrated the feasibility of machine learning approaches in small cohorts [96-98]. Taylor-Weiner et al. [99] described a machine learning approach to assess liver histology that accurately characterized disease severity as well as heterogeneity and sensitively quantified the treatment response in NASH. These findings demonstrated the power of machine learning in advancing the understanding of NASH disease heterogeneity in affected patients by risk stratification and in facilitating treatment development.

    In general, AI techniques have shown promising results in diffuse liver diseases.

    5 OMICS TECHNOLOGIES (EXCEPT RADIOMICS) FOR DIFFUSE LIVER DISEASES

    The application of omics to noninvasive biomarker discovery is a revolutionary breakthrough that allowed the large-scale analysis of patient samples [100]. Advancements in omics technologies have allowed novel biomarkers to be identified in a hypothesis-free manner [101]. The publicly available genomic, transcriptomic, proteomic, metabolomic, and lipidomic datasets can be used to help systematically understand the state of health and disease and to potentially identify new treatments for many diseases. Indeed, omics technologies have been widely applied to NAFLD.

    5.1 Genomics

    After the publication of the human genome sequence in 2003, genomics has become one of the most developed omics fields. Growing numbers of genome-wide association studies (GWAS) have identified specific genes involved in the development of many complex diseases and millions of genetic variations have been discovered [102]. The current National Human Genome Research Institute (NHGRI)-GWAS catalog includes 24 GWAS studies that have reported >100 gene variants relevant to NAFLD [103].

    5.2 Metagenomics

    Metagenomics focuses on the genome sequences of organisms that live in a common environment, whereas genomics focuses on the entire genetic makeup of an organism [104]. DNA from microbial communities has been analyzed by metagenomic methods, which until now have been used mainly to study microbial communities [104]. The human microbiome and its metabolites have been suggested to play a role in NAFLD pathogenesis [105]. A large metabolomic and metagenomic study showed that 6 out of 56 altered metabolites were gut derived and demonstrated sharing of gene effects with liver steatosis and fibrosis in NAFLD [106]. Recent metagenomic studies suggested that some microbiota members may serve as biomarkers for NAFLD diagnosis and prognosis [107-109].

    5.3 Pharmacogenomics

    Pharmacogenomics uses genetic variants to predict treatment responses, which can lead to the development of individualized treatments based on genetic characteristics. Besides extrinsic causes such as environmental chemicals, diet, drug–drug interactions, and alcohol, genetic factors also play a significant role in NAFLD [110]. Indeed, pharmacogenomic studies are underway to identify potential drugs for treating NAFLD and metabolic syndrome on a more individual level.

    5.4 Transcriptomics

    Besides environmental factors, other factors can alter gene expression, which can affect an individual's phenotype and their risk of developing a metabolic disease. Most transcriptome studies of metabolic diseases have focused on islets and peripheral tissues such as the liver, skeletal muscle, and adipose tissue. Baselli et al.[111] identified 320 genes that were differentially expressed in severe NAFLD compared with their expression in nonsevere NAFLD and the normal liver, and 16 of these genes were deregulated in carriers of the PNPLA3 rs738409 variant. Liver transcriptome profiles were found to be similar in healthy individuals but were clearly different from the profiles of NAFLD or NASH patients who were obese [112]. Despite the observational nature of these transcriptomic studies of NAFLD, the findings have provided insight into the treatment of this disease with many benefits for precision medicine.

    5.5 Epigenomics

    Epigenetics is the study of changes in gene expression and phenotypic variations that are not the result of changes in DNA sequences [113]. Methylation of DNA and modification of histones are among the most studied epigenetic modifications [114]. DNA methylation is a significant epigenetic modification of cytosine bases in cytosine-phospho-guanine (CpG) dinucleotides, called CpG islands [115]. There is a link between the methylation status of CpG islands and gene suppression, especially when methylation is in the regulatory and promoter regions of genes [115]. Some genes of patients with NAFLD were found to have different methylation levels than the same genes of healthy individuals [114], and a next-generation sequencing analysis of liver samples showed that patients with NAFLD had significantly lower levels of global DNA methylation than those with healthy controls [115]. In addition, changes in DNA methylation levels were found to be associated with NAFLD progression from mild to advanced and may be highly predictive of fibrosis [116]. Histone modifications are crucial posttranslational epigenetic changes that include acetylation, methylation, phosphorylation, and ubiquitylation. Chromatin structure can be altered by these modifications to regulate transcriptional activity [117]. The altered acetylation profile of histones that has been observed in patients with NAFLD has received more attention in recent years. For example, acetylation of the carbohydrate-responsive element-binding protein (ChREBP) by a histone was found to be involved in NAFLD modulation [118].

    5.6 Proteomics

    Proteomic analysis includes the identification and quantification of all the proteins present in a cell, tissue, or organism [119]. Proteomics has made remarkable progress owing to technological advances and improved tools for bioinformatics analysis. Protein signatures related to metabolic-associated fatty liver disease have been identified in several studies, which could aid in the development of noninvasive biomarkers [120-122].

    5.7 Metabolomics

    Metabolomics is the study of the metabolome in cells, tissues, or fluids of an organism [123]. Metabolomics provides an integrated view of pathophysiology and complements other omics studies. High-throughput analytical technologies have led to a rise in metabolomics studies to identify biomarkers for NAFLD and other liver diseases [123]. Metabolomics includes lipidomics and glycomics as major subfields. Lipidomics has made substantial progress in understanding metabolic pathways involved in NAFLD and NASH [124]. Glycomics involves the analysis of glycan structures both on their own and when they are bound to proteins [118]. Alterations in protein glycosylation were found to be associated with NAFLD [125]. In addition, fucosylation of proteins, such as haptoglobin, transferrin, alpha-1 antitrypsin, and ceruloplasmin, was found to contribute to the progression of NAFLD to NASH, and during the progression of NAFLD to liver fibrosis, high concentrations of fucosylated proteins have been observed [126, 127].

    5.8 Integrating multiomics

    Combining singleomics data can increase their power and provide a more comprehensive and thorough picture of individuals. The rapid evolution of sequencing technologies has led to an increase in the role of multiomics data in biology. Multiomics approaches have been used to comprehensively explore the biological mechanisms behind complex traits at the levels of DNA, RNA, proteins, and metabolites. Indeed, medicine and biology have been transformed by the availability of multiomics data, which has led to the development of system level. Advanced technologies, such as mass spectrometry and high-throughput sequencing, have dramatically reduced the cost of generating biological data. Consequently, multiomics profiling is increasingly being used in research related to health and disease [128-130]. Interpretation of multigraph data is crucial for the prediction, diagnosis, and treatment of NAFLD. Random forest models were built to predict NAFLD progression and early diagnosis based on multiomics data [131, 132]. Furthermore, Perakakis et al. [133] developed a noninvasive supervised learning model consisting of lipids, glycans, and hormones that diagnosed NAFLD with >90% accuracy. Pirola et al. [134] selected a list of genes linked to NAFLD and metabolites that were altered in NAFLD and NASH as part of an integrative research method.

    Despite these advances, the use of multiomics in diffuse liver diseases is still in its infancy, and further work is needed before multiomics can be applied to precision medicine. Establishing a solid evidence base is essential for multiomics integration and precision medicine in diffuse liver diseases. To achieve this, research designs should include integrated rigorous high-dimensional data from multiple sources and computational approaches that have been developed for large datasets. Reduction in the cost of meta-analysis is also needed. The workflow of multiomics is shown in Figure 4.

    Details are in the caption following the image

    Workflow for multiomics analysis. (The figure was created using FigDraw provided by HOME for Researchers).

    6 PREDICTIVE MODELS OR MACHINE LEARNING ALGORITHMS FOR CLASSIFICATION

    6.1 Logistic regression

    Many medical research tasks can be framed as dichotomous outcomes, such as whether hepatic steatosis occurs, whether hepatitis is present, or whether hepatic fibrosis develops. Linear regression models describe continuous outcomes, whereas logistic regression models convert binary outcomes into continuous outcomes that are log odds or logits of events, also known as logit functions [135].

    Logistic regression models are versatile and have been applied in different fields of medical research. Such models have often been used to evaluate predictors and to adjust the confounding factors and/or interactions for diffuse liver diseases in clinical practice [136-138]. A confounder is a variable that is correlated with both the outcome and the independent variable of interest. When a confounder is omitted from a model, results will be biased, and therefore it is crucial to consider all confounders. Logistic regression models have also been used to create predictive algorithms that describe the probability of an event using nomograms or online calculators, such as assessment of fibrosis and prediction of survival of patients with HCC [139-141]. Alternatively, logistic regression models have been applied to calculate propensity scores, which can be used to adjust the imbalance of baseline confounders between two cohorts when the goal is to compare their identical outcomes [142, 143]. Furthermore, because the outcome variables are binary, logistic regression can be used for binary classification tasks. Therefore, logistic regression models can evaluate tasks such as whether there is significant hepatic inflammation [144], whether hepatic steatosis is present [136], or whether cirrhosis is absent [137].

    However, nonlinear problems cannot be evaluated by logistic regression because of its linear decision surface and because the independent variables need to be averaged among themselves or without multicollinearity. More importantly, logistic regression cannot tackle multiple classification tasks. Therefore, it is difficult to obtain complex relationships using logistic regression, and more powerful and compact algorithms, such as deep learning, are needed [145]. Logistic regression is the parametric statistic for the predictive model of classification, whereas the semiparametric statistic of Cox regression can be considered for survival analysis.

    6.2 Nonparametric statistics

    SVM, random forest, and decision tree are supervised machine learning models that can be used for classification and regression. A schematic diagram of the three machine learning algorithms is given in Figure 5.

    Details are in the caption following the image

    Schematic diagrams of three machine learning algorithms.

    SVM is a supervised machine learning model that can be used for classification and regression. The classification principle of SVMs is as follows. If a dataset contains M objects and N variables, the objects can be divided into two categories. If an object is represented as points in a variable space, an n-dimensional space is obtained. SVM aims to find a hyperplane in the multidimensional space, which can divide all objects into two optimal categories with the margin of the closest points in the two categories being as large as possible. These points are called support vectors on the spacing boundary, and they determine the spacing as well as the segmented hyperplane in the middle of the spacing. Therefore, an optimal hyperplane of n−1th dimension can be obtained in an n-dimensional space (corresponding to N variables) [146].

    Random forest is a classifier that contains multiple decision trees, and its output category is determined by the mode of individual trees [147]. The classification process of random forest is as follows. Assume N represents the number of training samples and M represents the number of features. Then, the number of input features, m, is used to determine the decision result of a node in the decision tree, where m is much lower than M. A training set (bootstrap sampling) is formed by sampling N times from training samples by put-back sampling. The unsampled samples are used for prediction and error assessment. For each node, m features are selected randomly, and the decision of each node in the decision tree is determined based on the selected features. Then, the optimal splitting mode is calculated using these m features. Each tree grows intact without pruning, although pruning can be applied after building a normal tree classifier.

    Decision tree is a commonly used classification method. It has a tree-like structure in which the internal node represents a judgment on an attribute, the branch represents the output of a judgment result, and the leaf node represents a final classification result. Decision tree requires supervised learning, where a group of samples and each sample have a set of attributes as well as classification result [148]. Because the classification result is known, a decision tree can be obtained by learning the results of these samples and applying this knowledge to correctly classify new data.

    Machine learning algorithms are powerful tools for processing and modeling large amounts of omics data. Indeed, the application of machine learning to radiomics has produced gratifying results for diffuse liver diseases, such as diagnosis of liver fibrosis staging [149], inflammatory activity grading [150], and differentiation of a healthy liver from hepatic steatosis, cirrhosis, amiodarone deposition, and iron overload [87]. Recent studies have proposed targeted metabolomics with machine learning to identify metabolites that distinguish the progression of NAFLD [151, 152]. Findings of such studies can be used to assess metabolic changes in NASH and will contribute to an understanding of the pathophysiological significance of metabolite profiles in relation to NAFLD progression. Bioinformatics analysis has been increasingly used in studies of NAFLD and has provided useful information for exploring potential candidate biomarkers for the diagnosis, prognosis, and drug targets of NAFLD [5, 6, 153]. However, studies that combine bioinformatics, machine learning, and radiomics data have not yet been reported. We believe that radiobioinformatics will provide new insight into diffuse liver diseases.

    6.3 Potential value of CoAtNet in diffuse liver diseases

    CoAtNet is a new architecture that combines depthwise convolution and attention models and vertically stacking convolution layers and attention layers, that have high generalization ability, capacity, and efficiency [154]. CoAtNet combines translation invariance, input-adaptive weighting, and a global receptive field based on mobile inverted bottleneck convolution (MBConv) blocks with a relative self-attention mechanism to steer the network to a completely different level of performance. The attention mechanism is a method of resource allocation that was first used in natural language processing [155] and was soon found to be applicable to the computer vision field. Vision Transformer (ViT) [156] was released in 2020 and quickly became the cornerstone for the development of subsequent transformer models in the computer vision field. ViT was found to have poor generalization performance when the dataset is insufficient. In trying to solve this problem, Dai et al. [154] eventually devised CoAtNet. The network illustrating the principle of CoAtNet is displayed in Figure 6. To incorporate the translation invariance of the convolution with the input-adaptive weights and the global receptive field of the self-attention mechanism, CoAtNet combines the global static convolution kernel with an adaptive attention matrix after or before softmax initialization, which exhibited better classification performance on an ImageNet [157] classification task than it did on other architectures (e.g., ResNet [158], ViT, or NFNet [159]).

    Details are in the caption following the image

    Network illustrating the principle of the CoAtNet algorithm.

    CoAtNet is one of the state-of-the-art frameworks released in September 2021, and therefore not many applications of CoAtNet have been reported so far. However, because of its superior performance, CoAtNet has been applied in the medical domain. Wang et al. [160] demonstrated that CoAtNet was effective in classifying renal parenchymal tumor subtypes and performed better than ViT. Kvak [161] used CoAtNet to classify malignant and benign melanoma and obtained an accuracy of 0.901, which is better than the accuracies obtained with other state-of-the-art algorithms. Tripathi et al. [162] used CoAtNet for the cytomorphological classification of bone marrow cells and found that the CoAtNet model outperformed the EfficientNetV2 and ResNext50 models. Diffuse liver diseases are difficult to distinguish in the early stages by medical imaging and naked eye classification [163]. Early identification, diagnosis, and treatment of these diseases can significantly improve the prognosis and reduce the risk of progression to liver cirrhosis and liver cancer. Medical image classification is commonly performed using deep learning models such as ResNet and EfficientNet. Although there are few examples of CoAtNet used for diffuse liver disease discrimination, related studies have shown that the CoAtNet model outperforms other state-of-the-art algorithms for both general and medical image classification tasks. We believe that CoAtNet will also perform well for classifying diffuse liver diseases.

    6.4 Deep learning for survival analysis (DeepSurv)

    Cox proportional hazards models are widely used in survival analysis studies and to investigate the effects of different prognostic factors on tumor death and recurrence [164, 165]. Cox regression assumes that the covariates are linear, and therefore fitting Cox regression is challenging when survival data are nonlinear. Cox linear regression models also require in-depth feature engineering and prior medical knowledge for the deployment of individual treatment plan recommendation systems. Survival models built on deep learning can more effectively fit nonlinear survival data, and survival models based on deep learning and machine learning may be well suited for high-dimensional interaction terms in survival data. DeepSurv is a deep feed-forward neural network that predicts hazard rates by boosting the weights θ and patient covariates.

    DeepSurv outperformed more conventional linear prediction models in numerous biomedical studies. DeepSurv has been used to predict the efficacy of chemotherapy, guide individualized treatment, and in clinical decision support systems. She et al. [166] suggested that the DeepSurv algorithm may have potential benefits for prognosis evaluation and treatment recommendation in nonsmall cell lung cancer. Kim et al. [167] showed that DeepSurv outperformed random survival forest and Cox proportional hazards models in survival prediction of patients with oral cancer. In a study of cardiovascular risk prediction using DeepSurv, Hathaway et al. [168] obtained independent accurate predictions of atherosclerotic cardiovascular disease risk and cardiovascular outcomes using clinical characteristics without imaging information and inflammatory factors. These findings demonstrated that DeepSurv had superior capabilities in guiding individualized therapy and disease risk monitoring when compared with conventional linear prediction models. The network illustrating the principle of DeepSurv is displayed in Figure 7. Although DeepSurv has not yet been applied to diffuse liver diseases, it has shown good predictive performance, implying that similar results may be obtained in diffuse liver diseases; however, further studies are required to confirm this.

    Details are in the caption following the image

    Network illustrating the principle of the DeepSurv algorithm.

    7 CHALLENGES IN RADIOBIOINFORMATICS

    Radiomics research is still in its infancy, and there is no standardization and uniformity in the selection of regions of interest. Regions of interest are usually calibrated by radiologists, which increases the amount of prework, and calibration by different individuals will affect the subsequent model building, resulting in limited reproducibility of results and comparability between studies. In addition, both traditional machine learning algorithms, such as random forests, and deep learning algorithms, such as neural networks, have been used to build radiomics models and determining which classifiers and machine learning methods are better at dimension reduction still needs to be solved. Radiomics and deep learning algorithms are affected by the overfitting problem because of the large number of parameters derived from images. Overfitting models perform well on training data, but poorly on testing data, which reduce the generalization capability of the model. Therefore, all radiomics and deep learning algorithms require rigorous clinical validation.

    With the exponential increase of international open data, the resources for multiomics research in diffuse liver diseases will increase and the cost of research will be significantly reduced. However, the costs of long-term follow-up and laboratory testing are considerable, and clinical pathology samples are difficult to obtain, which pose challenges for molecular studies of some diseases. In addition, the combination of multiple factors and the high variability of individual datasets can lead to spurious findings, which makes it difficult to interpret the results of multiomics analyses, especially the ability to identify biologically relevant molecules. Despite these drawbacks, multiomics opens opportunities for transnational and interdisciplinary collaboration, which is particularly important for research in low- and middle-income countries, especially for research design and testing techniques. However, ethical and data sharing issues involved in these studies deserve more attention.

    To bridge the gap between fundamental research and clinical applications for exploring correlations between imaging features and potential signaling pathways or biomarkers by radiobioinformatics, stringent research methodologies, integration of high-dimensional data from diverse sources, and development of advanced computational techniques for handling large datasets need to be implemented. Addressing these aspects will clarify the current research status and future development directions of radiobioinformatics and pave the way for a deeper understanding of its application prospects and challenges. For radiobioinformatics to advance, fostering interdisciplinary collaboration among experts in radiology, bioinformatics, machine learning, and clinical practice is crucial. Such collaboration will not only facilitate the identification of novel radiological features and their associations with underlying biological mechanisms but also contribute to the development of more robust and accurate predictive models that can be seamlessly integrated into clinical workflows. With the emergence of new imaging modalities and continuous refinement of existing technologies, there is an increasing need for standardized protocols for image acquisition, preprocessing, and feature extraction. Such standardization will enable the comparability and reproducibility of research findings across different studies, thereby enhancing the generalizability and clinical applicability of radiobioinformatics-based models. Furthermore, addressing ethical and legal considerations surrounding data privacy and security is paramount for the successful implementation of radiobioinformatics in clinical settings, and establishing guidelines and best practices for data sharing while protecting patient confidentiality is crucial for responsible development and deployment of these cutting-edge tools. Radiobioinformatics advocates should prioritize the education and training of the next generation of researchers and clinicians by developing targeted curricula and interdisciplinary training programs that foster a deep understanding of the complexities and nuances of both radiology and bioinformatics.

    8 CONCLUSIONS

    In this review, we introduced the concept of radiobioinformatics as an interdisciplinary approach that integrates radiological characteristics and bioinformatics. The aim is to develop clinical decision support by constructing predictive models using various approaches, including logistic regression, supervised learning (e.g., SVM, random forest, and decision tree), and deep learning (e.g., CoAtNet and DeepSurv) algorithms. Radiobioinformatics has the potential to bridge the gap between fundamental research and clinical applications by exploring the correlations between imaging features and potential signaling pathways or biomarkers. Radiobioinformatics can be applied not only for the diagnosis and management of diffuse liver diseases but also for exploring other organs and tissues. The future of radiobioinformatics promises significant advancements in precision medicine and improved patient care. By addressing the challenges outlined and capitalizing on interdisciplinary collaboration and emerging technologies, radiobioinformatics is poised to contribute substantially to an understanding of the complex interplay between radiological features and underlying biological processes, ultimately leading to more effective diagnosis and management of a wide array of diseases. We believe that radiobioinformatics can be applied not only for diffuse liver diseases but also for other human diseases in the future.

    AUTHOR CONTRIBUTIONS

    Pinggui Lei, Na Hu, Yuhui Wu, Maowen Tang, Chong Lin, Luoyi Kong, and Lingfeng Zhang wrote the manuscript; Pinggui Lei, Na Hu, Yuhui Wu, and Maowen Tang revised the manuscript; Pinggui Lei, Na Hu, Luoyi Kong, and Lingfeng Zhang reviewed and corrected the manuscript; and Pinggui Lei and Lawrence Wing-Chi Chan critically reviewed the manuscript; and Pinggui Lei, Peng Luo, and Lawrence Wing-Chi Chan conceived the manuscript. All authors have read and agreed to the published version of the manuscript.

    ACKNOWLEDGMENTS

    The authors gratefully thank all the participants in this study; we are also thankful to the members of StudyForBetter Team who contributed their best research spirits to the area of radiobioinformatics.

      CONFLICT OF INTEREST STATEMENT

      The authors declare no conflict of interest.

      ETHICS STATEMENT

      None.

      INFORMED CONSENT

      None.

      DATA AVAILABILITY STATEMENT

      The data sets used or analyzed during the current study are available from the corresponding author on reasonable request.

        The full text of this article hosted at iucr.org is unavailable due to technical difficulties.