Volume 45, Issue 4 e16153
EDITORIAL
Free Access

AI-Based Prognostic Stratification in HCC, Towards a Personalised Treatment Approach

Olivier Sutter

Corresponding Author

Olivier Sutter

Interventional Radiology Unit, Hôpital Avicenne, Hôpitaux Universitaires Paris Seine-Saint-Denis, APHP, Bobigny, France

Team MONC, Inria, CNRS UMR 5251, Bordeaux INP, Université de Bordeaux, Bordeaux, France

Correspondence:

Olivier Sutter ([email protected])

Search for more papers by this author
Lorenzo-Carlo Pescatori

Lorenzo-Carlo Pescatori

Interventional Radiology Unit, Hôpital Avicenne, Hôpitaux Universitaires Paris Seine-Saint-Denis, APHP, Bobigny, France

Unité de Formation et de Recherche Santé Médecine et Biologie Humaine, Université Sorbonne Paris Nord, Bobigny, France

Search for more papers by this author
First published: 14 March 2025

Early-stage hepatocellular carcinoma (HCC), defined as a single tumour or up to three lesions < 3 cm, according to the Barcelona Clinic Liver Cancer (BCLC) classification, is eligible for curative treatment [1]. Curative options include liver transplantation, surgical resection or percutaneous ablation. Ablation can be selected as a first-line therapy over surgery for tumour(s) < 3 cm due to its advantages, such as lower complication rates, reduced mortality and minimal invasiveness, making it feasible even when liver function is slightly impaired or in the presence of portal hypertension. Moreover, for transplant-eligible patients, the strategy of first-line percutaneous ablation followed by salvage transplantation in case of recurrence is gaining traction worldwide in the context of graft shortages [2]. However, unlike surgery, ablation does not allow for a full histopathological analysis of the resected specimen, limiting the tumour tissue analysis to samples obtained via micro-biopsies prior to the ablation. This is a limitation for prognostic stratification following a first curative treatment, especially for transplantable patients, as recognised histological markers of tumour aggressiveness, such as microvascular invasion and satellites, cannot be captured by biopsies. Intratumoral heterogeneity (ITH) is another promising marker with the potential to provide valuable information on prognosis and the risk of early recurrence. As ITH can only be fully assessed through histological and genomic evaluation of the entire lesion, imaging features of ITH, using modalities such as radiomics and/or deep-learning, could serve as surrogate markers in advanced HCC or in early-stage HCCs treated with ablation where, by definition, no surgical specimen is available.

In this issue of Liver International, Zhang et al. describe a transformer-based quantitative ITH model that integrates signatures extracted from ultrasound (US), contrast-enhanced US (CEUS) and magnetic resonance imaging (MRI), acquired before ablation, along with demographic, clinicopathological and laboratory variables to predict individual early recurrence risk [3]. The model was tested on cohorts of patients treated with radiofrequency ablation (RFA) and microwave ablation (MWA), and then validated on external cohorts undergoing RFA, laser ablation (LA) and irreversible electroporation (IRE) treatments. The primary structure of the network used to extract ITH-related features from imaging is referred to as the vision-transformer-based quantitative intratumoral heterogeneity (ViT-Q-ITH) model. This deep learning model segments images into patches and employs a self-attention mechanism to analyse correlations between these patches, capturing both global and local relationships to enhance the understanding of complex visual structures. A combined model was then developed by integrating the ViT-Q-ITH score with clinical factors. The study results show that the combined model achieved high performance in both the internal validation cohort (AUC 0.86) and the external test cohort (AUC 0.83), with sensitivities of 76% and 74% and specificities of 88% and 84%, respectively, outperforming both the traditional clinical model and the standalone ViT-Q-ITH model. Recurrence-free survival analyses further confirmed the superior stratification capability of the combined model, clearly distinguishing high-risk from low-risk patients more accurately than clinical models alone. In all test cohorts, high-risk patients identified by the combined model had a significantly higher probability of early recurrence (local or distant) compared to low-risk patients.

One of the study's strengths lies in the generalisability of the combined model when applied to ‘real-world’ data from external cohorts. Indeed, these external cohorts differed not only in the therapeutic modalities employed (some patients treated with IRE or LA instead of traditional RFA/MWA) but also in baseline clinical and biological characteristics (e.g., much smaller tumours and less advanced liver disease) compared to the training cohort. Despite these unfavourable conditions on paper, the developed combined model showed excellent performance, achieving AUCs over 0.8. This suggests that the model may be applicable across different clinical settings and ablation modalities, broadening its potential for clinical use. An innovative aspect of this study is the development of a model that integrates traditional clinical data, known to be linked to HCC recurrence, with readily available images such as single pictures from US and CEUS. It is also worth noting that the two MRI sequences used were unenhanced acquisitions (T2- and diffusion-weighted imaging), as opposed to numerous studies on tumour heterogeneity on imaging, which generally analyse vascularity/texture on contrast-enhanced volumes. This choice was likely made to limit variations related to image contrast, injection type and timing, thereby improving the model's generalisability across centres.

However, one recurrent challenge for the widespread adoption of radiomics or AI-based models lies in their interpretability. Although the vision-transformer offers superior performance, its neural network nature makes it more complex to interpret compared to traditional approaches based on simple clinical variables or static images. This is critical because predictive models must be understandable not only to AI specialists but also to the clinicians who want to use them in daily practice. Despite these challenges, this model appears robust, and there are several areas for potential development to improve its clinical applicability, particularly by extending validation to multi-institutional and international datasets to test the model's robustness in diverse populations. A potential target population for AI-based imaging biomarkers could be transplant-eligible patients treated with first-line ablation, in order to stratify, based on their predicted early recurrence risk, whether they should be fast-tracked for transplantation rather than following an ‘ablate and wait’ strategy. Additionally, integrating other data sources, such as genetic sequencing or circulating biomarkers, could further enhance the model's predictive power. Combining genetic data with imaging and clinical data could provide a more complete picture of tumour biology, improving prediction accuracy and enabling even more personalised risk stratification. Lastly, the challenge of interpretability must be addressed more thoroughly. Clinician involvement in model training and validation could help develop more transparent tools that can be used in daily practice, reducing the common feeling among physicians of dealing with a ‘black box’ with AI-based tools. In this context, explainable AI (XAI) techniques will play a crucial role, enabling deep learning models to maintain high accuracy without sacrificing an understanding of their internal mechanisms [4].

In conclusion, the study by Zhang et al. contributes to the growing body of evidence supporting the use of AI-based imaging biomarkers in the prognostic stratification of HCC [5]. While challenges remain, particularly in terms of interpretability and large-scale validation, these approaches have the potential to improve clinicians' ability to predict recurrence and tailor patient management accordingly.

Conflicts of Interest

The authors declare no conflicts of interest.

Data Availability Statement

Data sharing is not applicable to this article, as no new data were created or analyzed in this study.

    The full text of this article hosted at iucr.org is unavailable due to technical difficulties.