Early View
Original Article
Open Access

A deep active learning framework for mitotic figure detection with minimal manual annotation and labelling

Eric Liu

Eric Liu

Department of Computer Science, Western University, London, Ontario, Canada

Search for more papers by this author
August Lin

August Lin

Department of Pathology and Lab Medicine, Western University, London, Ontario, Canada

Search for more papers by this author
Pramath Kakodkar

Pramath Kakodkar

Department of Pathology and Lab Medicine, University of Saskatchewan, Saskatoon, Saskatchewan, Canada

Search for more papers by this author
Yayuan Zhao

Yayuan Zhao

Department of Pathology and Lab Medicine, University of Saskatchewan, Saskatoon, Saskatchewan, Canada

Search for more papers by this author
Boyu Wang

Boyu Wang

Department of Computer Science, Western University, London, Ontario, Canada

Search for more papers by this author
Charles Ling

Charles Ling

Department of Computer Science, Western University, London, Ontario, Canada

Search for more papers by this author
Qi Zhang

Corresponding Author

Qi Zhang

Department of Pathology and Lab Medicine, Western University, London, Ontario, Canada

Address for correspondence: Qi Zhang, Department of Pathology and Lab Medicine, Western University, London N6A 3K7, ON, Canada. e-mail: [email protected]

Search for more papers by this author
First published: 03 July 2025

Abstract

Aims

Accurately and efficiently identifying mitotic figures (MFs) is crucial for diagnosing and grading various cancers, including glioblastoma (GBM), a highly aggressive brain tumour requiring precise and timely intervention. Traditional manual counting of MFs in whole slide images (WSIs) is labour-intensive and prone to interobserver variability. Our study introduces a deep active learning framework that addresses these challenges with minimal human intervention.

Methods and results

We utilized a dataset of GBM WSIs from The Cancer Genome Atlas (TCGA). Our framework integrates convolutional neural networks (CNNs) with an active learning strategy. Initially, a CNN is trained on a small, annotated dataset. The framework then identifies uncertain samples from the unlabelled data pool, which are subsequently reviewed by experts. These ambiguous cases are verified and used for model retraining. This iterative process continues until the model achieves satisfactory performance. Our approach achieved 81.75% precision and 82.48% recall for MF detection. For MF subclass classification, it attained an accuracy of 84.1%. Furthermore, this approach significantly reduced annotation time - approximately 900 min across 66 WSIs - cutting the effort nearly in half compared to traditional methods.

Conclusions

Our deep active learning framework demonstrates a substantial improvement in both efficiency and accuracy for MF detection and classification in GBM WSIs. By reducing reliance on large annotated datasets, it minimizes manual effort while maintaining high performance. This methodology can be generalized to other medical imaging tasks, supporting broader applications in the healthcare domain.

Graphical Abstract

We propose a deep active learning framework for mitotic figure detection and classification in glioblastoma whole slide images. By iteratively selecting uncertain samples for expert review and model retraining, our approach achieves high accuracy while reducing annotation time by nearly 50%, demonstrating its potential for efficient cancer diagnosis.

Abbreviations

  • CNN
  • convolutional neural network
  • CNNs
  • convolutional neural networks
  • COCO
  • common objects in context
  • GBM
  • glioblastoma
  • GM
  • granular mitotic figure
  • H&E
  • Hematoxylin and Eosin
  • mAP50-95
  • mean Average Precision at Intersection over Union thresholds from 0.5 to 0.95
  • MF
  • mitotic figure
  • MFs
  • mitotic figures
  • SGD
  • Stochastic Gradient Descent
  • TCGA
  • The Cancer Genome Atlas
  • UIP
  • Unlabelled Image Pool
  • WSI
  • whole slide image
  • WSIs
  • whole slide images
  • yolov8-cls-m
  • YOLOv8 Classification Medium
  • yolov8-m
  • YOLOv8 Medium
  • Introduction

    The quantification and categorization of mitotic figures (MFs) in histology slides is essential for the diagnosis and grading of cancers such as glioblastoma (GBM).1 It is not only important for grading, but it can also help guide treatment decisions.2 In general, MFs can exhibit either typical or atypical morphologies.2, 3 In particular, one subcategory of atypical MFs unique to GBMs is granular mitotic figures (GM). A high mitotic index (i.e. the proportion of cells undergoing mitosis) is often associated with a more aggressive phenotype and, subsequently, a poorer prognosis in many neoplasms.3

    However, manually counting MF in haematoxylin and eosin (H&E) stained whole slide images (WSIs) requires expertise to differentiate MFs from morphological mimickers, such as lymphocytes, Creutzfeldt-Peters cells and apoptotic bodies. The task is further complicated by tumour tissue heterogeneity, processing artefacts and visual fatigue. In a traditional anatomic pathology service, a mitotic hotspot is visually estimated and at least 10 adjacent non-overlapping high-power fields are analysed for mitosis by a pathologist.4 This method of estimation or ‘eyeballing’ is efficient, but often overlooks the tissue heterogeneity and can miss MFs not included in the mitotic hotspots.

    Advanced computational tools and deep learning models offer potential solutions by automating MF detection. However, they often require large labelled datasets to perform well. Creating sufficient and accurately labelled MF datasets is highly laborious.

    To address these challenges, we developed an active learning framework that integrates two convolutional neural networks (CNNs) based on the YOLOv8 architecture.5, 6 We selected 66 H&E-stained GBM WSIs from The Cancer Genome Atlas (TCGA) as our unlabelled data pool. To begin, we manually labelled a small dataset with approximately 379 MFs to train the initial models. These models were then used to screen the remaining unlabelled WSIs, providing preliminary annotations. A tri-class thresholding strategy was applied to select MFs with ambiguous morphology from model-generated annotations for further review. Only the most challenging cases, flagged as marginal cases, were reviewed by pathologists and trained research assistants. This approach significantly reduced human effort. Selected marginal cases were prioritized in the model retraining process. The updated models were then used to provide initial MF annotations in the subsequent iteration, and this process continued iteratively.

    Our annotation and detection pipeline, incorporated into the AI4Path web platform,7 requires no specialized training and can be accessed through any modern web browser. AI4Path further offers confidence threshold adjustments within each of the MF subclasses, allowing users to adjust model sensitivity for marginal MF cases. It also features precision-recall adjustments, enhancing its usability for pathologists.

    Using this active learning framework, only less than 30% of the total MFs required manual review, as these were primarily ambiguous edge cases that were more time-consuming to annotate. Reviewing this smaller subset still took significant effort, but the framework ultimately reduced the total annotation time by half. Additionally, our models achieved near-optimal performance (within 15% of the best results) using only about 1000 marginal cases for training. In comparison, achieving similar performance with a non-active learning framework required approximately 2500 randomly selected training samples. The pipeline not only improves annotation efficiency but also reduces the challenging cases necessary for review. Prioritizing these marginal cases in model training also significantly enhances performance with much fewer samples.

    This study presents a significant advancement in MF quantification for GBM WSIs, showcasing the potential of deep learning models integrated with active learning to improve diagnostic accuracy with minimal data and manual effort. Our method can be applied to other medical imaging tasks, making it useful in different areas of healthcare.

    Methods

    Dataset

    In this study, 66 WSIs were retrieved from the TCGA GBM database, which included the most comprehensive data profiles (mRNA expression, methylation, etc.). Annotation data and genomic data were analysed in another project. Initially, two WSIs were selected based on typical morphological features observed in GBM and were manually annotated by pathologists, yielding 379 MFs for the initial training dataset. The remaining 64 WSIs were annotated using our active learning framework. During the CNN training and inference, WSIs were processed into 640 by 640 pixel patch images, discarding patches without MFs. Additionally, MFs were morphologically classified into normal, atypical and GM classes. These categories were unbalanced due to the natural occurrence rates of each MF subtype within GBMs.

    Active Learning Framework

    An overview of the active learning framework is illustrated in Figure 1. To start, 66 GBM WSIs were retrieved from the TCGA database, referred to as the Unlabelled Image Pool (UIP) for this study. Two WSIs were then selected from the UIP and manually annotated by 3 pathologists each, yielding 379 MF training samples used as the initial training dataset (L0). With samples from L0, we trained our first version of CNNs, which included one object detection model and one classification model.

    Details are in the caption following the image
    An overview of our proposed active learning framework: each image from the unlabelled pool is processed through the pipeline and analysed by the CNN inference pipeline. Actively selected samples are then verified by pathologists and trained research assistants, and these verified samples are subsequently used to retrain the deep learning model.

    With the first CNNs ready, we began processing the remaining GBM WSIs in the UIP. Each WSI underwent our inference pipeline, and the results were filtered using a user-specified confidence threshold for each MF subclass (Tri-class Threshold Strategy). Cases with confidence scores falling within the specified range (0.500 to 0.725), likely indicating ambiguous morphology, were flagged as marginal cases. These marginal cases were then reviewed by pathologists or trained research assistants, while cases with high confidence were directly added to the labelled training dataset (L1) by default. Once the marginal cases were reviewed and categorized, they were eventually added to L1 with a designated tag.

    During each iteration of the CNN's retraining process, flagged marginal cases were prioritized over randomly selected training samples. Over time, this approach improved the pipeline's performance in MF detection and categorization, reducing the manual annotation workload and resulting in more reliable outcomes.

    Inference Pipeline

    Due to the large size of WSIs (averaging 100,000 by 100,000 pixels in size), direct detection is computationally prohibitive, making preprocessing mandatory. Our inference pipeline splits WSI processing into two steps as shown in Figure 2. First, the WSI was compressed into a manageable size of around 1000 by 1000 pixels. An adaptive thresholding and morphological refinement algorithm generate a binary mask to differentiate tissue from the background.8 This created a 2D map where each element, represented by a binary value of 0 or 1 (0 indicating minimal or no valuable information and 1 indicating the opposite), corresponded to each patched image to be processed in the following step.

    Details are in the caption following the image
    An illustration of our inference pipeline. Each WSI is processed at both low and high zoom levels. Initially, the low zoom level image is used to generate a binary tissue map, which guides the selection of high zoom level patches, ensuring only relevant areas are analysed. These selected patches then undergo H&E separation to isolate the haematoxylin-stained nuclei. Finally, the processed patches are analysed by our model to produce the final detection results.

    In the second step, the original WSI was processed into 640 by 640 pixel patches. Each patch image was then cross-checked against the 2D map generated in the first step to filter out those with minimal or no useful information. To address varying stain intensities and shades, the Stain Vector & Intensity Variation and Correction method was applied, which involves adjusting the colour vectors and intensities of stained tissue images to standardize and normalize the visual representation for consistent and accurate analysis, ensuring optimal experimental results.9 Furthermore, given the focus on nuclear material and the H&E staining of WSIs in our dataset (Haematoxylin primarily staining nucleic acid in the nucleus blue and Eosin staining cytoplasmic proteins pink), we performed H&E separation (as shown in Figure 3) to isolate and examine only Haematoxylin-stained nuclei.10 The processed patch images were then sent to our object detection and classification CNNs to obtain the final results.

    Details are in the caption following the image
    Normalization is applied to each patched image undergoing analysis. Each retrieved patch image is processed using an H&E separation algorithm. Since mitotic activity predominantly occurs in the nuclei, which are stained by haematoxylin, only the haematoxylin map is necessary for studying mitosis.

    Mitotic Figure Detection & Subtype Classification

    As shown in Figure 4, a two-stage object detection and classification pipeline was implemented. Two CNNs with the YOLOv8 architecture were trained, one for detecting MF objects (Model A) and the other for classifying subtypes (Model B).6 The YOLOv8 medium (yolov8-m) model was used for object detection, while the YOLOv8 classification medium (yolov8-cls-m) model was used for subtype classification. Both models were initialized with pre-trained weights from the COCO dataset. To better address class imbalance in the classification task, we replaced the default entropy-based loss function with focal loss.11 No other structural modifications were made to the Ultralytics base models. Both models were trained for 50 epochs during initial training and in each retraining iteration, using the Stochastic Gradient Descent (SGD) optimizer for yolov8-m and the Adam optimizer for yolov8-cls-m.12, 13 To prevent overfitting, an early stopping mechanism was implemented with a tolerance set to 25 epochs.14 A batch size of 32 was chosen to optimally balance training speed and GPU memory constraints on the single RTX 4090, ensuring efficient resource utilization while maintaining convergence speed and generalization performance.15 To improve model robustness, training included standard data augmentation techniques such as random flips (horizontal/vertical), ±10° rotations, brightness/contrast adjustments and scale jittering.16 All other settings were kept as default from the Ultralytics YOLO library.6

    Details are in the caption following the image
    An illustration of the two-stage object detection and subtype classification system. The first stage (Model A) detects candidate MFs from input WSI patches. Snapshots of each detected bounding box are then passed to the second stage (Model B), which classifies each MF into one of three subtypes: normal, atypical or granular. For illustrative purposes, only granular MFs are shown in this figure.

    At first, patch images were fed into Model A to perform object detection, identifying MFs in the given images. Once all MFs were located, snapshots of each detected MF were taken and sent to the classification model, Model B, which morphologically classified them into three subtypes: normal, atypical and granular. The different phases of normal MF were not further delineated, due to the lack of clinical importance. Granular mitosis is a type of atypical MF, but was designated as a separate group in this study, because of their unique appearance and abundance in GBM samples. Detections were represented as bounding boxes. Using the coordinates of these bounding boxes and the patched images, we applied the results to the original WSIs, providing the final detection and classification outcomes.

    Model Selection and Benchmarking

    To contextualize our selection of YOLOv8, we conducted a pilot benchmark comparing its performance against three widely used object detection models in MF detection as shown in Table 1:
    1. Faster R-CNN – A widely used two-stage detector noted for strong accuracy.17
    2. RetinaNet – Known for its focal loss-based robustness in imbalanced object detection tasks.11
    3. YOLOv5 – A prior YOLO version frequently used in fast inference applications.18
    Table 1. Comparative benchmarking of object detection models for MF detection
    Model mAP50-95 Inference time (per WSI) Remarks
    Faster R-CNN 0.783 65.2 sec Strong accuracy but slow inference
    RetinaNet 0.799 59.8 sec Good balance, slower than YOLOv8
    YOLOv5 0.801 28.6 sec Fast inference, decent accuracy
    YOLOv8 0.823 30.1 sec Best trade-off for accuracy and speed

    Each model was trained and evaluated on a subset of our pathologist-verified dataset using standardized training parameters: 50 epochs, batch size 32 and early stopping with a tolerance of 25. We evaluated mean Average Precision (mAP50-95) and inference speed per WSI.19

    YOLOv8 demonstrated the best overall trade-off between detection accuracy and inference efficiency, making it well-suited for integration into real-time or large-scale annotation platforms like AI4Path.

    AI4Path Web Portal

    To enhance accessibility and collaboration, we have integrated our annotation and detection pipeline into a web-based platform.7 This platform is designed for various clinical applications, including Colon Polyp true/pseudo invasion classification, Ki-67 indexing, and several other useful features.20 In this study, we extended the platform's capabilities by incorporating the proposed active learning framework for GBM-related tasks as shown in Figure 5. This integration enables easy access for pathologists and researchers, facilitating the use of our active learning framework in real-world diagnostic settings.

    Details are in the caption following the image
    A screenshot of the AI4Path web portal, showcasing the integrated Tri-class threshold adjustment feature and the ability to fine-tune Precision and Recall based on specific use cases.

    In object detection tasks, precision and recall are crucial metrics for evaluating a model's performance. Precision measures the accuracy of the positive predictions, while recall measures the model's ability to identify all relevant objects. Balancing these metrics ensures that the model accurately and comprehensively detects objects with minimal errors. To achieve this balance, a Precision & Recall adjustment feature was also incorporated into the annotation tool in the AI4Path web portal. This feature allows pathologists to manually balance these two metrics to ensure the most optimal results for specific use cases.

    Results

    TCGA Data Annotation

    Our initial dataset comprised 66 GBM WSIs. Using the proposed deep active learning framework, the initial screening and annotations were performed. Pathologists and trained research assistants were only required to review marginal cases with uncertain morphology, which allowed them to efficiently annotate the entire set of 66 WSIs. Of these, 50 WSIs received further verification from at least one pathologist. As a result, 804 normal MFs, 728 atypical MFs and 1,955 granular MFs were annotated. Selected mitotic figures from each group are shown in Figure 6. The whole annotation dataset is available at GitHub (https://github.com/h26liu/gbm-db). The ambiguous cases are also included in the same repository.

    Details are in the caption following the image
    Representative images of mitotic figures. (A–C) Mitotic figures labelled with confidence. (A) A normal mitotic figure representing metaphase; (B) an atypical tripolar mitotic figure; (C) a granular mitotic figure with minute chromosome bodies. (D–F) Mitotic figures labelled as ‘ambiguous’ with confidence scores below the threshold. (D) a normal mitotic figure at metaphase. Likely marked as ambiguous because of the three nuclei at the periphery; (E) an atypical mitotic figure with condensed asymmetrical chromatin; (F) a mitotic figure with low confidence but re-labelled as a granular mitotic figure by pathologists.

    Active Learning Performance

    The focus of this study was to implement a deep active learning framework to accelerate the iteration of our CNNs and enhance the accuracy and efficiency of MF object detection and classification tasks.

    As a result, our active learning approach offers several benefits. First, it reduces the amount of manual annotation required by pathologists by about 70%, as the model becomes more adept at handling common cases autonomously; only the most marginal cases require expert review. In our experiments, the time required to annotate all 66 WSIs were cut into half (estimated as over 900 min), compared to the manual annotation approach. Second, it accelerates the model iteration process, enabling quicker refinements and improvements. This results in consistently accurate detection and classification over time, even with significantly less training data.

    We compared active learning to traditional passive learning using the same dataset, evaluating our MF object detection model with mAP50-95 and our sub-type classification model with accuracy.19 As shown in Figure 7a, the object detection model reached near-optimal mAP50-95 scores much faster with active learning (after 1000 training samples) than with passive learning (after 2750 training samples). Similarly, as shown in Figure 7b, the accuracy of the sub-type classification model reached approximately 80% more quickly using the active learning method (in fewer than 20 training iterations) compared to the passive learning strategy (around the 34th training iteration). The framework ensures the model is trained on the most challenging and informative cases, maximizing the effectiveness of manually labelled training data while maintaining nearly identical performance to the non-active, exhaustive labeling method.

    Details are in the caption following the image
    Comparison of active learning (red) and passive learning (blue) performance for the MF detection and subtype classification models. (A) The progression of the MF detection model shows that the active learning method reaches near-optimal mAP50-95 scores with fewer training samples than passive learning. (B) Similarly, the progression of the subtype classification model highlights that active learning achieves 80% accuracy more efficiently, with fewer training iterations compared to the passive learning approach.

    These results demonstrated that the active learning framework significantly enhances the efficiency of the annotation process and makes annotating large datasets of high-resolution WSIs possible.

    Model Accuracy

    As mentioned earlier, we implemented a two-stage object detection and classification pipeline. To verify our method, we created a validation set using WSIs from the pathologist-verified cohort that were entirely separate from those used for training, ensuring no WSI-level overlap. Subtype proportions (normal, atypical, granular) were maintained. Training and internal testing were split at the patch level, which may result in patches from the same WSI appearing in both sets. Given the limited number of annotated WSIs, this was a necessary trade-off. We performed five-fold cross-validation on the independent validation set, and all reported metrics are averaged across folds. The final assessment of our MF object detection model showed strong performance, with average precision of 81.75% (±2.3%) and recall of 82.48% (±2.5%), demonstrating the method's effectiveness in accurately and comprehensively identifying MFs.

    For the subtype classification model, we followed the same approach to create the validation dataset, ensuring it included only pathologist-verified samples, comprising approximately 10% of all annotated samples. Classification accuracy, calculated as the number of correctly classified images divided by the total number of images, was used as the evaluation metric. Results are reported as means ± standard deviations from five-fold cross-validation. The model achieved an average accuracy of 84.1% (±1.9%).

    To further evaluate classification performance, we conducted a detailed error analysis. As illustrated in Figure 8, granular MFs were the most accurately classified, exhibiting minimal confusion with other subtypes. In contrast, atypical MFs demonstrated the highest misclassification rates, most frequently labelled as normal. This confusion likely stems from subtle morphological overlaps between atypical and normal MFs, underscoring the inherent challenge in distinguishing these two classes.

    Details are in the caption following the image
    Normalized confusion matrix for MF subtype classification.

    System Efficiency

    Efficiency is a key factor when evaluating how practical an application is for real-world use. To determine the efficiency of our system, we tracked how long it took to perform object detection and subclass classification on our dataset of 66 WSIs. On average, our system processes a WSI in just 378 s (about 6.3 min). This is much quicker compared to the approximately 30–60 min typically needed by a pathologist to examine an entire WSI. As illustrated in Figure 9, we use the time spent detecting MFs in two WSIs as an example. It took pathologists 45 min to manually count 162 MFs in TCGA-08-0386 (DX1) and 60 min to count 220 MFs in TCGA-02-0033 (DX1). In comparison, our system completed the detection in just 0.8 and 5.7 min, respectively.

    Details are in the caption following the image
    The time spent detecting MFs is significantly shorter using deep active learning model than pathologist manual counting. Two example WSIs using deep active learning model compared to manual counting by pathologists.

    Moreover, the time required for a pathologist to review WSIs can vary widely due to factors such as slide complexity, experience level, visual fatigue from large caseloads and available resources. Ambiguous slides, in particular, may require several hours of review and consultation with colleagues. In contrast, our system's processing time depends only on the size of the tissue areas within a WSI and remains consistent regardless of other factors. This reliability ensures that our system streamlines the annotation process efficiently.

    Discussion

    Accurately counting MF is crucial for diagnosing and grading cancers. In the case of GBM, one of the most challenging brain tumours with dismal progression free survival and overall survival rates. MF indexing in GBM provides valuable insights into tumour growth, guiding the selection of the best treatment plan. However, manually counting MFs in WSIs is labour-intensive, and prone to inconsistencies due to visual fatigue and subjectivity.

    With the advancement of deep learning, various computational tools have become available to assist in the MF indexing process. However, achieving satisfactory results typically requires a large amount of annotated data, necessitating extensive human effort. To address this, we designed and implemented an active learning framework powered by two CNNs: one for MF detection in WSIs and the other for MF subtype classification. We used H&E stained GBM WSIs from TCGA. After manually annotating a small dataset to train initial models, these models screened the remaining unlabelled WSIs. Our tri-class thresholding algorithm flagged only the most ambiguous MFs for review, while the rest were automatically added to the training set. Flagged cases were prioritized for model retraining, and the updated models provided annotations for the next iteration. This iterative process continued until satisfactory model performance was achieved, significantly reducing the manual review workload while maintaining model accuracy and iterative speed.

    While our framework prioritizes manual review of marginal cases, a subset of high-confidence predictions was also manually verified in fully annotated WSIs by pathologists. However, due to time constraints, this verification was not performed across all WSIs. As a result, some high-confidence errors may have gone undetected. The current threshold range was selected empirically without formal calibration. In future work, we plan to explore confidence calibration and more dynamic thresholding methods to improve model certainty estimates and reduce potential error leakage.

    To enhance accessibility, we integrated the proposed active learning framework into AI4Path. AI4Path requires no expertise to use and does not need heavy computational resources. It also includes handy features such as subclass threshold adjustments and precision-recall adjustments, making it user-friendly and efficient. Our research collaborators from institutions across Canada have seamlessly utilized AI4Path to manually detect mitotic figures in our training set, and have verified ambiguous mitotic figures in over 80 WSI. AI4Path provides customizable features such as MF hotspot analysis if pathologists would still like to utilize synoptic reporting of MF counts in accordance with the College of American Pathology electronic cancer protocols.

    With the aid of the proposed framework, pathologists and research assistants only needed to review less than 30% of the total MF samples within select WSIs. This helped save almost half of the total annotation time (approximately 900 min) across the 66 WSIs in this study compared to the traditional approach. Additionally, our models achieved nearly identical results with less than half of the training samples (1000 out of 2500) required for non-active learning methods. This demonstrates that our framework significantly reduces the number of samples needed while maintaining excellent performance.

    So far, 66 GBM WSIs have been annotated using our framework, with 50 verified by pathologists. In the MF detection task, our models achieved an average precision of 81.75% and a recall of 82.48% at a confidence threshold of 0.75, and a classification accuracy of 89.1% for normal, granular and atypical MFs in the subtype classification task. These results demonstrate the effectiveness of our approach in balancing precision and recall, thereby supporting the diagnostic process.

    As we continue to collect additional WSIs and annotations, we aim to refine our models and extend the application to other cancer types. Our goal is to enhance diagnostic accuracy with less data and minimal manual effort, ultimately improving pathology workflows and patient outcomes.

    In summary, our study presents a significant advancement in MF quantification for GBM WSIs, demonstrating the potential of deep learning integrated with active learning to improve diagnostic efficiency. Future work will focus on broadening the applicability of our approach and further optimizing its performance across various datasets.

    Author contributions

    E.L.: development of methodology, data analysis, original draft preparation; A.L.: data curation and analysis, original draft preparation; P.K. and Y.Z.: data curation, manuscript review and editing; B.W.: manuscript review and editing; C.L. and Q.Z.: project conceptualization, supervision, funding acquisition, manuscript review and editing.

    Funding information

    Pathology Internal Funds for Academic Development (PIFAD) of Western University (QZ); Western Dean's Office Research Opportunities Fellowship (DUROP) (AL).

    Conflict of interest

    The authors declare no conflicts of interest.

    Data availability statement

    The images used in this study are publicly available from The Cancer Genome Atlas (TCGA) database. The annotations and corresponding metadata generated for this study are openly accessible at our GitHub repository: https://github.com/h26liu/gbm-db.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.