Volume 149, Issue 3 pp. 728-740
Tumor Markers and Signatures

Feasibility of deep learning-based fully automated classification of microsatellite instability in tissue slides of colorectal cancer

Sung Hak Lee

Sung Hak Lee

Department of Hospital Pathology, Seoul St. Mary's Hospital, Seoul, South Korea

Search for more papers by this author
In Hye Song

In Hye Song

Department of Hospital Pathology, Seoul St. Mary's Hospital, Seoul, South Korea

Search for more papers by this author
Hyun-Jong Jang

Corresponding Author

Hyun-Jong Jang

Catholic Big Data Integration Center, Department of Physiology, College of Medicine, The Catholic University of Korea, Seoul, South Korea

Correspondence

Hyun-Jong Jang, Catholic Big Data Integration Center, Department of Physiology

College of Medicine, The Catholic University of Korea

Seoul 06591, South Korea.

Email: [email protected]

Search for more papers by this author
First published: 13 April 2021
Citations: 31

Funding information: National Research Foundation of Korea (NRF) grants funded by the Korean government, Grant/Award Number: NRF-2017R1D1A1B03030998

Abstract

High levels of microsatellite instability (MSI-H) occurs in about 15% of sporadic colorectal cancer (CRC) and is an important predictive marker for response to immune checkpoint inhibitors. To test the feasibility of a deep learning (DL)-based classifier as a screening tool for MSI status, we built a fully automated DL-based MSI classifier using pathology whole-slide images (WSIs) of CRCs. On small image patches of The Cancer Genome Atlas (TCGA) CRC WSI dataset, tissue/non-tissue, normal/tumor and MSS/MSI-H classifiers were applied sequentially for the fully automated prediction of the MSI status. The classifiers were also tested on an independent cohort. Furthermore, to test how the expansion of the training data affects the performance of the DL-based classifier, additional classifier trained on both TCGA and external datasets was tested. The areas under the receiver operating characteristic curves were 0.892 and 0.972 for the TCGA and external datasets, respectively, by a classifier trained on both datasets. The performance of the DL-based classifier was much better than that of previously reported histomorphology-based methods. We speculated that about 40% of CRC slides could be screened for MSI status without molecular testing by the DL-based classifier. These results demonstrated that the DL-based method has potential as a screening tool to discriminate molecular alteration in tissue slides.

Abstract

What's new?

Microsatellite instability (MSI) levels are an important predictive biomarker for response to immune checkpoint inhibitors in colorectal cancer. To test the feasibility of a deep learning (DL)-based classifier as a screening tool for MSI status, here the authors built a fully-automated DL-based MSI classifier using pathology whole-slide images of hematoxylin and eosin-stained tissue slides of colorectal cancer. By automatically removing artefacts and selecting tumour patches with high tumour probability, the DL-based system could screen out a considerable number of tissue slides for their MSI status, demonstrating its potential as a screening tool for molecular alterations in tissue slides.

CONFLICT OF INTEREST

The authors declare no conflicts of interest.

DATA AVAILABILITY STATEMENT

The slide images of TCGA datasets were downloaded from the Genomic Data Commons portal (https://portal.gdc.cancer.gov/). The SMH datasets are available from the corresponding author (HJ.J.) upon reasonable request and through collaborative investigations. The source codes for the classifiers are available as open-source Python code on GitHub: https://github.com/jajman/ColonMSI/.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.