Volume 2022, Issue 1 7511905

Research Article

Open Access

GCP-Net: A Gating Context-Aware Pooling Network for Cervical Cell Nuclei Segmentation

Guihua Yang,

Guihua Yang

orcid.org/0000-0002-5288-6363

The School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China hrbust.edu.cn

The School of Electrical and Mechanical Engineering, Daqing Normal University, Daqing, China dqsy.net

Search for more papers by this author

Jinjie Huang,

Corresponding Author

Jinjie Huang

[email protected]

orcid.org/0000-0001-9243-3011

The School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China hrbust.edu.cn

The School of Automation, Harbin University of Science and Technology, Harbin, China hrbust.edu.cn

Search for more papers by this author

Yongjun He,

Yongjun He

orcid.org/0000-0002-5156-651X

The School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China hrbust.edu.cn

Search for more papers by this author

Yuanjian Chen,

Yuanjian Chen

orcid.org/0000-0003-0718-7405

The School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China hrbust.edu.cn

Search for more papers by this author

Tao Wang,

Tao Wang

orcid.org/0000-0003-3229-208X

Network and Education Technology Center, Harbin University of Commerce, Harbin, China hrbcu.edu.cn

Search for more papers by this author

Cong Jin,

Cong Jin

orcid.org/0000-0002-1153-2964

The School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China hrbust.edu.cn

Search for more papers by this author

Phienphommalinh Sengphachanh,

Phienphommalinh Sengphachanh

orcid.org/0000-0002-9952-8213

The School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China hrbust.edu.cn

Search for more papers by this author

Guihua Yang,

Guihua Yang

orcid.org/0000-0002-5288-6363

The School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China hrbust.edu.cn

The School of Electrical and Mechanical Engineering, Daqing Normal University, Daqing, China dqsy.net

Search for more papers by this author

Jinjie Huang,

Corresponding Author

Jinjie Huang

[email protected]

orcid.org/0000-0001-9243-3011

The School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China hrbust.edu.cn

The School of Automation, Harbin University of Science and Technology, Harbin, China hrbust.edu.cn

Search for more papers by this author

Yongjun He,

Yongjun He

orcid.org/0000-0002-5156-651X

The School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China hrbust.edu.cn

Search for more papers by this author

Yuanjian Chen,

Yuanjian Chen

orcid.org/0000-0003-0718-7405

The School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China hrbust.edu.cn

Search for more papers by this author

Tao Wang,

Tao Wang

orcid.org/0000-0003-3229-208X

Network and Education Technology Center, Harbin University of Commerce, Harbin, China hrbcu.edu.cn

Search for more papers by this author

Cong Jin,

Cong Jin

orcid.org/0000-0002-1153-2964

The School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China hrbust.edu.cn

Search for more papers by this author

Phienphommalinh Sengphachanh,

Phienphommalinh Sengphachanh

orcid.org/0000-0002-9952-8213

The School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China hrbust.edu.cn

Search for more papers by this author

First published: 09 April 2022

https://doi.org/10.1155/2022/7511905

Academic Editor: Jinan Fiaidhi

Share a link

Email
Wechat
Bluesky

Abstract

Accurate segmentation of cervical nuclei is an essential step in the early diagnosis of cervical cancer. Still, there are few studies on the segmentation of clustered nuclei in clusters of cells. Because of the complexities of high cell overlap, blurred nuclei boundaries, and clustered cells, the accurate segmentation of clustered nuclei remains a pressing challenge. In this paper, we purposefully propose a GCP-Net deep learning network to handle the challenging cervical cluster cell images. The proposed U-Net-based GCP-Net consists of a pretrained ResNet-34 model as encoder, a Gating Context-aware Pooling (GCP) module, and a modified decoder. The GCP module is the primary building block of the network to improve the quality of feature learning. It allows the GCP-Net to refine details of feature maps leveraging multiscale context gating and Global Context Attention for the spatial and texture dependencies. The decoder block including Global Context Attention- (GCA-) Residual Block helps build long-range dependencies and global context interaction in the decoder to refine the predicted masks. We conducted extensive comparative experiments with seven existing models on our ClusteredCell dataset and three typical medical image datasets, respectively. The experimental results showed that the GCP-Net obtained promising results on three evaluation metrics AJI, Dice, and PQ, demonstrating the superiorities and generalizability of our GCP-Net for automatic medical image segmentation in comparison with some SOAT baselines.

1. Introduction

Cervical cancer is the fourth most common cancer among women worldwide [1]. According to data from the Global Cancer Observatory (GCO) in 2018, there were an estimated 570,000 new cases and 311,000 deaths due to cervical cancer [2]. According to the latest data from GCO, it estimates that there will be 604,127 new cases of cervical cancer in 2020. Therefore, early detection of cervical lesions is of great significance in reducing cervical cancer mortality. Cervical routine Pap smear or liquid-based cytology (LBC) [3] is the most popular screening method for preventing and early detection of cervical cancer. It has been widely used and has dramatically reduced its incidence and deaths [4]. However, most countries’ existing leading smear screening technology still uses manual reading, which is very troublesome and prone to human error [5]. Therefore, in the past few decades, much research has been devoted to creating a computer-aided reading system based on automatic image analysis [6]. This system automatically selects potential abnormal cells in a given cervical cytology specimen, and finally, the cytopathologist completes the classification. This task includes three steps: cell (cytoplasm and nucleus) segmentation, feature extraction/selection, and cell classification. Precise cell nucleus segmentation is a prerequisite and indispensable part of the computer-assisted analysis of cervical cells and diagnostic decisions.

Some previous conventional methods [7–10] focused on segmenting overlapping nuclei, but they generally used some indirect processing methods. In addition, some ways [11, 12] use shape priors to segment cells in overlapping clumps, but due to the various complex and challenging situations in overlapping clusters (as shown in Figure 1), the shape priors and standard set of boundary patterns imposed in the literature do not provide sufficient shape details for segmentation of overlapping parts. Therefore, these traditional methods cannot solve the challenging overlapping cluster segmentation problem.

Details are in the caption following the image — **Figure 1 (a) Origin image**
Open in figure viewer PowerPoint

Examples of the challenges in accurate cervical cell nuclei segmentation from the clustered cell dataset.

When using deep learning to deal with possible abnormalities in cervical cells, any deep learning model established will be limited by the number and quality of the cell samples in the dataset used. However, the primary publicly accessible datasets currently used in many studies of cervical cytology have been oversimplified and contain a lot of artificially preprocessed data. For example, the nuclei of those datasets are almost separate, their shapes are mostly uniform and have precise contours, and the color difference between the nucleus and the cytoplasm is noticeable [13]. Based on the above datasets, segmenting the nucleus is relatively easy. Most of the papers [14–18] dedicates to the segmentation of overlapping cytoplasm, and there is a small part of work [19–21] also focused on the segmentation of cell nuclei. However, the actual clinical data is much more complicated than the above dataset, as shown in Figures 1(b)–1(g). There are often overlapping or deformed nuclei in smears. The color of some nuclei is similar to the color of the overlapping cytoplasm, etc. These characteristics all lead to the difficulty of cell nucleus segmentation in cervical cell images.

We obtained a set of cervical cell images based on an LBC test from a hospital and a biomedical testing company for this paper. We randomly selected 265 origin images of size 2048 × 2048 from different slides, as shown in Figure 1(a). The presence of cell clusters and the intercellular overlapping and self-folding, diverse shapes and sizes of nuclei, nuclei with blurred contours, and similar colors of nucleus and cytoplasm remains a significant obstacle to the accurate segmentation of single seats. We crop the clustering units of different sizes from the original images to form the dataset we deal with in this paper. Each image in the dataset has a segmentation ground truth marked by professional pathologists. As shown in Figures 1(b)–1(g), several cases are challenging to handle. Therefore, we propose a GCP-Net deep-learning network to process cervical clustered cell images of challenging issues. The U-Net-based Network proposed in GCP-Net strategically incorporates multiscale context gating information and context-aware attention features and decoder features into the final feature map to correctly classify each pixel into the background and nucleus pixels.

The main innovations of this work are summarized as follows:

(1)
A Gating Context-aware Pooling (GCP) Module enables to refine details of feature map leveraging multiscale context gating and Global Context Attention for the spatial and texture dependencies, improving the quality of feature learning
(2)
A decoder block including Global Context Attention- (GCA-) Residual Block helps build long-range dependencies and global context interaction in the decoder to refine the predicted masks
(3)
Extensive experimental results on our complex ClusteredCell dataset and three typical medical image datasets demonstrate the superiorities and generalizability of our GCP-Net for automatic medical image segmentation in comparison with some state-of-the-art baselines

2. Related Work

Cell nucleus segmentation is researched in academia because its results help judge pathology and medical diagnosis. This section will review the segmentation methods for cervical cell nuclei and other typical medical image nuclei.

In the past few decades, many cervical cell nuclei segmentation methods have been proposed, most of which are based on traditional algorithms, such as watershed algorithm [22], edge enhancement [23, 24], level set [25], clustering [26, 27], and thresholding [28]. For example, [23] uses a segmentation method based on a series of edge enhancement techniques, which performed poorly in blurred nucleus contours. [27] uses a contrast-based adaptive version of the mean-shift and SLIC algorithm and uses an intensity-weighted adaptive threshold to segment cell nuclei in the Pap smear images. In many cases, the above traditional method cannot handle well when images of cervical cells with irregular shapes and sizes appear. With the prevalence of machine learning, the performance of deep learning networks has also improved. In the most challenging cervical cell nuclei segmentation problem, the performance of deep learning networks is better than traditional algorithms [19–21, 29]. [19] used the Herlev dataset, combined Mask-RCNN for rough segmentation, and LFCCRF to refine the nuclei boundary. In [20], both the cytoplasm and the nucleus were segmented, using the combined segmentation method of superpixel and CNN-based network. In this paper, they used a private dataset. The author [21] developed a deep learning method through a multiscale CNN for feature extraction and graph division of cell nucleus segmentation. In this experiment, they privately collect the dataset. The author [29] uses the CNN Bi-path Architecture to segment Pap smear images and classify cervical cancer. The first path is segmentation based on CNN architecture. The second path is a classification process to test the segmentation results by applying the KNN and ANN methods. This method integrates segmentation and classification processing, but it is not suitable for high overlap cervical cell images.

To detect nuclei in multiorgan nuclei segmentation datasets (MoNuSeg [30], CoNSeP [31], and CPM-17 [32]), several methods have been used, such as U-Net [33], CE-Net [34], Triple U-Net [35], CIA-Net [36], SRPN [37], and Hover-Net [31]. U-Net [33] has an encoder-decoder design with skip connections to incorporate low-level information, applied to many segmentation tasks in medical image analysis. The recently proposed CE-Net method [34] extends U-Net by using an enhanced network structure with DAC and RMP Block for medical image segmentation. Triple U-Net [35] leveraged the optical characteristics of hematoxylin and eosin (H&E) staining and proposed a hematoxylin-aware Triplet U-Net, which makes predictions concerning the extracted hematoxylin component in the image. By subtracting instance boundaries from the segmentation maps, overlapped nuclei can be separated; the downside is that such a subtraction operation may lose segmentation accuracy [38]. CIA-Net [36] is a contour-aware CNN model for predicting more precise nucleus boundaries. Hover-Net [31] represents nucleus instances using pixel-to-centroid distance maps in horizontal and vertical directions. SRPN [37] uses the similarity region proposal network to detect nuclei and cells in histological images. The embedding layer proposed here can help the network learn discriminating characteristics based on learning similarity. The performance of these approaches is affected by the empirically designed postprocessing strategies.

Although the methods described above have helped make significant progress in cytology nucleus segmentation, there is still a need to develop more practical and effective strategies.

3. Methods

In this section, we demonstrate the architecture of our GCP-Net and the details of comprising modules.

3.1. Overall Architecture

We design GCP-Net based on the overall architecture from CE-Net [34], which is a modified version of U-Net [33]. As shown in Figure 2, we use the ResNet-34 Block pretrained from ImageNet to replace the encoder block in the original U-Net network. We only retain the first four feature extraction module blocks of ResNet-34. The GCP module proposed in this paper generates more high-lever semantic feature maps, introducing its essential components in Section 3.2. In addition, this paper presents a feature decoder block consisting of GCA-Residual Block, concatenate operation, and transpose convolution. We will give the details of the decoder block in Section 3.3.

3.2. GCP Module

The GCP module is a newly proposed module, as shown in Figure 3(c). This module extractor semantic information generates more high-level feature maps.

3.2.1. Multiscale CG Residual (MCGR) Block

Context Gating (CG) [39] module is an efficient nonlinear unit for modeling interdependencies among network activations. Its structure is shown in Figure 3(a). The formula of CG is as follows:

(1)

where X ∈ Rⁿ is the input feature vector, σ is the element-wise Sigmoid activation, and ⨀ is the element-wise multiplication. W ∈ R^n×n and b ∈ Rⁿ are trainable parameters. The vector of weights σ(W∗X + b) ∈ [0, 1] represents a set of learned gates applied to the individual dimensions of the input feature X. Through the element-wise multiplication and training between the weight vector and X, the input feature representation X is transformed into a new representation X^′ which has the more powerful discriminant capability.

To mitigate the limited receptive field of invariably local operators of the Context Gating module, we propose a Multiscale CG Residual (MCGR) Block, as shown in Figure 3(b). MCGR Block consists of three parallel branches with depth-wise separable convolution [40] and a residual limb. Each branch with depth-wise separable convolution has a different convolutional kernel size to provide various fields. Here, we set the convolutional kernel of sizes 3, 5, and 7 for each branch. Then, each branch with depth-wise separable convolution produces attention weights on a specific scale. After that, attention weights are element-wise multiplied to feature maps to obtain weighted feature maps in different resolutions. Finally, MCGR Block fuses the weighted feature maps and input feature map of the residual branch by element-wise addition for integrating multiscale information.

In MCGR Block, we use depth-wise separable convolution to replace standard convolution. With the depth-wise separable convolution, MCGR Block can avoid extracting redundant features, reuse the input feature maps of cell images, and reduce the number of training parameters. Compared with regular use convolution, MCGR Block is more lightweight, and its training parameters decrease significantly. The formula of MCGR Block is described as follows:

(2)

where X1^′ is the output feature and ∗ and ∘ represent the point-wise convolution and depth-wise convolution, respectively. W^p and b^p are the point-wise convolution parameters. and are the depth-wise convolution parameters. m ∈ {3, 5, 7} represents three different sizes of the convolutional kernel.

3.2.2. Global Context Attention (GCA) Block

Recent works have shown that contextual information is helpful for models to predict high-quality segmentation results. Modules that could enlarge the receptive field, such as ASPP [41], DenseASPP [42], and CRFasRNN [43], have been proposed in the past years. Furthermore, the attention mechanism has been widely used for increasing model capability. Therefore, we add a GCA Block [44] after the convolutional operation of the multiscale fusion information. It reweights every feature accordingly to create a more accurate feature map. In this way, the network becomes more sensitive to essential elements that significantly improve network performance.

Figure 3(d) shows the detail of the GCA Block. Given an input feature map X2 ∈ R^C×H×W, the calculation details are summarized as follows:

① The first branch applies 1 × 1 convolution to X2 to generate a feature map with the size of R^1×H×W, then reshape it to R^HW×1×1, and softmax function is used after that. The second branch reshapes X2 to R^C×HW. To this end, two branches’ results are multiplied to obtain the feature Xt ∈ R^C×1×1. F(∙) denotes convolution operation, α(∙) denotes softmax function, f_r(∙) denotes reshape, and ⊗ denotes matrix multiplication.

(3)

② To reduce the number of parameters after the 1 × 1 convolution, feature Xt turns into the size of R^C/r×1×1, where r is the bottleneck ratio usually be set to 16. Then, layer normalization (LN) and activation function ReLU are applied to improve the network’s generalization ability. After that, the feature to the size of R^C×1×1 is restored and added to X2, getting the final output X2^′ = R^C×H×W. ⊕ in red denotes the channel-wise summation operation, and f_ln&relu(∙) denotes LN as well as ReLU.

(4)

3.2.3. Multikernel Maxpooling Residual (MMR) Block

The MMR Block structure is illustrated in Figure 3(e). Generally, maxpooling operation just employs a single pooling kernel, such as 3 × 3. As we know that the size of the receptive field roughly determines how much context information we can use, so in this paper, we use MMR block with four different kernel sizes: 2 × 2, 3 × 3, 5 × 5, and 7 × 7. Each branch with a different kernel outputs the feature maps with various receptive fields. To reduce the weight and computational cost, we use 1 × 1 convolution after each pooling layer. In this way, if the number of channels of the original feature map is N, the dimension of the new feature map is reduced to 1/N. Then, we upsample the new feature map through bilinear interpolation and finally get the feature with the same size as the original feature map. Finally, we concatenate the original feature X3 and the map obtained by upsampling to output X3^′.

3.3. Feature Decoder Module

We use the feature decoder module to recover the high-lever semantic features extracted from the feature encoder and context extractor modules. As illustrated in Figure 2, it mainly includes four decoder blocks, a 4 × 4 transposed convolution, two 3 × 3 convolutions with batch normalization (BN), and a sigmoid consecutively. In addition, the feature decoder module outputs a mask with the same size as the original input based on the skip connection and the decoder block. Next, we will introduce the composition of the feature decoder module.

3.3.1. Decoder Block

Similar to [45], we adopt an efficient block to enhance the decoding performance. Figure 4(a) shows that the input feature map is first fed into two consecutive GCA-Residual Block and then concatenated with the skip connection. The skip connection brings detailed information from the encoder to the decoder to compensate for the feature loss due to continuous pooling and stride convolution operations. After the concatenate operation, the output feature map is fed to a 4 × 4 transpose convolution, and its spatial dimensions will double.

3.3.2. GCA-Residual Block

A deeper network can significantly improve the model’s performance, but the increase in network depth will cause gradient disappearance or gradient explosion [46]. We use the shortcut connection between layers in the residual learning paradigm to deal with this problem. The GCA-Residual Block (see Figure 3(b)) consists of two 3 × 3 convolutions, a GCA Block and an identity mapping, where each convolution layer follows a batch normalization (BN) and a rectified linear unit (ReLU) activation function. The GCA Block (see Figure 3(d)) acts as a context attention mechanism instructing the network to select critical feature units in each feature map and ignore the unrelated units. We use identity mapping to connect the input and output of the GCA Block.

3.4. Evaluation Metrics

To evaluate the proposed GCP-Net and SOTA deep learning methods, we used standard evaluation metrics, including Aggregated Jaccard Index (AJI) [30], dice coefficient [47], and panoptic quality (PQ) [48]. The definition of AJI is as follows:

(5)

where J = argmax_k(G_i ⋂ P_k/G_i ∪ P_k), G = {G₁, G₂, ⋯, G_n}, and P = {P₁, P₂, ⋯, P_m} denote the ground truth and the prediction results, respectively. N is the set of indices of prediction results without any corresponding ground truth.

Dice coefficient measures the overlapping degree between the two regions and is given by

(6)

Since the AJI score has a problem, it may overpenalize the overlapping region. To avoid this problem, PQ [48] introduces to evaluate nuclei segmentation performance, which has been widely adopted in panoptic segmentation tasks and was raised into nucleus segmentation in [31]. It is defined as follows:

(7)

where p and g denote the prediction segment and the ground truth, respectively, in instance level. IoU represents the intersection over the union. When the IoU > 0.5 of each (p, g) pair, the result can be regarded as unique. True Positives (TP), False Positives (FP), and False Negatives (FN) represent matched pairs of segments, unmatched predicted segments, and unmatched ground truth segments, respectively.

4. Experiments

4.1. Dataset

In this work, we obtained a group of clinical cervical cell images based on the LBC test from the 2nd affiliated hospital of Harbin Medical University and Harbin precise yuan test company. We use an automatic pathology scanner to acquire images, as shown in Figure 5. The master control center is a computer equipped with 8 G memory, i5-4590 CPU, and 3.3 GHz, and the image acquisition module is shown in Figure 5(b). The industrial camera is a CMOS camera with an image acquisition resolution of 2048 × 2048 pixels, an acquisition frame rate of 50 frames per second, and a grayscale of 256. The objective lens magnification of the microscope is 20 times. The electric platform can be set to automatic and manual control modes. The automatic control mode can control the electric platform to move, focus, and control the camera to capture images. The image capture begins in the center of the slide and moves automatically as a snake. The pixel overlaps are 30, and each slide can take over 400 images. In manual control mode, manually operating the manual control bar, positioning, and grabbing images are realized.

In this paper, we randomly selected 265 clinical cervical cell images from different slides and coarsely segmented the region of the cell clusters, yielding 2363 clustered cell images with different sizes ranging from 150 to 500 pixels. Then, we randomly selected 568 of them as the test set. Finally, we named this dataset ClusteredCell.

Since this dataset comprises cell cluster images, many cases are challenging to handle, as shown in Figure 1. Therefore, we employ curriculum learning [49] to utilize the difference in the difficulty of training between the image cases. Curriculum learning is a method of learning data effectively considering the difficulty of the exercise, in which a model learns progressively from easy- to difficult-to-train data. So, in this paper, we divided the training set images into three categories, respectively, simple, normal, and difficult; the classification criteria are presented as follows, and the grouping results are shown in Figure 6:

(1)
Simple. Nucleus had high contrast with cytoplasm, apparent nucleus, and greater distance between each nucleus
(2)
Normal. Nucleus had low contrast with cytoplasm, the nucleus is faintly visible and pale-colored, multinuclear, or neutrophil impurities; nucleoli or nuclear groove is prominent; cytoplasm is dark; cytoplasm is vacuolar
(3)
Difficult. There is a significant overlap between the nuclei of most of the cells, with the peripheral nuclei faintly visible and the interiors dark in color

The details of our dataset are shown in Table 1.

Table 1. The details of the Clustered Cell dataset.

Category	Original image number	Clustered cell image	Group	Number
The training set	265	1795	Simple	200
			Normal	1500
			Difficult	95
The test set		568		568

To prove the efficacy of the proposed algorithm, we selected three nuclei segmentation datasets for comparison. The three datasets used in this paper are described as follows:

(1)
MoNuSeg [30]. This dataset contains 30 images of size 1000 × 1000 cropped from whole slide images of seven different organs. We use the same image split with the existing methods [46] (16 for training and 14 for testing, each image is 224 × 224 pixels)
(2)
CoNSeP [31]. This dataset contains 41 H&E stained images with 1000 × 1000 pixels at 40x magnification extracted from 16 CRA WSIs. CoNSeP dataset is split into train set (n = 27) and test set (n = 14) as employed in the original work [31], and each image is cropped into 256 × 256 pixels in the experiment
(3)
CPM-17 [32]. This dataset contains 40 pathological images with pixel-level annotations, of which 32 are in the training set, and eight are in the test set. Each image, scanned at 40x magnification, has 500 × 500 pixels. In addition, all images in the train set and test set are also cropped into the size 256 × 256

4.2. Implementation Details

The GCP-Net prepares on an NVIDIA GeForce RTX 2080Ti GPU and Intel Core i7-7700 3.60GHz CPU using the PyTorch 1.8 framework. We trained this model for 100 epochs using the Adam optimizer, and the learning rate for all experiments was 2e-4. The loss function uses a combination of binary cross-entropy [50] and dice loss [51]. All the images fed into networks were resized to 448 × 448. The data augmentation strategies used in the training and testing phases are the same as the reference [34]. In the training phase, each image in the original dataset augments eight images, including horizontal, vertical, diagonal flip, and random shifting image, expanding the image from 90% to 110% or in HSV color space color dithering. In the testing phase, like that in the reference [34, 52, 53], the test augmentation strategy is also adopted. That means each test image has to be predicted eight times, and then, we average the predictions to get the final prediction mask. All baseline methods use the same strategy during the training and testing phase.

4.3. Ablation Study

(1)
U-Net. Basic network
(2)
Backbone. In the proposed method, we replace the encoder block of U-Net with a pretrained ResNet-34, as shown in Figure 2. We define this modified U-Net with pretrained ResNet-34 as the backbone
(3)
Backbone +Decoder Block. We replace the original decoder layer with the proposed decoder block
(4)
Backbone + GCP. We integrate the GCP module in the backbone
(5)
Backbone + Decoder Block +GCP. This is the final GCP-Net architecture, with the GCP module and the decoder block are used in combination

Table 2 lists the ablation results of these five configurations performed on our ClusteredCell and two public datasets. Below, we conduct a detailed analysis of different model architectural settings and verify them through the five network configurations.

Table 2. Detailed ablation study of the GCP-Net architecture.

Model	ClusteredCell (ours)			MoNuSeg			CoNSeP
Model	AJI	Dice	PQ	AJI	Dice	PQ	AJI	Dice	PQ
U-Net	0.639	0.827	0.624	0.526	0.780	0.494	0.485	0.741	0.408
Backbone	0.667	0.881	0.672	0.642	0.827	0.594	0.461	0.797	0.448
Backbone + Decoder Block	0.674	0.880	0.680	0.647	0.828	0.598	0.488	0.777	0.467
Backbone +GCP	0.680	0.881	0.681	0.646	0.831	0.597	0.487	0.790	0.452
Backbone + Decoder Block +GCP (proposed)	0.683	0.880	0.687	0.651	0.830	0.601	0.586	0.835	0.563

4.3.1. Effectiveness of Pretrained ResNet-34

Fine-tuning from the pretrained ResNet-34 backbone network makes our network in a good initial state to quickly adapt to new modalities of medical images using a relatively small number of training data. Table 2 shows the performance of the modified U-Net with pretrained ResNet-34 as a backbone. We find that although pretrained ResNet-34 introduces almost no additional parameters and calculations, the gain of segmentation performance is still very noticeable. In the ClusteredCell dataset, there have been 2.8%, 5.4%, and 4.8% increases in AJI, Dice, and PQ, respectively. In the MoNuSeg dataset, there has been 11.6%, 4.7%, and 10% increases in AJI, Dice, and PQ, respectively. In the CoNSeP dataset, despite a 2.4% decline in AJI, the Dice and PQ have increased by 5.6% and 4%, respectively.

4.3.2. Decoder Block’s Effectiveness

By replacing the original decoder layers with decoder block in the backbone, decoder block can quickly build long-range dependencies and global context connections in the decoder. As shown in Table 2, we can see that decoder block already achieves better performance than the backbone on three compared datasets with improvements of 0.7%, 0.5%, and 2.7% in terms of AJI score and 0.6%, 0.4%, and 1.9% in terms of PQ score, respectively. The result means that decoder block has better learning and generalization ability than previous methods. Therefore, the decoder block design based on GCA-Residual can effectively improve the segmentation performance.

4.3.3. Effectiveness of GCP Module

The multiscale CG Residual Block in the GCP module adds three multiscale context gating branches and fuses the multiscale feature information through a residual operation. The Global Context Attention block reweights feature information accordingly to create a more accurate feature map. The Multikernel Maxpooling Residual Block could encode the global information and change the combination way of the feature. It can be observed in Table 2 that the results of Backbone +GCP achieve AJI improvements of 1.3%, 0.4%, and 2.6% and PQ improvements of 0.9%, 0.3%, and 0.4% on ClusteredCell, MoNuSeg, and CoNSeP compared to Backbone, showing a 0.4% improvement on MoNuSeg in terms of Dice. That means the GCP module brings more effective feature representations fusion of multiscale branches and helps achieve better segmentation performance.

4.3.4. Effectiveness of Decoder Block and GCP Module Combination

The proposed GCP-Net architecture combines decoder block and GCP module. As a result, we can observe the performance improvement of GCP-Net in AJI, Dice, and PQ in Table 2. It has obtained higher results than Backbone and Backbone + Decoder Block and Backbone + GCP.

4.4. Attention Module Comparison and Selection

In this paper, both the GCP module and decoder block use the attention module for giving feature maps with different weight values. In the process of selecting attention modules, we experimented with five state-of-the-art attention modules (Shuffle Attention [54], ECA Attention [55], CBAM Attention [56], SE Attention [57], and Global Context Attention [44]) in GCP-Net, respectively. The performance of the selected different attention modules is presented in Table 3. The experimental results show that using varying attention modules leads to different implementations results. Still, the differences are insignificant. From comparing the three metrics on the two datasets, it can be seen that Global Context Attention has the most outstanding performance.

Table 3. The results of using different attention module.

Attention model	ClusteredCell (ours)			MoNuSeg
Attention model	AJI	Dice	PQ	AJI	Dice	PQ
Shuffle Attention [54]	0.664	0.877	0.669	0.648	0.828	0.603
ECA Attention [55]	0.667	0.877	0.672	0.635	0.827	0.597
CBAM Attention [56]	0.671	0.879	0.677	0.638	0.825	0.599
SE Attention [57]	0.681	0.878	0.684	0.640	0.824	0.598
Global Context Attention [44]	0.684	0.880	0.688	0.651	0.830	0.601

4.5. Experiment Results

To evaluate the performance of the proposed models, we compared our proposed model to recent segmentation approaches. Those approaches have used in computer vision (U-Net [33], UNet++ [58], Attention U-Net [59]), medical imaging (CE-Net [34]), and also to methods specifically tuned for the task of nuclear segmentation (Hover-Net [31], CIA-Net [36], Triple U-Net [35]). Below, we present quantitative comparison results on four different biomedical imaging datasets.

4.5.1. Results on ClusteredCell Dataset

ClusteredCell is a private cervical cell nuclear segmentation dataset described in detail in Section 4.1. Comparing seven widely accepted segmentation methods with different backboned (see Table 4) shows that proposed method has improved performance than the SOTA methods (on the same train-test split).

Table 4. Quantitative comparison with existing SOTA methods.

Model	ClusteredCell (ours)			MoNuSeg			CoNSeP			CPM-17
Model	AJI	Dice	PQ	AJI	Dice	PQ	AJI	Dice	PQ	AJI	Dice	PQ
U-Net [33]	0.638	0.827	0.624	0.526	0.781	0.494	0.485	0.741	0.408	0.556	0.804	0.506
UNet++ [58]	0.654	0.858	0.646	0.620	0.814	0.568	0.556	0.828	0.536	0.649	0.852	0.608
Attention U-Net [59]	0.639	0.847	0.634	0.553	0.800	0.507	0.546	0.827	0.540	0.634	0.846	0.607
CE-Net [34]	0.669	0.878	0.671	0.538	0.801	0.503	0.489	0.754	0.439	0.647	0.871	0.619
CIA-Net [36]	0.672	0.869	0.653	0.623	0.815	0.578	—	—	—	—	—	—
Triple U-Net [35]	0.678	0.837	0.608	0.622	0.834	0.601	0.574	0.839	0.566	0.711	0.856	0.659
Hover-Net [31]	0.670	0.831	0.675	0.619	0.825	0.599	0.574	0.848	0.538	0.705	0.856	0.661
GCP-Net (proposed)	0.684	0.88	0.688	0.651	0.830	0.601	0.586	0.835	0.563	0.719	0.892	0.671

We also show three sample results in Figure 7 to visually compare our method with the other methods. The sample given in Figure 7 contains simple, normal, and difficult task. According to the results, the simple picture of the first row, each method obtained the segmentation results similar to ground truth. From the results of the second and third rows, we can see the difference in the processing results of each method, which shows that our method achieved the best segmentation results.

4.5.2. Results on MoNuSeg, CoNSeP, and CPM-17 Datasets

We evaluated our method by employing a completely independent comparison across the three most enormous known exhaustively labeled nucleus segmentation datasets, MoNuSeg, CoNSeP, and CPM-17, and utilized the metrics described in Section 4.1. The results are reported in Table 4, and we find that proposed method can successfully deal with unprocessed data in three public datasets. But it turns out that some methods perform poorly on unseen data, especially U-Net’s performance on all three datasets is worse than other competing methods. Triple U-Net and Hover-Net achieved competitive performance in all three generalization tests. In particular, Triple U-Net has proven to detect nuclear pixels successfully. It scores better than GCP-Net’s Dice on the MoNuSeg dataset and better than GCP-Net’s PQ score on the CoNSeP dataset. However, the overall segmentation result for GCP-Net is superior (as shown in Figure 8) because it can better analyze the image context information by introducing context-aware modules in the network’s feature extractor and decoder parts. Thus, it is better separating the cell nuclei.

In Figure 9 there is a box plot [60]. The box plot is a way to observe the overall shape of a data set. The central box shows the data between the rough quartiles, and a black line represents the average. “Whiskers” extends to the extremes of the data.

Box plot displays the variation of segmentation results in a statistical distribution. Figure 9 shows the performance of segmentation results of every image in the test set in different models on ClusteredCell, MoNuSeg, CoNSeP, and CPM-17 datasets, respectively. A large variation in performance between methods within each dataset is observed, especially in CoNSeP datasets, where there exists a large number of overlapping nuclei. It can be seen that proposed method outperforms the other methods, which validates the feasibility of applying our GCP-Net on different datasets.

5. Conclusions

Accurate segmentation of cell nuclei is an essential step in diagnosis and analysis. Segmentation of cluster cell nuclei in LBC testing has become a challenge in biology and medicine. In this paper, we purposefully propose a GCP-Net deep learning network to handle the challenging cervical cluster cell images. The proposed U-Net-based GCP-Net consists of a pretrained ResNet-34 model as encoder, a GCP module, and a modified decoder. The GCP module is the primary building block of the network to improve the quality of feature learning. It allows the GCP-Net to refine details of feature maps leveraging multiscale context gating and Global Context Attention for the spatial and texture dependencies. The decoder block includes that GCA-Residual Block helps build long-range dependencies and global context interaction in the decoder to refine the predicted masks. We used ablation experiments to discuss the effectiveness of the GCP module and the decoder block. We conducted extensive comparative experiments with seven existing models on our ClusteredCell dataset and three typical medical image datasets, respectively. The experimental results showed that the GCP-Net obtained promising results on three evaluation metrics AJI, Dice, and PQ, demonstrating the superiorities and generalizability of our GCP-Net for automatic medical image segmentation in comparison with some SOAT baselines. Although we obtained considerable accuracy in our experiments, this task can only be used as AI-assisted cytological screening during actual clinical diagnosis. The method helps with primary cytological screening or triage, and for challenging cases, physician confirmation is also required. Further research is necessary and significant. In the future, we will use contrastive learning methods to improve the performance of GCP-Net on more challenging biomedical images.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported in part by the Heilongjiang Provincial Key Laboratory of Complex Intelligent System and Integration and by the Natural Science Fund Project of Heilongjiang Province of China under Grant F201222.

Open Research

Data Availability

The dataset is being compiled for publication.

References

1 Bray F., Ferlay J., Soerjomataram I., Siegel R. L., Torre L. A., and Jemal A., Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries, CA: a Cancer Journal for Clinicians. (2018) 68, no. 6, 394–424, https://doi.org/10.3322/caac.21492, 2-s2.0-85053395052, 30207593.
10.3322/caac.21492
PubMed Web of Science® Google Scholar
2 Arbyn M., Weiderpass E., Bruni L., de Sanjosé S., Saraiya M., Ferlay J., and Bray F., Estimates of incidence and mortality of cervical cancer in 2018: a worldwide analysis, The Lancet Global Health. (2020) 8, no. 2, e191–e203, https://doi.org/10.1016/S2214-109X(19)30482-6, 31812369.
10.1016/S2214-109X(19)30482-6
PubMed Web of Science® Google Scholar
3 Davey E., Barratt A., Irwig L., Chan S. F., Macaskill P., Mannes P., and Saville A. M., Effect of study design and quality on unsatisfactory rates, cytology classifications, and accuracy in liquid-based versus conventional cervical cytology: a systematic review, Lancet. (2006) 367, no. 9505, 122–132, https://doi.org/10.1016/S0140-6736(06)67961-0, 2-s2.0-30444459961, 16413876.
10.1016/S0140-6736(06)67961-0
PubMed Web of Science® Google Scholar
4 Saslow D., Solomon D., Lawson H. W., Killackey M., Kulasingam S. L., Cain J. M., Garcia F. A., Moriarty A. T., Waxman A. G., Wilbur D. C., Wentzensen N., Downs LS Jr, Spitzer M., Moscicki A. B., Franco E. L., Stoler M. H., Schiffman M., Castle P. E., Myers E. R., Chelmow D., Herzig A., Kim J. J., Kinney W., Herschel W. L., and Waldman J., American cancer society, american society for colposcopy and cervical pathology, and american society for clinical pathology screening guidelines for the prevention and early detection of cervical cancer, Journal of Lower Genital Tract Disease. (2012) 16, no. 3, 175–204, https://doi.org/10.1097/LGT.0b013e31824ca9d5, 2-s2.0-84863589988, 22418039.
10.1097/LGT.0b013e31824ca9d5
PubMed Web of Science® Google Scholar
5 Bengtsson E., Recognizing Signs of Malignancy - the Quest for Computer Assisted Cancer Screening and Diagnosis Systems, 2010 IEEE International Conference on Computational Intelligence and Computing Research, 2010, Coimbatore, India, 1–6, https://doi.org/10.1109/ICCIC.2010.5705885, 2-s2.0-79951789211.
10.1109/ICCIC.2010.5705885
Google Scholar
6 Zhang L., Kong H., Ting Chin C., Liu S., Fan X., Wang T., and Chen S., Automation-assisted cervical cancer screening in manual liquid-based cytology with hematoxylin and eosin staining, Cytom. Part A. (2014) 85, no. 3, 214–230, https://doi.org/10.1002/cyto.a.22407, 2-s2.0-84894026203, 24376056.
10.1002/cyto.a.22407
CAS PubMed Web of Science® Google Scholar
7 Kong H., Gurcan M., and Belkacem-Boussaid K., Partitioning histopathological images: an integrated framework for supervised color-texture segmentation and cell splitting, IEEE Transactions on Medical Imaging. (2011) 30, no. 9, 1661–1677, https://doi.org/10.1109/TMI.2011.2141674, 2-s2.0-80052292123, 21486712.
10.1109/TMI.2011.2141674
PubMed Web of Science® Google Scholar
8 Arslan S., Ersahin T., Cetin-Atalay R., and Gunduz-Demir C., Attributed relational graphs for cell nucleus segmentation in fluorescence microscopy images, IEEE Transactions on Medical Imaging. (2013) 32, no. 6, 1121–1131, https://doi.org/10.1109/TMI.2013.2255309, 2-s2.0-84878533114, 23549886.
10.1109/TMI.2013.2255309
PubMed Web of Science® Google Scholar
9 Moussavi F., Wang Y., Lorenzen P., Oakley J., Russakoff D., and Gould S., A unified graphical models framework for automated mitosis detection in human embryos, IEEE Transactions on Medical Imaging. (2014) 33, no. 7, 1551–1562, https://doi.org/10.1109/TMI.2014.2317836, 2-s2.0-84903757664, 24771573.
10.1109/TMI.2014.2317836
PubMed Web of Science® Google Scholar
10 Zafari S., Eerola T., Sampo J., Kälviäinen H., and Haario H., Segmentation of overlapping elliptical objects in silhouette images, IEEE Transactions on Image Processing. (2015) 24, no. 12, 5942–5952, https://doi.org/10.1109/TIP.2015.2492828, 2-s2.0-84946925336, 26513788.
10.1109/TIP.2015.2492828
PubMed Web of Science® Google Scholar
11 Nosrati M. S. and Hamarneh G., Segmentation of overlapping cervical cells: a variational method with star-shape prior, 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI), 2015, Brooklyn, NY, USA, 186–189, https://doi.org/10.1109/ISBI.2015.7163846, 2-s2.0-84944317929.
10.1109/ISBI.2015.7163846
Google Scholar
12 Lu Z., Carneiro G., and Bradley A. P., An improved joint optimization of multiple level set functions for the segmentation of overlapping cervical cells, IEEE Transactions on Image Processing. (2015) 24, no. 4, 1261–1272, https://doi.org/10.1109/TIP.2015.2389619, 2-s2.0-84923856075.
10.1109/TIP.2015.2389619
PubMed Web of Science® Google Scholar
13 Fernando T., Denman S., Sridharan S., and Fookes C., Tracking by prediction: a deep generative model for mutli-person localisation and tracking, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), 2018, Lake Tahoe, NV, USA, 1122–1132, https://doi.org/10.1109/WACV.2018.00128, 2-s2.0-85047169976.
10.1109/WACV.2018.00128
Google Scholar
14 Phoulady H. A., Goldgof D. B., Hall L. O., and Mouton P. R., A new approach to detect and segment overlapping cells in multi-layer cervical cell volume images, 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI), 2016, Prague, Czech Republic, 201–204, https://doi.org/10.1109/ISBI.2016.7493244, 2-s2.0-84978431934.
10.1109/ISBI.2016.7493244
Google Scholar
15 Song Y., Tan E. L., Jiang X., Cheng J. Z., Ni D., Chen S., Lei B., and Wang T., Accurate cervical cell segmentation from overlapping clumps in pap smear images, IEEE Transactions on Medical Imaging. (2017) 36, no. 1, 288–300, https://doi.org/10.1109/TMI.2016.2606380, 2-s2.0-85017836610, 27623573.
10.1109/TMI.2016.2606380
PubMed Web of Science® Google Scholar
16 Tareef A., Song Y., Huang H., Feng D., Chen M., Wang Y., and Cai W., Multi-pass fast watershed for accurate segmentation of overlapping cervical cells, IEEE Transactions on Medical Imaging. (2018) 37, no. 9, 2044–2059, https://doi.org/10.1109/TMI.2018.2815013, 2-s2.0-85043467887, 29993863.
10.1109/TMI.2018.2815013
PubMed Web of Science® Google Scholar
17 Zhang H., Zhu H., and Ling X., Polar coordinate sampling-based segmentation of overlapping cervical cells using attention U-Net and random walk, Neurocomputing. (2020) 383, 212–223, https://doi.org/10.1016/j.neucom.2019.12.036.
10.1016/j.neucom.2019.12.036
Web of Science® Google Scholar
18 Umadi A., Nagarajan K., Venkatesha J. B., Ganesh A., and George K., Automated Segmentation of Overlapping Cells in Cervical Cytology Images Using Deep Learning, 2020 IEEE 17th India Council International Conference (INDICON), 2020, New Delhi, India, 1–7, https://doi.org/10.1109/INDICON49873.2020.9342328.
10.1109/INDICON49873.2020.9342328
Google Scholar
19 Liu Y., Zhang P., Song Q., Li A., Zhang P., and Gui Z., Automatic segmentation of cervical nuclei based on deep learning and a conditional random field, IEEE Access. (2018) 6, 53709–53721, https://doi.org/10.1109/ACCESS.2018.2871153, 2-s2.0-85053635331.
10.1109/ACCESS.2018.2871153
Web of Science® Google Scholar
20 Song Y., Zhang L., Chen S., Ni D., Li B., Zhou Y., Lei B., and Wang T., A deep learning based framework for accurate segmentation of cervical cytoplasm and nuclei, 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2014, 2014, Chicago, IL, USA, 2903–2906, https://doi.org/10.1109/EMBC.2014.6944230, 2-s2.0-84929484211.
10.1109/EMBC.2014.6944230
Google Scholar
21 Song Y., Zhang L., Chen S., Ni D., Lei B., and Wang T., Accurate segmentation of cervical cytoplasm and nuclei based on multiscale convolutional network and graph partitioning, IEEE Transactions on Biomedical Engineering. (2015) 62, no. 10, 2421–2433, https://doi.org/10.1109/TBME.2015.2430895, 2-s2.0-84950238277, 25966470.
10.1109/TBME.2015.2430895
PubMed Web of Science® Google Scholar
22 Yang X., Li H., and Zhou X., Nuclei segmentation using marker-controlled watershed, tracking using mean-shift, and Kalman filter in time-lapse microscopy, IEEE Transactions on Circuits and Systems I: Regular Papers. (2006) 53, no. 11, 2405–2414, https://doi.org/10.1109/TCSI.2006.884469, 2-s2.0-33751510756.
10.1109/TCSI.2006.884469
Web of Science® Google Scholar
23 Lin C. H., Chan Y. K., and Chen C. C., Detection and segmentation of cervical cell cytoplast and nucleus, International Journal of Imaging Systems and Technology. (2009) 19, no. 3, 260–270, https://doi.org/10.1002/ima.20198, 2-s2.0-69749122836.
10.1002/ima.20198
Web of Science® Google Scholar
24 Zhao M., Wang H., Han Y., Wang X., Dai H. N., Sun X., Zhang J., and Pedersen M., SEENS: nuclei segmentation in Pap smear images with selective edge enhancement, Future Generation Computer Systems. (2021) 114, 185–194, https://doi.org/10.1016/j.future.2020.07.045.
10.1016/j.future.2020.07.045
Web of Science® Google Scholar
25 Taneja A., Ranjan P., and Ujlayan A., Multi-cell nuclei segmentation in cervical cancer images by integrated feature vectors, Multimedia Tools and Applications. (2018) 77, no. 8, 9271–9290, https://doi.org/10.1007/s11042-017-4864-x, 2-s2.0-85020694604.
10.1007/s11042-017-4864-x
Web of Science® Google Scholar
26 Huang J., Wang T., Zheng D., and He Y., Nucleus segmentation of cervical cytology images based on multi-scale fuzzy clustering algorithm, Bioengineered. (2020) 11, no. 1, 484–501, https://doi.org/10.1080/21655979.2020.1747834, 32279589.
10.1080/21655979.2020.1747834
PubMed Web of Science® Google Scholar
27 Gautam S., Gupta K., Bhavsar A., and Sao A. K., Unsupervised segmentation of cervical cell nuclei via adaptive clustering, Communications in Computer and Information Science. (2017) 723, https://doi.org/10.1007/978-3-319-60964-5_71, 2-s2.0-85023165150.
10.1007/978-3-319-60964-5_71
Google Scholar
28 Lee H., Han M., Yoo T., Jung C., Son H. J., and Cho M., Evaluation of nuclear chromatin using grayscale intensity and thresholded percentage area in liquid-based cervical cytology, Diagnostic Cytopathology. (2018) 46, no. 5, 384–389, https://doi.org/10.1002/dc.23906, 2-s2.0-85042181638, 29464913.
10.1002/dc.23906
PubMed Web of Science® Google Scholar
29 Desiani A., Erwin M., Suprihatin B., Yahdin S., Putri A. I., and Husein F. R., Bi-path architecture of CNN segmentation and classification method for cervical cancer disorders based on Pap-smear images, IAENG International Journal of Computer Science. (2021) 48, no. 3.
Google Scholar
30 Kumar N., Verma R., Anand D., and Zhou Y., A dataset and a technique for generalized nuclear segmentation for computational pathology, IEEE Transactions on Medical Imaging. (2017) 36, no. 7, 1550–1560, https://doi.org/10.1109/TMI.2017.2677499, 2-s2.0-85028392498, 28287963.
10.1109/TMI.2017.2677499
PubMed Web of Science® Google Scholar
31 Graham S., Vu Q. D., Raza S. E. A., Azam A., Tsang Y. W., Kwak J. T., and Rajpoot N., Hover-net: simultaneous segmentation and classification of nuclei in multi- tissue histology images, Medical Image Analysis. (2019) 58, https://doi.org/10.1016/j.media.2019.101563, 2-s2.0-85072568104, 31561183.
10.1016/j.media.2019.101563
PubMed Web of Science® Google Scholar
32 Vu Q. D., Graham S., Kurc T., To M. N. N., Shaban M., Qaiser T., Koohbanani N. A., Khurram S. A., Kalpathy-Cramer J., Zhao T., Gupta R., Kwak J. T., Rajpoot N., Saltz J., and Farahani K., Methods for segmentation and classification of digital microscopy tissue images, Frontiers in Bioengineering and Biotechnology. (2019) 7, https://doi.org/10.3389/fbioe.2019.00053, 2-s2.0-85064655258, 31001524.
10.3389/fbioe.2019.00053
PubMed Web of Science® Google Scholar
33 Ronneberger O., Fischer P., and Brox T., U-net: convolutional networks for biomedical image segmentation, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2015, 9351, 234–241, https://doi.org/10.1007/978-3-319-24574-4_28, 2-s2.0-84951834022.
10.1007/978-3-319-24574-4_28
Google Scholar
34 Gu Z., Cheng J., Fu H., Zhou K., Hao H., Zhao Y., Zhang T., Gao S., and Liu J., CE-net: context encoder network for 2D medical image segmentation, IEEE Transactions on Medical Imaging. (2019) 38, no. 10, 2281–2292, https://doi.org/10.1109/TMI.2019.2903562, 2-s2.0-85067101771, 30843824.
10.1109/TMI.2019.2903562
PubMed Web of Science® Google Scholar
35 Zhao B., Chen X., Li Z., Yu Z., Yao S., Yan L., Wang Y., Liu Z., Liang C., and Han C., Triple U-net: hematoxylin-aware nuclei segmentation with progressive dense feature aggregation, Medical Image Analysis. (2020) 65, https://doi.org/10.1016/j.media.2020.101786, 32712523.
10.1016/j.media.2020.101786
PubMed Web of Science® Google Scholar
36 Zhou Y., Onder O. F., Dou Q., Tsougenis E., Chen H., and Heng P. A., CIA-Net: Robust Nuclei Instance Segmentation with Contour-Aware Information Aggregation, 2019, 11492, Springer International Publishing.
Google Scholar
37 Sun Y., Huang X., Zhou H., and Zhang Q., SRPN: similarity-based region proposal networks for nuclei and cells detection in histology images, Medical Image Analysis. (2021) 72, https://doi.org/10.1016/j.media.2021.102142, 34198042.
10.1016/j.media.2021.102142
PubMed Web of Science® Google Scholar
38 Chen S., Ding C., and Tao D., Boundary-assisted region proposal networks for nucleus segmentation, Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), 2020, 12265, LNCS, 279–288, https://doi.org/10.1007/978-3-030-59722-1_27.
10.1007/978-3-030-59722-1_27
Google Scholar
39 Miech A., Laptev I., and Sivic J., Learnable pooling with context gating for video classification, (2017) http://arxiv.org/abs/1706.06905.
Google Scholar
40 Chollet F., Xception: deep learning with depthwise separable convolutions, 2017, 2017, Proceedings -30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 1800–1807, https://doi.org/10.1109/CVPR.2017.195, 2-s2.0-85040604274.
10.1109/CVPR.2017.195
Google Scholar
41 Chen L. C., Papandreou G., Kokkinos I., Murphy K., and Yuille A. L., Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs, IEEE Transactions on Pattern Analysis and Machine Intelligence. (2018) 40, no. 4, 834–848, https://doi.org/10.1109/TPAMI.2017.2699184, 2-s2.0-85042712042, 28463186.
10.1109/TPAMI.2017.2699184
PubMed Web of Science® Google Scholar
42 Yang M., Yu K., Zhang C., Li Z., and Yang K., DenseASPP for semantic segmentation in street scenes, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, Salt Lake City, UT, USA, 3684–3692, https://doi.org/10.1109/CVPR.2018.00388, 2-s2.0-85055111228.
10.1109/CVPR.2018.00388
Google Scholar
43 Zheng S., Jayasumana S., Romera-Paredes B., Vineet V., Su Z., Du D., Huang C., and Torr P. H., Conditional random fields as recurrent neural networks, Proceedings of the IEEE international conference on computer vision (ICCV), 2015, Santiago, Chile, 1529–1537.
Google Scholar
44 Cao Y., Xu J., Lin S., Wei F., and Hu H., GCNet: non-local networks meet squeeze-excitation networks and beyond, Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, Seoul, Korea (south), 1971–1980, https://doi.org/10.1109/ICCVW.2019.00246.
10.1109/ICCVW.2019.00246
Google Scholar
45 Tomar N. K., Jha D., Ali S., Johansen H. D., Johansen D., Riegler M. A., and Halvorsen P., DDANet: Dual Decoder Attention Network for Automatic Polyp Segmentation, Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021, 12668, Springer, Cham, Lecture Notes in Computer Science, https://doi.org/10.1007/978-3-030-68793-9_23.
Google Scholar
46 Tuan T. A., Khoa N. T., Quan T. M., and Jeong W.-K., ColorRL: reinforced coloring for end-to-end instance segmentation, 2021, Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. 2021, 16727–16736.
Google Scholar
47 Dice L. R., Measures of the amount of ecologic association between species, Ecology. (1945) 26, no. 3, https://doi.org/10.2307/1932409.
10.2307/1932409
Web of Science® Google Scholar
48 Kirillov A., He K., Girshick R., Rother C., and Dollar P., Panoptic segmentation, 2019, 2019, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit, 9396–9405, https://doi.org/10.1109/CVPR.2019.00963.
10.1109/CVPR.2019.00963
Google Scholar
49 Bengio Y., Louradour J., Collobert R., and Weston J., Curriculum learning, Proceedings of the 26th annual international conference on machine learning, June 2009, Montreal, Quebec, Canada, 41–48.
Google Scholar
50 Crum W. R., Camara O., and Hill D. L. G., Generalized overlap measures for evaluation and validation in medical image analysis, IEEE Transactions on Medical Imaging. 25, no. 11, 1451–1461, https://doi.org/10.1109/TMI.2006.880587, 2-s2.0-33750843927, 17117774.
10.1109/TMI.2006.880587
PubMed Web of Science® Google Scholar
51 Milletari F., Navab N., and Ahmadi S. A., V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation, 2016 Fourth International Conference on 3D Vision (3DV), 2016, Stanford, CA, USA, 565–571, https://doi.org/10.1109/3DV.2016.79, 2-s2.0-85011298810.
10.1109/3DV.2016.79
Google Scholar
52 Zhang Y., Qiu Z., Yao T., Liu D., and Mei T., Fully convolutional adaptation networks for semantic segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, Salt Lake City, USA, 6810–6818.
Google Scholar
53 Dai J., Li Y., He K., and Sun J., R-fcn: Object detection via region-based fully convolutional networks, Advances in neural information processing systems. (2016) 29.
Google Scholar
54 Zhang Q.-L. and Yang Y.-B., SA-Net: shuffle attention for deep convolutional neural networks, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, Toronto, ON, Canada, 2235–2239, https://doi.org/10.1109/icassp39728.2021.9414568.
10.1109/icassp39728.2021.9414568
Google Scholar
55 Wang Q., Wu B., Zhu P., Li P., Zuo W., and Hu Q., ECA-net: efficient channel attention for deep convolutional neural networks, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit, 2020, Seattle, WA, USA, 11531–11539, https://doi.org/10.1109/CVPR42600.2020.01155.
10.1109/CVPR42600.2020.01155
Google Scholar
56 Woo S., Park J., Lee J. Y., and Kweon I. S., CBAM: convolutional block attention module, Lecture Notes in Computer Science, 2018, 11211, LNCS, Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, https://doi.org/10.1007/978-3-030-01234-2_1, 2-s2.0-85055111544.
Google Scholar
57 Hu J., Shen L., and Sun G., Squeeze-and-excitation networks, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit, 2018, Salt Lake City, UT, USA, 7132–7141, https://doi.org/10.1109/CVPR.2018.00745, 2-s2.0-85062854803.
10.1109/CVPR.2018.00745
Google Scholar
58 Zhou Z., Rahman Siddiquee M. M., Tajbakhsh N., and Liang J., Unet++: a nested u-net architecture for medical image segmentation, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2018, 11045, LNCS, https://doi.org/10.1007/978-3-030-00889-5_1, 2-s2.0-85054551189.
10.1007/978-3-030-00889-5_1
Google Scholar
59 Oktay O., Schlemper J., Folgoc L. L., Lee M., Heinrich M., Misawa K., Mori K., McDonagh S., Hammerla N. Y., Kainz B., and Glocker B., Attention U-Net: learning where to look for the pancreas, (2018), https://arxiv.org/abs/1804.03999.
Google Scholar
60 Kafadar K., Koehler J. R., Venables W. N., and Ripley B. D., Modern Applied Statistics with S-Plus, The American Statistician. (1999) 53, no. 1, https://doi.org/10.2307/2685660.
10.2307/2685660
Google Scholar

All articles

GCP-Net: A Gating Context-Aware Pooling Network for Cervical Cell Nuclei Segmentation

Abstract

1. Introduction

2. Related Work

3. Methods

3.1. Overall Architecture

3.2. GCP Module

3.2.1. Multiscale CG Residual (MCGR) Block

3.2.2. Global Context Attention (GCA) Block

3.2.3. Multikernel Maxpooling Residual (MMR) Block

3.3. Feature Decoder Module

3.3.1. Decoder Block

3.3.2. GCA-Residual Block

3.4. Evaluation Metrics

4. Experiments

4.1. Dataset

4.2. Implementation Details

4.3. Ablation Study

4.3.1. Effectiveness of Pretrained ResNet-34

4.3.2. Decoder Block’s Effectiveness

4.3.3. Effectiveness of GCP Module

4.3.4. Effectiveness of Decoder Block and GCP Module Combination

4.4. Attention Module Comparison and Selection

4.5. Experiment Results

4.5.1. Results on ClusteredCell Dataset

4.5.2. Results on MoNuSeg, CoNSeP, and CPM-17 Datasets

5. Conclusions

Conflicts of Interest

Acknowledgments

Open Research

Data Availability

References

Figures

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley