ORIGINAL ARTICLE

Open Access

Effects of artificial intelligence assistance on endoscopist performance: Comparison of diagnostic performance in superficial esophageal squamous cell carcinoma detection using video-based models

Naoki Aoyama

orcid.org/0000-0003-3978-2778

Department of Gastroenterology and Endoscopy, National Cancer Center Hospital East, Chiba, Japan

Department of Gastroenterology and Hepatology, Kyoto University Graduate School of Medicine, Kyoto, Japan

Contribution: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Validation, Visualization, Writing - original draft

Search for more papers by this author

Keiichiro Nakajo,

Corresponding Author

Keiichiro Nakajo

[email protected]

Department of Gastroenterology and Endoscopy, National Cancer Center Hospital East, Chiba, Japan

NEXT Medical Device Innovation Center, National Cancer Center Hospital East, Chiba, Japan

Correspondence

Keiichiro Nakajo, Department of Gastroenterology and Endoscopy, National Cancer Center Hospital East, 6-5-1, Kashiwanoha, Kashiwa, Chiba 277-8577, Japan.

Email: [email protected]

Contribution: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Supervision, Writing - review & editing

Search for more papers by this author

Maasa Sasabe,

Maasa Sasabe

Department of Gastroenterology and Endoscopy, National Cancer Center Hospital East, Chiba, Japan

Division of Endoscopy, Saitama Cancer Center, Saitama, Japan

Contribution: Data curation, Investigation, Methodology, Resources, Writing - review & editing

Search for more papers by this author

Atsushi Inaba,

Atsushi Inaba

orcid.org/0000-0001-7036-9485

Department of Gastroenterology and Endoscopy, National Cancer Center Hospital East, Chiba, Japan

Contribution: Data curation, Investigation, Methodology, Resources, Writing - review & editing

Search for more papers by this author

Yuki Nakanishi,

Yuki Nakanishi

Department of Gastroenterology and Hepatology, Kyoto University Graduate School of Medicine, Kyoto, Japan

Contribution: Project administration, Supervision, Writing - review & editing

Search for more papers by this author

Hiroshi Seno,

Hiroshi Seno

Department of Gastroenterology and Hepatology, Kyoto University Graduate School of Medicine, Kyoto, Japan

Contribution: Project administration, Supervision, Writing - review & editing

Search for more papers by this author

Tomonori Yano,

Tomonori Yano

Department of Gastroenterology and Endoscopy, National Cancer Center Hospital East, Chiba, Japan

NEXT Medical Device Innovation Center, National Cancer Center Hospital East, Chiba, Japan

Contribution: Conceptualization, Funding acquisition, Project administration, Resources, Supervision, Writing - review & editing

Search for more papers by this author

Naoki Aoyama,

Naoki Aoyama

orcid.org/0000-0003-3978-2778

Department of Gastroenterology and Endoscopy, National Cancer Center Hospital East, Chiba, Japan

Department of Gastroenterology and Hepatology, Kyoto University Graduate School of Medicine, Kyoto, Japan

Contribution: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Validation, Visualization, Writing - original draft

Search for more papers by this author

Keiichiro Nakajo,

Corresponding Author

Keiichiro Nakajo

[email protected]

Department of Gastroenterology and Endoscopy, National Cancer Center Hospital East, Chiba, Japan

NEXT Medical Device Innovation Center, National Cancer Center Hospital East, Chiba, Japan

Correspondence

Keiichiro Nakajo, Department of Gastroenterology and Endoscopy, National Cancer Center Hospital East, 6-5-1, Kashiwanoha, Kashiwa, Chiba 277-8577, Japan.

Email: [email protected]

Contribution: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Supervision, Writing - review & editing

Search for more papers by this author

Maasa Sasabe,

Maasa Sasabe

Department of Gastroenterology and Endoscopy, National Cancer Center Hospital East, Chiba, Japan

Division of Endoscopy, Saitama Cancer Center, Saitama, Japan

Contribution: Data curation, Investigation, Methodology, Resources, Writing - review & editing

Search for more papers by this author

Atsushi Inaba,

Atsushi Inaba

orcid.org/0000-0001-7036-9485

Department of Gastroenterology and Endoscopy, National Cancer Center Hospital East, Chiba, Japan

Contribution: Data curation, Investigation, Methodology, Resources, Writing - review & editing

Search for more papers by this author

Yuki Nakanishi,

Yuki Nakanishi

Department of Gastroenterology and Hepatology, Kyoto University Graduate School of Medicine, Kyoto, Japan

Contribution: Project administration, Supervision, Writing - review & editing

Search for more papers by this author

Hiroshi Seno,

Hiroshi Seno

Department of Gastroenterology and Hepatology, Kyoto University Graduate School of Medicine, Kyoto, Japan

Contribution: Project administration, Supervision, Writing - review & editing

Search for more papers by this author

Tomonori Yano,

Tomonori Yano

Department of Gastroenterology and Endoscopy, National Cancer Center Hospital East, Chiba, Japan

NEXT Medical Device Innovation Center, National Cancer Center Hospital East, Chiba, Japan

Contribution: Conceptualization, Funding acquisition, Project administration, Resources, Supervision, Writing - review & editing

Search for more papers by this author

First published: 02 May 2025

https://doi.org/10.1002/deo2.70083

Share a link

Email
Wechat
Bluesky

Abstract

Objectives

Superficial esophageal squamous cell carcinoma (ESCC) detection is crucial. Although narrow-band imaging improves detection, its effectiveness is diminished by inexperienced endoscopists. The effects of artificial intelligence (AI) assistance on ESCC detection by endoscopists remain unclear. Therefore, this study aimed to develop and validate an AI model for ESCC detection using endoscopic video analysis and evaluate diagnostic improvements.

Methods

Endoscopic videos with and without ESCC lesions were collected from May 2020 to January 2022. The AI model trained on annotated videos and 18 endoscopists (eight experts, 10 non-experts) evaluated their diagnostic performance. After 4 weeks, the endoscopists re-evaluated the test data with AI assistance. Sensitivity, specificity, and accuracy were compared between endoscopists with and without AI assistance.

Results

Training data comprised 280 cases (140 with and 140 without lesions), and test data, 115 cases (52 with and 63 without lesions). In the test data, the median lesion size was 14.5 mm (range: 1–100 mm), with pathological depths ranging from high-grade intraepithelial to submucosal neoplasia. The model's sensitivity, specificity, and accuracy were 76.0%, 79.4%, and 77.2%, respectively. With AI assistance, endoscopist sensitivity (57.4% vs. 66.5%) and accuracy (68.6% vs. 75.9%) improved significantly, while specificity increased slightly (87.0% vs. 91.6%). Experts demonstrated substantial improvements in sensitivity (59.1% vs. 70.0%) and accuracy (72.1% vs. 79.3%). Non-expert accuracy increased significantly (65.8% vs. 73.3%), with slight improvements in sensitivity (56.1% vs. 63.7%) and specificity (81.9% vs. 89.2%).

Conclusions

AI assistance enhances ESCC detection and improves endoscopists' diagnostic performance, regardless of experience.

INTRODUCTION

Esophageal cancer, a significant global health burden, ranks seventh in terms of incidence and sixth in terms of mortality. Esophageal squamous cell carcinoma (ESCC) accounts for >90% of esophageal cancers in certain regions like eastern Asia.¹ Early ESCC detection is crucial for improving patient outcomes, as the 5-year survival rate dramatically increases with early diagnosis.^{2, 3} However, the accurate and timely diagnosis of superficial ESCC remains challenging, particularly for inexperienced endoscopists.⁴

The narrow-band imaging technique improves superficial ESCC detection compared with white-light imaging but requires advanced skills, posing difficulties for less experienced endoscopists.⁵ Identifying subtle endoscopic features demands significant training, and a lack of expertise can result in misdiagnoses or delays.

Artificial intelligence (AI) offers a promising solution for endoscopic diagnosis. AI algorithms, particularly those based on deep learning, have demonstrated remarkable capabilities in image recognition and pattern detection. Several studies have reported successful AI applications in gastrointestinal endoscopic image recognition in various situations, including colorectal polyp and gastric cancer detection.^6-11 While prior studies exist on AI for ESCC diagnosis using video data,^12-21 few examine its impact on endoscopists’ diagnostic performance.

This study aimed to develop and validate an AI model for detecting superficial ESCC via video analysis, mimicking real-world clinical scenarios. The AI model's diagnostic accuracy was compared with that of novice and experienced endoscopists.

METHODS

Data collection for AI model development

Endoscopic videos were prospectively collected from May 2020 to September 2021 as training data for the AI model. Additional endoscopic videos were collected between October 2021 and January 2022 as test data to evaluate the developed AI model. Videos were obtained during endoscopic procedures at the National Cancer Center Hospital East (NCCHE) using narrow-band imaging with GIF-Q260, GIF-2T260 M, GIF-H260Z, GIF-H290Z, GIF-EZ1500, and GIF-XZ1200 scopes (Olympus). Additionally, standard video endoscopy systems (EVIS LUCERA CV-260, EVIS LUCERA ELITE CV-290, and EVIS X1 CV-1500; Olympus) were employed. Two types of videos were collected: those with and those without neoplastic lesions. The eligibility criteria for patients with neoplastic lesions included histopathologically diagnosed ESCC or high-grade intraepithelial neoplasia with lesion depth shallower than the submucosa. Patients with residual or locally recurrent lesions after endoscopic resection were excluded. Patients without neoplastic lesions included those without ESCC or high-grade intraepithelial neoplasia, as determined by endoscopic diagnosis. Patients with a history of chemotherapy, radiotherapy, chemoradiotherapy, or surgery for ESCC were excluded (Figure 1).

Details are in the caption following the image — **FIGURE 1**
Open in figure viewer PowerPoint

Flowchart detailing the procedures employed in developing and evaluating the AI model for detecting superficial esophageal squamous cell carcinoma, including the assessment of its assistive effects on the performance of endoscopists. AI, artificial intelligence; NCCHE, National Cancer Center Hospital East.

Data preparation and annotation of lesion frames

Videos depicting mucosa with or without neoplastic lesions were divided into individual frames for subsequent analysis. Within this dataset, the quality of certain frames was suboptimal, characterized by image blurring and artifacts, such as bubbles, to the extent that lesion identification remained feasible. Frames of the ESCC videos were annotated by four skilled endoscopists at the NCCHE using the Visual Object Tagging Tool developed by Microsoft Commercial Software Engineering, under the MIT license (https://github.com/microsoft/VoTT/blob/master/LICENSE). Each frame in the videos was assigned a rectangular box based on reference images, such as iodine-stained images, obtained during pre-endoscopic submucosal dissection (ESD) examination and from iodine-stained ESD specimens. This annotation provided ground-truth data for training and validating the AI model to differentiate videos with and without neoplastic lesions (Figure 2).

AI model development

We utilized the You Only Look Once v3 machine-learning model, a state-of-the-art object-detection algorithm.²² The model was trained using annotated ESCC and normal mucosa videos obtained from the NCCHE in order for it to learn the distinctive patterns and features associated with ESCC. Iterative optimization processes enhanced the model's performance and accuracy, including fine-tuning the model parameters and adjusting the training process based on the validation results.

Evaluation of the AI model

Short video clips lasting 3–5 s were extracted from the captured videos for testing. An equal number of video clips depicting the lesion in either the close-up or distant view were prepared. The distant view refers to a perspective captured from a far position at which abnormal blood vessels cannot be identified. In contrast, the close-up view refers to a perspective captured from a closer position where abnormal blood vessels become visible. Importantly, in the close-up view, we only employed low-level magnification; no high-level magnification was applied for detailed examination. The model's diagnostic capabilities were evaluated. In the per-frame analysis, we defined the detection of lesions by the AI model with IoU (Intersection over Union) ≥0.3 as a true positive. In the per-lesion analysis, a lesion that was correctly detected in five consecutive frames was defined as a true positive (Figure 3). Subsequently, the model's performance was evaluated by comparing its predictions with the diagnoses made by a group of endoscopists, including skilled experts and non-experts. Eighteen endoscopists, comprising eight experts and ten non-experts, were recruited from two institutions. The categorization of endoscopists as experts or non-experts was determined by their certification status as board-certified fellows of the Japan Gastroenterological Endoscopy Society. The endoscopists independently reviewed the test videos captured at the NCCHE and provided their diagnoses. The participants were tasked with determining only the presence or absence of lesions in the videos. Sensitivity, specificity, accuracy, positive predictive value, negative predictive value, and interobserver agreement were calculated to assess the diagnostic performance of both the AI model and endoscopists (Figure 1). This study was designed as a retrospective analysis.

Assessment of AI assistance

To explore the model's assistance ability, endoscopists were asked to diagnose the test data again after a 4-week washout interval while referring to the results predicted by the AI model. The diagnostic performance of the endoscopists with and without AI assistance was compared to assess the effects of AI assistance on their diagnostic accuracy (Figure 1).

Statistical analysis

Descriptive analyses were conducted to summarize the diagnostic performance measures of the AI model and endoscopists. The comparison of diagnostic performance between the AI model and endoscopists was determined to yield a significant difference if the diagnostic performance of the AI was outside the 95% confidence interval of the mean diagnostic performance of the endoscopists. The diagnostic performances of AI-assisted and non-AI-assisted endoscopists were compared using a paired t-test and Wilcoxon matched-pairs signed-rank test. Additionally, the McNemar test was performed to assess whether AI assistance significantly improved the diagnostic performance of endoscopists. Subgroup analyses were performed based on lesion size, depth, location, endoscopic system, and other relevant factors. Statistical significance was set at p < 0.05. Interobserver agreement among the endoscopists was assessed using Fleiss' kappa (κ) coefficient. The criteria for the interpretation of kappa values by Landis and Koch were employed (poor: < 0.00, slight: 0.00–0.20, fair: 0.21–0.40, moderate: 0.41–0.60, substantial: 0.61–0.80, almost perfect agreement: >0.80). Statistical analyses were conducted using JMP Pro, version 17.1.0 (SAS Institute Inc.,) and GraphPad Prism, version 10.0.2, for Windows (GraphPad Software, www.graphpad.com).

Ethical considerations

This study complied with the ethical principles outlined in the Declaration of Helsinki. This study involved the removal of the patients' personal information and was approved by the Institutional Review Board of the National Cancer Center East (2022-162). Informed consent was obtained from all the patients whose videos were included in the study.

RESULTS

Patient and lesion characteristics

The data of 280 patients, including 140 with lesions and 140 without lesions, were included as training data. An additional 115 patients, comprising 52 with lesions and 63 without lesions, were used as test data. The clinicopathological characteristics of the test dataset lesions revealed a median tumor diameter of 14.5 mm (range: 1–100 mm). Regarding the invasion depth of the lesions, the majority were limited to the epithelium (n = 21), followed by the lamina propria mucosae (n = 17) and muscularis mucosae (n = 7). Statistical differences between the training and test data were evaluated using the Wilcoxon rank-sum test for continuous variables and Fisher's exact test for categorical variables (Table 1).

TABLE 1. Patient and lesion characteristics in the training and test datasets.

Patient characteristics	n = 122	n = 47	p-value
	Training dataset	Test dataset
Sex: (male/female)	103 / 19	39 / 8	p = 0.818
Median age: year (range)	71 (37–91)	73 (49–87)	p = 0.436

Lesion characteristics	n = 140	n = 52
Median tumor size: mm (range)	16 (1–107)	14.5 (1–100)	p = 0.938
Size: 1–10 mm / 11–20 mm / ≥21 mm	36 / 54 / 50	17 / 13 / 22	p = 0.210
Macroscopic types: 0-IIa / 0-IIb / 0-IIc	3 / 4 / 133	1 / 1 / 50	p = 1.00
Location: Ce / Ut / Mt / Lt / Ae	1 / 30 / 74 / 35 / 0	0 / 5 / 32 / 15 / 0	p = 0.234
Aw / Pw / Rw / Lw	29 / 34 / 46 / 31	7 / 18 / 19 / 8	p = 0.332
Depth: HGIN / EP / LPM / MM / SM1 / SM2	0 / 74 / 43 / 14 / 4 / 5	3 / 21 / 17 / 7 / 0 / 4	p = 0.0392
Histological types: HGIN / SCC / others	0 / 140 / 0	3 / 49 / 0	p = 0.0190
Gastroscope: H260Z / H290Z / EZ1500 / XZ1200 / others	80 / 1 / 27 / 31 / 1	0 / 0 / 0 / 50 / 2	p <0.0001
Endoscopy system: CV-260 / CV-290 / CV-1500	1 / 17 / 122	0 / 19 / 33	p = 0.0004

Abbreviations: Ae, abdominal esophagus; Aw, anterior wall; Ce, cervical esophagus; EP, epithelial; HGIN, high-grade intraepithelial neoplasia; LPM, lamina propria mucosae; Lt, lower thoracic esophagus; Lw, left wall; MM, muscularis mucosae; Mt, middle thoracic esophagus; Pw, posterior wall; Rw, right wall; SCC, squamous cell carcinoma; SM, submucosa; Ut, upper thoracic esophagus.

Evaluation of AI performance in isolation

The model's performance in detecting superficial ESCC was evaluated using the test data based on endoscopic video analysis. The AI model exhibited a sensitivity, specificity, and accuracy of 76.0%, 79.4%, and 77.2%, respectively.

Assessment of the supportive effects of AI

The effects of AI assistance on the diagnostic performance of endoscopists were assessed. Without AI assistance, the endoscopists demonstrated a sensitivity of 57.4%, a specificity of 87.0%, and an accuracy of 68.6%. However, AI assistance significantly improved the sensitivity (66.5%, p = 0.0019), specificity (91.6%, p = 0.0280), and accuracy (75.9%, p < 0.0001) of the endoscopists (Table 2 and Figure 4). The McNemar test further confirmed significant overall improvement with AI assistance (χ² = 74.34, p < 0.0001). Interobserver agreement among the endoscopists was assessed with and without AI assistance. The Fleiss' kappa statistic (κ) showed an improvement from 0.429 (without AI assistance) to 0.547 (with AI assistance).

TABLE 2. Diagnostic performance of the artificial intelligence (model and endoscopists.

	Sensitivity, % (95% CI)	Specificity, % (95% CI)	Accuracy, % (95% CI)	PPV, % (95% CI)	NPV, % (95% CI)
AI model	76.0	79.4	77.2
Expert (n = 8)
Without AI	59.1 (51.9–66.3)	93.5 (88.8–98.1)	72.1 (67.9–76.3)	94.0 (90.3–97.7)	58.5 (54.2-62.8)
With AI	70.0 (60.8–79.1)^*	94.6 (91.0–98.3)	79.3 (74.5–84.1)^*	95.9 (93.5–98.3)	66.6 (60.2–73.0)^*
Non-expert (n = 10)
Without AI	56.1 (45.1–67.0)	81.9 (71.9–91.9)	65.8 (61.9–69.7)	85.6 (79.5–91.7)	54.3 (49.9–58.7)
With AI	63.7 (57.1–70.2)^*	89.2 (84.2–94.2)^*	73.3 (70.2–76.4)^*	91.3 (87.6–95.0)^*	60.4 (56.5–64.2)^*
All (n = 18)
Without AI	57.4 (51.2–63.6)	87.0 (80.9–93.1)	68.6 (65.6–71.6)	89.3 (85.3–93.3)	56.1 (53.2–59.1)
With AI	66.5 (61.4–71.5)^*	91.6 (88.4–94.8)^*	75.9 (73.1–78.8)^*	93.4 (91.0–95.7)^*	63.1 (59.6–66.6)^*

Abbreviations: AI, artificial intelligence; NPV, negative predictive value; PPV, positive predictive value.
* p < 0.05, vs. without AI.

Subgroup analysis

On analyzing the performances of experts and non-experts separately, similar trends were observed. Among the experts, sensitivity and accuracy significantly improved with AI assistance (59.1% vs. 70.0%, p = 0.0170 and 72.1% vs. 79.3%, p = 0.0096, respectively; Table 2 and Figure 5). In the case of non-experts as well, there was a significant improvement in sensitivity (56.1% vs. 63.7%, p = 0.0351), specificity (81.9% vs. 89.2%, p = 0.0362), and accuracy (65.8% vs. 73.3%, p = 0.0015; Table 2 and Figure 6). The analysis also differentiated between two distinct perspectives: the distant and close-up views. Overall, the results were less favorable in the distant view context (Figure S1). Importantly, a consistent enhancement in performance was evident across both distant and close-up perspectives when AI was integrated. Specifically, in the case of the distant view, there was a notable improvement in sensitivity (51.0%–59.8%, p = 0.0114), specificity (87.0%–91.6%, p = 0.0295), and accuracy (70.7%–77.2%, p < 0.0001; Figures S2 and S3). These improvements were consistently observed across different lesion sizes. Even in the case of small lesions (tumor size: 1–10 mm), significant enhancements in sensitivity (54.1% to 65.4%, p = 0.0070), specificity (87.0%–91.6%, p = 0.0280), and accuracy (75.5%–82.4%, p = 0.0001) were noted. The incorporation of AI resulted in a consistent cumulative benefit irrespective of lesion size (Figure S4). Furthermore, upon examining differences in each evaluation criterion between facilities, no significant difference was observed in any of the criteria (Figure S5). Examples of video-captured images diagnosed by the AI model are shown in Figure 7.

DISCUSSION

In this study, we developed an AI model to assist in the detection and diagnosis of superficial ESCC using endoscopic video analysis. The model's performance was promising, with a sensitivity of 76.0%, specificity of 79.4%, and accuracy of 77.2%. The AI model exhibited higher sensitivity and accuracy than endoscopists alone. Moreover, the diagnostic support provided by the AI to the endoscopists was demonstrated across all the evaluated criteria. This observation was particularly notable in areas where the baseline performance was relatively suboptimal; thus, room for improvement was more substantial. These results highlight the potential of AI technology to support endoscopists in detecting and diagnosing esophageal lesions in a clinical setting.

Several studies have examined the effects of augmenting the ESCC diagnostic capabilities of endoscopists using AI.^{16-18, 23-25} Both previous studies and the current study have recognized the added advantages of AI assistance for endoscopists. Previous studies on the added benefits of AI assistance for endoscopists have predominantly relied on still images for testing, with limited validation using video-based assessments.^{17, 18} Furthermore, most studies were focused on white light imaging, with few investigations using narrow-band imaging.^{17, 24} While two RCTs have been published,^{26, 27} there are some points for discussion, such as the inclusion of lesions detectable only by iodine staining or those identified after iodine chromoendoscopy, as well as how low-grade intraepithelial neoplasia is classified. Our study explicitly excludes these lesions, making it more aligned with real-world clinical practice.

Our study focused on evaluating the effects of AI assistance on the diagnostic performance of both experts and non-experts. The findings revealed that AI assistance significantly improved the diagnostic performance of both skilled and novice endoscopists. This suggests that AI technology can be beneficial across different levels of expertise, supporting endoscopists in their clinical decision-making.

A more detailed analysis of the results revealed that AI assistance resulted in a more significant improvement in the sensitivity of experts and specificity of non-experts. It has been postulated that both experts and non-experts tend to prioritize specificity over sensitivity in endoscopic diagnosis, possibly because training focuses on distinguishing between malignant and benign lesions rather than on identifying the lesion itself. With experience, there has been a shift in focus towards improved sensitivity. In our study, without AI assistance, experts and non-experts differed in specificity rather than in sensitivity. AI helps experts recognize subtle lesions, thereby enhancing sensitivity while maintaining specificity. For non-experts, AI guidance strengthens judgment capabilities, thereby improving specificity. Prioritizing sensitivity in the AI design may balance this inclination towards specificity among endoscopists.

Our study found that AI assistance was valuable across different lesion sizes, suggesting the model's versatility. We also compared distant and close-up lesion observations using this novel approach. Close-up observation plays a critical role in detecting fine details, such as abnormal blood vessels, which are difficult for AI to identify in distant views. Conversely, distant-view observation presents challenges, including difficulties in detecting small or poorly defined lesions and insufficient lighting. Enhancing AI capabilities for distant-view lesion detection could address these limitations and further improve diagnostic accuracy, especially in challenging scenarios.

To minimize selection bias, cases were mechanically classified based on the collection period to establish the training and validation datasets. Consequently, although unintentionally, the validation set primarily comprised videos captured using Olympus's latest equipment (GIF-XZ1200), whereas the majority of the training set comprised videos captured with an older-generation scope (GIF-H260Z).

Generally, high-resolution endoscopic images are presumed to enhance gastrointestinal lesion detection.^28-32 Previous studies comparing AI performance between scopes with low (GIF-Q260) and high resolution (GIF-H260 and H290) did not observe any significant differences.¹⁶ While a discrepancy in the scope specifications between the training and validation datasets was encountered in this study, indicating a potential disadvantage for AI due to being insufficiently trained on high-resolution lesion videos, the results still suggested a positive augmentation effect on the performance of endoscopists, similar to the findings of previous reports. This indicates the applicability of information obtained from older generation scopes relative to that obtained using the latest equipment, even if optical advancements continue. Although the continuous updating of AI models is deemed essential, the ability to utilize older data during updates is crucial.

Although there are benefits of using AI, caution is necessary when using it. The integration of AI as an assistive tool should be approached with care, ensuring that endoscopists possess fundamental endoscopic and critical thinking skills. Interpreting the AI results within the clinical context of patients is essential. These considerations emphasize the need for continuous research and the careful application of AI in clinical practice.

Although our study produced noteworthy results, it is essential to acknowledge its limitations. The foremost is the lack of a real-time diagnostic evaluation in an actual clinical setting, which represents a significant constraint in assessing the practical effectiveness of our AI system.

Additionally, the reliance on data from a single medical facility to develop an AI model presents another limitation. Despite this limitation, a significant aspect of our research was the examination of the effects of AI assistance on the performance of endoscopists from various facilities. Recognizing the possibility of selection bias, it is crucial that endoscopists from diverse facilities beyond the collection site (NCCHE), including Kyoto University, participate in the trial. This inclusion of additional facilities is pivotal for improving the model's generalizability.

Furthermore, it is vital to emphasize that our study consistently demonstrated the positive effects of AI on all evaluation criteria across all facilities. Importantly, when potential differences in each evaluation criterion between facilities were closely examined, no significant variations were found. This result strongly indicates the model's versatility. To further ensure its generalizability, accumulating diverse cases from multiple facilities will be essential in future studies.

In conclusion, our study demonstrated the potential of AI technology to assist endoscopists in detecting and diagnosing superficial ESCC. The performance of the developed AI model in accurately identifying suspicious lesions was favorable, and its integration significantly improved the diagnostic capabilities of endoscopists. These findings have important implications for clinicians, as AI assistance can enhance the accuracy and consistency of the detection of early-stage esophageal cancer, potentially leading to earlier diagnosis and improved patient outcomes. Further research is warranted to validate and refine the AI model, explore its utility in real-world clinical practice, and address the limitations of this study.

ACKNOWLEDGMENTS

I would like to express my gratitude to the endoscopists from the NCCHE and Kyoto University who participated in this study. Their contributions were indispensable for the completion of this study. Additionally, I am thankful to the endoscopists from the NCCHE for conducting the video collection, which was crucial for the development and evaluation of the AI model. I would also like to thank the Olympus Medical Systems Corporation for their technical support in creating the AI model.

CONFLICT OF INTEREST STATEMENT

Tomonori Yano received financial support for the research from Olympus Medical Systems Corporation.

ETHICS STATEMENT

Approval of the research protocol by an Institutional Reviewer Board: This study was approved by the Institutional Review Board of the National Cancer Center East (2022-162).

PATIENT CONSENT STATEMENT

Informed consent was obtained from all the patients whose videos were included in the study.

CLINICAL TRIAL REGISTRATION

N/A

Supporting Information

REFERENCES

1Sung H, Ferlay J, Siegel RL et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2021; 71: 209–249.
10.3322/caac.21660
PubMed Web of Science® Google Scholar
2Watanabe M, Toh Y, Ishihara R et al. Comprehensive registry of esophageal cancer in Japan, 2015. Esophagus 2023; 20: 1–28.
10.1007/s10388-022-00950-5
PubMed Web of Science® Google Scholar
3Njei B, McCarty TR, Birk JW. Trends in esophageal cancer survival in United States adults from 1973 to 2009: A SEER database analysis. J Gastroenterol Hepatol 2016; 31: 1141–1146.
10.1111/jgh.13289
PubMed Web of Science® Google Scholar
4Rodriguez de Santiago E, Hernanz N, Marcos-Prieto HM et al. Rate of missed oesophageal cancer at routine endoscopy and survival outcomes: A multicentric cohort study. United European Gastroenterol J 2019; 7: 189–198.
10.1177/2050640618811477
PubMed Web of Science® Google Scholar
5Ishihara R, Takeuchi Y, Chatani R et al. Prospective evaluation of narrow-band imaging endoscopy for screening of esophageal squamous mucosal high-grade neoplasia in experienced and less experienced endoscopists. Dis Esophagus 2010; 23: 480–486.
10.1111/j.1442-2050.2009.01039.x
CAS PubMed Web of Science® Google Scholar
6Misawa M, Kudo SE, Mori Y et al. Artificial intelligence-assisted polyp detection for colonoscopy: Initial experience. Gastroenterology 2018; 154: 2027–2029. e3.
10.1053/j.gastro.2018.04.003
PubMed Web of Science® Google Scholar
7Yamada M, Saito Y, Imaoka H et al. Development of a real-time endoscopic image diagnosis support system using deep learning technology in colonoscopy. Sci Rep 2019; 9: 14465.
10.1038/s41598-019-50567-5
PubMed Web of Science® Google Scholar
8Wallace MB, Sharma P, Bhandari P et al. Impact of artificial intelligence on miss rate of colorectal neoplasia. Gastroenterology 2022; 163: 295–304.e5.
10.1053/j.gastro.2022.03.007
PubMed Web of Science® Google Scholar
9Xu H, Tang RSY, Lam TYT et al. Artificial intelligence-assisted colonoscopy for colorectal cancer screening: A multicenter randomized controlled trial. Clin Gastroenterol Hepatol 2023; 21: 337–346. e3.
10.1016/j.cgh.2022.07.006
PubMed Web of Science® Google Scholar
10Kanesaka T, Lee TC, Uedo N et al. Computer-aided diagnosis for identifying and delineating early gastric cancers in magnifying narrow-band imaging. Gastrointest Endosc 2018; 87: 1339–1344.
10.1016/j.gie.2017.11.029
PubMed Web of Science® Google Scholar
11Ikenoyama Y, Hirasawa T, Ishioka M et al. Detecting early gastric cancer: Comparison between the diagnostic ability of convolutional neural networks and endoscopists. Dig Endosc 2021; 33: 141–150.
10.1111/den.13688
PubMed Web of Science® Google Scholar
12Fukuda H, Ishihara R, Kato Y et al. Comparison of performances of artificial intelligence versus expert endoscopists for real-time assisted diagnosis of esophageal squamous cell carcinoma (with video). Gastrointest Endosc 2020; 92: 848–855.
10.1016/j.gie.2020.05.043
PubMed Web of Science® Google Scholar
13Guo L, Xiao X, Wu C et al. Real-time automated diagnosis of precancerous lesions and early esophageal squamous cell carcinoma using a deep learning model (with videos). Gastrointest Endosc 2020; 91: 41–51.
10.1016/j.gie.2019.08.018
PubMed Web of Science® Google Scholar
14Shimamoto Y, Ishihara R, Kato Y et al. Real-time assessment of video images for esophageal squamous cell carcinoma invasion depth using artificial intelligence. J Gastroenterol 2020; 55: 1037–1045.
10.1007/s00535-020-01716-5
PubMed Web of Science® Google Scholar
15Shiroma S, Yoshio T, Kato Y et al. Ability of artificial intelligence to detect T1 esophageal squamous cell carcinoma from endoscopic videos and the effects of real-time assistance. Sci Rep 2021; 11: 7759.
10.1038/s41598-021-87405-6
CAS PubMed Web of Science® Google Scholar
16Tang D, Wang L, Jiang J et al. A novel deep learning system for diagnosing early esophageal squamous cell carcinoma: A multicenter diagnostic study. Clin Transl Gastroenterol 2021; 12: e00393.
10.14309/ctg.0000000000000393
PubMed Web of Science® Google Scholar
17Waki K, Ishihara R, Kato Y et al. Usefulness of an artificial intelligence system for the detection of esophageal squamous cell carcinoma evaluated with videos simulating overlooking situation. Dig Endosc 2021; 33: 1101–1109.
10.1111/den.13934
PubMed Web of Science® Google Scholar
18Yang XX, Li Z, Shao XJ et al. Real-time artificial intelligence for endoscopic diagnosis of early esophageal squamous cell cancer (with video). Dig Endosc 2021; 33: 1075–1084.
10.1111/den.13908
PubMed Web of Science® Google Scholar
19Tajiri A, Ishihara R, Kato Y et al. Utility of an artificial intelligence system for classification of esophageal lesions when simulating its clinical use. Sci Rep 2022; 12: 6677.
10.1038/s41598-022-10739-2
CAS PubMed Web of Science® Google Scholar
20Tani Y, Ishihara R, Inoue T et al. A single-center prospective study evaluating the usefulness of artificial intelligence for the diagnosis of esophageal squamous cell carcinoma in a real-time setting. BMC Gastroenterol 2023; 23: 184.
10.1186/s12876-023-02788-2
CAS PubMed Web of Science® Google Scholar
21Yuan XL, Zeng XH, Liu W et al. Artificial intelligence for detecting and delineating the extent of superficial esophageal squamous cell carcinoma and precancerous lesions under narrow-band imaging (with video). Gastrointest Endosc 2023; 97: 664–672. e4.
10.1016/j.gie.2022.12.003
PubMed Web of Science® Google Scholar
22Redmon J, Farhadi A. Yolov3: An incremental improvement. arXiv:1804.02767 [Preprint]. 2018 [cited 2018 Apr 8]: [6 p.]. Available from: https://arxiv.org/abs/1804.02767
Google Scholar
23Cai SL, Li B, Tan WM et al. Using a deep learning system in endoscopy for screening of early esophageal squamous cell carcinoma (with video). Gastrointest Endosc 2019; 90: 745–7253. e2.
10.1016/j.gie.2019.06.044
PubMed Web of Science® Google Scholar
24Li B, Cai SL, Tan WM et al. Comparative study on artificial intelligence systems for detecting early esophageal squamous cell carcinoma between narrow-band and white-light imaging. World J Gastroenterol 2021; 27: 281–93.
10.3748/wjg.v27.i3.281
PubMed Web of Science® Google Scholar
25Feng Y, Liang Y, Li P et al. Artificial intelligence assisted detection of superficial esophageal squamous cell carcinoma in white-light endoscopic images by using a generalized system. Discov Oncol 2023; 14: 73.
10.1007/s12672-023-00694-3
CAS PubMed Web of Science® Google Scholar
26Yuan XL, Liu W, Lin YX et al. Effect of an artificial intelligence-assisted system on endoscopic diagnosis of superficial oesophageal squamous cell carcinoma and precancerous lesions: A multicentre, tandem, double-blind, randomised controlled trial. Lancet Gastroenterol Hepatol 2024; 9: 34–44.
10.1016/S2468-1253(23)00276-5
CAS PubMed Web of Science® Google Scholar
27Nakao E, Yoshio T, Kato Y et al. Randomized controlled trial of an artificial intelligence diagnostic system for the detection of esophageal squamous cell carcinoma in clinical practice. Endoscopy Published online: 18 Nov 2024; DOI: 10.1055/a-2421-3194
10.1055/a?2421?3194
Web of Science® Google Scholar
28Kaise M, Kato M, Tajiri H. High-definition endoscopy and magnifying endoscopy combined with narrow band imaging in gastric cancer. Gastroenterol Clin North Am 2010; 39: 771–784.
10.1016/j.gtc.2010.08.028
PubMed Web of Science® Google Scholar
29Sami SS, Subramanian V, Butt WM et al. High definition versus standard definition white light endoscopy for detecting dysplasia in patients with Barrett's esophagus. Dis Esophagus 2015; 28: 742–749.
10.1111/dote.12283
CAS PubMed Web of Science® Google Scholar
30Jrebi NY, Hefty M, Jalouta T et al. High-definition colonoscopy increases adenoma detection rate. Surg Endosc 2017; 31: 78–84.
10.1007/s00464-016-4986-7
PubMed Web of Science® Google Scholar
31Richardson J, Thaventhiran A, Mackenzie H, Stubbs B. The use of high definition colonoscopy versus standard definition: Does it affect polyp detection rate? Surg Endosc 2018; 32: 2676–2682.
10.1007/s00464-017-5962-6
PubMed Google Scholar
32Lee JY, Koh M, Lee JH. Latest generation high-definition colonoscopy increases adenoma detection rate by trainee endoscopists. Dig Dis Sci 2021; 66: 2756–2762.
10.1007/s10620-020-06543-5
PubMed Web of Science® Google Scholar

Volume6, Issue1

April 2026

e70083

Effects of artificial intelligence assistance on endoscopist performance: Comparison of diagnostic performance in superficial esophageal squamous cell carcinoma detection using video-based models