BT11: How to train your artificial intelligence (AI) dragon: an analysis of human triage performance of community-captured images to inform development of AI solutions
Gillian X.M. Chin,1 Callum Hakimi-Khiabani,2 Sanaa Butt,1 Alyson Bryden,1 Shareen Muthiah,3 Tamas Suveges,2 Colin Morton,3 Andrew Coon1 and Colin Fleming1
1Ninewells Hospital, Dundee, UK; 2University of Dundee, Dundee, UK; and 3Stirling Community Hospital, Stirling, UK
Our dermatology department uses low-quality images taken on smartphones or digital cameras by primary care clinicians, and high-quality images taken by medical photography. Low-quality images can be taken anywhere, and have useful triage value. High-quality images afford greater diagnostic power. We have studied human triage performance from this system, in the process of developing artificial intelligence (AI) based on both low- and high-quality images. Two studies were performed. To test triage performance using low-quality images, without clinical data, we reviewed a database from 2017 of previously triaged low-quality images. A clinical fellow (O1) in dermatology and a consultant dermatologist (O2) were shown 150 sequentially selected images and asked to triage as benign or suspicious, and then provide the most likely diagnosis. Validated diagnoses were determined from a dermatology diagnostic database. We then examined how many patients were directly discharged from triage when either low- or high-quality images were available. In total, 350 sequential patients, from 2021, triaged using initial low-quality images, then high-quality images, had records analysed to determine the additional benefit from using high-quality images in this context. In the first part of the study, 99% (O1: 100%; O2: 97%) of cancerous/precancerous lesions were correctly identified as suspicious. Provisional diagnoses made by observers matched validated diagnoses in 60% (O1: 58%; O2: 61%) of cases. In the second part of the study, 28% of patients with high-quality images were directly discharged to primary care. A further 32% of these patients completed telephone consultations, ensuring 60% of these patients did not require face-to-face appointments. In summary, cancerous/precancerous lesions were accurately triaged as either benign or malignant. Low-quality images in triage often yield false-positive suspicious responses. Low-quality images produce lower diagnostic accuracy. Low-quality image referral in our system, even without clinical data, permits two in 10 patients to be managed remotely; with high-quality images this is around six in 10. This study is the first real-world investigation of triage use of low- and high-quality data. It supports previous observations that high-quality images increase the likelihood of accurate diagnoses, even without clinical information. Low-quality data are useful in triage, and therefore may be used to develop AI, to complement high-quality data AI. Some patients will struggle to attend for high-quality image capture and research should focus on improving capture of both low- and high-quality images, and developing real-world AI based on varying populations, presentations and image quality.