Exploring the role of an artificial intelligence chatbot on appendicitis management: an experimental study on ChatGPT
Abstract
Background
Appendicitis is a common surgical condition that requires urgent medical attention. Recent advancements in artificial intelligence and large language processing, such as ChatGPT, have demonstrated potential in supporting healthcare management and scientific research. This study aims to evaluate the accuracy and comprehensiveness of ChatGPT's knowledge on appendicitis management.
Methods
Six questions related to appendicitis management were created by experienced RACS qualified general surgeons to assess ChatGPT's ability to provide accurate information. The criteria of ChatGPT answers' accuracy were compared with current healthcare guidelines for appendicitis and subjective evaluation by two RACS qualified General Surgeons. Additionally, ChatGPT was then asked to provide five high level evidence references to support its responses.
Results
ChatGPT provided clinically relevant information on appendicitis management, however, was inconsistent in doing so and often provided superficial information. Further to this, ChatGPT encountered difficulties in generating relevant references, with some being either non-existent or incorrect.
Conclusion
ChatGPT has the potential to provide timely and comprehensible medical information on appendicitis management to laypersons. However, its issue of inaccuracy in information and production of non-existent or erroneous references presents a challenge for researchers and clinicians who may inadvertently employ such information in their research or healthcare. Therefore, clinicians should exercise caution when using ChatGPT for these purposes.
Conflicts of interest
The authors declare no conflict of interest.
Open Research
Data availability statement
All study data is included in the submission.