Volume 35, Issue 5 pp. 876-878

Short Report

Will code one day run a code? Performance of language models on ACEM primary examinations and implications

Jesse Smith MD,

Corresponding Author

Jesse Smith MD

Emergency Medicine Registrar

[email protected]

orcid.org/0000-0003-1871-6544

Eastern Health Emergency Medicine Program, Eastern Health, Melbourne, Victoria, Australia

Correspondence: Dr Jesse Smith, Emergency Department, Box Hill Hospital, 5 Arnold Street, Box Hill, VIC 3128, Australia. Email: [email protected]

Search for more papers by this author

Philip MC Choi MBChB, FRACP,

Philip MC Choi MBChB, FRACP

Stroke Neurologist

orcid.org/0000-0003-0339-3439

Department of Neuroscience, Eastern Health, Melbourne, Victoria, Australia

Eastern Health Clinical School, Monash University, Melbourne, Victoria, Australia

Search for more papers by this author

Paul Buntine MBBS (Hons), FACEM, MClinRes,

Paul Buntine MBBS (Hons), FACEM, MClinRes

Director of Emergency Medicine Research

Eastern Health Emergency Medicine Program, Eastern Health, Melbourne, Victoria, Australia

Department of Neuroscience, Eastern Health, Melbourne, Victoria, Australia

Search for more papers by this author

Jesse Smith MD,

Corresponding Author

Jesse Smith MD

Emergency Medicine Registrar

[email protected]

orcid.org/0000-0003-1871-6544

Eastern Health Emergency Medicine Program, Eastern Health, Melbourne, Victoria, Australia

Correspondence: Dr Jesse Smith, Emergency Department, Box Hill Hospital, 5 Arnold Street, Box Hill, VIC 3128, Australia. Email: [email protected]

Search for more papers by this author

Philip MC Choi MBChB, FRACP,

Philip MC Choi MBChB, FRACP

Stroke Neurologist

orcid.org/0000-0003-0339-3439

Department of Neuroscience, Eastern Health, Melbourne, Victoria, Australia

Eastern Health Clinical School, Monash University, Melbourne, Victoria, Australia

Search for more papers by this author

Paul Buntine MBBS (Hons), FACEM, MClinRes,

Paul Buntine MBBS (Hons), FACEM, MClinRes

Director of Emergency Medicine Research

Eastern Health Emergency Medicine Program, Eastern Health, Melbourne, Victoria, Australia

Department of Neuroscience, Eastern Health, Melbourne, Victoria, Australia

Search for more papers by this author

First published: 06 July 2023

https://doi.org/10.1111/1742-6723.14280

Citations: 1

Share a link

Email
Wechat
Bluesky

Abstract

Objective

Large language models (LLMs) have demonstrated mixed results in their ability to pass various specialist medical examination and their performance within the field of emergency medicine remains unknown.

Methods

We explored the performance of three prevalent LLMs (OpenAI's GPT series, Google's Bard, and Microsoft's Bing Chat) on a practice ACEM primary examination.

Results

All LLMs achieved a passing score, with scores with GPT 4.0 outperforming the average candidate.

Conclusion

Large language models, by passing the ACEM primary examination, show potential as tools for medical education and practice. However, limitations exist and are discussed.

Open Research

Data availability statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

Citing Literature

Volume35, Issue5

October 2023

Pages 876-878

Will code one day run a code? Performance of language models on ACEM primary examinations and implications

Abstract

Objective

Methods

Results

Conclusion

Open Research

Data availability statement

References

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Will code one day run a code? Performance of language models on ACEM primary examinations and implications

Abstract

Objective

Methods

Results

Conclusion

Open Research

Data availability statement

References

Citing Literature

References

Related

Information