4.5 Letter

How Does ChatGPT Perform on the Italian Residency Admission National Exam Compared to 15,869 Medical Graduates?

Journal

ANNALS OF BIOMEDICAL ENGINEERING
Volume -, Issue -, Pages -

Publisher

SPRINGER
DOI: 10.1007/s10439-023-03318-7

Keywords

Education; Artificial intelligence; ChatGPT; Medical degree; Residency; Medical knowledge

Ask authors/readers for more resources

The study aims to assess the performance of ChatGPT on the Residency Admission National Exam in Italy and compare it to the level of medical knowledge of graduate medical doctors. ChatGPT3 participated in the exam and achieved a score in the top 98.8th percentile among 15,869 medical graduates. The results demonstrate that ChatGPT is proficient in basic science and applied clinical knowledge.
Purpose The study aims to assess ChatGPT performance on the Residency Admission National Exam to evaluate ChatGPT's level of medical knowledge compared to graduate medical doctors in Italy.Methods ChatGPT3 was used in June 2023 to undertake the 2022 Italian Residency Admission National Exam-a 140 multiple choice questions computer-based exam taken by all Italian medical graduates yearly, used to assess basic science and applied medical knowledge. The exam was scored using the same criteria defined by the national educational governing body. The performance of ChatGPT was compared to the performance of the 15,869 medical graduates who took the exam in July 2022. Lastly, the integrity and quality of ChatGPT's responses were evaluated.Results ChatGPT answered correctly 122 out of 140 questions. The score ranked in the top 98.8(th) percentile among 15,869 medical graduates. Among the 18 incorrect answers, 10 were evaluating direct questions on basic science medical knowledge, while 8 were evaluating candidates' applied clinical knowledge and reasoning under the form of case presentation. Errors were logical (2 incorrect answers) and informational in nature (16 incorrect answers). Explanations to the correct answers were all evaluated as appropriate. Comparison to national statistics related to the minimal score needed to match into each specialty, demonstrated that the performance of ChatGPT would have granted the candidate a match into any specialty.Conclusion ChatGPT proved to be proficient in basic science medical knowledge and applied clinical knowledge. Future research should assess the impact and reliability of ChatGPT in clinical practice.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available