☆ 4.5 Letter

How Does ChatGPT Perform on the Italian Residency Admission National Exam Compared to 15,869 Medical Graduates?

ANNALS OF BIOMEDICAL ENGINEERING (2023)

Journal

ANNALS OF BIOMEDICAL ENGINEERING

Volume -, Issue -, Pages -

Publisher

SPRINGER

DOI: 10.1007/s10439-023-03318-7

Keywords

Education; Artificial intelligence; ChatGPT; Medical degree; Residency; Medical knowledge

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The study aims to assess the performance of ChatGPT on the Residency Admission National Exam in Italy and compare it to the level of medical knowledge of graduate medical doctors. ChatGPT3 participated in the exam and achieved a score in the top 98.8th percentile among 15,869 medical graduates. The results demonstrate that ChatGPT is proficient in basic science and applied clinical knowledge.

Purpose The study aims to assess ChatGPT performance on the Residency Admission National Exam to evaluate ChatGPT's level of medical knowledge compared to graduate medical doctors in Italy.Methods ChatGPT3 was used in June 2023 to undertake the 2022 Italian Residency Admission National Exam-a 140 multiple choice questions computer-based exam taken by all Italian medical graduates yearly, used to assess basic science and applied medical knowledge. The exam was scored using the same criteria defined by the national educational governing body. The performance of ChatGPT was compared to the performance of the 15,869 medical graduates who took the exam in July 2022. Lastly, the integrity and quality of ChatGPT's responses were evaluated.Results ChatGPT answered correctly 122 out of 140 questions. The score ranked in the top 98.8(th) percentile among 15,869 medical graduates. Among the 18 incorrect answers, 10 were evaluating direct questions on basic science medical knowledge, while 8 were evaluating candidates' applied clinical knowledge and reasoning under the form of case presentation. Errors were logical (2 incorrect answers) and informational in nature (16 incorrect answers). Explanations to the correct answers were all evaluated as appropriate. Comparison to national statistics related to the minimal score needed to match into each specialty, demonstrated that the performance of ChatGPT would have granted the candidate a match into any specialty.Conclusion ChatGPT proved to be proficient in basic science medical knowledge and applied clinical knowledge. Future research should assess the impact and reliability of ChatGPT in clinical practice.

How Does ChatGPT Perform on the Italian Residency Admission National Exam Compared to 15,869 Medical Graduates?

Journal

ANNALS OF BIOMEDICAL ENGINEERING

Publisher

SPRINGER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

How Does ChatGPT Perform on the Italian Residency Admission National Exam Compared to 15,869 Medical Graduates?

Journal

ANNALS OF BIOMEDICAL ENGINEERING

Publisher

SPRINGER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper