4.5 Article

Does Google's Bard Chatbot perform better than ChatGPT on the European hand surgery exam?

Journal

INTERNATIONAL ORTHOPAEDICS
Volume -, Issue -, Pages -

Publisher

SPRINGER
DOI: 10.1007/s00264-023-06034-y

Keywords

Bard; ChatGPT; Chatbot; Hand Surgery; Multiple-choice question; Artificial intelligence

Categories

Ask authors/readers for more resources

This study investigated the performance of Google's chatbot Bard (R) on the European Board of Hand Surgery (EBHS) diploma examination and compared it with ChatGPT. The results showed that both current versions of ChatGPT and Bard were unable to pass the first part of the EBHS diploma exam.
PurposeAccording to a previous research, the chatbot ChatGPT (R) V3.5 was unable to pass the first part of the European Board of Hand Surgery (EBHS) diploma examination. This study aimed to investigate whether Google's chatbot Bard (R) would have superior performance compared to ChatGPT on the EBHS diploma examination.MethodsChatbots were asked to answer 18 EBHS multiple choice questions (MCQs) published in the Journal of Hand Surgery (European Volume) in five trials (A1 to A5). After A3, chatbots received correct answers, and after A4, incorrect answers. Consequently, their ability to modify their response was measured and compared.ResultsBard (R) scored 3/18 (A1), 1/18 (A2), 4/18 (A3) and 2/18 (A4 and A5). The average percentage of correct answers was 61.1% for A1, 62.2% for A2, 64.4% for A3, 65.6% for A4, 63.3% for A5 and 63.3% for all trials combined. Agreement was moderate from A1 to A5 (kappa = 0.62 (IC95% = [0.51; 0.73])) as well as from A1 to A3 (kappa = 0.60 (IC95% = [0.47; 0.74])). The formulation of Bard (R) responses was homogeneous, but its learning capacity is still developing.ConclusionsThe main hypothesis of our study was not proved since Bard did not score significantly higher than ChatGPT when answering the MCQs of the EBHS diploma exam. In conclusion, neither ChatGPT (R) nor Bard (R), in their current versions, can pass the first part of the EBHS diploma exam.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available