☆ 4.6 Article

Automated Item Generation: impact of item variants on performance and standard setting

BMC MEDICAL EDUCATION (2023)

期刊

BMC MEDICAL EDUCATION

卷 23, 期 1, 页码 -

出版社

BMC

DOI: 10.1186/s12909-023-04457-0

关键词

Assessment; Automated item generation; Multiple choice questions; Standard setting

类别

Education & Educational Research Education, Scientific Disciplines

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study used computer software to generate multiple item variants and assessed them with final year medical students in the UK. The results showed that there were significant differences in item facility, which may be related to different clinical reasoning strategies and school-level factors.

Background Automated Item Generation (AIG) uses computer software to create multiple items from a single question model. There is currently a lack of data looking at whether item variants to a single question result in differences in student performance or human-derived standard setting. The purpose of this study was to use 50 Multiple Choice Questions (MCQs) as models to create four distinct tests which would be standard set and given to final year UK medical students, and then to compare the performance and standard setting data for each. Methods Pre-existing questions from the UK Medical Schools Council (MSC) Assessment Alliance item bank, created using traditional item writing techniques, were used to generate four ' isomorphic ' 50-item MCQ tests using AIG software. Isomorphic questions use the same question template with minor alterations to test the same learning outcome. All UK medical schools were invited to deliver one of the four papers as an online formative assessment for their final year students. Each test was standard set using a modified Angoff method. Thematic analysis was conducted for item variants with high and low levels of variance in facility (for student performance) and average scores (for standard setting). Results Two thousand two hundred eighteen students from 12 UK medical schools participated, with each school using one of the four papers. The average facility of the four papers ranged from 0.55-0.61, and the cut score ranged from 0.58-0.61. Twenty item models had a facility difference > 0.15 and 10 item models had a difference in standard setting of > 0.1. Variation in parameters that could alter clinical reasoning strategies had the greatest impact on item facility. Conclusions Item facility varied to a greater extent than the standard set. This difference may relate to variants causing greater disruption of clinical reasoning strategies in novice learners compared to experts, but is confounded by the possibility that the performance differences may be explained at school level and therefore warrants further study.

Automated Item Generation: impact of item variants on performance and standard setting

期刊

BMC MEDICAL EDUCATION

出版社

BMC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Automated Item Generation: impact of item variants on performance and standard setting

期刊

BMC MEDICAL EDUCATION

出版社

BMC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文