4.0 Article

Incorporating Test-Taking Engagement into Multistage Adaptive Testing Design for Large-Scale Assessments

期刊

出版社

WILEY
DOI: 10.1111/jedm.12380

关键词

-

向作者/读者索取更多资源

The use of multistage adaptive testing (MST) has been increasing in large-scale testing programs as it offers a balance between linear test design and item-level adaptive testing. However, research has shown that a lack of test-taking engagement can impact the measurement accuracy of MST. To address this issue, test-taking engagement can be incorporated into the on-the-fly module assembly procedure to minimize the impact of noneffortful responses.
The use of multistage adaptive testing (MST) has gradually increased in large-scale testing programs as MST achieves a balanced compromise between linear test design and item-level adaptive testing. MST works on the premise that each examinee gives their best effort when attempting the items, and their responses truly reflect what they know or can do. However, research shows that large-scale assessments may suffer from a lack of test-taking engagement, especially if they are low stakes. Examinees with low test-taking engagement are likely to show noneffortful responding (e.g., answering the items very rapidly without reading the item stem or response options). To alleviate the impact of noneffortful responses on the measurement accuracy of MST, test-taking engagement can be operationalized as a latent trait based on response times and incorporated into the on-the-fly module assembly procedure. To demonstrate the proposed approach, a Monte-Carlo simulation study was conducted based on item parameters from an international large-scale assessment. The results indicated that the on-the-fly module assembly considering both ability and test-taking engagement could minimize the impact of noneffortful responses, yielding more accurate ability estimates and classifications. Implications for practice and directions for future research were discussed.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.0
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据