4.1 Article

The interactive reading task: Transformer-based automatic item generation

Journal

FRONTIERS IN ARTIFICIAL INTELLIGENCE
Volume 5, Issue -, Pages -

Publisher

FRONTIERS MEDIA SA
DOI: 10.3389/frai.2022.903077

Keywords

automatic item generation; reading assessment; language modeling; transformer models; psychometrics

Ask authors/readers for more resources

This paper presents an interactive reading task approach based on transformer-based deep language modeling for generating reading comprehension assessments. Through a large-scale pilot test, the feasibility of this approach for automatic creation of complex educational assessments has been demonstrated.
Automatic item generation (AIG) has the potential to greatly expand the number of items for educational assessments, while simultaneously allowing for a more construct-driven approach to item development. However, the traditional item modeling approach in AIG is limited in scope to content areas that are relatively easy to model (such as math problems), and depends on highly skilled content experts to create each model. In this paper we describe the interactive reading task, a transformer-based deep language modeling approach for creating reading comprehension assessments. This approach allows a fully automated process for the creation of source passages together with a wide range of comprehension questions about the passages. The format of the questions allows automatic scoring of responses with high fidelity (e.g., selected response questions). We present the results of a large-scale pilot of the interactive reading task, with hundreds of passages and thousands of questions. These passages were administered as part of the practice test of the Duolingo English Test. Human review of the materials and psychometric analyses of test taker results demonstrate the feasibility of this approach for automatic creation of complex educational assessments.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.1
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available