期刊
LANGUAGE TESTING
卷 29, 期 3, 页码 371-394出版社
SAGE PUBLICATIONS LTD
DOI: 10.1177/0265532211425673
关键词
automated speech scoring; validity; classification trees; TOEFL; practice test
This paper compares two alternative scoring methods - multiple regression and classification trees - for an automated speech scoring system used in a practice environment. The two methods were evaluated on two criteria: construct representation and empirical performance in predicting human scores. The empirical performance of the two scoring models is reported in Zechner, Higgins, Xi, & Williamson (2009), which discusses the development of the entire automated speech scoring system; the current paper shifts the focus to the comparison of the two scoring methods, elaborating both technical and substantive considerations and providing a reasoned argument for the trade-off between them. We concluded that a multiple regression model with expert weights was superior to the classification tree model. In addition to comparing the relative performance of the two models, we also evaluated the adequacy of the regression model for the intended use. In particular, the construct representation of the model was sufficiently broad to justify its use in a low-stakes application. The correlation of the model-predicted total test scores with human scores (r = 0.7) was also deemed acceptable for practice purposes.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据