4.3 Article

Complementing human judgment of essays written by English language learners with e-rater® scoring

Journal

LANGUAGE TESTING
Volume 27, Issue 3, Pages 317-334

Publisher

SAGE PUBLICATIONS LTD
DOI: 10.1177/0265532210363144

Keywords

automated essay scoring; e-rater; educational measurement; ESL/EFL writing assessment; TOEFL writing; validity of writing assessment

Ask authors/readers for more resources

E-rater (R) is an automated essay scoring system that uses natural language processing techniques to extract features from essays and to model statistically human holistic ratings. Educational Testing Service has investigated the use of e-rater, in conjunction with human ratings, to score one of the two writing tasks on the TOEFL-iBT (R) writing section. In this article we describe the TOEFL iBT writing section and an e-rater model proposed to provide one of two ratings for the Independent writing task. We discuss how the evidence for a process that uses both human and e-rater scoring is relevant to four components in a validity argument: (a) Evaluation - observations of performance on the writing task are scored to provide evidence of targeted writing skills; (b) Generalization - scores on the writing task provide estimates of expected scores over relevant parallel versions of the task and across raters; (c) Extrapolation - expected scores on the writing task are consistent with other measures of writing ability; and (d) Utilization - scores on the writing task are useful in educational contexts. Finally, we propose directions for future research that will strengthen the case for using complementary methods of scoring to improve the assessment of EFL writing.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available