☆ 4.5 Article

A Meta-Analysis of Machine Learning-Based Science Assessments: Factors Impacting Machine-Human Score Agreements

JOURNAL OF SCIENCE EDUCATION AND TECHNOLOGY (2021)

期刊

JOURNAL OF SCIENCE EDUCATION AND TECHNOLOGY

卷 30, 期 3, 页码 361-379

出版社

SPRINGER

DOI: 10.1007/s10956-020-09875-z

关键词

Machine learning; Science assessment; Meta-analysis; Interrater reliability; Validity; Cohen's kappa; Artificial Intelligence

类别

Education & Educational Research Education, Scientific Disciplines

资金

National Science Foundation [DUE-1561159]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study conducted a meta-analysis of machine scoring in science assessment, identifying six factors that impact scoring success and showing that algorithm and subject domain have significant effects on scoring success.

Machine learning (ML) has been increasingly employed in science assessment to facilitate automatic scoring efforts, although with varying degrees of success (i.e., magnitudes of machine-human score agreements [MHAs]). Little work has empirically examined the factors that impact MHA disparities in this growing field, thus constraining the improvement of machine scoring capacity and its wide applications in science education. We performed a meta-analysis of 110 studies of MHAs in order to identify the factors most strongly contributing to scoring success (i.e., high Cohen's kappa [kappa]). We empirically examined six factors proposed as contributors to MHA magnitudes: algorithm, subject domain, assessment format, construct, school level, and machine supervision type. Our analyses of 110 MHAs revealed substantial heterogeneity in kappa(mean=.64; range = .09-.97, taking weights into consideration). Using three-level random-effects modeling, MHA score heterogeneity was explained by the variability both within publications (i.e., the assessment task level: 82.6%) and between publications (i.e., the individual study level: 16.7%). Our results also suggest that all six factors have significant moderator effects on scoring success magnitudes. Among these, algorithm and subject domain had significantly larger effects than the other factors, suggesting that technical features and assessment external features might be primary targets for improving MHAs and ML-based science assessments.

A Meta-Analysis of Machine Learning-Based Science Assessments: Factors Impacting Machine-Human Score Agreements

期刊

JOURNAL OF SCIENCE EDUCATION AND TECHNOLOGY

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A Meta-Analysis of Machine Learning-Based Science Assessments: Factors Impacting Machine-Human Score Agreements

期刊

JOURNAL OF SCIENCE EDUCATION AND TECHNOLOGY

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文