4.5 Article

Testing the Impact of Novel Assessment Sources and Machine Learning Methods on Predictive Outcome Modeling in Undergraduate Biology

期刊

JOURNAL OF SCIENCE EDUCATION AND TECHNOLOGY
卷 30, 期 2, 页码 193-209

出版社

SPRINGER
DOI: 10.1007/s10956-020-09888-8

关键词

Machine learning; Assessment; Predictive learning analytics; Concept inventories; Course- vs. institution-specific data sources; Introductory biology

资金

  1. Howard Hughes Medical Institute Science Education Program

向作者/读者索取更多资源

This study explores the effectiveness of incorporating concept inventories and using machine learning methods to predict and address attrition in undergraduate science courses. The results show that including course-specific data significantly improves prediction performance, with ensemble ML methods yielding higher AUC values compared to non-ensemble techniques. Logistic regression performed the poorest and increasing corpus size did not impact prediction success meaningfully. The study discusses the potential roles of novel assessment types and ML techniques in enhancing predictive learning analytics and reducing attrition in undergraduate science education.
High levels of attrition characterize undergraduate science courses in the USA. Predictive analytics research seeks to build models that identify at-risk students and suggest interventions that enhance student success. This study examines whether incorporating a novel assessment type (concept inventories [CI]) and using machine learning (ML) methods (1) improves prediction quality, (2) reduces the time point of successful prediction, and (3) suggests more actionable course-level interventions. A corpus of university and course-level assessment and non-assessment variables (53 variables in total) from 3225 students (over six semesters) was gathered. Five ML methods were employed (two individuals, three ensembles) at three time points (pre-course, week 3, week 6) to quantify predictive efficacy. Inclusion of course-specific CI data along with university-specific corpora significantly improved prediction performance. Ensemble ML methods, in particular the generalized linear model with elastic net (GLMNET), yielded significantly higher area under the curve (AUC) values compared with non-ensemble techniques. Logistic regression achieved the poorest prediction performance and consistently underperformed. Surprisingly, increasing corpus size (i.e., amount of historical data) did not meaningfully impact prediction success. We discuss the roles that novel assessment types and ML techniques may play in advancing predictive learning analytics and addressing attrition in undergraduate science education.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据