☆ 4.5 Article

Testing the Impact of Novel Assessment Sources and Machine Learning Methods on Predictive Outcome Modeling in Undergraduate Biology

JOURNAL OF SCIENCE EDUCATION AND TECHNOLOGY (2021)

Journal

JOURNAL OF SCIENCE EDUCATION AND TECHNOLOGY

Volume 30, Issue 2, Pages 193-209

Publisher

SPRINGER

DOI: 10.1007/s10956-020-09888-8

Keywords

Machine learning; Assessment; Predictive learning analytics; Concept inventories; Course- vs. institution-specific data sources; Introductory biology

Funding

Howard Hughes Medical Institute Science Education Program

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This study explores the effectiveness of incorporating concept inventories and using machine learning methods to predict and address attrition in undergraduate science courses. The results show that including course-specific data significantly improves prediction performance, with ensemble ML methods yielding higher AUC values compared to non-ensemble techniques. Logistic regression performed the poorest and increasing corpus size did not impact prediction success meaningfully. The study discusses the potential roles of novel assessment types and ML techniques in enhancing predictive learning analytics and reducing attrition in undergraduate science education.

High levels of attrition characterize undergraduate science courses in the USA. Predictive analytics research seeks to build models that identify at-risk students and suggest interventions that enhance student success. This study examines whether incorporating a novel assessment type (concept inventories [CI]) and using machine learning (ML) methods (1) improves prediction quality, (2) reduces the time point of successful prediction, and (3) suggests more actionable course-level interventions. A corpus of university and course-level assessment and non-assessment variables (53 variables in total) from 3225 students (over six semesters) was gathered. Five ML methods were employed (two individuals, three ensembles) at three time points (pre-course, week 3, week 6) to quantify predictive efficacy. Inclusion of course-specific CI data along with university-specific corpora significantly improved prediction performance. Ensemble ML methods, in particular the generalized linear model with elastic net (GLMNET), yielded significantly higher area under the curve (AUC) values compared with non-ensemble techniques. Logistic regression achieved the poorest prediction performance and consistently underperformed. Surprisingly, increasing corpus size (i.e., amount of historical data) did not meaningfully impact prediction success. We discuss the roles that novel assessment types and ML techniques may play in advancing predictive learning analytics and addressing attrition in undergraduate science education.

Testing the Impact of Novel Assessment Sources and Machine Learning Methods on Predictive Outcome Modeling in Undergraduate Biology

Journal

JOURNAL OF SCIENCE EDUCATION AND TECHNOLOGY

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Testing the Impact of Novel Assessment Sources and Machine Learning Methods on Predictive Outcome Modeling in Undergraduate Biology

Journal

JOURNAL OF SCIENCE EDUCATION AND TECHNOLOGY

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper