4.7 Article

Predictive modeling of clinical trial terminations using feature engineering and embedding learning

Journal

SCIENTIFIC REPORTS
Volume 11, Issue 1, Pages -

Publisher

NATURE PORTFOLIO
DOI: 10.1038/s41598-021-82840-x

Keywords

-

Funding

  1. U.S. National Science Foundation [IIS-2027339, IIS-1763452, CNS-1828181]

Ask authors/readers for more resources

By using machine learning to analyze terminated clinical trials, the study identified common factors associated with trial termination and accurately predicted trial termination. The results demonstrate that machine learning has high predictive accuracy in clinical trial studies.
In this study, we propose to use machine learning to understand terminated clinical trials. Our goal is to answer two fundamental questions: (1) what are common factors/markers associated to terminated clinical trials? and (2) how to accurately predict whether a clinical trial may be terminated or not? The answer to the first question provides effective ways to understand characteristics of terminated trials for stakeholders to better plan their trials; and the answer to the second question can direct estimate the chance of success of a clinical trial in order to minimize costs. By using 311,260 trials to build a testbed with 68,999 samples, we use feature engineering to create 640 features, reflecting clinical trial administration, eligibility, study information, criteria etc. Using feature ranking, a handful of features, such as trial eligibility, trial inclusion/exclusion criteria, sponsor types etc., are found to be related to the clinical trial termination. By using sampling and ensemble learning, we achieve over 67% Balanced Accuracy and over 0.73 AUC (Area Under the Curve) scores to correctly predict clinical trial termination, indicating that machine learning can help achieve satisfactory prediction results for clinical trial study.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available