4.5 Article

Training sample selection: Impact on screening automation in diagnostic test accuracy reviews

期刊

RESEARCH SYNTHESIS METHODS
卷 12, 期 6, 页码 831-841

出版社

WILEY
DOI: 10.1002/jrsm.1518

关键词

cosine similarity; computerised support; machine learning; screening automation; training sample selection

资金

  1. SURF Foundation

向作者/读者索取更多资源

This paper introduces an approach to select data for model training and compares it with established methods using 50 Cochrane diagnostic test accuracy reviews. The study suggests that models perform best with a larger number of reviews in the training set and when the research subject of the target review is similar to other reviews in the dataset.
When performing a systematic review, researchers screen the articles retrieved after a broad search strategy one by one, which is time-consuming. Computerised support of this screening process has been applied with varying success. This is partly due to the dependency on large amounts of data to develop models that predict inclusion. In this paper, we present an approach to choose which data to use in model training and compare it with established approaches. We used a dataset of 50 Cochrane diagnostic test accuracy reviews, and each was used as a target review. From the remaining 49 reviews, we selected those that most closely resembled the target review's clinical topic using the cosine similarity metric. Included and excluded studies from these selected reviews were then used to develop our prediction models. The performance of models trained on the selected reviews was compared against models trained on studies from all available reviews. The prediction models performed best with a larger number of reviews in the training set and on target reviews that had a research subject similar to other reviews in the dataset. Our approach using cosine similarity may reduce computational costs for model training and the duration of the screening process.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据