4.5 Article

Identifying unreported links between ClinicalTrials.gov trial registrations and their published results

Journal

RESEARCH SYNTHESIS METHODS
Volume 13, Issue 3, Pages 342-352

Publisher

WILEY
DOI: 10.1002/jrsm.1545

Keywords

clinical trials; information retrieval; trial registration

Funding

  1. National Library of Medicine, National Institutes of Health [R01LM012976]

Ask authors/readers for more resources

A substantial proportion of trial registrations are not linked to corresponding published articles, which limits analysis and tools. In this study, researchers developed a method using a classifier with distance metrics to identify missing links between trial registrations and articles. The proposed method outperformed the baseline method in identifying the correct article or registration, improving the feasibility of identifying missing links. This method has important implications for improving the coupling of PubMed and automating systematic review and evidence synthesis processes.
A substantial proportion of trial registrations are not linked to corresponding published articles, limiting analyses and new tools. Our aim was to develop a method for finding articles reporting the results of trials that are registered on when they do not include metadata links. We used a set of 27,280 trial registration and article pairs to train and evaluate methods for identifying missing links in both directions-from articles to registrations and from registrations to articles. We trained a classifier with six distance metrics as feature representations to rank the correct article or registration, using recall@K to evaluate performance and compare to baseline methods. When identifying links from registrations to published articles, the classifier ranked the correct article first (recall@1) among 378,048 articles in 80.8% of evaluation cases and 34.9% in the baseline method. Recall@10 was 85.1% compared to 60.7% in the baseline. When predicting links from articles to registrations, recall@1 was 83.4% for the classifier and 39.8% in the baseline. Recall@10 was 89.5% compared to 65.8% in the baseline. The proposed method improves on our baseline document similarity method to be feasible for identifying missing links in practice. Given a registration, a user checking 10 ranked articles can expect to identify the matching article in at least 85% of cases, if the trial has been published. The proposed method can be used to improve the coupling of and PubMed, with applications related to automating systematic review and evidence synthesis processes.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available