☆ 4.5 Article

Comparing machine and human reviewers to evaluate the risk of bias in randomized controlled trials

RESEARCH SYNTHESIS METHODS (2020)

Journal

RESEARCH SYNTHESIS METHODS

Volume 11, Issue 3, Pages 484-493

Publisher

WILEY

DOI: 10.1002/jrsm.1398

Keywords

artificial intelligence; health technology assessment (HTA); inter-rater reliability; randomized controlled trial; risk of bias; systematic review

Funding

Alberta Innovates - Health Solutions
Canadian Institutes of Health Research
Government of Alberta
Institute of Health Economics
Physiotherapy Foundation of Canada
Knowledge translation initiative grant

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Background Evidence from new health technologies is growing, along with demands for evidence to inform policy decisions, creating challenges in completing health technology assessments (HTAs)/systematic reviews (SRs) in a timely manner. Software can decrease the time and burden by automating the process, but evidence validating such software is limited. We tested the accuracy of RobotReviewer, a semi-autonomous risk of bias (RoB) assessment tool, and its agreement with human reviewers. Methods Two reviewers independently conducted RoB assessments on a sample of randomized controlled trials (RCTs), and their consensus ratings were compared with those generated by RobotReviewer. Agreement with the human reviewers was assessed using percent agreement and weighted kappa (kappa). The accuracy of RobotReviewer was also assessed by calculating the sensitivity, specificity, and area under the curve in comparison to the consensus agreement of the human reviewers. Results The study included 372 RCTs. Inter-rater reliability ranged from kappa = -0.06 (no agreement) for blinding of participants and personnel to kappa = 0.62 (good agreement) for random sequence generation (excluding overall RoB). RobotReviewer was found to use a high percentage of irrelevant supporting quotations to complement RoB assessments for blinding of participants and personnel (72.6%), blinding of outcome assessment (70.4%), and allocation concealment (54.3%). Conclusion RobotReviewer can help with risk of bias assessment of RCTs but cannot replace human evaluations. Thus, reviewers should check and validate RoB assessments from RobotReviewer by consulting the original article when not relevant supporting quotations are provided by RobotReviewer. This consultation is in line with the recommendation provided by the developers.

Comparing machine and human reviewers to evaluate the risk of bias in randomized controlled trials

Journal

RESEARCH SYNTHESIS METHODS

Publisher

WILEY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Comparing machine and human reviewers to evaluate the risk of bias in randomized controlled trials

Journal

RESEARCH SYNTHESIS METHODS

Publisher

WILEY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper