4.5 Article

Comparing machine and human reviewers to evaluate the risk of bias in randomized controlled trials

Journal

RESEARCH SYNTHESIS METHODS
Volume 11, Issue 3, Pages 484-493

Publisher

WILEY
DOI: 10.1002/jrsm.1398

Keywords

artificial intelligence; health technology assessment (HTA); inter-rater reliability; randomized controlled trial; risk of bias; systematic review

Funding

  1. Alberta Innovates - Health Solutions
  2. Canadian Institutes of Health Research
  3. Government of Alberta
  4. Institute of Health Economics
  5. Physiotherapy Foundation of Canada
  6. Knowledge translation initiative grant

Ask authors/readers for more resources

Background Evidence from new health technologies is growing, along with demands for evidence to inform policy decisions, creating challenges in completing health technology assessments (HTAs)/systematic reviews (SRs) in a timely manner. Software can decrease the time and burden by automating the process, but evidence validating such software is limited. We tested the accuracy of RobotReviewer, a semi-autonomous risk of bias (RoB) assessment tool, and its agreement with human reviewers. Methods Two reviewers independently conducted RoB assessments on a sample of randomized controlled trials (RCTs), and their consensus ratings were compared with those generated by RobotReviewer. Agreement with the human reviewers was assessed using percent agreement and weighted kappa (kappa). The accuracy of RobotReviewer was also assessed by calculating the sensitivity, specificity, and area under the curve in comparison to the consensus agreement of the human reviewers. Results The study included 372 RCTs. Inter-rater reliability ranged from kappa = -0.06 (no agreement) for blinding of participants and personnel to kappa = 0.62 (good agreement) for random sequence generation (excluding overall RoB). RobotReviewer was found to use a high percentage of irrelevant supporting quotations to complement RoB assessments for blinding of participants and personnel (72.6%), blinding of outcome assessment (70.4%), and allocation concealment (54.3%). Conclusion RobotReviewer can help with risk of bias assessment of RCTs but cannot replace human evaluations. Thus, reviewers should check and validate RoB assessments from RobotReviewer by consulting the original article when not relevant supporting quotations are provided by RobotReviewer. This consultation is in line with the recommendation provided by the developers.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available