4.4 Article

The Reproducibility of Statistical Results in Psychological Research: An Investigation Using Unpublished Raw Data

期刊

PSYCHOLOGICAL METHODS
卷 26, 期 5, 页码 527-546

出版社

AMER PSYCHOLOGICAL ASSOC
DOI: 10.1037/met0000365

关键词

reanalysis; reproducible research; reporting errors; p values; transparency

资金

  1. KU Leuven Research Grant [C14/19/054]

向作者/读者索取更多资源

The study investigated the reproducibility of major statistical conclusions from 46 articles published in 2012 in three APA journals, finding that 70% of statistical claims were successfully reproduced, with 7% of originally significant claims no longer significant. Successfully reproduced results were often the outcome of cumbersome and time-consuming trial-and-error work, suggesting that APA reporting style may make verification of statistical results difficult or impossible.
We investigated the reproducibility of the major statistical conclusions drawn in 46 articles published in 2012 in three APA journals. After having identified 232 key statistical claims, we tried to reproduce, for each claim, the test statistic, its degrees of freedom, and the corresponding p value, starting from the raw data that were provided by the authors and closely following the Method section in the article. Out of the 232 claims, we were able to successfully reproduce 163 (70%), 18 of which only by deviating from the article's analytical description. Thirteen (7%) of the 185 claims deemed significant by the authors are no longer so. The reproduction successes were often the result of cumbersome and time-consuming trial-and-error work, suggesting that APA style reporting in conjunction with raw data makes numerical verification at least hard, if not impossible. This article discusses the types of mistakes we could identify and the tediousness of our reproduction efforts in the light of a newly developed taxonomy for reproducibility. We then link our findings with other findings of empirical research on this topic, give practical recommendations on how to achieve reproducibility, and discuss the challenges of large-scale reproducibility checks as well as promising ideas that could considerably increase the reproducibility of psychological research. Translational Abstract Reproducible findings, that are findings that can be verified by an independent researcher using the same data and repeating the exact same calculations, are a pillar of empirical scientific research. We investigated the reproducibility of the major statistical conclusions drawn in 46 scientific articles from 2012. After having identified over 200 key statistical conclusions drawn in those articles, we tried to reproduce, for each conclusion, the underlying statistical results starting from the raw data that were provided by the authors and closely following the descriptions of the article. We were unable to successfully reproduce the underlying statistical results for almost one third of the identified conclusions. Moreover, around 5% of these conclusions do no longer hold. Successfully reproduced conclusions were often the result of cumbersome and time-consuming trial-and-error work, suggesting that the prevailing reporting style in psychology makes verification of statistical results through an independent reanalysis at least hard, if not impossible. This work discusses the types of mistakes we could identify and the tediousness of our reproduction efforts in the light of a newly developed taxonomy for reproducibility. We then link our findings with other findings of empirical research on this topic, give practical recommendations on how to achieve reproducibility, and discuss the challenges of large-scale reproducibility checks as well as promising ideas that could considerably increase the reproducibility of psychological research.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据