4.6 Article

A statistical methodology for analyzing co-occurrence data from a large sample

期刊

JOURNAL OF BIOMEDICAL INFORMATICS
卷 40, 期 3, 页码 343-352

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.jbi.2006.11.003

关键词

associations; co-occurrence; two-way tables; volume test adjustments; p-value plot; large-scale testing

资金

  1. NLM NIH HHS [R01 LM006910-04, R01 LM006910-06, R01 LM006910, R01 LM006910-05, R01 LM06910] Funding Source: Medline

向作者/读者索取更多资源

Determining important associations among items in a large database is challenging due to multiple simultaneous hypotheses and the ability to select weak associations that are statistically but not clinically significant. The simple application of the 2 test among all possible pairs of items results in mostly inappropriate associations surpassing the traditional (alpha =.05, chi(2) = 3.94) threshold. One can choose a stricter threshold to find stronger associations, but the choice may be arbitrary. We combined the volume test of Diaconis and Efron with 2 a p-value plot to select a more rigorous and less arbitrary threshold. The volume test adjusts the p-value of the Z(2) -statistic. A plot of adjusted p-values (1-p versus N-p), where N-p is the number of test statistics with a p-value greater than p, should be linear if there are no true associations. The point where the plot deviates from a line can be used as a threshold. We used linear regression to select the threshold in a reproducible fashion. In one experiment, we found that the method selected a threshold similar to that previously obtained by manually reviewing associations. (C) 2006 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据