4.6 Article

Correlated z-Values and the Accuracy of Large-Scale Statistical Estimates

期刊

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
卷 105, 期 491, 页码 1042-1055

出版社

AMER STATISTICAL ASSOC
DOI: 10.1198/jasa.2010.tm09129

关键词

Acceleration; Correlation penalty; Empirical process; Mehler's identity; Nonnull z-values; Rms correlation

资金

  1. NIH [8R01 EB002784]
  2. NSF [DMS0505673]
  3. Division Of Mathematical Sciences
  4. Direct For Mathematical & Physical Scien [0854973] Funding Source: National Science Foundation

向作者/读者索取更多资源

We consider large-scale studies in which there are hundreds or thousands of correlated cases to investigate, each represented by its own normal variate, typically a z-value. A familiar example is provided by a microarray experiment comparing healthy with sick subjects' expression levels for thousands of genes. This paper concerns the accuracy of summary statistics for the collection of normal variates, such as their empirical cdf or a false discovery rate statistic. It seems like we must estimate an N by N correlation matrix, N the number of cases, but our main result shows that this is not necessary: good accuracy approximations can be based on the root mean square correlation over all N . (N - 1)/2 pairs, a quantity often easily estimated. A second result shows that z-values closely follow normal distributions even under nonnull conditions, supporting application of the main theorem. Practical application of the theory is illustrated for a large leukemia microarray study.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据