4.4 Article

SUBSAMPLING METHODS FOR GENOMIC INFERENCE

期刊

ANNALS OF APPLIED STATISTICS
卷 4, 期 4, 页码 1660-1697

出版社

INST MATHEMATICAL STATISTICS
DOI: 10.1214/10-AOAS363

关键词

Genome Structure Correction (GSC); subsampling; piecewise stationary model; segmentation-block bootstrap; feature overlap

资金

  1. NIH [U01 HG004695, 5R01GM075312]

向作者/读者索取更多资源

Large-scale statistical analysis of data sets associated with genome sequences plays an important role in modern biology. A key component of such statistical analyses is the computation of p-values and confidence bounds for statistics defined on the genome. Currently such computation is commonly achieved through ad hoc simulation measures. The method of randomization, which is at the heart of these simulation procedures, can significantly affect the resulting statistical conclusions. Most simulation schemes introduce a variety of hidden assumptions regarding the nature of the randomness in the data, resulting in a failure to capture biologically meaningful relationships. To address the need for a method of assessing the significance of observations within large scale genomic studies, where there often exists a complex dependency structure between observations, we propose a unified solution built upon a data subsampling approach. We propose a piecewise stationary model for genome sequences and show that the subsampling approach gives correct answers under this model. We illustrate the method on three simulation studies and two real data examples.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据