3.9 Article

The practical effect of batch on genomic prediction

出版社

WALTER DE GRUYTER GMBH
DOI: 10.1515/1544-6115.1766

关键词

batch effects; prediction; microarrays; reproducibility; research design

资金

  1. NCRR NIH HHS [R01 RR021967] Funding Source: Medline
  2. NIGMS NIH HHS [R01 GM103552, R01 GM083084] Funding Source: Medline

向作者/读者索取更多资源

Measurements from microarrays and other high-throughput technologies are susceptible to non-biological artifacts like batch effects. It is known that batch effects can alter or obscure the set of significant results and biological conclusions in high-throughput studies. Here we examine the impact of batch effects on predictors built from genomic technologies. To investigate batch effects, we collected publicly available gene expression measurements with known outcomes, and estimated batches using date. Using these data we show (1) the impact of batch effects on prediction depends on the correlation between outcome and batch in the training data, and (2) removing expression measurements most affected by batch before building predictors may improve the accuracy of those predictors. These results suggest that (1) training sets should be designed to minimize correlation between batches and outcome, and (2) methods for identifying batch-affected probes should be developed to improve prediction results for studies with high correlation between batches and outcome.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.9
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据