4.6 Article

An evaluation of statistical methods for DNA methylation microarray data analysis

期刊

BMC BIOINFORMATICS
卷 16, 期 -, 页码 -

出版社

BMC
DOI: 10.1186/s12859-015-0641-x

关键词

DNA methylation; Power; Stability

资金

  1. University of Rochester's Clinical and Translational Science Award (CTSA) from National Center for Advancing Translational Sciences of National Institutes of Health [UL1 TR000042]
  2. Center for Biomedical Research Excellence (COBRE) [P20GM103516]

向作者/读者索取更多资源

Background: DNA methylation offers an excellent example for elucidating how epigenetic information affects gene expression. beta values and M values are commonly used to quantify DNA methylation. Statistical methods applicable to DNA methylation data analysis span a number of approaches such as Wilcoxon rank sum test, t-test, Kolmogorov-Smirnov test, permutation test, empirical Bayes method, and bump hunting method. Nonetheless, selection of an optimal statistical method can be challenging when different methods generate inconsistent results from the same data set. Results: We compared six statistical approaches relevant to DNA methylation microarray analysis in terms of false discovery rate control, statistical power, and stability through simulation studies and real data examples. Observable differences were noticed between beta values and M values only when methylation levels were correlated across CpG loci. For small sample size (n = 3 or 6 in each group), both the empirical Bayes and bump hunting methods showed appropriate FDR control and the highest power when methylation levels across CpG loci were independent. Only the bump hunting method showed appropriate FDR control and the highest power when methylation levels across CpG sites were correlated. For medium (n = 12 in each group) and large sample sizes (n = 24 in each group), all methods compared had similar power, except for the permutation test whenever the proportion of differentially methylated loci was low. For all sample sizes, the bump hunting method had the lowest stability in terms of standard deviation of total discoveries whenever the proportion of differentially methylated loci was large. The apparent test power comparisons based on raw p-values from DNA methylation studies on ovarian cancer and rheumatoid arthritis provided results as consistent as those obtained in the simulation studies. Overall, these results provide guidance for optimal statistical methods selection under different scenarios. Conclusions: For DNA methylation studies with small sample size, the bump hunting method and the empirical Bayes method are recommended when DNA methylation levels across CpG loci are independent, while only the bump hunting method is recommended when DNA methylation levels are correlated across CpG loci. All methods are acceptable for medium or large sample sizes.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据