4.7 Article

Characterizing the properties of bisulfite sequencing data: maximizing power and sensitivity to identify between-group differences in DNA methylation

期刊

BMC GENOMICS
卷 22, 期 1, 页码 -

出版社

BMC
DOI: 10.1186/s12864-021-07721-z

关键词

DNA methylation; Bisulfite sequencing; RRBS; Epigenetics; Power; Read depth; Sample size

资金

  1. BBSRC CASE PhD studentship
  2. UK Medical Research Council [MR/R005176/1]
  3. Medical Research Council (MRC) Proximity to Discovery: Industry Engagement Fund (Precision Medicine Exeter Innovation Platform) [MC_PC_14127]
  4. Alzheimer's Research UK [ARUK-PG2018B-016]
  5. Wellcome Trust Multi User Equipment Award [WT101650MA]
  6. Medical Research Council (MRC) Clinical Infrastructure Funding [MR/M008924/1]

向作者/读者索取更多资源

The combination of sodium bisulfite treatment with highly-parallel sequencing is commonly used to quantify DNA methylation levels. Factors such as read depth, sample size, and DNA methylation differences between groups all influence the power to detect differences. A tool called POWEREDBiSeq has been developed to predict study-specific power for identifying DNA methylation differences, taking into account read depth filtering parameters and sample size requirements.
Background The combination of sodium bisulfite treatment with highly-parallel sequencing is a common method for quantifying DNA methylation across the genome. The power to detect between-group differences in DNA methylation using bisulfite-sequencing approaches is influenced by both experimental (e.g. read depth, missing data and sample size) and biological (e.g. mean level of DNA methylation and difference between groups) parameters. There is, however, no consensus about the optimal thresholds for filtering bisulfite sequencing data with implications for the reproducibility of findings in epigenetic epidemiology. Results We used a large reduced representation bisulfite sequencing (RRBS) dataset to assess the distribution of read depth across DNA methylation sites and the extent of missing data. To investigate how various study variables influence power to identify DNA methylation differences between groups, we developed a framework for simulating bisulfite sequencing data. As expected, sequencing read depth, group size, and the magnitude of DNA methylation difference between groups all impacted upon statistical power. The influence on power was not dependent on one specific parameter, but reflected the combination of study-specific variables. As a resource to the community, we have developed a tool, POWEREDBiSeq, which utilizes our simulation framework to predict study-specific power for the identification of DNAm differences between groups, taking into account user-defined read depth filtering parameters and the minimum sample size per group. Conclusions Our data-driven approach highlights the importance of filtering bisulfite-sequencing data by minimum read depth and illustrates how the choice of threshold is influenced by the specific study design and the expected differences between groups being compared. The POWEREDBiSeq tool, which can be applied to different types of bisulfite sequencing data (e.g. RRBS, whole genome bisulfite sequencing (WGBS), targeted bisulfite sequencing and amplicon-based bisulfite sequencing), can help users identify the level of data filtering needed to optimize power and aims to improve the reproducibility of bisulfite sequencing studies.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据