4.7 Article

Bigmelon: tools for analysing large DNA methylation datasets

期刊

BIOINFORMATICS
卷 35, 期 6, 页码 981-986

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/bty713

关键词

-

资金

  1. Economic and Social Research Council [ES/N00812X/1]
  2. Essex University
  3. ESRC [ES/M008592/1]
  4. Medical Research Council [MR/K013807/1]
  5. ESRC [ES/N00812X/1, ES/M008592/1, ES/S008349/1] Funding Source: UKRI
  6. MRC [G1001799, MR/N01104X/2, MR/N01104X/1, MR/K013807/1] Funding Source: UKRI

向作者/读者索取更多资源

Motivation The datasets generated by DNA methylation analyses are getting bigger. With the release of the HumanMethylationEPIC micro-array and datasets containing thousands of samples, analyses of these large datasets using R are becoming impractical due to large memory requirements. As a result there is an increasing need for computationally efficient methodologies to perform meaningful analysis on high dimensional data. Results Here we introduce the bigmelon R package, which provides a memory efficient workflow that enables users to perform the complex, large scale analyses required in epigenome wide association studies (EWAS) without the need for large RAM. Building on top of the CoreArray Genomic Data Structure file format and libraries packaged in the gdsfmt package, we provide a practical workflow that facilitates the reading-in, preprocessing, quality control and statistical analysis of DNA methylation data. We demonstrate the capabilities of the bigmelon package using a large dataset consisting of 1193 human blood samples from the Understanding Society: UK Household Longitudinal Study, assayed on the EPIC micro-array platform. Availability and implementation The bigmelon package is available on Bioconductor (http://bioconductor.org/packages/bigmelon/). The Understanding Society dataset is available at https://www.understandingsociety.ac.uk/about/health/data upon request. Supplementary information Supplementary data are available at Bioinformatics online.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据