期刊
BMC BIOINFORMATICS
卷 18, 期 -, 页码 -出版社
BIOMED CENTRAL LTD
DOI: 10.1186/s12859-016-1455-1
关键词
Omics data; Data integration; Data infrastructure; Data organization; R
类别
资金
- Spanish Ministry of Economy and Competitiveness [MTM2015-68140-R]
- European Community's Seventh Framework Programme (FP7) - the HELIX project [308333]
- Catalan Government [016FI_B 00272]
Background: Reduction in the cost of genomic assays has generated large amounts of biomedical-related data. As a result, current studies perform multiple experiments in the same subjects. While Bioconductor's methods and classes implemented in different packages manage individual experiments, there is not a standard class to properly manage different omic datasets from the same subjects. In addition, most R/Bioconductor packages that have been designed to integrate and visualize biological data often use basic data structures with no clear general methods, such as subsetting or selecting samples. Results: To cover this need, we have developed MultiDataSet, a new R class based on Bioconductor standards, designed to encapsulate multiple data sets. MultiDataSet deals with the usual difficulties of managing multiple and non-complete data sets while offering a simple and general way of subsetting features and selecting samples. We illustrate the use of MultiDataSet in three common situations: 1) performing integration analysis with third party packages; 2) creating new methods and functions for omic data integration; 3) encapsulating new unimplemented data from any biological experiment. Conclusions: MultiDataSet is a suitable class for data integration under R and Bioconductor framework.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据