4.7 Article

CORAL: A framework for rigorous self-validated data modeling and integrative, reproducible data analysis

期刊

GIGASCIENCE
卷 11, 期 -, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/gigascience/giac089

关键词

FAIR data; contexton; microtype; data management; provenance; data analysis; Jupyter

资金

  1. US Department of Energy, Office of Science, Office of Biological and Environmental Research [DE-AC02-05CH11231]

向作者/读者索取更多资源

In this article, a platform called CORAL is introduced, which greatly facilitates adherence to the FAIR principles. CORAL achieves interoperability and reusability of heterogeneous datasets by requiring data generators to extensively document the context for all data and maintaining that context throughout the entire analysis pipeline. Additionally, CORAL provides web and Jupyter notebook interfaces for data uploading, exploration, and analysis.
Background: Many organizations face challenges in managing and analyzing data, especially when relevant datasets arise from multiple sources and methods. Analyzing heterogeneous datasets and additional derived data requires rigorous tracking of their interrelationships and provenance. This task has long been a Grand Challenge of data science and has more recently been formalized in the FAIR principles: that all data objects be Findable, Accessible, Interoperable, and Reusable, both for machines and for people. Adherence to these principles is necessary for proper stewardship of information, for testing regulatory compliance, for measuring the efficiency of processes, and for facilitating reuse of data-analytical frameworks. Findings: We present the Contextual Ontology-based Repository Analysis Library (CORAL), a platform that greatly facilitates adherence to all 4 of the FAIR principles, including the especially difficult challenge of making heterogeneous datasets Interoperable and Reusable across all parts of a large, long-lasting organization. To achieve this, CORAL's data model requires that data generators extensively document the context for all data, and our tools maintain that context throughout the entire analysis pipeline. CORAL also features a web interface for data generators to upload and explore data, as well as a Jupyter notebook interface for data analysts, both backed by a common API. Conclusions: CORAL enables organizations to build FAIR data types on the fly as they are needed, avoiding the expense of bespoke data modeling. CORAL provides a uniquely powerful platform to enable integrative cross-dataset analyses, generating deeper insights than are possible using traditional analysis tools.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据