4.7 Article

BigDataScript: a scripting language for data pipelines

期刊

BIOINFORMATICS
卷 31, 期 1, 页码 10-16

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btu595

关键词

-

资金

  1. NIH [T2DGENES, U01 DK085545-01]
  2. NSERC Discovery
  3. McGill CIHR Systems Biology training program
  4. Fonds de la Recherche en Sante du Quebec and a New Investigator
  5. Canadian Institutes of Health Research

向作者/读者索取更多资源

Motivation: The analysis of large biological datasets often requires complex processing pipelines that run for a long time on large computational infrastructures. We designed and implemented a simple script-like programming language with a clean and minimalist syntax to develop and manage pipeline execution and provide robustness to various types of software and hardware failures as well as portability. Results: We introduce the BigDataScript (BDS) programming language for data processing pipelines, which improves abstraction from hardware resources and assists with robustness. Hardware abstraction allows BDS pipelines to run without modification on a wide range of computer architectures, from a small laptop to multi-core servers, server farms, clusters and clouds. BDS achieves robustness by incorporating the concepts of absolute serialization and lazy processing, thus allowing pipelines to recover from errors. By abstracting pipeline concepts at programming language level, BDS simplifies implementation, execution and management of complex bioinformatics pipelines, resulting in reduced development and debugging cycles as well as cleaner code.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据