4.6 Article

Nanopore sequencing data analysis using Microsoft Azure cloud computing service

期刊

PLOS ONE
卷 17, 期 12, 页码 -

出版社

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pone.0278609

关键词

-

资金

  1. Innovation Grant (Microsoft Australia)

向作者/读者索取更多资源

Genetic information provides insights into the organism's exome, genome, epigenetics, and structural organization. The development of a cloud-based automated analysis pipeline using Microsoft Azure has been shown to improve the efficiency and cost-effectiveness of HLA genotyping services.
Genetic information provides insights into the exome, genome, epigenetics and structural organisation of the organism. Given the enormous amount of genetic information, scientists are able to perform mammoth tasks to improve the standard of health care such as determining genetic influences on outcome of allogeneic transplantation. Cloud based computing has increasingly become a key choice for many scientists, engineers and institutions as it offers on-demand network access and users can conveniently rent rather than buy all required computing resources. With the positive advancements of cloud computing and nanopore sequencing data output, we were motivated to develop an automated and scalable analysis pipeline utilizing cloud infrastructure in Microsoft Azure to accelerate HLA genotyping service and improve the efficiency of the workflow at lower cost. In this study, we describe (i) the selection process for suitable virtual machine sizes for computing resources to balance between the best performance versus cost effectiveness; (ii) the building of Docker containers to include all tools in the cloud computational environment; (iii) the comparison of HLA genotype concordance between the in-house manual method and the automated cloud-based pipeline to assess data accuracy. In conclusion, the Microsoft Azure cloud based data analysis pipeline was shown to meet all the key imperatives for performance, cost, usability, simplicity and accuracy. Importantly, the pipeline allows for the on-going maintenance and testing of version changes before implementation. This pipeline is suitable for the data analysis from MinION sequencing platform and could be adopted for other data analysis application processes.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据