☆ 4.6 Article Proceedings Paper

Hybrid cloud and cluster computing paradigms for life science applications

BMC BIOINFORMATICS (2010)

期刊

BMC BIOINFORMATICS

卷 11, 期 -, 页码 -

出版社

BIOMED CENTRAL LTD

DOI: 10.1186/1471-2105-11-S12-S3

关键词

类别

Biochemical Research Methods Biotechnology & Applied Microbiology Mathematical & Computational Biology

资金

NHGRI NIH HHS [RC2HG005806-02] Funding Source: Medline

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Background: Clouds and MapReduce have shown themselves to be a broadly useful approach to scientific computing especially for parallel data intensive applications. However they have limited applicability to some areas such as data mining because MapReduce has poor performance on problems with an iterative structure present in the linear algebra that underlies much data analysis. Such problems can be run efficiently on clusters using MPI leading to a hybrid cloud and cluster environment. This motivates the design and implementation of an open source Iterative MapReduce system Twister. Results: Comparisons of Amazon, Azure, and traditional Linux and Windows environments on common applications have shown encouraging performance and usability comparisons in several important non iterative cases. These are linked to MPI applications for final stages of the data analysis. Further we have released the open source Twister Iterative MapReduce and benchmarked it against basic MapReduce (Hadoop) and MPI in information retrieval and life sciences applications. Conclusions: The hybrid cloud (MapReduce) and cluster (MPI) approach offers an attractive production environment while Twister promises a uniform programming environment for many Life Sciences applications. Methods: We used commercial clouds Amazon and Azure and the NSF resource FutureGrid to perform detailed comparisons and evaluations of different approaches to data intensive computing. Several applications were developed in MPI, MapReduce and Twister in these different environments.

Hybrid cloud and cluster computing paradigms for life science applications

期刊

BMC BIOINFORMATICS

出版社

BIOMED CENTRAL LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Hybrid cloud and cluster computing paradigms for life science applications

期刊

BMC BIOINFORMATICS

出版社

BIOMED CENTRAL LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文