4.8 Article

iProX in 2021: connecting proteomics data sharing with big data

期刊

NUCLEIC ACIDS RESEARCH
卷 50, 期 D1, 页码 D1522-D1527

出版社

OXFORD UNIV PRESS
DOI: 10.1093/nar/gkab1081

关键词

-

资金

  1. National Key Research Program of China [2020YFE0202200]
  2. Innovation special zone [18-163-15-ZT-001-006-07]
  3. Program for Guangdong Introducing Innovative and Entrepreneurial Teams [2016ZT06D211]
  4. National Natural Science Foundation of China [U1811461]
  5. Innovation project [16CXZ027]

向作者/读者索取更多资源

The iProX integrated proteome resource has been greatly improved with an up-to-date big data platform to support large-scale data storage, efficient querying, and reanalysis, meeting the demands of the rapidly growing field of proteomics.
The rapid development of proteomics studies has resulted in large volumes of experimental data. The emergence of big data platform provides the opportunity to handle these large amounts of data. The integrated proteome resource, iProX (https://www.iprox.cn), which was initiated in 2017, has been greatly improved with an up-to-date big data platform implemented in 2021. Here, we describe the main iProX developments since its first publication in Nucleic Acids Research in 2019. First, a hyper-converged architecture with high scalability supports the submission process. A hadoop cluster can store large amounts of proteomics datasets, and a distributed, RESTful-styled Elastic Search engine can query millions of records within one second. Also, several new features, including the Universal Spectrum Identifier (USI) mechanism proposed by ProteomeXchange, RESTful Web Service API, and a high-efficiency reanalysis pipeline, have been added to iProX for better open data sharing. By the end of August 2021, 1526 datasets had been submitted to iProX, reaching a total data volume of 92.42TB. With the implementation of the big data platform, iProX can support PB-level data storage, hundreds of billions of spectra records, and second-level latency service capabilities that meet the requirements of the fast growing field of proteomics.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据