☆ 4.7 Article

Boosting MS1-only Proteomics with Machine Learning Allows 2000 Protein Identifications in Single-Shot Human Proteome Analysis Using 5 min HPLC Gradient

JOURNAL OF PROTEOME RESEARCH (2021)

期刊

JOURNAL OF PROTEOME RESEARCH

卷 20, 期 4, 页码 1864-1873

出版社

AMER CHEMICAL SOC

DOI: 10.1021/acs.jproteome.0c00863

关键词

protein identification; mass spectrometry; high-throughput proteomics; machine learning; retention time prediction

类别

Biochemical Research Methods

资金

Russian Science Foundation [20-14-00229]
VILLUM Foundation [7292]
PRO-MS: Danish National Mass Spectrometry Platform for Functional Proteomics [5072-00007B]
Russian Science Foundation [20-14-00229] Funding Source: Russian Science Foundation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The study introduced a new fast proteomic method, DirectMS1, which, combined with machine-learning algorithm and other tools, successfully identified a large number of proteins efficiently in a short period of time.

Proteome-wide analyses rely on tandem mass spectrometry and the extensive separation of proteolytic mixtures. This imposes considerable instrumental time consumption, which is one of the main obstacles in the broader acceptance of proteomics in biomedical and clinical research. Recently, we presented a fast proteomic method termed DirectMS1 based on ultrashort LC gradients as well as MS1-only mass spectra acquisition and data processing. The method allows significant reduction of the proteome-wide analysis time to a few minutes at the depth of quantitative proteome coverage of 1000 proteins at 1% false discovery rate (FDR). In this work, to further increase the capabilities of the DirectMS1 method, we explored the opportunities presented by the recent progress in the machine-learning area and applied the LightGBM decision tree boosting algorithm to the scoring of peptide feature matches when processing MS1 spectra. Furthermore, we integrated the peptide feature identification algorithm of DirectMS1 with the recently introduced peptide retention time prediction utility, DeepLC. Additional approaches to improve the performance of the DirectMS1 method are discussed and demonstrated, such as using FAIMS for gas-phase ion separation. As a result of all improvements to DirectMS1, we succeeded in identifying more than 2000 proteins at 1% FDR from the HeLa cell line in a 5 min gradient LC-FAIMS/MS1 analysis. The data sets generated and analyzed during the current study have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the data set identifier PXD023977.

Boosting MS1-only Proteomics with Machine Learning Allows 2000 Protein Identifications in Single-Shot Human Proteome Analysis Using 5 min HPLC Gradient

期刊

JOURNAL OF PROTEOME RESEARCH

出版社

AMER CHEMICAL SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Boosting MS1-only Proteomics with Machine Learning Allows 2000 Protein Identifications in Single-Shot Human Proteome Analysis Using 5 min HPLC Gradient

期刊

JOURNAL OF PROTEOME RESEARCH

出版社

AMER CHEMICAL SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文