4.7 Article Proceedings Paper

pNovo 3: precise de novo peptide sequencing using a learning-to-rank framework

期刊

BIOINFORMATICS
卷 35, 期 14, 页码 I183-I190

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btz366

关键词

-

资金

  1. National Key Research and Development Program of China [2016YFA0501300]
  2. National Natural Science Foundation of China [31470805]
  3. Youth Innovation Promotion Association CAS [2014091]
  4. National High Technology Research and Development Program of China (863) [2014AA020902, 2014AA020901]

向作者/读者索取更多资源

Motivation De novo peptide sequencing based on tandem mass spectrometry data is the key technology of shotgun proteomics for identifying peptides without any database and assembling unknown proteins. However, owing to the low ion coverage in tandem mass spectra, the order of certain consecutive amino acids cannot be determined if all of their supporting fragment ions are missing, which results in the low precision of de novo sequencing. Results In order to solve this problem, we developed pNovo 3, which used a learning-to-rank framework to distinguish similar peptide candidates for each spectrum. Three metrics for measuring the similarity between each experimental spectrum and its corresponding theoretical spectrum were used as important features, in which the theoretical spectra can be precisely predicted by the pDeep algorithm using deep learning. On seven benchmark datasets from six diverse species, pNovo 3 recalled 29-102% more correct spectra, and the precision was 11-89% higher than three other state-of-the-art de novo sequencing algorithms. Furthermore, compared with the newly developed DeepNovo, which also used the deep learning approach, pNovo 3 still identified 21-50% more spectra on the nine datasets used in the study of DeepNovo. In summary, the deep learning and learning-to-rank techniques implemented in pNovo 3 significantly improve the precision of de novo sequencing, and such machine learning framework is worth extending to other related research fields to distinguish the similar sequences. Availability and implementation pNovo 3 can be freely downloaded from http://pfind.ict.ac.cn/software/pNovo/index.html. Supplementary information Supplementary data are available at Bioinformatics online.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据