4.7 Article

Predicting Intensity Ranks of Peptide Fragment Ions

Journal

JOURNAL OF PROTEOME RESEARCH
Volume 8, Issue 5, Pages 2226-2240

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/pr800677f

Keywords

MS/MS; peptide; fragmentation; prediction; machine learning; ranking; boosting; MRM

Funding

  1. National Center for Research Resources of the NIH [P-41-111124851]
  2. UCSD FWGrid Project
  3. NSF Research Infrastructure [NSF EIA-0303622]

Ask authors/readers for more resources

Accurate modeling of peptide fragmentation is necessary for the development of robust scoring functions for peptide-spectrum matches, which are the cornerstone of MS/MS-based identification algorithms. Unfortunately, peptide fragmentation is a complex process that can involve several competing chemical pathways, which makes it difficult to develop generative probabilistic models that describe it accurately. However, the vast amounts of MS/MS data being generated now make it possible to use data-driven machine learning methods to develop discriminative ranking-based models that predict the intensity ranks of a peptide's fragment ions. We use simple sequence-based features that get combined by a boosting algorithm into models that make peak rank predictions with high accuracy. In an accompanying manuscript, we demonstrate how these prediction models are used to significantly improve the performance of peptide identification algorithms. The models can also be useful in the design of optimal multiple reaction monitoring (MRM) transitions, in cases where there is insufficient experimental data to guide the peak selection process. The prediction algorithm can also be run independently through PepNovo+, which is available for download from http://bix.ucsd.edu/Software/PepNovo.html.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available