4.7 Article

MyriMatch: Highly accurate tandem mass spectral peptide identification by multivariate hypergeometric analysis

期刊

JOURNAL OF PROTEOME RESEARCH
卷 6, 期 2, 页码 654-661

出版社

AMER CHEMICAL SOC
DOI: 10.1021/pr0604054

关键词

proteomics; identification; statistical distribution; reversed database; peak filtering

资金

  1. NCI NIH HHS [R01 CA126218-01, R01 CA126218, U24 CA126479-01, U24 CA126479, 1R01 CA 126218-01, 1U24 CA 126479-01] Funding Source: Medline
  2. NHLBI NIH HHS [HL 071002, R01 HL071002] Funding Source: Medline
  3. NIEHS NIH HHS [P30 ES000267, P30 ES000267-40, P30 ES 000267] Funding Source: Medline

向作者/读者索取更多资源

Shotgun proteomics experiments are dependent upon database search engines to identify peptides from tandem mass spectra. Many of these algorithms score potential identifications by evaluating the number of fragment ions matched between each peptide sequence and an observed spectrum. These systems, however, generally do not distinguish between matching an intense peak and matching a minor peak. We have developed a statistical model to score peptide matches that is based upon the multivariate hypergeometric distribution. This scorer, part of the MyriMatch database search engine, places greater emphasis on matching intense peaks. The probability that the best match for each spectrum has occurred by random chance can be employed to separate correct matches from random ones. We evaluated this software on data sets from three different laboratories employing three different ion trap instruments. Employing a novel system for testing discrimination, we demonstrate that stratifying peaks into multiple intensity classes improves the discrimination of scoring. We compare MyriMatch results to those of Sequest and X!Tandem, revealing that it is capable of higher discrimination than either of these algorithms. When minimal peak filtering is employed, performance plummets for a scoring model that does not stratify matched peaks by intensity. On the other hand, we find that MyriMatch discrimination improves as more peaks are retained in each spectrum. MyriMatch also scales well to tandem mass spectra from high-resolution mass analyzers. These findings may indicate limitations for existing database search scorers that count matched peaks without differentiating them by intensity. This software and source code is available under Mozilla Public License at this URL: http://www.mc.vanderbilt.edu/msrc/bioinformatics/.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据