4.7 Article

Probability-based pattern recognition and statistical framework for randomization: modeling tandem mass spectrum/peptide sequence false match frequencies

向作者/读者索取更多资源

Motivation: In proteomics, reverse database searching is used to control the false match frequency for tandem mass spectrum/peptide sequence matches, but reversal creates sequences devoid of patterns that usually challenge database-search software. Results: We designed an unsupervised pattern recognition algorithm for detecting patterns with various lengths from large sequence datasets. The patterns found in a protein sequence database were used to create decoy databases using a Monte Carlo sampling algorithm. Searching these decoy databases led to the prediction of false positive rates for spectrum/peptide sequence matches. We show examples where this method, independent of instrumentation, database-search software and samples, provides better estimation of false positive identification rates than a prevailing reverse database searching method. The pattern detection algorithm can also be used to analyze sequences for other purposes in biology or cryptology. Availability: On request from the authors. Contact: Bret.Cooper@ars.usda.gov Supplementary information: http://bioinformatics.psb.ugent.be/.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据