期刊
BIOINFORMATICS
卷 23, 期 3, 页码 277-280出版社
OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btl595
关键词
-
类别
资金
- NIGMS NIH HHS [R01 GM076680-01A1] Funding Source: Medline
- Direct For Biological Sciences
- Div Of Biological Infrastructure [0839970] Funding Source: National Science Foundation
Motivation: Tandem mass-spectrometry of trypsin digests, followed by database searching, is one of the most popular approaches in high-throughput proteomics studies. Peptides are considered identified if they pass certain scoring thresholds. To avoid false positive protein identification, >= 2 unique peptides identified within a single protein are generally recommended. Still, in a typical high-throughput experiment, hundreds of proteins are identified only by a single peptide. We introduce here a method for distinguishing between true and false identifications among single-hit proteins. The approach is based on randomized database searching and usage of logistic regression models with cross-validation. This approach is implemented to analyze three bacterial samples enabling recovery 68-98% of the correct single-hit proteins with an error rate of < 2%. This results in a 22-65% increase in number of identified proteins. Identifying true single-hit proteins will lead to discovering many crucial regulators, biomarkers and other low abundance proteins.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据