4.7 Article

Reanalysis of ProteomicsDB Using an Accurate, Sensitive, and Scalable False Discovery Rate Estimation Approach for Protein Groups

Journal

MOLECULAR & CELLULAR PROTEOMICS
Volume 21, Issue 12, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.mcpro.2022.100437

Keywords

-

Funding

  1. German Federal Ministry of Education and Research (BMBF) [031L0168]
  2. ERC Advanced Grant [833710]
  3. European Research Council (ERC) [833710] Funding Source: European Research Council (ERC)

Ask authors/readers for more resources

This article presents an extension to the Picked Protein FDR method that can handle protein groups, and introduces new strategies to obtain accurate FDR estimates. The validation analysis shows that the new method produces reliable protein group-level FDR estimates regardless of the dataset size.
Estimating false discovery rates (FDRs) of protein identi-fication continues to be an important topic in mass spectrometry-based proteomics, particularly when analyzing very large datasets. One performant method for this purpose is the Picked Protein FDR approach which is based on a target-decoy competition strategy on the protein level that ensures that FDRs scale to large data -sets. Here, we present an extension to this method that can also deal with protein groups, that is, proteins that share common peptides such as protein isoforms of the same gene. To obtain well-calibrated FDR estimates that preserve protein identification sensitivity, we introduce two novel ideas. First, the picked group target-decoy and second, the rescued subset grouping strategies. Using entrapment searches and simulated data for validation, we demonstrate that the new Picked Protein Group FDR method produces accurate protein group-level FDR esti-mates regardless of the size of the data set. The validation analysis also uncovered that applying the commonly used Occam's razor principle leads to anticonservative FDR estimates for large datasets. This is not the case for the Picked Protein Group FDR method. Reanalysis of deep proteomes of 29 human tissues showed that the new method identified up to 4% more protein groups than MaxQuant. Applying the method to the reanalysis of the entire human section of ProteomicsDB led to the identifi-cation of 18,000 protein groups at 1% protein group-level FDR. The analysis also showed that about 1250 genes were represented by >= 2 identified protein groups. To make the method accessible to the proteomics commu-nity, we provide a software tool including a graphical user interface that enables merging results from multiple MaxQuant searches into a single list of identified and quantified protein groups.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available