4.7 Article

Group-walk: a rigorous approach to group-wise false discovery rate analysis by target-decoy competition

Journal

BIOINFORMATICS
Volume 38, Issue -, Pages ii82-ii88

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btac471

Keywords

-

Funding

  1. [ECCB2022]

Ask authors/readers for more resources

Target-decoy competition (TDC) is a commonly used method to control false discovery rate (FDR) in tandem mass spectrometry data analysis. However, the effectiveness of TDC depends on the homogeneity of the data. We developed Group-walk, a procedure that controls FDR in TDC by considering the group structure, leading to substantial power gains.
Motivation: Target-decoy competition (TDC) is a commonly used method for false discovery rate (FDR) control in the analysis of tandem mass spectrometry data. This type of competition-based FDR control has recently gained significant popularity in other fields after Barber and Candes laid its theoretical foundation in a more general setting that included the feature selection problem. In both cases, the competition is based on a head-to-head comparison between an (observed) target score and a corresponding decoy (knockoff) score. However, the effectiveness of TDC depends on whether the data are homogeneous, which is often not the case: in many settings, the data consist of groups with different score profiles or different proportions of true nulls. In such cases, applying TDC while ignoring the group structure often yields imbalanced lists of discoveries, where some groups might include relatively many false discoveries and other groups include relatively very few. On the other hand, as we show, the alternative approach of applying TDC separately to each group does not rigorously control the FDR. Results: We developed Group-walk, a procedure that controls the FDR in the target-decoy/knockoff setting while taking into account a given group structure. Group-walk is derived from the recently developed AdaPT-a general framework for controlling the FDR with side-information. We show using simulated and real datasets that when the data naturally divide into groups with different characteristics Group-walk can deliver consistent power gains that in some cases are substantial. These groupings include the precursor charge state (4% more discovered peptides at 1% FDR threshold), the peptide length (3.6% increase) and the mass difference due to modifications (26% increase). Availability and implementation Group-walk is available at . Supplementary information are available at Bioinformatics online.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available