4.7 Article

Comparing Top-Down Proteoform Identification: Deconvolution, PrSM Overlap, and PTM Detection

Journal

JOURNAL OF PROTEOME RESEARCH
Volume 22, Issue 7, Pages 2199-2217

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/acs.jproteome.2c00673

Keywords

bioinformatics; deconvolution; identificationalgorithms; post-translational modifications; proteoforms; top-down proteomics

Ask authors/readers for more resources

This study evaluates four top algorithms for top-down identification in their yield of PrSMs while controlling false discovery rate. The study finds that these algorithms perform well in generating PrSMs, but approximately half of the identified proteoforms are specific to only one algorithm. Deconvolution algorithms show inconsistency in precursor charges and mass determinations, contributing to identification variability. Detection of post-translational modifications is also inconsistent among algorithms. Applying multiple search engines provides more comprehensive assessments of experiments, and there is a need for greater interoperability of top-down algorithms.
Generating top-down tandem mass spectra (MS/MS) fromcomplex mixturesof proteoforms benefits from improvements in fractionation, separation,fragmentation, and mass analysis. The algorithms to match MS/MS tosequences have undergone a parallel evolution, with both spectralalignment and match-counting approaches producing high-quality proteoform-spectrummatches (PrSMs). This study assesses state-of-the-art algorithms fortop-down identification (ProSight PD, TopPIC, MSPathFinderT, and pTop)in their yield of PrSMs while controlling false discovery rate. Weevaluated deconvolution engines (ThermoFisher Xtract, Bruker AutoMSn,Matrix Science Mascot Distiller, TopFD, and FLASHDeconv) in both ThermoFisherOrbitrap-class and Bruker maXis Q-TOF data (PXD033208) to produceconsistent precursor charges and mass determinations. Finally, wesought post-translational modifications (PTMs) in proteoforms frombovine milk (PXD031744) and human ovarian tissue. Contemporary identificationworkflows produce excellent PrSM yields, although approximately halfof all identified proteoforms from these four pipelines were specificto only one workflow. Deconvolution algorithms disagree on precursormasses and charges, contributing to identification variability. Detectionof PTMs is inconsistent among algorithms. In bovine milk, 18% of PrSMsproduced by pTop and TopMG were singly phosphorylated, but this percentagefell to 1% for one algorithm. Applying multiple search engines producesmore comprehensive assessments of experiments. Top-down algorithmswould benefit from greater interoperability.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available