Journal
JOURNAL OF PROTEOME RESEARCH
Volume 9, Issue 4, Pages 1716-1726Publisher
AMER CHEMICAL SOC
DOI: 10.1021/pr900850m
Keywords
mutation; bioinformatics; hydroxyproline; sequence tagging
Categories
Funding
- NIH [R01 CA126218, U24 CA126479]
Ask authors/readers for more resources
Shotgun proteomics produces collections of tandem mass spectra that contain all the data needed to identify mutated peptides from clinical samples. Identifying these sequence variations, however, has not been feasible with conventional database search strategies, which require exact matches between observed and expected sequences. Searching for mutations as mass shifts on specified residues through database search can incur significant performance penalties and generate substantial false positive rates. Here we describe TagRecon, an algorithm that leverages inferred sequence tags to identify unanticipated mutations in clinical proteomic data sets. TagRecon identifies unmodified peptides as sensitively as the related MyriMatch database search engine. In both LTQ and Orbitrap data sets, TagRecon outperformed state of the art software in recognizing sequence mismatches from data sets with known variants. We developed guidelines for filtering putative mutations from clinical samples, and we applied them in an analysis of cancer cell lines and an examination of colon tissue. Mutations were found in up to 6% of identified peptides, and only a small fraction corresponded to dbSNP entries. The RKO cell line, which is DNA mismatch repair deficient, yielded more mutant peptides than the mismatch repair proficient SW480 line. Analysis of colon cancer tumor and adjacent tissue revealed hydroxyproline modifications associated with extracellular matrix degradation. These results demonstrate the value of using sequence tagging algorithms to fully interrogate clinical proteomic data sets.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available