4.7 Article

JUMPg: An Integrative Proteogenomics Pipeline Identifying Unannotated Proteins in Human Brain and Cancer Cells

Journal

JOURNAL OF PROTEOME RESEARCH
Volume 15, Issue 7, Pages 2309-2320

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/acs.jproteome.6b00344

Keywords

genomics; proteomics; mass spectrometry; proteogenomics; RNA-seq; database search; multistage analysis; spectrum quality control

Funding

  1. National Institutes of Health [R01AG047928, R01GM114260, U24NS072026, P30AG19610]
  2. Arizona Department of Health Services [211002]
  3. Arizona Biomedical Research Commission [4001, 0011, 05-901, 1001]
  4. Michael J. Fox Foundation
  5. ALSAC (American Lebanese Syrian Associated Charities)
  6. NIH Cancer Center [P30CA021765]

Ask authors/readers for more resources

Proteogenomics is an emerging approach to improve gene annotation and interpretation of proteomics data. Here we present JUMPg, an integrative proteogenomics pipeline including customized database construction, tag-based database search, peptide-spectrum match filtering, and data visualization. JUMPg creates multiple databases of DNA polymorphisms, mutations, splice junctions, partially trypticity, as well as protein fragments translated from the whole transcriptome in all six frames upon RNA-seq de novo assembly. We use a multistage strategy to search these databases sequentially, in which the performance is optimized by re-searching only unmatched high-quality spectra and reusing amino acid tags generated by the JUMP search engine. The identified peptides/proteins are displayed with gene loci using the UCSC genome browser. Then, the JUMPg program is applied to process a label-free mass spectrometry data set of Alzheimer's disease postmortem brain, uncovering 496 new peptides of amino acid substitutions, alternative splicing, frame shift, and non-coding gene translation. The novel protein.PN.MA6BL specifically expressed in the brain is highlighted. We also tested JUMPg to analyze a stable-isotope labeled data set of multiple myeloma cells, revealing 991 sample-specific peptides that include protein sequences in the immunoglobulin light chain variable region. Thus, the JUMPg program is an effective proteogenomics tool for multiomics data integration.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available