4.7 Article

Validating Amino Acid Variants in Proteogenomics Using Sequence Coverage by Multiple Reads

期刊

JOURNAL OF PROTEOME RESEARCH
卷 21, 期 6, 页码 1438-1448

出版社

AMER CHEMICAL SOC
DOI: 10.1021/acs.jproteome.2c00033

关键词

proteogenomics; single nucleotide variant; single amino acid variant; missense mutation; shotgun proteomics; data-dependent acquisition; SNP calling; false discovery rate; protease

资金

  1. Russian Science Foundation [20-15-00072]
  2. Russian Science Foundation [20-15-00072] Funding Source: Russian Science Foundation

向作者/读者索取更多资源

This study proposes a method for interpreting proteomics data by determining the protein sequence coverage using multiple reads. The approach improves the reliability of peptide variant identification and reveals single amino acid variants in the HEK-293 cell line.
Mass spectrometry-based proteome analysis implies matching the mass spectra of proteolytic peptides to amino acid sequences predicted from genomic sequences. Reliability of peptide variant identification in proteogenomic studies is often lacking. We propose a way to interpret shotgun proteomics results, specifically in the data-dependent acquisition mode, as protein sequence coverage by multiple reads as it is done in nucleic acid sequencing for calling of single nucleotide variants. Multiple reads for each sequence position could be provided by overlapping distinct peptides, thus confirming the presence of certain amino acid residues in the overlapping stretch with a lower false discovery rate. Overlapping distinct peptides originate from miscleaved tryptic peptides in combination with their properly cleaved counterparts and from peptides generated by multiple proteases after the same specimen is subject to parallel digestion and analyzed separately. We illustrate this approach using publicly available multiprotease data sets and our own data generated for the HEK-293 cell line digests obtained using trypsin, LysC, and GluC proteases. Totally, up to 30% of the whole proteome was covered by tryptic peptides with up to 7% covered twofold and more. The proteogenomic analysis of the HEK-293 cell line revealed 36 single amino acid variants, seven of which were supported by multiple reads.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据