☆ 4.5 Article

Improving the quality of protein identification in non-model species. Characterization of Quercus ilex seed and Pinus radiata needle proteomes by using SEQUEST and custom databases

JOURNAL OF PROTEOMICS (2014)

期刊

JOURNAL OF PROTEOMICS

卷 105, 期 -, 页码 85-91

出版社

ELSEVIER SCIENCE BV

DOI: 10.1016/j.jprot.2014.01.027

关键词

Protein identification; Custom databases; Non-model species; SEQUEST; ESTs; NGS

类别

Biochemical Research Methods

资金

predoctoral scholarship program Itaipia Binacional-Paraguay (Paraguay Government)
FPU predoctoral program (Ministry of Education, Spain)
FCT postdoctoral fellowship (Portuguese Government)

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Nowadays the most used pipeline for protein identification consists in the comparison of the MS/MS spectra to reference databases. Search algorithms compare obtained spectra to an in silico digestion of a sequence database to find exact matches. In this context, the database has a paramount importance and will determine in a great deal the number of identifications and its quality, being this especially relevant for non-model plant species. Using a single Viridiplantae database (NCBI, UniProt) and TAIR is not the best choice for non-model species since they are underrepresented in databases resulting in poor identification rates. We demonstrate how it is possible to improve the rate and quality of identifications in two orphan species, Quercus ilex and Pinus radiata, by using SEQUEST and a combination of public (Viridiplantae NCBI, UniProt) and a custom-built specific database which contained 593,294 and 455,096 peptide sequences (Quercus and Pinus, respectively). These databases were built after gathering and processing (trimming, contiging, 6-frame translation) publicly available RNA sequences, mostly ESTs and NGS reads. A total of 149 and 1533 proteins were identified from Quercus seeds and Pinus needles, representing a 3.1- or 1.5-fold increase in the number of protein identifications and scores compared to the use of a single database. Since this approach greatly improves the identification rate, and is not significantly more complicated or time consuming than other approaches, we recommend its routine use when working with non-model species. Biological significance In this work we demonstrate how the construction of a custom database (DB) gathering all available RNA sequences and its use in combination with Viridiplantae public DBs (NCBI, UniProt) significantly improve protein identification when working with non-model species. Protein identification rate and quality is higher to those obtained in routine procedures based on using only one database (commonly Viridiplantae from NCBI), as we demonstrated analyzing Quercus seeds and Pine needles. The proposed approach based on the building of a custom database is not difficult or time consuming, so we recommend its routine use when working with non-model species. This article is part of a Special Issue entitled: Proteomics of non-model organisms. (C) 2014 Elsevier B.V. All rights reserved.

Improving the quality of protein identification in non-model species. Characterization of Quercus ilex seed and Pinus radiata needle proteomes by using SEQUEST and custom databases

期刊

JOURNAL OF PROTEOMICS

出版社

ELSEVIER SCIENCE BV

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Improving the quality of protein identification in non-model species. Characterization of Quercus ilex seed and Pinus radiata needle proteomes by using SEQUEST and custom databases

期刊

JOURNAL OF PROTEOMICS

出版社

ELSEVIER SCIENCE BV

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文