4.6 Article

Protein sequence databases generated from metagenomics and public databases produced similar soil metaproteomic results of microbial taxonomic and functional changes

Journal

PEDOSPHERE
Volume 32, Issue 4, Pages 507-520

Publisher

SCIENCE PRESS
DOI: 10.1016/S1002-0160(21)60016-4

Keywords

bioinformatics; differentially accumulated protein; functional annotation; functional microorganism; Meta-database; microbial community; microbial species; Public database

Categories

Funding

  1. National Key Research and Development Program of China [2016YFD0200-308]
  2. National Key Basic Research Program of China [2015CB150501]
  3. Project of Priority and Key Areas, Institute of Soil Science, Chinese Academy of Sciences [ISSASIP1605, ISSASIP1640]

Ask authors/readers for more resources

Soil metaproteomics has potential in studying structural and functional changes in soil microbial communities, while selecting a protein sequence database poses challenges. Using a Meta-database can lead to identifying more proteins and microbial species, but different databases yield similar results in main findings and functional analyses.
Soil metaproteomics has excellent potential as a tool to elucidate the structural and functional changes in soil microbial communities in response to environmental alterations. However, soil metaproteomics is hindered by several challenges and gaps. Soil microbial communities possess extremely complex microbial composition, including many uncultured microorganisms without whole genome sequencing. Thus, how to select a suitable protein sequence database remains challenging in soil metaproteomics. In this study, the Public database and Meta-database were constructed using protein sequences from public databases and metagenomics, respectively. We comprehensively analyzed and compared the soil metaproteomic results using these two kinds of protein sequence databases for protein identification based on published soil metaproteomic raw data. The results demonstrated that many more proteins, higher sequence coverage, and even more microbial species and functional annotations could be identified using the Meta-database compared with those identified using the Public database. These findings indicated that the Meta-database was more specific as a protein sequence database. However, the follow-up in-depth metaproteomic analyses exhibited similar main results regardless of the database used. The microbial community composition at the genus level was similar between the two databases, especially the species annotations with high peptide-spectrum match and high abundance. The functional analyses in response to stress, such as the gene ontology enrichment of biological progress and molecular function and the key functional microorganisms, were also similar regardless of the database. Our analysis revealed that the Public database could also meet the demand to explore the functional responses of microbial proteins to some extent. This study provides valuable insights into the choice of protein sequence databases and their impacts on subsequent bioinformatic analysis in soil metaproteomic research and will facilitate the optimization of experimental design for different purposes.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available