4.8 Article

GMrepo v2: a curated human gut microbiome database with special focus on disease markers and cross-dataset comparison

Journal

NUCLEIC ACIDS RESEARCH
Volume 50, Issue D1, Pages D777-D784

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/nar/gkab1019

Keywords

-

Funding

  1. National Key Research and Development Program of China [2019YFA0905600]
  2. National Natural Science Foundation of China [81803850, 61932008, 61772368]
  3. National Key R&D Program of China [2020YFA0712403]
  4. Shanghai Science and Technology Innovation Fund [19511101404]
  5. Shanghai Municipal Science and Technology Major Project [2018SHZDZX01]

Ask authors/readers for more resources

GMrepo is a curated database of human gut metagenomes aimed at increasing data reusability and accessibility, and enabling cross-project and phenotype comparisons. The latest version, GMrepo v2, includes more projects and samples obtained through different sequencing methods. Various disease markers have been identified and compared across datasets to facilitate the discovery of consistent microbial markers.
GMrepo (data repository for Gut Microbiota) is a database of curated and consistently annotated human gut metagenomes. Its main purposes are to increase the reusability and accessibility of human gut metagenomic data, and enable cross-project and phenotype comparisons. To achieve these goals, we performed manual curation on the meta-data and organized the datasets in a phenotype-centric manner. GMrepo v2 contains 353 projects and 71,642 runs/samples, which are significantly increased from the previous version. Among these runs/samples, 45,111 and 26,531 were obtained by 16S rRNA amplicon and whole-genome metagenomics sequencing, respectively. We also increased the number of phenotypes from 92 to 133. In addition, we introduced disease-marker identification and cross-project/phenotype comparison. We first identified disease markers between two phenotypes (e.g. health versus diseases) on a per-project basis for selected projects. We then compared the identified markers for each phenotype pair across datasets to facilitate the identification of consistent microbial markers across datasets. Finally, we provided a marker-centric view to allow users to check if a marker has different trends in different diseases. So far, GMrepo includes 592 marker taxa (350 species and 242 genera) for 47 phenotype pairs, identified from 83 selected projects. GMrepo v2 is freely available at: https://gmrepo.humangut.info.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available