4.6 Article

mobileOG-db: a Manually Curated Database of Protein Families Mediating the Life Cycle of Bacterial Mobile Genetic Elements

Journal

APPLIED AND ENVIRONMENTAL MICROBIOLOGY
Volume 88, Issue 18, Pages -

Publisher

AMER SOC MICROBIOLOGY
DOI: 10.1128/aem.00991-22

Keywords

antibiotic resistance; bacteriophages; insertion sequence; integrative elements; metagenomics; mobile genetic elements; mobilome; plasmids; transposons

Funding

  1. NSF PIRE Award [1545756]
  2. NSF CI4WARS Award [2004751]
  3. NSF NRT Award [2125798]
  4. USDA National Institute of Food and Agriculture [2017-68003-26498]
  5. Water Research Foundation Project [4961]
  6. Genetics, Bioinformatics, and Computational Biology Interdisciplinary Graduate Education Program (IGEP)
  7. Virginia Tech Sustainable NanoTechnology IGEP
  8. Fralin Life Sciences Institute
  9. Virginia Tech Open Access Support Fund
  10. Virginia Tech ICTAS Center for Science and Engineering of the Exposome
  11. NanoEarth
  12. Directorate for STEM Education
  13. Division Of Graduate Education [2125798] Funding Source: National Science Foundation

Ask authors/readers for more resources

This study provides a comprehensive analysis of bacterial mobile genetic elements (MGEs) and presents a database, mobileOG-db, containing over 6,000 protein families for the annotation and analysis of MGEs. The database offers a multilevel classification scheme, allowing for the annotation of plasmids, phages, integrative elements, and transposable elements.
Bacterial mobile genetic elements (MGEs) encode functional modules that perform both core and accessory functions for the element, the latter of which are often only transiently associated with the element. The presence of these accessory genes, which are often close homologs to primarily immobile genes, incur high rates of false positives and, therefore, limits the usability of these databases for MGE annotation. To overcome this limitation, we analyzed 10,776,849 protein sequences derived from eight MGE databases to compile a comprehensive set of 6,140 manually curated protein families that are linked to the life cycle (integration/excision, replication/recombination/repair, transfer, stability/transfer/defense, and phage-specific processes) of plasmids, phages, integrative, transposable, and conjugative elements. We overlay experimental information where available to create a tiered annotation scheme of high-quality annotations and annotations inferred exclusively through bioinformatic evidence. We additionally provide an MGE-class label for each entry (e.g., plasmid or integrative element), and assign to each entry a major and minor category. The resulting database, mobileOG-db (for mobile orthologous groups), comprises over 700,000 deduplicated sequences encompassing five major mobileOG categories and more than 50 minor categories, providing a structured language and interpretable basis for an array of MGE-centered analyses. mobileOG-db can be accessed at mobileogdb.flsi.cloud.vt.edu/, where users can select, refine, and analyze custom subsets of the dynamic mobilome. IMPORTANCE The analysis of bacterial mobile genetic elements (MGEs) in genomic data is a critical step toward profiling the root causes of antibiotic resistance, phenotypic or metabolic diversity, and the evolution of bacterial genera. Existing methods for MGE annotation pose high barriers of biological and computational expertise to properly harness. To bridge this gap, we systematically analyzed 10,776,849 proteins derived from eight databases of MGEs to identify 6,140 MGE protein families that can serve as candidate hallmarks, i.e., proteins that can be used as signatures of MGEs to aid annotation. The resulting resource, mobileOG-db, provides a multilevel classification scheme that encompasses plasmid, phage, integrative, and transposable element protein families categorized into five major mobileOG categories and more than 50 minor categories. mobileOG-db thus provides a rich resource for simple and intuitive element annotation that can be integrated seamlessly into existing MGE detection pipelines and colocalization analyses. The analysis of bacterial mobile genetic elements (MGEs) in genomic data is a critical step toward profiling the root causes of antibiotic resistance, phenotypic or metabolic diversity, and the evolution of bacterial genera. Existing methods for MGE annotation pose high barriers of biological and computational expertise to properly harness.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available