4.6 Article

mobileOG-db: a Manually Curated Database of Protein Families Mediating the Life Cycle of Bacterial Mobile Genetic Elements

期刊

出版社

AMER SOC MICROBIOLOGY
DOI: 10.1128/aem.00991-22

关键词

antibiotic resistance; bacteriophages; insertion sequence; integrative elements; metagenomics; mobile genetic elements; mobilome; plasmids; transposons

资金

  1. NSF PIRE Award [1545756]
  2. NSF CI4WARS Award [2004751]
  3. NSF NRT Award [2125798]
  4. USDA National Institute of Food and Agriculture [2017-68003-26498]
  5. Water Research Foundation Project [4961]
  6. Genetics, Bioinformatics, and Computational Biology Interdisciplinary Graduate Education Program (IGEP)
  7. Virginia Tech Sustainable NanoTechnology IGEP
  8. Fralin Life Sciences Institute
  9. Virginia Tech Open Access Support Fund
  10. Virginia Tech ICTAS Center for Science and Engineering of the Exposome
  11. NanoEarth
  12. Directorate for STEM Education
  13. Division Of Graduate Education [2125798] Funding Source: National Science Foundation

向作者/读者索取更多资源

This study provides a comprehensive analysis of bacterial mobile genetic elements (MGEs) and presents a database, mobileOG-db, containing over 6,000 protein families for the annotation and analysis of MGEs. The database offers a multilevel classification scheme, allowing for the annotation of plasmids, phages, integrative elements, and transposable elements.
Bacterial mobile genetic elements (MGEs) encode functional modules that perform both core and accessory functions for the element, the latter of which are often only transiently associated with the element. The presence of these accessory genes, which are often close homologs to primarily immobile genes, incur high rates of false positives and, therefore, limits the usability of these databases for MGE annotation. To overcome this limitation, we analyzed 10,776,849 protein sequences derived from eight MGE databases to compile a comprehensive set of 6,140 manually curated protein families that are linked to the life cycle (integration/excision, replication/recombination/repair, transfer, stability/transfer/defense, and phage-specific processes) of plasmids, phages, integrative, transposable, and conjugative elements. We overlay experimental information where available to create a tiered annotation scheme of high-quality annotations and annotations inferred exclusively through bioinformatic evidence. We additionally provide an MGE-class label for each entry (e.g., plasmid or integrative element), and assign to each entry a major and minor category. The resulting database, mobileOG-db (for mobile orthologous groups), comprises over 700,000 deduplicated sequences encompassing five major mobileOG categories and more than 50 minor categories, providing a structured language and interpretable basis for an array of MGE-centered analyses. mobileOG-db can be accessed at mobileogdb.flsi.cloud.vt.edu/, where users can select, refine, and analyze custom subsets of the dynamic mobilome. IMPORTANCE The analysis of bacterial mobile genetic elements (MGEs) in genomic data is a critical step toward profiling the root causes of antibiotic resistance, phenotypic or metabolic diversity, and the evolution of bacterial genera. Existing methods for MGE annotation pose high barriers of biological and computational expertise to properly harness. To bridge this gap, we systematically analyzed 10,776,849 proteins derived from eight databases of MGEs to identify 6,140 MGE protein families that can serve as candidate hallmarks, i.e., proteins that can be used as signatures of MGEs to aid annotation. The resulting resource, mobileOG-db, provides a multilevel classification scheme that encompasses plasmid, phage, integrative, and transposable element protein families categorized into five major mobileOG categories and more than 50 minor categories. mobileOG-db thus provides a rich resource for simple and intuitive element annotation that can be integrated seamlessly into existing MGE detection pipelines and colocalization analyses. The analysis of bacterial mobile genetic elements (MGEs) in genomic data is a critical step toward profiling the root causes of antibiotic resistance, phenotypic or metabolic diversity, and the evolution of bacterial genera. Existing methods for MGE annotation pose high barriers of biological and computational expertise to properly harness.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据