4.7 Article

Comparative pangenomics: analysis of 12 microbial pathogen pangenomes reveals conserved global structures of genetic and functional diversity

期刊

BMC GENOMICS
卷 23, 期 1, 页码 -

出版社

BMC
DOI: 10.1186/s12864-021-08223-8

关键词

Pangenome; Core genome; Comparative genomics; Multispecies; Heaps' law; Functional diversity; Sequence diversity; Protein domains; Aminoacyl-tRNA synthetases

资金

  1. National Institute of Allergy and Infectious Diseases [AI124316]
  2. National Institutes of Health [T32GM8806]

向作者/读者索取更多资源

With the growth of publicly available genome sequences, comparative pangenomics methods have provided valuable insights into genetic diversity across multiple species. The study found that pangenome openness is associated with species' phylogenetic placement, gene function and frequency relationships are conserved across species, core genomes have high sequence diversity and functional diversity, and certain protein domains are consistently mutation enriched across multiple species.
Background: With the exponential growth of publicly available genome sequences, pangenome analyses have provided increasingly complete pictures of genetic diversity for many microbial species. However, relatively few studies have scaled beyond single pangenomes to compare global genetic diversity both within and across different species. We present here several methods forcomparative pangenomics that can be used to contextualize multi-pangenome scale genetic diversity with gene function for multiple species at multiple resolutions: pangenome shape, genes, sequence variants, and positions within variants. Results: Applied to 12,676 genomes across 12 microbial pathogenic species, we observed several shared resolution-specific patterns of genetic diversity: First, pangenome openness is associated with species' phylogenetic placement. Second, relationships between gene function and frequency are conserved across species, with core genomes enriched for metabolic and ribosomal genes and accessory genomes for trafficking, secretion, and defense-associated genes. Third, genes in core genomes with the highest sequence diversity are functionally diverse. Finally, certain protein domains are consistently mutation enriched across multiple species, especially among aminoacyl-tRNA synthetases where the extent of a domain's mutation enrichment is strongly function-dependent. Conclusions: These results illustrate the value of each resolution at uncovering distinct aspects in the relationship between genetic and functional diversity across multiple species. With the continued growth of the number of sequenced genomes, these methods will reveal additional universal patterns of genetic diversity at the pangenome scale.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据