4.7 Article

Epidemiological associations with genomic variation in SARS-CoV-2

期刊

SCIENTIFIC REPORTS
卷 11, 期 1, 页码 -

出版社

NATURE PORTFOLIO
DOI: 10.1038/s41598-021-02548-w

关键词

-

资金

  1. National Science Foundation [DEB-2028280, DEB-2109688]

向作者/读者索取更多资源

The study identified that the nonstructural protein 3 (nsp3) and Spike protein (S) of SARS-CoV-2 have the highest variation in the genome, and are most correlated with the viral whole-genome variation. The country of origin and time since the start of the pandemic were found to be the most influential metadata associated with genomic variation.
SARS-CoV-2 (CoV) is the etiological agent of the COVID-19 pandemic and evolves to evade both host immune systems and intervention strategies. We divided the CoV genome into 29 constituent regions and applied novel analytical approaches to identify associations between CoV genomic features and epidemiological metadata. Our results show that nonstructural protein 3 (nsp3) and Spike protein (S) have the highest variation and greatest correlation with the viral whole-genome variation. S protein variation is correlated with nsp3, nsp6, and 3 '-to-5 ' exonuclease variation. Country of origin and time since the start of the pandemic were the most influential metadata associated with genomic variation, while host sex and age were the least influential. We define a novel statistic-coherence-and show its utility in identifying geographic regions (populations) with unusually high (many new variants) or low (isolated) viral phylogenetic diversity. Interestingly, at both global and regional scales, we identify geographic locations with high coherence neighboring regions of low coherence; this emphasizes the utility of this metric to inform public health measures for disease spread. Our results provide a direction to prioritize genes associated with outcome predictors (e.g., health, therapeutic, and vaccine outcomes) and to improve DNA tests for predicting disease status.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据