4.7 Article

Accurate reconstruction of viral genomes in human cells from short reads using iterative refinement

期刊

BMC GENOMICS
卷 23, 期 1, 页码 -

出版社

BMC
DOI: 10.1186/s12864-022-08649-8

关键词

Viral genomes; Epstein-Barr Virus; Cancer; Whole-genome sequencing; Iterative refinement

资金

  1. Hong Kong Research Grants Council Theme-based Research Scheme [T12-401/13-R]
  2. Hong Kong Research Grants Council Area of Excellence [AoE/M-401/20]
  3. Collaborative Research Fund [C4001-18GF]
  4. Hong Kong Research Grants Council Collaborative Research Funds [C4045-18WF, C4054-16G, C4057-18EF, C7044-19GF]
  5. General Research Funds [14113620, 14107420, 14170217, 14203119]
  6. Hong Kong Epigenomics Project (EpiHK)
  7. Chinese University of Hong Kong

向作者/读者索取更多资源

In this study, a pipeline called ASPIRE is proposed for accurately reconstructing viral genomes from short reads data of human samples. ASPIRE improves the quality of the reconstructed genomes through additional components such as iterative refinement and sequence corrections, especially for samples with significant differences from the reference genome.
Background After an infection, human cells may contain viral genomes in the form of episomes or integrated DNA. Comparing the genomic sequences of different strains of a virus in human cells can often provide useful insights into its behaviour, activity and pathology, and may help develop methods for disease prevention and treatment. To support such comparative analyses, the viral genomes need to be accurately reconstructed from a large number of samples. Previous efforts either rely on customized experimental protocols or require high similarity between the sequenced genomes and a reference, both of which limit the general applicability of these approaches. In this study, we propose a pipeline, named ASPIRE, for reconstructing viral genomes accurately from short reads data of human samples, which are increasingly available from genome projects and personal genomics. ASPIRE contains a basic part that involves de novo assembly, tiling and gap filling, and additional components for iterative refinement, sequence corrections and wrapping. Results Evaluated by the alignment quality of sequencing reads to the reconstructed genomes, these additional components improve the assembly quality in general, and in some particular samples quite substantially, especially when the sequenced genome is significantly different from the reference. We use ASPIRE to reconstruct the genomes of Epstein Barr Virus (EBV) from the whole-genome sequencing data of 61 nasopharyngeal carcinoma (NPC) samples and provide these sequences as a resource for EBV research. Conclusions ASPIRE improves the quality of the reconstructed EBV genomes in published studies and outperforms TRACESPipe in some samples considered.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据