4.6 Article

Microseek: A Protein-Based Metagenomic Pipeline for Virus Diagnostic and Discovery

期刊

VIRUSES-BASEL
卷 14, 期 9, 页码 -

出版社

MDPI
DOI: 10.3390/v14091990

关键词

metagenomics; virus; discovery; diagnostic; bioinformatics; pipeline

类别

向作者/读者索取更多资源

Microseek is a pipeline for virus identification and discovery based on the RVDB-prot database. It analyzes mNGS raw data and determines viral sequences through Lowest Common Ancestor scoring. Experimental results on human samples demonstrate that Microseek performs well in identifying known and distant pseudoviral sequences while minimizing non-relevant results.
We present Microseek, a pipeline for virus identification and discovery based on RVDB-prot, a comprehensive, curated and regularly updated database of viral proteins. Microseek analyzes metagenomic Next Generation Sequencing (mNGS) raw data by performing quality steps, de novo assembly, and by scoring the Lowest Common Ancestor (LCA) from translated reads and contigs. Microseek runs on a local computer. The outcome of the pipeline is displayed through a user-friendly and dynamic graphical interface. Based on two representative mNGS datasets derived from human tissue and plasma specimens, we illustrate how Microseek works, and we report its performances. In silico spikes of known viral sequences, but also spikes of fake Neopneumovirus viral sequences generated with variable evolutionary distances from known members of the Pneumoviridae family, were used. Results were compared to Chan Zuckerberg ID (CZ ID), a reference cloud-based mNGS pipeline. We show that Microseek reliably identifies known viral sequences and performs well for the detection of distant pseudoviral sequences, especially in complex samples such as in human plasma, while minimizing non-relevant hits.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据