4.0 Article

MetaTransformer: deep metagenomic sequencing read classification using self-attention models

期刊

NAR GENOMICS AND BIOINFORMATICS
卷 5, 期 3, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/nargab/lqad082

关键词

-

向作者/读者索取更多资源

Deep learning has had a significant impact on scientific research and this paper introduces a self-attention-based deep learning tool called MetaTransformer for metagenomic analysis. MetaTransformer outperforms previous methods in species and genus classification and achieves improved performance and reduced memory consumption through different embedding schemes.
Deep learning has emerged as a paradigm that revolutionizes numerous domains of scientific research. Transformers have been utilized in language modeling outperforming previous approaches. Therefore, the utilization of deep learning as a tool for analyzing the genomic sequences is promising, yielding convincing results in fields such as motif identification and variant calling. DeepMicrobes, a machine learning-based classifier, has recently been introduced for taxonomic prediction at species and genus level. However, it relies on complex models based on bidirectional long short-term memory cells resulting in slow runtimes and excessive memory requirements, hampering its effective usability. We present MetaTransformer, a self-attention-based deep learning metagenomic analysis tool. Our transformer-encoder-based models enable efficient parallelization while outperforming DeepMicrobes in terms of species and genus classification abilities. Furthermore, we investigate approaches to reduce memory consumption and boost performance using different embedding schemes. As a result, we are able to achieve 2x to 5x speedup for inference compared to DeepMicrobes while keeping a significantly smaller memory footprint. MetaTransformer can be trained in 9 hours for genus and 16 hours for species prediction. Our results demonstrate performance improvements due to self-attention models and the impact of embedding schemes in deep learning on metagenomic sequencing data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.0
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据