4.7 Article

Unitig level assembly graph based metagenome-assembled genome refiner (UGMAGrefiner): A tool to increase completeness and resolution of metagenome-assembled genomes

Journal

COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL
Volume 21, Issue -, Pages 2394-2404

Publisher

ELSEVIER
DOI: 10.1016/j.csbj.2023.03.030

Keywords

Metagenome; Metagenomic assembly; Assembly graph; Binning refinement; Genome specific unitig cluster

Ask authors/readers for more resources

In this study, a new approach called UGMAGrefiner was proposed, which utilizes the connection and coverage information from unitig level assembly graphs to improve the quality of metagenome-assembled genomes (MAGs). UGMAGrefiner outperforms state-of-the-art binning refine tools and can identify genome specific regions of mixed genomes. It provides an efficient way to obtain more complete MAGs and study genome specific functions.
De novo assembly of next generation metagenomic reads is widely used to provide taxonomic and functional information of genomes in a microbial community. As strains are functionally specific, recovery of strain-resolved genomes is important but still a challenge. Unitigs and assembly graphs are mid-products generated during the assembly of reads into contigs, and they provide higher resolution for sequences connection information. In this study, we propose a new approach UGMAGrefiner (a unitig level assembly graph-based metagenome-assembled Genome refiner), which uses the connection and coverage information from unitig level assembly graphs to recruit unbinned unitigs to MAGs, adjust binning result, and infer unitigs shared by multiple MAGs. In two simulated datasets (Simdata and CAMI data) and one real dataset (GD02), it outperforms two state-of-the-art assembly graph-based binning refine tools in the refinement of MAGs' quality by stably increasing the completeness of genomes. UGMAGrefiner can identify genome specific clusters of genomes with below 99% average nucleotide identity for homologous sequences. For MAGs mixed with 99% similarity genome clusters, it could distinguish 8 out of 9 genomes in Simdata and 8 out of 12 genomes in CAMI data. In GD02 data, it could identify 16 new unitig clusters representing genome specific regions of mixed genomes and 4 unitig clusters representing new genomes from total 135 MAGs for further functional analysis. UGMAGrefiner provides an efficient way to obtain more complete MAGs and study genome specific functions. It will be useful to improve taxonomic and functional information of genomes after de novo assembly.(c) 2023 The Author(s). Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available