4.7 Article

A Pipeline NanoTRF as a New Tool for De Novo Satellite DNA Identification in the Raw Nanopore Sequencing Reads of Plant Genomes

期刊

PLANTS-BASEL
卷 11, 期 16, 页码 -

出版社

MDPI
DOI: 10.3390/plants11162103

关键词

satellite DNA; Nanopore sequencing; genome; tandem repeats; pipeline

资金

  1. Russian Science Foundation [22-26-00222]

向作者/读者索取更多资源

High-copy tandemly organized repeats (TRs), or satellite DNA, are an important and mysterious component of eukaryotic genomes. In this study, we developed NanoTRF, a new python pipeline for the de novo identification of TRs in raw Nanopore sequencing data. NanoTRF can generate informative reports on TR genome abundance, monomer sequence, and length. It also performs annotation of transposable element sequences within or near satDNA arrays, providing insights into the co-evolution of TRs and transposable elements in the genome.
High-copy tandemly organized repeats (TRs), or satellite DNA, is an important but still enigmatic component of eukaryotic genomes. TRs comprise arrays of multi-copy and highly similar tandem repeats, which makes the elucidation of TRs a very challenging task. Oxford Nanopore sequencing data provide a valuable source of information on TR organization at the single molecule level. However, bioinformatics tools for de novo identification of TRs in raw Nanopore data have not been reported so far. We developed NanoTRF, a new python pipeline for TR repeat identification, characterization and consensus monomer sequence assembly. This new pipeline requires only a raw Nanopore read file from low-depth (<1 x ) genome sequencing. The program generates an informative html report and figures on TR genome abundance, monomer sequence and monomer length. In addition, NanoTRF performs annotation of transposable elements (TEs) sequences within or near satDNA arrays, and the information can be used to elucidate how TR-TE co-evolve in the genome. Moreover, we validated by FISH that the NanoTRF report is useful for the evaluation of TR chromosome organization-clustered or dispersed. Our findings showed that NanoTRF is a robust method for the de novo identification of satellite repeats in raw Nanopore data without prior read assembly. The obtained sequences can be used in many downstream analyses including genome assembly assistance and gap estimation, chromosome mapping and cytogenetic marker development.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据