期刊
BIOINFORMATICS
卷 37, 期 21, 页码 3947-3949出版社
OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btab430
关键词
-
类别
资金
- Museum National d'Histoire Naturelle (MNHN)
- Institut Universtaire de France (IUF)
Genomic sequences are commonly used to infer evolutionary history, with MNHN-Tree-Tools being a high-performance algorithm set for clustering and tree building. It does not rely on multiple sequence alignment, making it suitable for large datasets and various applications such as human alpha-satellite repeats classification and tree of life derivation from 16S/18S rDNA sequences.
Genomic sequences are widely used to infer the evolutionary history of a given group of individuals. Many methods have been developed for sequence clustering and tree building. In the early days of genome sequencing, these were often limited to hundreds of sequences but due to the surge of high throughput sequencing, it is now common to have millions of sampled sequences at hand. We introduce MNHN-Tree-Tools, a high performance set of algorithms that builds multi-scale, nested clusters of sequences found in a FASTA file. MNHN-Tree-Tools does not rely on multiple sequence alignment and can thus be used on large datasets to infer a sequence tree. Herein, we outline two applications: a human alpha-satellite repeats classification and a tree of life derivation from 16S/18S rDNA sequences.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据