4.6 Article

Do Alignment and Trimming Methods Matter for Phylogenomic (UCE) Analyses?

期刊

SYSTEMATIC BIOLOGY
卷 70, 期 3, 页码 440-462

出版社

OXFORD UNIV PRESS
DOI: 10.1093/sysbio/syaa064

关键词

Alignment; concatenated analysis; phylogenomics; sequence length heterogeneity; species-tree analysis; trimming

资金

  1. U.S. National Science Foundation [DEB 1655690]

向作者/读者索取更多资源

Different alignment and trimming methods can significantly impact various aspects of phylogenomic data sets, but generally have little impact on the recovery and support values for well-established clades. The choice of phylogenetic methods has the strongest impact on the phylogenetic results, with concatenated analyses recovering significantly more well-established clades with stronger support than species-tree analyses.
Alignment is a crucial issue in molecular phylogenetics because different alignment methods can potentially yield very different topologies for individual genes. But it is unclear if the choice of alignment methods remains important in phylogenomic analyses, which incorporate data from hundreds or thousands of genes. For example, problematic biases in alignment might be multiplied across many loci, whereas alignment errors in individual genes might become irrelevant. The issue of alignment trimming (i.e., removing poorly aligned regions or missing data from individual genes) is also poorly explored. Here, we test the impact of 12 different combinations of alignment and trimming methods on phylogenomic analyses. We compare these methods using published phylogenomic data from ultraconserved elements (UCEs) from squamate reptiles (lizards and snakes), birds, and tetrapods. We compare the properties of alignments generated by different alignment and trimming methods (e.g., length, informative sites, missing data). We also test whether these data sets can recover well-established clades when analyzed with concatenated (RAxML) and species-tree methods (ASTRAL-III), using the full data (similar to 5000 loci) and subsampled data sets (10% and 1% of loci). We show that different alignment and trimming methods can significantly impact various aspects of phylogenomic data sets (e.g., length, informative sites). However, these different methods generally had little impact on the recovery and support values for well-established clades, even across very different numbers of loci. Nevertheless, our results suggest several best practices for alignment and trimming. Intriguingly, the choice of phylogenetic methods impacted the phylogenetic results most strongly, with concatenated analyses recovering significantly more well-established clades (with stronger support) than the species-tree analyses.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据