4.7 Article

Phylogenetic analysis at deep timescales: Unreliable gene trees, bypassed hidden support, and the coalescence/concatalescence conundrum

期刊

MOLECULAR PHYLOGENETICS AND EVOLUTION
卷 80, 期 -, 页码 231-266

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.ympev.2014.08.013

关键词

Mammalia; Supermatrix; Species tree; Gene tree; Deep coalescence

资金

  1. NSF United States [EF0629860]

向作者/读者索取更多资源

Large datasets are required to solve difficult phylogenetic problems that are deep in the Tree of Life. Currently, two divergent systematic methods are commonly applied to such datasets: the traditional supermatrix approach (= concatenation) and shortcut coalescence (= coalescence methods wherein gene trees and the species tree are not co-estimated). When applied to ancient clades, these contrasting frameworks often produce congruent results, but in recent phylogenetic analyses of Placentalia (placental mammals), this is not the case. A recent series of papers has alternatively disputed and defended the utility of shortcut coalescence methods at deep phylogenetic scales. Here, we examine this exchange in the context of published phylogenomic data from Mammalia; in particular we explore two critical issues the delimitation of data partitions (genes) in coalescence analysis and hidden support that emerges with the combination of such partitions in phylogenetic studies. Hidden support increased support for a clade in combined analysis of all data partitions relative to the support evident in separate analyses of the various data partitions, is a hallmark of the supermatrix approach and a primary rationale for concatenating all characters into a single matrix. In the most extreme cases of hidden support, relationships that are contradicted by all gene trees are supported when all of the genes are analyzed together. A valid fear is that shortcut coalescence methods might bypass or distort character support that is hidden in individual loci because small gene fragments are analyzed in isolation. Given the extensive systematic database for Mammalia, the assumptions and applicability of shortcut coalescence methods can be assessed with rigor to complement a small but growing body of simulation work that has directly compared these methods to concatenation. We document several remarkable cases of hidden support in both supermatrix and coalescence paradigms and argue that in most instances, the emergent support in the shortcut coalescence analyses is an artifact. By referencing rigorous molecular clock studies of Mammalia, we suggest that inaccurate gene trees that imply unrealistically deep coalescences debilitate shortcut coalescence analyses of the placental dataset. We document contradictory coalescence results for Placentalia, and outline a critical conundrum that challenges the general utility of shortcut coalescence methods at deep phylogenetic scales. In particular, the basic unit of analysis in coalescence analysis, the coalescence-gene, is expected to shrink in size as more taxa are analyzed, but as the amount of data for reconstruction of a gene tree ratchets downward, the number of nodes in the gene tree that need to be resolved ratchets upward. Some advocates of shortcut coalescence methods have attempted to address problems with inaccurate gene trees by concatenating multiple coalescence-genes to yield gene trees that better match the species tree. However, this hybrid concatenation/coalescence approach, concatalescence, contradicts the most basic biological rationale for performing a coalescence analysis in the first place. We discuss this reality in the context of recent simulation work that also suggests inaccurate reconstruction of gene trees is more problematic for shortcut coalescence methods than deep coalescence of independently segregating loci is for concatenation methods. (C) 2014 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据