Journal
SYSTEMATIC BIOLOGY
Volume 63, Issue 1, Pages 66-82Publisher
OXFORD UNIV PRESS
DOI: 10.1093/sysbio/syt059
Keywords
Bayesian; false negative; false positive; gene tree; maximum likelihood; phylogenetics; Robinson-Foulds; statistical consistency
Categories
Funding
- National Science Foundation [DBI-1103639, DBI-1146722]
- New Zealand Marsden Fund
- National Institute for Mathematical and Biological Synthesis
- National Science Foundation
- US Department of Homeland Security
- U.S. Department of Agriculture through NSF [EF-0832858]
- University of Tennessee, Knoxville
- Div Of Biological Infrastructure
- Direct For Biological Sciences [1300426] Funding Source: National Science Foundation
Ask authors/readers for more resources
To infer species trees from gene trees estimated from phylogenomic data sets, tractable methods are needed that can handle dozens to hundreds of loci. We examine several computationally efficient approaches-MP-EST, STAR, STEAC, STELLS, and STEM-for inferring species trees from gene trees estimated using maximum likelihood (ML) and Bayesian approaches. Among the methods examined, we found that topology-based methods often performed better using ML gene trees and methods employing coalescent times typically performed better using Bayesian gene trees, with MP-EST, STAR, STEAC, and STELLS outperforming STEM under most conditions. We examine why the STEM tree (also called GLASS or Maximum Tree) is less accurate on estimated gene trees by comparing estimated and true coalescence times, performing species tree inference using simulations, and analyzing a great ape data set keeping track of false positive and false negative rates for inferred clades. We find that although true coalescence times are more ancient than speciation times under the multispecies coalescent model, estimated coalescence times are often more recent than speciation times. This underestimation can lead to increased bias and lack of resolution with increased sampling (either alleles or loci) when gene trees are estimated with ML. The problem appears to be less severe using Bayesian gene-tree estimates.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available