Journal
SYSTEMATIC BIOLOGY
Volume 59, Issue 1, Pages 42-58Publisher
OXFORD UNIV PRESS
DOI: 10.1093/sysbio/syp075
Keywords
Alignment; GenBank; phyloinformatics; rogue taxa; supermatrix; taxonomy; Testudines; turtle phylogeny
Categories
Funding
- National Science Foundation Doctoral Dissertation Improvement Grant [DEB-0710380]
- UC Davis Center for Population Biology
- National Science Foundation [DEB-0507916, DEB-0213155, DEB-0817042]
- UC Davis Agricultural Experiment Station
Ask authors/readers for more resources
As phylogenetic data sets grow in size and number, objective methods to summarize this information are becoming increasingly important. Supermatrices can combine existing data directly and in principle provide effective syntheses of phylogenetic information that may reveal new relationships. However, several serious difficulties exist in the construction of large supermatrices that must be overcome before these approaches will enjoy broad utility. We present analyses that examine the performance of sparse supermatrices constructed from large sequence databases for the reconstruction of species-level phylogenies. We develop a largely automated informatics pipeline that allows for the construction of sparse supermatrices from GenBank data. In doing so, we develop strategies for alleviating some of the outstanding impediments to accurate phylogenetic inference using these approaches. These include taxonomic standardization, automated alignment, and the identification of rogue taxa. We use turtles as an exemplar clade and present a well-supported species-level phylogeny for two-thirds of all turtle species based on a similar to 50 kb supermatrix consisting of 93% missing data. Finally, we discuss some of the remaining pitfalls and concerns associated with supermatrix analyses, provide comparisons to supertree approaches, and suggest areas for future research.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available