☆ 4.8 Article

Prospects for Building Large Timetrees Using Molecular Data with Incomplete Gene Coverage among Species

MOLECULAR BIOLOGY AND EVOLUTION (2014)

期刊

MOLECULAR BIOLOGY AND EVOLUTION

卷 31, 期 9, 页码 2542-2550

出版社

OXFORD UNIV PRESS

DOI: 10.1093/molbev/msu200

关键词

divergence time; timetree; incomplete data

类别

Biochemistry & Molecular Biology Evolutionary Biology Genetics & Heredity

资金

National Institutes of Health (NIH) [HG002096-12, HG006039-02]
National Science Foundation [DBI-0850013]
[NIH R25 GM099650]
Div Of Biological Infrastructure
Direct For Biological Sciences [1445187] Funding Source: National Science Foundation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Scientists are assembling sequence data sets from increasing numbers of species and genes to build comprehensive timetrees. However, data are often unavailable for some species and gene combinations, and the proportion of missing data is often large for data sets containing many genes and species. Surprisingly, there has not been a systematic analysis of the effect of the degree of sparseness of the species-genematrix on the accuracy of divergence time estimates. Here, we present results from computer simulations and empirical data analyses to quantify the impact of missing gene data on divergence time estimation in large phylogenies. We found that estimates of divergence times were robust even when sequences from a majority of genes for most of the species were absent. From the analysis of such extremely sparse data sets, we found that the most egregious errors occurred for nodes in the tree that had no common genes for any pair of species in the immediate descendant clades of the node in question. These problematic nodes can be easily detected prior to computational analyses based only on the input sequence alignment and the tree topology. We conclude that it is best to use larger alignments, because adding both genes and species to the alignment augments the number of genes available for estimating divergence events deep in the tree and improves their time estimates.

Prospects for Building Large Timetrees Using Molecular Data with Incomplete Gene Coverage among Species

期刊

MOLECULAR BIOLOGY AND EVOLUTION

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Prospects for Building Large Timetrees Using Molecular Data with Incomplete Gene Coverage among Species

期刊

MOLECULAR BIOLOGY AND EVOLUTION

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文