Journal
GENOME BIOLOGY AND EVOLUTION
Volume 9, Issue 6, Pages 1519-1527Publisher
OXFORD UNIV PRESS
DOI: 10.1093/gbe/evx109
Keywords
computer simulation; disease genes; BLAST; false negatives; de novo gene origination
Categories
Funding
- U.S. National Institutes of Health [GM120093]
Ask authors/readers for more resources
Phylostratigraphy, originally designed for gene age estimation by BLAST-based protein homology searches of sequenced genomes, has been widely used for studying patterns and inferring mechanisms of gene origination and evolution. We previously showed by computer simulation that phylostratigraphy underestimates gene age for a non-negligible fraction of genes and that the underestimation is severer for genes with certain properties such as fast evolution and short protein sequences. Consequently, many previously reported age distributions of gene properties may have been methodological artifacts rather than biological realities. Domazet-Loso and colleagues recently argued that our simulations were flawed and that phylostratigraphic bias does not impact inferences about gene emergence and evolution. Here we discuss conceptual difficulties of phylostratigraphy, identify numerous problems in Domazet-LoSo et al.'s argument, reconfirm phylostratigraphic error using simulations suggested by Domazet-LoSo and colleagues, and demonstrate that a phylostratigraphic trend claimed to be robust to error disappears when genes likely to be error-resistant are analyzed. We conclude that extreme caution is needed in interpreting phylostratigraphic results because of the inherent biases of the method and that reanalysis using genes exhibiting no error in realistic simulations may help reduce spurious findings.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available