4.7 Article

Systematic errors in orthology inference and their effects on evolutionary analyses

Journal

ISCIENCE
Volume 24, Issue 2, Pages -

Publisher

CELL PRESS
DOI: 10.1016/j.isci.2021.102110

Keywords

-

Funding

  1. BBSRC [BB/R016240/1]
  2. European Union's Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grant [764840]
  3. European Research Council [ERC-2012-AdG 322790]

Ask authors/readers for more resources

The availability of complete gene sets from different organisms allows for identification of unique or lost genes in specific clades. Through simulation, it has been shown that errors in predicting orthologs increase with higher rates of evolution. Empirical data findings are closely replicated by simulated data with errors, suggesting downstream analyses should consider the impact of orthology prediction errors on gene evolution patterns.
The availability of complete sets of genes from many organisms makes it possible to identify genes unique to (or lost from) certain clades. This information is used to reconstruct phylogenetic trees; identify genes involved in the evolution of clade specific novelties; and for phylostratigraphy-identifying ages of genes in a given species. These investigations rely on accurately predicted orthologs. Here we use simulation to produce sets of orthologs that experience no gains or losses. We show that errors in identifying orthologs increase with higher rates of evolution. We use the predicted sets of orthologs, with errors, to reconstruct phylogenetic trees; to count gains and losses; and for phylostratigraphy. Our simulated data, containing information only from errors in orthology prediction, closely recapitulate findings from empirical data. We suggest published downstream analyses must be informed to a large extent by errors in orthology prediction that mimic expected patterns of gene evolution.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available