4.5 Review

Inferring the Deep Past from Molecular Data

Journal

GENOME BIOLOGY AND EVOLUTION
Volume 13, Issue 5, Pages -

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/gbe/evab067

Keywords

phylogenetics; tree of life; substitution models; eukaryote origins; microbial evolution

Funding

  1. Royal Society University Fellowship
  2. NERC [NE/P00251X/1]
  3. Foundation for Science and Technology (FCT) [UIDB/04326/2020]
  4. operational programs CRESC Algarve 2020 and COMPETE 2020 [ALG-01-0145-FEDER-022121, ALG-01-0145-FEDER-022231]
  5. European Research Council under the European Union's Horizon 2020 research and innovation program [714774]

Ask authors/readers for more resources

The analysis of molecular sequences can help distinguish between alternative hypotheses for ancient relationships, but the choice of phylogenetic methods and data types is crucial for recovering historical signals. Using overly simple models to analyze molecular data can influence the topology of trees obtained, as demonstrated by examples where relationships have changed with improvements in models and methods. Due to the complexity and heterogeneity of molecular sequence data, even the best available models may struggle with some problems, highlighting the importance of maintaining critical attitude towards phylogenetic analyses.
There is an expectation that analyses of molecular sequences might be able to distinguish between alternative hypotheses for ancient relationships, but the phylogenetic methods used and types of data analyzed are of critical importance in any attempt to recover historical signal. Here, we discuss some common issues that can influence the topology of trees obtained when using overly simple models to analyze molecular data that often display complicated patterns of sequence heterogeneity. To illustrate our discussion, we have used three examples of inferred relationships which have changed radically as models and methods of analysis have improved. In two of these examples, the sister-group relationship between thermophilic Thermus and mesophilic Deinococcus, and the position of long-branch Microsporidia among eukaryotes, we show that recovering what is now generally considered to be the correct tree is critically dependent on the fit between model and data. In the third example, the position of eukaryotes in the tree of life, the hypothesis that is currently supported by the best available methods is fundamentally different from the classical view of relationships between major cellular domains. Since heterogeneity appears to be pervasive and varied among all molecular sequence data, and even the best available models can still struggle to deal with some problems, the issues we discuss are generally relevant to phylogenetic analyses. It remains essential to maintain a critical attitude to all trees as hypotheses of relationship that may change with more data and better methods.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available