4.6 Article

Influence of substitution model selection on protein phylogenetic tree reconstruction

Journal

GENE
Volume 865, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.gene.2023.147336

Keywords

Substitution models of protein evolution; Substitution model selection; Molecular evolution; Phylogenetic tree reconstruction; Protein evolution; Phylogenetics

Ask authors/readers for more resources

Probabilistic phylogenetic tree reconstruction has traditionally been performed by selecting the best-fitting substitution model of molecular evolution. Recent studies have proposed that this step is unnecessary, leading to a debate in the field. However, in the case of protein sequences, the selection of a substitution model has a significant influence on phylogenetic tree reconstruction. Our analysis of real and simulated data shows that phylogenetic trees reconstructed using the best-fitting substitution model of protein evolution are the most accurate in terms of topology and branch lengths.
Probabilistic phylogenetic tree reconstruction is traditionally performed under a best-fitting substitution model of molecular evolution previously selected according to diverse statistical criteria. Interestingly, some recent studies proposed that this procedure is unnecessary for phylogenetic tree reconstruction leading to a debate in the field. In contrast to DNA sequences, phylogenetic tree reconstruction from protein sequences is traditionally based on empirical exchangeability matrices that can differ among taxonomic groups and protein families. Considering this aspect, here we investigated the influence of selecting a substitution model of protein evolution on phylogenetic tree reconstruction by the analyses of real and simulated data. We found that phylogenetic tree reconstructions based on a selected best-fitting substitution model of protein evolution are the most accurate, in terms of topology and branch lengths, compared with those derived from substitution models with amino acid replacement matrices far from the selected best-fitting model, especially when the data has large genetic di-versity. Indeed, we found that substitution models with similar amino acid replacement matrices produce similar reconstructed phylogenetic trees, suggesting the use of substitution models as similar as possible to a selected best-fitting model when the latter cannot be used. Therefore, we recommend the use of the traditional protocol of selection among substitution models of evolution for protein phylogenetic tree reconstruction.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available